Haskell triangular loop (correct use of (++))

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Haskell triangular loop (correct use of (++))

rohan sumant
Suppose I have a list of distinct integers and I wish to generate all possible unordered pairs (a,b) where a/=b.

Ex: [1,2,3,0] --> [(1,2),(1,3),(1,0),(2,3),(2,0),(3,0)]

 The approach I am following is this :-

mkpairs [] = []
mkpairs (x:xs) = (map (fn x) xs) ++ (mkpairs xs)

fn x y = (x,y)

It is generating the desired output but I am a bit unsure about the time complexity of the function mkpairs. In an imperative language a nested triangular for loop would do the trick in O(n^2) or more precisely (n*(n-1)/2) operations. Does my code follow the same strategy? I am particularly worried about the (++) operator. I think that (++) wouldn't add to the time complexity since the initial code fragment (map (fn x) xs) is to be computed anyway. Am I wrong here? Is this implementation running O(n^2)? If not, could you please show me how to write a nested triangular loop in Haskell?

Rohan Sumant
   

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Sumit Sahrawat, Maths & Computing,
 IIT (BHU)

Haskell is a declarative language. The primary means of programming in a declarative language is to provide definitions for stuff, similar to mathematics. e.g. inc x = x + 1

If you want to define a value using an explicit sequence of steps, then you have to use monads. The Haskell wikibook has a good tutorial on using the state monad to generate random numbers. This allows mutation, preventing which is one of Haskell's prime features, so I wouldn't recommend wiring code like this.

Complexity analysis is usually tricky in Haskell. You can refer to the book 'purely functional days structures' to know more.

Regards,
  Sumit


_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Silent Leaf
In reply to this post by rohan sumant
Dunno if that's what you're interested in, or if it's best in terms of efficiency, but there's syntax inside the language made just for this kind of thing, called list comprehension. It comes from math's definition of sets by comprehension, and since it's part of the language I'd have a tendency to trust its efficiency, but I might be entirely wrong on this aspect.

Anyways, for your problem, say I want to create the set of pairs of your example:

let result = [(x,y) | let xs = [1,2,3,0], (x,ix) <- zip xs [1,2..], y <- drop ix xs, x /= y]
in result == [(1,2),(1,3),(1,0),(2,3),(2,0),(3,0)]

Basically the syntax is: [ parameterized result element | conditions on the parameters]
the conditions being a sequence of comma-separated items that are either: local variable declarations without the 'in', example being (let input = [1,2,3,0]), pattern-accepting generation of values from a list, or conditions on the parameters (here x and y).

In order to build y's list I decided to zip xs with a list of indexes starting to 1, thereby ensuring no pair is twice in, considering the order doesn't matter.
I'd bet the syntax is monad/do related, with all those right-to left arrows. Plus it fits the bill of what's actually happening here.

Of course if you want a function, you can still write thereafter
mkpairs :: Integral a => a -> [(a,a)]
mkpairs n = [(x,y) | let xs = [1..n] ++ [0], (x,ix) <- zip xs [1,2..], y <- drop ix xs, x /= y]

If you don't care about the order, I guess xs = [0..n] will be much more efficient, relatively speaking.
Pretty sure the function even works for n == 0, since y <- drop 1 [0] won't have a thing to yield, hence, result = [].


_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

rohan sumant
@Silent Leaf

I am indeed familiar with the list comprehension syntax indeed. I agree with you that it certainly is the better alternative to writing handcrafted functions especially when they involve (++). However the code you have mentioned doesn't get the job done correctly. Your approach implements a square nested loop which makes it at least twice as inefficient than the one I am rooting for. The problem lies with the dropWhile function. It will begin from the start of the list for every new (x,ix). This is particularly bad in Haskell because the garbage collector cannot do away with unnecessary prefixes of the input string, thereby wasting a lot of memory.




Rohan Sumant
   

On Sun, Apr 10, 2016 at 11:42 PM, Silent Leaf <[hidden email]> wrote:
Dunno if that's what you're interested in, or if it's best in terms of efficiency, but there's syntax inside the language made just for this kind of thing, called list comprehension. It comes from math's definition of sets by comprehension, and since it's part of the language I'd have a tendency to trust its efficiency, but I might be entirely wrong on this aspect.

Anyways, for your problem, say I want to create the set of pairs of your example:

let result = [(x,y) | let xs = [1,2,3,0], (x,ix) <- zip xs [1,2..], y <- drop ix xs, x /= y]
in result == [(1,2),(1,3),(1,0),(2,3),(2,0),(3,0)]

Basically the syntax is: [ parameterized result element | conditions on the parameters]
the conditions being a sequence of comma-separated items that are either: local variable declarations without the 'in', example being (let input = [1,2,3,0]), pattern-accepting generation of values from a list, or conditions on the parameters (here x and y).

In order to build y's list I decided to zip xs with a list of indexes starting to 1, thereby ensuring no pair is twice in, considering the order doesn't matter.
I'd bet the syntax is monad/do related, with all those right-to left arrows. Plus it fits the bill of what's actually happening here.

Of course if you want a function, you can still write thereafter
mkpairs :: Integral a => a -> [(a,a)]
mkpairs n = [(x,y) | let xs = [1..n] ++ [0], (x,ix) <- zip xs [1,2..], y <- drop ix xs, x /= y]

If you don't care about the order, I guess xs = [0..n] will be much more efficient, relatively speaking.
Pretty sure the function even works for n == 0, since y <- drop 1 [0] won't have a thing to yield, hence, result = [].


_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Silent Leaf
Ah, from that I gather I probably won't think about anything you wouldn't, then ^^

But I'll try because it's fun anyway!
If dropping incessantly from the start of the list at each iteration is the problem (is that it? or did I not understand correctly?), could we not manage the problem with memoization, using a recursive function of the like:
f 1 = drop 1 xs
f n = drop 1 (f (n-1))
I can't be sure at all, so I'll ask: would haskell remember the result of each f 1, f 2, f 3, instead of recalculating them all the time? it would allow for only one "drop" operation per iteration, and i guess it's the best we can get with the general path I tried... but it seems obvious from you both messages, there must be better ones no matter what, efficiently speaking. :P

it could be naive, but one way to manage mere lists of integral numbers, would be to transform them into one big number, and instead of dropping, we do integral divisions/recuperation of remainders. i bet *if* that's more efficient, there's a library to do that already.

one only has to to write the number that serves as list as a juxtaposition of n-sized clusters of digits, n being the biggest power of ten reached by the biggest number of the list. smaller numbers, tha have not enough digits could be written with zeroes to fill in the rest: "0010" for example, and of course so as to not lose trailing zeroes of the number at the leftest, one has to start (from the right) with the smaller numbers: 123...010...002001.
But who knows if it's really more efficient? I'd have a tendency to say, arithmetic is just numbers, the computer can do it way quicker and with much less memory, but maybe not.

Just an idea off the top of my head. I bet no matter the technique we use, handling one big number could end up being faster than a big list, and with the right set of functions, it could be just as easy, but i could be totally wrong.
And if too big numbers are problematic, we can still attempt intermediary solutions, like lists of clustered numbers ^^
Sorry for the stupid ideas, hopefully soon my wild imagination will be better handled by what i'll learn when i get a little less newbie. :P

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Silent Leaf
In reply to this post by rohan sumant
I'm sorry to hear, no implicit memoization (but then is there an explicit approach?). in a pure language, this seems hardly logical, considering the functional "to one output always the same result" and the language's propensity to use recursion that uses then previous values already calculated. Really hope for an explicit memoization! and i don't mean a manual one ^^ if that is even possible?

Anyway, i just don't get your function f. you did get that in mine, xs was the value of my comprehension list, aka [.s1..n] ++ [0]
From there, if I'm not wrong, your function creates a list of all truncated lists, I supposed to be used with a call for the nth element? True, it memorizes all needed things, and in theory only "drops" once per element of the list.

As for incorporating it, i'm thinking, local variable? ^^ the function itself could be outside the comprehension list, in a let or where, or completely out of everything, I don't really see the problem. then it's just a matter of it being called onto xs inside the comprehension list, like that:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, (x,i) <- zip xs [1,2..], y <- yss !! i, x /= y]
The remaining possible issue is the call to !!... dunno if that's costy or not.

The best way to go through the list I suppose would be by recursion... oh wait, I'm writing as I'm thinking, and I'm thinking:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, x <- xs, ys <- yss, y <- ys, x /= y]
after all, why not use the invisible internal recursion? What do you think?

As for the number-instead-of-number-list, the crucial point i mentioned was to determine the maximum number of digit (biggest power of ten reached), and fill up the holes of the smaller numbers with zeroes, so your examples would be:
[1,2,3] = 123 --maximum size of a number = 1, no need to fill up
[12,3] = 1203 --maximum size of a number = 2 digits, thus the hole beside 3 gets filled with a zero, just like on good old digital watches ^^
Do you think extraction of clusters of digits from numbers would be advantageous, efficiently speaking?

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Rein Henrichs
Compilers generally don't provide memoization optimizations because there isn't a single dominating strategy: it varies on a case-by-case basis. (For example, memoization of some functions can take advantage of structures with sub-linear indexing, but this can't be done generically.)

When you can implement your own memoization yourself within the language with exactly the properties you desire (and a number of libraries have already done so for many common strategies ([1], [2], [3], [4])), there has been little motivation to add an implementation to GHC which is either inferior by way of its generality or extremely complex by way of its many special cases.


On Sun, Apr 10, 2016 at 10:06 PM Silent Leaf <[hidden email]> wrote:
I'm sorry to hear, no implicit memoization (but then is there an explicit approach?). in a pure language, this seems hardly logical, considering the functional "to one output always the same result" and the language's propensity to use recursion that uses then previous values already calculated. Really hope for an explicit memoization! and i don't mean a manual one ^^ if that is even possible?

Anyway, i just don't get your function f. you did get that in mine, xs was the value of my comprehension list, aka [.s1..n] ++ [0]
From there, if I'm not wrong, your function creates a list of all truncated lists, I supposed to be used with a call for the nth element? True, it memorizes all needed things, and in theory only "drops" once per element of the list.

As for incorporating it, i'm thinking, local variable? ^^ the function itself could be outside the comprehension list, in a let or where, or completely out of everything, I don't really see the problem. then it's just a matter of it being called onto xs inside the comprehension list, like that:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, (x,i) <- zip xs [1,2..], y <- yss !! i, x /= y]
The remaining possible issue is the call to !!... dunno if that's costy or not.

The best way to go through the list I suppose would be by recursion... oh wait, I'm writing as I'm thinking, and I'm thinking:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, x <- xs, ys <- yss, y <- ys, x /= y]
after all, why not use the invisible internal recursion? What do you think?

As for the number-instead-of-number-list, the crucial point i mentioned was to determine the maximum number of digit (biggest power of ten reached), and fill up the holes of the smaller numbers with zeroes, so your examples would be:
[1,2,3] = 123 --maximum size of a number = 1, no need to fill up
[12,3] = 1203 --maximum size of a number = 2 digits, thus the hole beside 3 gets filled with a zero, just like on good old digital watches ^^
Do you think extraction of clusters of digits from numbers would be advantageous, efficiently speaking?
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Rein Henrichs
I should also mention that many times when you think you want memoization in Haskell, what you actually want is to restructure your computation to take better advantage of laziness (and especially sharing).

On Mon, Apr 11, 2016 at 2:10 PM Rein Henrichs <[hidden email]> wrote:
Compilers generally don't provide memoization optimizations because there isn't a single dominating strategy: it varies on a case-by-case basis. (For example, memoization of some functions can take advantage of structures with sub-linear indexing, but this can't be done generically.)

When you can implement your own memoization yourself within the language with exactly the properties you desire (and a number of libraries have already done so for many common strategies ([1], [2], [3], [4])), there has been little motivation to add an implementation to GHC which is either inferior by way of its generality or extremely complex by way of its many special cases.


On Sun, Apr 10, 2016 at 10:06 PM Silent Leaf <[hidden email]> wrote:
I'm sorry to hear, no implicit memoization (but then is there an explicit approach?). in a pure language, this seems hardly logical, considering the functional "to one output always the same result" and the language's propensity to use recursion that uses then previous values already calculated. Really hope for an explicit memoization! and i don't mean a manual one ^^ if that is even possible?

Anyway, i just don't get your function f. you did get that in mine, xs was the value of my comprehension list, aka [.s1..n] ++ [0]
From there, if I'm not wrong, your function creates a list of all truncated lists, I supposed to be used with a call for the nth element? True, it memorizes all needed things, and in theory only "drops" once per element of the list.

As for incorporating it, i'm thinking, local variable? ^^ the function itself could be outside the comprehension list, in a let or where, or completely out of everything, I don't really see the problem. then it's just a matter of it being called onto xs inside the comprehension list, like that:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, (x,i) <- zip xs [1,2..], y <- yss !! i, x /= y]
The remaining possible issue is the call to !!... dunno if that's costy or not.

The best way to go through the list I suppose would be by recursion... oh wait, I'm writing as I'm thinking, and I'm thinking:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, x <- xs, ys <- yss, y <- ys, x /= y]
after all, why not use the invisible internal recursion? What do you think?

As for the number-instead-of-number-list, the crucial point i mentioned was to determine the maximum number of digit (biggest power of ten reached), and fill up the holes of the smaller numbers with zeroes, so your examples would be:
[1,2,3] = 123 --maximum size of a number = 1, no need to fill up
[12,3] = 1203 --maximum size of a number = 2 digits, thus the hole beside 3 gets filled with a zero, just like on good old digital watches ^^
Do you think extraction of clusters of digits from numbers would be advantageous, efficiently speaking?
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

Jon Surrell
Rein's comment about taking advantage of laziness and sharing over memoization made an impact with me. Seeing it stated like that, it seems obvious. Does anyone know of resources that explain this type of restructuring, preferably with practical examples?

Thanks,
Jon

On Mon, Apr 11, 2016 at 11:12 PM Rein Henrichs <[hidden email]> wrote:
I should also mention that many times when you think you want memoization in Haskell, what you actually want is to restructure your computation to take better advantage of laziness (and especially sharing).

On Mon, Apr 11, 2016 at 2:10 PM Rein Henrichs <[hidden email]> wrote:
Compilers generally don't provide memoization optimizations because there isn't a single dominating strategy: it varies on a case-by-case basis. (For example, memoization of some functions can take advantage of structures with sub-linear indexing, but this can't be done generically.)

When you can implement your own memoization yourself within the language with exactly the properties you desire (and a number of libraries have already done so for many common strategies ([1], [2], [3], [4])), there has been little motivation to add an implementation to GHC which is either inferior by way of its generality or extremely complex by way of its many special cases.


On Sun, Apr 10, 2016 at 10:06 PM Silent Leaf <[hidden email]> wrote:
I'm sorry to hear, no implicit memoization (but then is there an explicit approach?). in a pure language, this seems hardly logical, considering the functional "to one output always the same result" and the language's propensity to use recursion that uses then previous values already calculated. Really hope for an explicit memoization! and i don't mean a manual one ^^ if that is even possible?

Anyway, i just don't get your function f. you did get that in mine, xs was the value of my comprehension list, aka [.s1..n] ++ [0]
From there, if I'm not wrong, your function creates a list of all truncated lists, I supposed to be used with a call for the nth element? True, it memorizes all needed things, and in theory only "drops" once per element of the list.

As for incorporating it, i'm thinking, local variable? ^^ the function itself could be outside the comprehension list, in a let or where, or completely out of everything, I don't really see the problem. then it's just a matter of it being called onto xs inside the comprehension list, like that:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, (x,i) <- zip xs [1,2..], y <- yss !! i, x /= y]
The remaining possible issue is the call to !!... dunno if that's costy or not.

The best way to go through the list I suppose would be by recursion... oh wait, I'm writing as I'm thinking, and I'm thinking:
result = [(x,y) | let xs = [1,2,3,0], let yss = f xs, x <- xs, ys <- yss, y <- ys, x /= y]
after all, why not use the invisible internal recursion? What do you think?

As for the number-instead-of-number-list, the crucial point i mentioned was to determine the maximum number of digit (biggest power of ten reached), and fill up the holes of the smaller numbers with zeroes, so your examples would be:
[1,2,3] = 123 --maximum size of a number = 1, no need to fill up
[12,3] = 1203 --maximum size of a number = 2 digits, thus the hole beside 3 gets filled with a zero, just like on good old digital watches ^^
Do you think extraction of clusters of digits from numbers would be advantageous, efficiently speaking?
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

atomly
In reply to this post by rohan sumant

:: atomly ::

[ [hidden email] : www.atomly.com  : http://blog.atomly.com/ ...
[ atomiq records : new york city : +1.347.692.8661 ...
[ e-mail [hidden email] for atomly info and updates ...

On Sun, Apr 10, 2016 at 11:59 AM, rohan sumant <[hidden email]> wrote:
Suppose I have a list of distinct integers and I wish to generate all possible unordered pairs (a,b) where a/=b.

Ex: [1,2,3,0] --> [(1,2),(1,3),(1,0),(2,3),(2,0),(3,0)]

 The approach I am following is this :-

mkpairs [] = []
mkpairs (x:xs) = (map (fn x) xs) ++ (mkpairs xs)

fn x y = (x,y)

It is generating the desired output but I am a bit unsure about the time complexity of the function mkpairs. In an imperative language a nested triangular for loop would do the trick in O(n^2) or more precisely (n*(n-1)/2) operations. Does my code follow the same strategy? I am particularly worried about the (++) operator. I think that (++) wouldn't add to the time complexity since the initial code fragment (map (fn x) xs) is to be computed anyway. Am I wrong here? Is this implementation running O(n^2)? If not, could you please show me how to write a nested triangular loop in Haskell?

Rohan Sumant
   

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Haskell triangular loop (correct use of (++))

AntC
In reply to this post by rohan sumant
> rohan sumant <r.s.sumant <at> gmail.com> writes:
>
>  The approach I am following is this :-
> mkpairs [] = []
>
> mkpairs (x:xs) = (map (fn x) xs) ++ (mkpairs xs)
> fn x y = (x,y)
>
> It is generating the desired output ...

Good! You've got the right approach for triangular loops.
That is a form

mkpairs (x:xs) = {- stuff -} $ mkpairs xs

A couple of things look non-idiomatic. So before I get on to your main question:

As well as an empty list being a special case,
neither can you make any pairs from a singleton ;-). I suggest:

mkpairs [x] = []

The auxiliary `fn` is messy. I would put explicit lambda:

mkpairs (x:xs) = map (\x' -> (x, x') ) xs ++ mkpairs xs

Or even switch on tuple sections
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/
syntax-extns.html#tuple-sections

mkpairs (x:xs) = map (x, ) xs ++ mkpairs xs

> but I am a bit unsure about the time complexity of the function mkpairs. ...
> I am particularly worried about the (++) operator. ...

You are spot-on! (++) has terrible time complexity.
Use the showS trick for constant-time concatenation.
Look in the Prelude for class Show a/method showsPrec,
explained in the Haskell 2010 report section 6.3.3.

To get that to work, you need a 'worker' function
to hand across the successor for each list.
It's conventional to call that `go`,
and because everyone uses `go`,
wrap it inside mkpairs using a `where`.

Complete code:

mkPairs [] = []
mkPairs [x] = []
mkPairs (x:xs) = go xs $ mkPairs xs
  where go [] s = s
             go (x':xs') s = (x, x') : go xs' s

Now is that any faster in practice?
Probably not until you get to a list with thousands of elements.

HTH
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners