folds -- help!

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

folds -- help!

7stud-2
This is an example that shows how foldl and foldr work (from RWH p.93-94):

foldl (+) 0 (1:2:3:[])
   == foldl (+) (0 + 1)             (2:3:[])
   == foldl (+) ((0 + 1) + 2)       (3:[])
   == foldl (+) (((0 + 1) + 2) + 3) []
   ==           (((0 + 1) + 2) + 3)


foldr (+) 0 (1:2:3:[])
   ==  1 +           foldr (+) 0 (2:3:[])
   ==  1 + (2 +      foldr (+) 0 (3:[])
   ==  1 + (2 + (3 + foldr (+) 0 []))
   ==  1 + (2 + (3 + 0))

The book says on p.94:

-----
The difference between foldl and foldr should be clear from looking at where
the parentheses and the empty list elements show up.  With foldl, the empty
list element is on the left, and all the parentheses group to the left.
With foldr, the zero value is on the right, and the parentheses group to the
right.
----

Huh?  With foldl, the only empty list element I see is on the right.

Initially, it looked to me ike they did the same thing, and that the only
difference was the way they called step.  I think "step" is a horrible,
non-descriptive name, so I'm going to use "accFunc" instead:

foldl calls: accFunc acc x

foldr calls: accFunc x acc

So it looks like you can define a function using either one and get the
same result.  Here is a test:

--I am going to use odd for pfunc and [1, 2, 3] for xs:

myFilter1 pfunc xs = foldl accFunc [] xs
    where accFunc acc x  
            | pfunc x       = acc ++ [x]
            | otherwise     = acc

myFilter2 pfunc xs = foldr accFunc [] xs
    where accFunc x acc
            | pfunc x       = acc ++ [x]
            | otherwise     = acc


*Main> myFilter1 odd [1, 2, 3]
[1,3]
*Main> myFilter2 odd [1, 2, 3]
[3,1]

Hmmm.  So there is a difference.  foldr appears to grab elements from
the end of the list.  Therefore, to get the same result from the function
that uses foldr, I did this:


myFilter3 pfunc xs = foldr accFunc [] xs
    where accFunc x acc
            | pfunc x       = x : acc
            | otherwise     = acc  


*Main> myFilter3 odd [1, 2, 3]
[1,3]

But then RWH explains that you would never use foldl in practice because it
thunks the result, which for large lists can overwhelm the maximum memory
alloted for a thunk.  But it appears to me the same thunk problem would
occur with foldr.  So why is foldr used in practice but not foldl?





Reply | Threaded
Open this post in threaded view
|

folds -- help!

Daniel Fischer-4
Am Montag, 9. M?rz 2009 17:46 schrieb 7stud:

> This is an example that shows how foldl and foldr work (from RWH p.93-94):
>
> foldl (+) 0 (1:2:3:[])
>    == foldl (+) (0 + 1)             (2:3:[])
>    == foldl (+) ((0 + 1) + 2)       (3:[])
>    == foldl (+) (((0 + 1) + 2) + 3) []
>    ==           (((0 + 1) + 2) + 3)
>
>
> foldr (+) 0 (1:2:3:[])
>    ==  1 +           foldr (+) 0 (2:3:[])
>    ==  1 + (2 +      foldr (+) 0 (3:[])
>    ==  1 + (2 + (3 + foldr (+) 0 []))
>    ==  1 + (2 + (3 + 0))
>
> The book says on p.94:
>
> -----
> The difference between foldl and foldr should be clear from looking at
> where the parentheses and the empty list elements show up.  With foldl, the
> empty list element is on the left, and all the parentheses group to the
> left. With foldr, the zero value is on the right, and the parentheses group
> to the right.
> ----
>
> Huh?  With foldl, the only empty list element I see is on the right.

What they meant was "the value that is the result in case the fold is applied
to an empty list", in this case the 0, in the definition

fold(l/r) f z xs = ...

the 'z'.

>
> Initially, it looked to me ike they did the same thing, and that the only
> difference was the way they called step.  I think "step" is a horrible,
> non-descriptive name, so I'm going to use "accFunc" instead:
>
> foldl calls: accFunc acc x
>
> foldr calls: accFunc x acc
>
> So it looks like you can define a function using either one and get the
> same result.

Note that in general the list elements and acc have different types, so only
one of
accFun acc x
and
accFun x acc
typechecks.

If the types are the same, in general accFun x acc /= accFun acc x, so foldr
and foldl give different results, too.

>  Here is a test:
>
> --I am going to use odd for pfunc and [1, 2, 3] for xs:
>
> myFilter1 pfunc xs = foldl accFunc [] xs
>     where accFunc acc x
>
>             | pfunc x       = acc ++ [x]
>             | otherwise     = acc
>
> myFilter2 pfunc xs = foldr accFunc [] xs
>     where accFunc x acc
>
>             | pfunc x       = acc ++ [x]
>             | otherwise     = acc
>
> *Main> myFilter1 odd [1, 2, 3]
> [1,3]
> *Main> myFilter2 odd [1, 2, 3]
> [3,1]
>
> Hmmm.  So there is a difference.  foldr appears to grab elements from
> the end of the list.  Therefore, to get the same result from the function
> that uses foldr, I did this:
>
>
> myFilter3 pfunc xs = foldr accFunc [] xs
>     where accFunc x acc
>
>             | pfunc x       = x : acc
>             | otherwise     = acc
>
> *Main> myFilter3 odd [1, 2, 3]
> [1,3]
>
> But then RWH explains that you would never use foldl in practice because it
> thunks the result, which for large lists can overwhelm the maximum memory
> alloted for a thunk.  But it appears to me the same thunk problem would
> occur with foldr.  So why is foldr used in practice but not foldl?
>

Since with foldr, the parentheses are grouped to the right:
x0 `f` (x1 `f` (x2 `f` ... (xn `f` z) ... )),

if f can start delivering the result without looking at its second argument,
you can start consuming the result before the fold has traversed the whole
list.

Common examples are things like

concat = foldr (++) [],
so
concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
and the start (l1) can be used before further reducing the fold,

and = foldr (&&) True

and [True,False,..........]
needs only inspect the list until it encounters the first False (if any,
otherwise it must of course traverse the whole list)

or = foldr (||) False

foldr is useful if the combination function is lazy in its second argument.

foldl on the other hand can't deliver anything before the whole list is
consumed. So since foldl builds thunks (except in some easy cases where the
optimiser sees it should be strict), which would have to be evaluated at the
end when they've become rather large, foldl isn't as useful and one uses the
strict left fold, foldl'.
Reply | Threaded
Open this post in threaded view
|

folds -- help!

Peter Verswyvelen-2
In reply to this post by 7stud-2
Maybe it helps to visualize it like this. Instead of computing the sum by
using a fold with (+), we just construct data:
data Expr = N Int
          | Expr :+: Expr
          deriving Show

ns :: [Expr]
ns = map N [1..3]

lf :: Expr
lf = foldl1 (:+:) ns

rf :: Expr
rf = foldr1 (:+:) ns

For simplicity Iused foldl1 and foldr1, which only work on non-empty lists.

(regarding the weird :+: well in Haskell, you can use operators for data
constructors when they start with a colon)

Run this with GHCi, and evaluate lf and rf. You should get

*Main> lf
(N 1 :+: N 2) :+: N 3
*Main> rf
N 1 :+: (N 2 :+: N 3)

So really, foldl "folds on the left", because the parentheses are on the
left side. Similarly for foldr.

Does this help?

On Mon, Mar 9, 2009 at 5:46 PM, 7stud <[hidden email]> wrote:

> This is an example that shows how foldl and foldr work (from RWH p.93-94):
>
> foldl (+) 0 (1:2:3:[])
>   == foldl (+) (0 + 1)             (2:3:[])
>   == foldl (+) ((0 + 1) + 2)       (3:[])
>   == foldl (+) (((0 + 1) + 2) + 3) []
>   ==           (((0 + 1) + 2) + 3)
>
>
> foldr (+) 0 (1:2:3:[])
>   ==  1 +           foldr (+) 0 (2:3:[])
>   ==  1 + (2 +      foldr (+) 0 (3:[])
>   ==  1 + (2 + (3 + foldr (+) 0 []))
>   ==  1 + (2 + (3 + 0))
>
> The book says on p.94:
>
> -----
> The difference between foldl and foldr should be clear from looking at
> where
> the parentheses and the empty list elements show up.  With foldl, the empty
> list element is on the left, and all the parentheses group to the left.
> With foldr, the zero value is on the right, and the parentheses group to
> the
> right.
> ----
>
> Huh?  With foldl, the only empty list element I see is on the right.
>
> Initially, it looked to me ike they did the same thing, and that the only
> difference was the way they called step.  I think "step" is a horrible,
> non-descriptive name, so I'm going to use "accFunc" instead:
>
> foldl calls: accFunc acc x
>
> foldr calls: accFunc x acc
>
> So it looks like you can define a function using either one and get the
> same result.  Here is a test:
>
> --I am going to use odd for pfunc and [1, 2, 3] for xs:
>
> myFilter1 pfunc xs = foldl accFunc [] xs
>    where accFunc acc x
>            | pfunc x       = acc ++ [x]
>            | otherwise     = acc
>
> myFilter2 pfunc xs = foldr accFunc [] xs
>    where accFunc x acc
>            | pfunc x       = acc ++ [x]
>            | otherwise     = acc
>
>
> *Main> myFilter1 odd [1, 2, 3]
> [1,3]
> *Main> myFilter2 odd [1, 2, 3]
> [3,1]
>
> Hmmm.  So there is a difference.  foldr appears to grab elements from
> the end of the list.  Therefore, to get the same result from the function
> that uses foldr, I did this:
>
>
> myFilter3 pfunc xs = foldr accFunc [] xs
>    where accFunc x acc
>            | pfunc x       = x : acc
>            | otherwise     = acc
>
>
> *Main> myFilter3 odd [1, 2, 3]
> [1,3]
>
> But then RWH explains that you would never use foldl in practice because it
> thunks the result, which for large lists can overwhelm the maximum memory
> alloted for a thunk.  But it appears to me the same thunk problem would
> occur with foldr.  So why is foldr used in practice but not foldl?
>
>
>
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090309/add4f687/attachment.htm
Reply | Threaded
Open this post in threaded view
|

folds -- help!

Adrian Neumann
In reply to this post by 7stud-2

Am 09.03.2009 um 17:46 schrieb 7stud:

> This is an example that shows how foldl and foldr work (from RWH p.
> 93-94):
>
> foldl (+) 0 (1:2:3:[])
>    == foldl (+) (0 + 1)             (2:3:[])
>    == foldl (+) ((0 + 1) + 2)       (3:[])
>    == foldl (+) (((0 + 1) + 2) + 3) []
>    ==           (((0 + 1) + 2) + 3)
>
>
> foldr (+) 0 (1:2:3:[])
>    ==  1 +           foldr (+) 0 (2:3:[])
>    ==  1 + (2 +      foldr (+) 0 (3:[])
>    ==  1 + (2 + (3 + foldr (+) 0 []))
>    ==  1 + (2 + (3 + 0))
>
> The book says on p.94:
>
> -----
> The difference between foldl and foldr should be clear from looking  
> at where
> the parentheses and the empty list elements show up.  With foldl,  
> the empty
> list element is on the left, and all the parentheses group to the  
> left.
> With foldr, the zero value is on the right, and the parentheses  
> group to the
> right.
> ----
>
> Huh?  With foldl, the only empty list element I see is on the right.

Have a look at foldl.com and foldr.com. With "empty list element"  
they mean the 0.

> But then RWH explains that you would never use foldl in practice  
> because it
> thunks the result, which for large lists can overwhelm the maximum  
> memory
> alloted for a thunk.  But it appears to me the same thunk problem  
> would
> occur with foldr.  So why is foldr used in practice but not foldl?

The problem is that foldr can lazyly produce a result. Try foldr (:)  
[] [1..]. It works. Now try foldl (flip (:)) [] [1..]. It breaks.  
However foldl is tail recursive, so the compiler can optimize the  
recursion away. In some cases that is beneficial. Notice that there  
is no difference between

foldr g a
foldl f a

(for appropriate g and f) if g and f are strict in both arguments.  
Have a look at the "Foldl as foldr" wikipage. Also have a look at  
this paper

 > http://www.cs.nott.ac.uk/~gmh/fold.pdf

Regards,

Adrian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: Signierter Teil der Nachricht
Url : http://www.haskell.org/pipermail/beginners/attachments/20090309/a5bb8304/PGP.bin
Reply | Threaded
Open this post in threaded view
|

Re: folds -- help!

7stud-2
In reply to this post by Daniel Fischer-4
Daniel Fischer <daniel.is.fischer <at> web.de> writes:

>
> Am Montag, 9. M?rz 2009 17:46 schrieb 7stud:
> > This is an example that shows how foldl and foldr work (from RWH p.93-94):
> >
> > foldl (+) 0 (1:2:3:[])
> >    == foldl (+) (0 + 1)             (2:3:[])
> >    == foldl (+) ((0 + 1) + 2)       (3:[])
> >    == foldl (+) (((0 + 1) + 2) + 3) []
> >    ==           (((0 + 1) + 2) + 3)
> >
> >
> > foldr (+) 0 (1:2:3:[])
> >    ==  1 +           foldr (+) 0 (2:3:[])
> >    ==  1 + (2 +      foldr (+) 0 (3:[])
> >    ==  1 + (2 + (3 + foldr (+) 0 []))
> >    ==  1 + (2 + (3 + 0))
> >
> > The book says on p.94:
> >
> > -----
> > The difference between foldl and foldr should be clear from looking at
> > where the parentheses and the empty list elements show up.  With foldl, the
> > empty list element is on the left, and all the parentheses group to the
> > left. With foldr, the zero value is on the right, and the parentheses group
> > to the right.
> > ----
> >
> > Huh?  With foldl, the only empty list element I see is on the right.
>
> What they meant was "the value that is the result in case the fold is applied
> to an empty list", in this case the 0, in the definition
>

So that's an error right? Or is that correct haskell terminology?

The book also says on p. 95:

-------------
Like foldl, foldr takes a function and a base case(what to do when the input
list is empty) as arguments.
-------------

That also does not seem correct.  For example:

foldrSum xs =  foldr accFunc 0 xs
    where accFunc x acc = acc + x

*Main> foldrSum [1, 2, 3]
6

In that example, the first two arguments to foldr are the function accFunc
and 0.  It does not seem accurate to say that  "0 is what to do when the
input list is empty".   What foldr  does when the input list is empty
is return the value of the acc parameter variable:

foldr _ acc [] = acc

In my example, the value of the acc parameter is 6 "when the input list is
empty"--not the value 0, which is the argument to foldr.


> fold(l/r) f z xs = ...

>
> the 'z'.
>
> >
> > Initially, it looked to me ike they did the same thing, and that the only
> > difference was the way they called step.  I think "step" is a horrible,
> > non-descriptive name, so I'm going to use "accFunc" instead:
> >
> > foldl calls: accFunc acc x
> >
> > foldr calls: accFunc x acc
> >
> > So it looks like you can define a function using either one and get the
> > same result.
>
> Note that in general the list elements and acc have different types, so only
> one of
> accFun acc x
> and
> accFun x acc
> typechecks.
>

I don't know how that comment is relevant.  In my examples, acc and x have
different types:

*Main> :type []
[] :: [a]

*Main> :type 1
1 :: (Num t) => t

And both examples work fine.

> If the types are the same, in general accFun x acc /= accFun acc x,
>

Is that correct?  Can you give some examples?  Here is what I tried:

1)
accFunc1 acc x = x + acc
accFunc2 x acc = x + acc

*Main> let x = 1
*Main> let acc = 3
*Main> accFunc1 acc x
4
*Main> accFunc1 x acc
4

2)
accFunc3 acc x = x ++ acc
accFunc4 x acc = x ++ acc

*Main> let x = [1]
*Main> let acc = [2, 3]
*Main> accFunc3 acc x
[1,2,3]
*Main> accFunc4 x acc
[1,2,3]


> so foldr and foldl give different results, too.
>

I think the results produced by my example functions myFilter1 and myFilter2
demonstrate that, but the differing results are because of the way foldr and
foldl are defined.



> >  Here is a test:
> >
> > --I am going to use odd for pfunc and [1, 2, 3] for xs:
> >
> > myFilter1 pfunc xs = foldl accFunc [] xs
> >     where accFunc acc x
> >
> >             | pfunc x       = acc ++ [x]
> >             | otherwise     = acc
> >
> > myFilter2 pfunc xs = foldr accFunc [] xs
> >     where accFunc x acc
> >
> >             | pfunc x       = acc ++ [x]
> >             | otherwise     = acc
> >
> > *Main> myFilter1 odd [1, 2, 3]
> > [1,3]
> > *Main> myFilter2 odd [1, 2, 3]
> > [3,1]
> >
> > Hmmm.  So there is a difference.  foldr appears to grab elements from
> > the end of the list.  Therefore, to get the same result from the function
> > that uses foldr, I did this:
> >
> >
> > myFilter3 pfunc xs = foldr accFunc [] xs
> >     where accFunc x acc
> >
> >             | pfunc x       = x : acc
> >             | otherwise     = acc
> >
> > *Main> myFilter3 odd [1, 2, 3]
> > [1,3]
> >
> > But then RWH explains that you would never use foldl in practice because it
> > thunks the result, which for large lists can overwhelm the maximum memory
> > alloted for a thunk.  But it appears to me the same thunk problem would
> > occur with foldr.  So why is foldr used in practice but not foldl?
> >
>
> Since with foldr, the parentheses are grouped to the right:
>
> if f can start delivering the result without looking at its second argument,
> you can start consuming the result before the fold has traversed the whole
> list.
>

Ok, that isn't clearly illustrated by the example in the book:

> > foldl (+) 0 (1:2:3:[])
> >    == foldl (+) (0 + 1)             (2:3:[])
> >    == foldl (+) ((0 + 1) + 2)       (3:[])
> >    == foldl (+) (((0 + 1) + 2) + 3) []
> >    ==           (((0 + 1) + 2) + 3)
> >
> >
> > foldr (+) 0 (1:2:3:[])
> >    ==  1 +           foldr (+) 0 (2:3:[])
> >    ==  1 + (2 +      foldr (+) 0 (3:[])
> >    ==  1 + (2 + (3 + foldr (+) 0 []))
> >    ==  1 + (2 + (3 + 0))
> >

In that example, it doesn't look like anything in foldr can be evaluated
until the whole fold has been completed.

> Common examples are things like
>
> concat = foldr (++) [],
> so
> concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
> and the start (l1) can be used before further reducing the fold,
>

So does haskell store a thunk for everything to the right of l1?
You said that when using foldr you can start "consuming" the beginning of the
result before the whole result is reduced.  I don't quite get that.


> and = foldr (&&) True
>
> [to evaluate the expression] and [True,False,..........]
> [haskell] needs only inspect the list until it encounters the first False
>(if any), otherwise it must of course traverse the whole list
>
> or = foldr (||) False
>
> foldr is useful if the combination function is lazy in its second argument.
>

Ok.
 
> foldl on the other hand can't deliver anything before the whole list is
> consumed. So since foldl builds thunks (except in some easy cases where the
> optimiser sees it should be strict), which would have to be evaluated at the
> end when they've become rather large, foldl isn't as useful and one uses the
> strict left fold, foldl'.
>

Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: folds -- help!

7stud-2
In reply to this post by Peter Verswyvelen-2
Peter Verswyvelen <bugfact <at> gmail.com> writes:
>
> Does this help?
>

Sorry, I'm only on chapter 4 of RWH, and I don't understand that notation.



Reply | Threaded
Open this post in threaded view
|

Re: folds -- help!

Kurt Hutchinson
In reply to this post by 7stud-2
On Tue, Mar 10, 2009 at 3:59 AM, 7stud <[hidden email]> wrote:

> -------------
> Like foldl, foldr takes a function and a base case(what to do when the input
> list is empty) as arguments.
> -------------
>
> That also does not seem correct. ?For example:
>
> foldrSum xs = ?foldr accFunc 0 xs
> ? ?where accFunc x acc = acc + x
>
> *Main> foldrSum [1, 2, 3]
> 6
>
> In that example, the first two arguments to foldr are the function accFunc
> and 0. ?It does not seem accurate to say that ?"0 is what to do when the
> input list is empty". ? What foldr ?does when the input list is empty
> is return the value of the acc parameter variable:
>

I'm not sure why they explain the base case for fold in that way. At
least to me, that is only a trivial result of the 'zero' value's main
purpose, which is to be the initial value used in the accumulating
fold function. When starting the fold, we have to start somewhere, so
the accFunc needs a seed value. That value, and the first element of
the list, are fed into accFunc to start things off. Then it just so
happens that if the list is empty, the seed value is the result, since
no folding can happen.

> foldr _ acc [] = acc
>
> In my example, the value of the acc parameter is 6 "when the input list is
> empty"--not the value 0, which is the argument to foldr.

You're thinking of a slightly different 'empty' here. You're thinking
of what happens when you reach the end of the list, after folding it
all, and there are no more elements to fold. In this example, you're
right that the acc parameter is 6. But what if the list you *first
gave to foldr* was empty? Then it would evaluate to 0, the initial
seed value.

Now you may be thinking, "Why would I ever apply foldr to an empty
list? Obviously that would do nothing." Well you may not know whether
a list is empty, if it's the result of other calculations.

You also may be thinking, "Why do I need to provide a seed value, why
can't foldr just start with the first two elements of the list?"
That's because the accFunc does not always evaluate to the same type
as the elements in the list. For example, you could use a fold to
count the number of 'a's in a list of characters. Then the type of
accFunc would be "accFunc :: Int -> Char -> Int" for foldl or "accFunc
:: Char -> Int -> Int" for foldr. It takes one element from the list,
the previous result, and evaluates to a new result. But in the very
first fold step, there's no previous result, so you have to provide
one.

It's called zero as a convention, but it doesn't actually have to *be*
zero. It can be any initial value you want. Make your foldSum function
only evaluate to results of 10 or larger by doing this:

foldSum xs = foldr  accFunc 10 xs
  where accFunc x acc = acc + x

[---snip---]

>> if f can start delivering the result without looking at its second argument,
>> you can start consuming the result before the fold has traversed the whole
>> list.
>>
>
> Ok, that isn't clearly illustrated by the example in the book:
>
>> > foldl (+) 0 (1:2:3:[])
>> > ? ?== foldl (+) (0 + 1) ? ? ? ? ? ? (2:3:[])
>> > ? ?== foldl (+) ((0 + 1) + 2) ? ? ? (3:[])
>> > ? ?== foldl (+) (((0 + 1) + 2) + 3) []
>> > ? ?== ? ? ? ? ? (((0 + 1) + 2) + 3)
>> >
>> >
>> > foldr (+) 0 (1:2:3:[])
>> > ? ?== ?1 + ? ? ? ? ? foldr (+) 0 (2:3:[])
>> > ? ?== ?1 + (2 + ? ? ?foldr (+) 0 (3:[])
>> > ? ?== ?1 + (2 + (3 + foldr (+) 0 []))
>> > ? ?== ?1 + (2 + (3 + 0))
>> >
>
> In that example, it doesn't look like anything in foldr can be evaluated
> until the whole fold has been completed.

You're right, that example doesn't show how you could start using the
result without fully evaluating the fold, since addition doesn't give
partial results. The concat example is better in that regard.

>> Common examples are things like
>>
>> concat = foldr (++) [],
>> so
>> concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
>> and the start (l1) can be used before further reducing the fold,
>>
>
> So does haskell store a thunk for everything to the right of l1?
> You said that when using foldr you can start "consuming" the beginning of the
> result before the whole result is reduced. ?I don't quite get that.

A thunk is used as a stand-in for most calculations before the result
is actually calculated. That way, if you never try to use the result,
the calculation never needs to be done, and that means less work. As
an example that ties to the concat example above, say your program
only wanted to test if the result of the concat fold was an empty
list. The function 'null' takes a list and evaluates to True or False,
based on whether the list is empty or not. So:

someFunc xs = null ( concat xs )
  where
  concat ys = foldr (++) [] ys

The 'null' function only needs to test whether the list that is the
result of foldConcat has at least one element. Let's say l1 has an
element. So it's kind of evaluated like this:
  someFunc [ l1, l2, l3, l4, l5 ]
  null (concat [ l1, l2, l3, l4, l5 ] )
  null ( l1 ++ ( thunk with rest of fold ))
  False

The rest of the fold doesn't need to be evaluated, since the beginning
part is enough for 'null' to tell that the result would have at least
one element (because l1 does). That's one way foldr can be used to
start consuming the result before the entire fold is done. It depends
completely on the accFunc: if it can return partial results, like
concat, then you can start consuming the result before a full
evaluation. Some accFunc's can't return partial results, like regular
addition. In that case, it's probably better to use foldl' (note the
apostrophe), which will force the thunks to be evaluated as they are
generated and so use less memory. foldl is the same as foldl' except
it does generate thunks, and then evaluates them all at the end of the
fold, so it uses a bunch of memory to store the thunks in the
meantime, which usually isn't useful.

Kurt
Reply | Threaded
Open this post in threaded view
|

Re: folds -- help!

Heinrich Apfelmus
In reply to this post by 7stud-2
7stud wrote:

> This is an example that shows how foldl and foldr work (from RWH p.93-94):
>
> foldl (+) 0 (1:2:3:[])
>    == foldl (+) (0 + 1)             (2:3:[])
>    == foldl (+) ((0 + 1) + 2)       (3:[])
>    == foldl (+) (((0 + 1) + 2) + 3) []
>    ==           (((0 + 1) + 2) + 3)
>
>
> foldr (+) 0 (1:2:3:[])
>    ==  1 +           foldr (+) 0 (2:3:[])
>    ==  1 + (2 +      foldr (+) 0 (3:[])
>    ==  1 + (2 + (3 + foldr (+) 0 []))
>    ==  1 + (2 + (3 + 0))
>
> The book says on p.94:
>
> -----
> The difference between foldl and foldr should be clear from looking at where
> the parentheses and the empty list elements show up.  With foldl, the empty
> list element is on the left, and all the parentheses group to the left.
> With foldr, the zero value is on the right, and the parentheses group to the
> right.
> ----
>
> Huh?  With foldl, the only empty list element I see is on the right.

A fold like  foldr f z  is best understood as a function that replaces
each  (:)  with  f  and each  []  with  z . See also the diagrams on

  http://en.wikipedia.org/wiki/Fold_(higher-order_function)

>From this point of view, z "corresponds to the empty list".


> Initially, it looked to me like they did the same thing, and that the only
> difference was the way they called step.

They are only the same when the operation  f  is associative, i.e. if it
satisfies

  f x (f y z) = f (f x y) z


> But then RWH explains that you would never use foldl in practice because it
> thunks the result, which for large lists can overwhelm the maximum memory
> alloted for a thunk.  But it appears to me the same thunk problem would
> occur with foldr.  So why is foldr used in practice but not foldl?

See also

  http://en.wikibooks.org/wiki/Haskell/Performance_Introduction#Space


Regards,
apfelmus

--
http://apfelmus.nfshost.com

Reply | Threaded
Open this post in threaded view
|

Re: folds -- help!

7stud-2
In reply to this post by Kurt Hutchinson
Kurt Hutchinson <kelanslists <at> gmail.com> writes:
> You're thinking of a slightly different 'empty' here. You're thinking
> of what happens when you reach the end of the list, after folding it
> all, and there are no more elements to fold. In this example, you're
> right that the acc parameter is 6. But what if the list you *first
> gave to foldr* was empty? Then it would evaluate to 0, the initial
> seed value.
>

Ok.

>> if f can start delivering the result without looking at its second argument,
>> you can start consuming the result before the fold has traversed the whole
>> list.
> >>
> >
> > Ok, that isn't clearly illustrated by the example in the book:
> >
> >> > foldl (+) 0 (1:2:3:[])
> >> > ? ?== foldl (+) (0 + 1) ? ? ? ? ? ? (2:3:[])
> >> > ? ?== foldl (+) ((0 + 1) + 2) ? ? ? (3:[])
> >> > ? ?== foldl (+) (((0 + 1) + 2) + 3) []
> >> > ? ?== ? ? ? ? ? (((0 + 1) + 2) + 3)
> >> >
> >> >
> >> > foldr (+) 0 (1:2:3:[])
> >> > ? ?== ?1 + ? ? ? ? ? foldr (+) 0 (2:3:[])
> >> > ? ?== ?1 + (2 + ? ? ?foldr (+) 0 (3:[])
> >> > ? ?== ?1 + (2 + (3 + foldr (+) 0 []))
> >> > ? ?== ?1 + (2 + (3 + 0))
> >> >
> >
> > In that example, it doesn't look like anything in foldr can be evaluated
> > until the whole fold has been completed.
>
> You're right, that example doesn't show how you could start using the
> result without fully evaluating the fold, since addition doesn't give
> partial results. The concat example is better in that regard.
>

Ok. I guess I did understand something.  Therefore, I think
the example in the book that uses addition with foldl and foldr is TERRIBLE.  
I think the sections on folds in RWH are a catastrophe and
need to be rewritten.  The authors need to get rid of the "bit twiddling"
example and provide an example like concat to clearly show the
difference between foldl and foldr.  I think the example using foldl
and foldr with addition should be a secondary example to demonstrate
that sometimes it doesn't matter whether you use foldl or foldr.


> >> Common examples are things like
> >>
> >> concat = foldr (++) [],
> >> so
> >> concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
> >> and the start (l1) can be used before further reducing the fold,
> >>
> >
> > So does haskell store a thunk for everything to the right of l1?
> > You said that when using foldr you can start "consuming" the beginning of
> > the before the whole result is reduced. ?I don't quite get that.
>
> A thunk is used as a stand-in for most calculations before the result
> is actually calculated. That way, if you never try to use the result,
> the calculation never needs to be done, and that means less work. As
> an example that ties to the concat example above, say your program
> only wanted to test if the result of the concat fold was an empty
> list. The function 'null' takes a list and evaluates to True or False,
> based on whether the list is empty or not. So:
>
> someFunc xs = null ( concat xs )
>   where
>   concat ys = foldr (++) [] ys
>
> The 'null' function only needs to test whether the list that is the
> result of foldConcat has at least one element. Let's say l1 has an
> element. So it's kind of evaluated like this:
>   someFunc [ l1, l2, l3, l4, l5 ]
>   null (concat [ l1, l2, l3, l4, l5 ] )
>   null ( l1 ++ ( thunk with rest of fold ))
>   False
>
> The rest of the fold doesn't need to be evaluated, since the beginning
> part is enough for 'null' to tell that the result would have at least
> one element (because l1 does). That's one way foldr can be used to
> start consuming the result before the entire fold is done.
>

Ok.  The terminology is just a little confusing for me. When you say
"start consuming before the fold is done" it sounds to me like
somehow you are reading partial results off the front of the result
before foldr returns the entire result.  But as you explained above
what you actually mean is that foldr will return the entire result--
with part of the result being a thunk, and subsequently you can do
operations on the result that may never require the thunked part to
be evaluated.


> It depends
> completely on the accFunc: if it can return partial results, like
> concat, then you can start consuming the result before a full
> evaluation. Some accFunc's can't return partial results, like regular
> addition. In that case, it's probably better to use foldl' (note the
> apostrophe), which will force the thunks to be evaluated as they are
> generated and so use less memory. foldl is the same as foldl' except
> [foldl] does generate thunks, and then evaluates them all at the end of the
> fold, so it uses a bunch of memory to store the thunks in the
> meantime, which usually isn't useful.
>
> Kurt
>

Thanks to you and Daniel for the great explanations.  I'm feeling a lot
better about folds now.


....
....
......
....
...
...
....
....
....
....
...
...
...
....

Reply | Threaded
Open this post in threaded view
|

folds -- help!

John Dorsey-2
In reply to this post by Adrian Neumann
Adrian Neumann wrote:

> Notice that there is no difference between
>
> foldr g a
> foldl f a
>
> (for appropriate g and f) if g and f are strict in both arguments.  

Be careful... as apfelmus noted elsewhere in this thread, that's not (in
general) true.

Prelude> foldr (^) 2 [3,5]
847288609443
Prelude> foldl (^) 2 [3,5]
32768

The reason?  Integer exponentiation (^) isn't associative and
commutative.  So the first is (3 ^ (5^2)) = 3^25, while the second is
((2 ^ 3) ^ 5) = 2^15.

Cheers,
John