learning lazy evaluation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

learning lazy evaluation

Dimitri DeFigueiredo
I found an interesting situation while making recursive calls that I am
trying to understand. I re-wrote the 'minimum' function from the prelude
as such:

minimum_bug :: (Ord a) => [a] -> a
minimum_bug  [x] = x
minimum_bug  (x:xs) | x > minimum_bug xs = minimum_bug xs
                     | otherwise          = x

(I understand I should be using something like "minbug xs = foldr1 min
xs" for this, but bare with me)

The interesting thing is that the function has exponential time
behavior. In a way, that makes sense as I am making two recursive calls.
But I was expecting GHC to be smart enough to transform it into
something like:

minimum_bug' :: (Ord a) => [a] -> a
minimum_bug'  [x] = x
minimum_bug'  (x:xs) | x > y      = y
                      | otherwise  = x
                         where y = minimum_bug' xs

(This one always works as expected)

I understand that lazy evaluation implies memoization in some cases.
When does GHC only use memoization to avoid this kind of behavior?

Another interesting fact is that in my simple tests the exponential
behavior does NOT occur when I pass the function a list in sorted order.

main = do putStrLn "started - in order list"
           putStrLn $ show $ minimum_bug [1..30000]  -- no problem
           putStrLn "started - out of order list"
           putStrLn $ show $ minimum_bug [27,26..1]  -- don't do this
with large numbers!
           putStrLn "done!"

It's not clear to me how the sorted list is able to escape the
exponential slow down.


Cheers,


Dimitri


Reply | Threaded
Open this post in threaded view
|

learning lazy evaluation

Bob Ippolito
On Mon, Oct 20, 2014 at 11:57 AM, Dimitri DeFigueiredo <
defigueiredo at ucdavis.edu> wrote:

> I found an interesting situation while making recursive calls that I am
> trying to understand. I re-wrote the 'minimum' function from the prelude as
> such:
>
> minimum_bug :: (Ord a) => [a] -> a
> minimum_bug  [x] = x
> minimum_bug  (x:xs) | x > minimum_bug xs = minimum_bug xs
>                     | otherwise          = x
>
> (I understand I should be using something like "minbug xs = foldr1 min xs"
> for this, but bare with me)
>
> The interesting thing is that the function has exponential time behavior.
> In a way, that makes sense as I am making two recursive calls. But I was
> expecting GHC to be smart enough to transform it into something like:
>
> minimum_bug' :: (Ord a) => [a] -> a
> minimum_bug'  [x] = x
> minimum_bug'  (x:xs) | x > y      = y
>                      | otherwise  = x
>                         where y = minimum_bug' xs
>
> (This one always works as expected)
>
> I understand that lazy evaluation implies memoization in some cases.
> When does GHC only use memoization to avoid this kind of behavior?
>

I assume you're trying this with ghci or compiling without optimizations.
In these scenarios GHC isn't doing anything clever at all, so there will be
two separate calls to minimum_bug in each recursion. Using a variable
ensures that this value is explicitly shared between the guard and the
result in the list.

This section of Parallel and Concurrent Programming in Haskell helped me
understand Haskell's non-strict evaluation model:
http://chimera.labs.oreilly.com/books/1230000000929/ch02.html#sec_par-eval-whnf

Another interesting fact is that in my simple tests the exponential

> behavior does NOT occur when I pass the function a list in sorted order.
>
> main = do putStrLn "started - in order list"
>           putStrLn $ show $ minimum_bug [1..30000]  -- no problem
>           putStrLn "started - out of order list"
>           putStrLn $ show $ minimum_bug [27,26..1]  -- don't do this with
> large numbers!
>           putStrLn "done!"
>
> It's not clear to me how the sorted list is able to escape the exponential
> slow down.
>

Because displaying the value doesn't have any recursion to it, since it's
`x` and not `minumum_bug x`.

You can walk through the execution.

-- Original call
minimum_bug [1, 2]
-- Expand the first matching pattern
minimum_bug (1:[2]) | 1 > minimum_bug [2]
-- minimum_bug [2] is forced due to the guard
-- Evaluate minimum_bug [2]
minimum_bug [2]
-> 2
-- Step back to the guard we were trying to evaluate
minimum_bug (1:[2]) | 1 > 2
1 > 2
-> False
-- Guard fails, go to the otherwise case
  | otherwise = 1
-> 1

The `1` is already in normal form, so `putStrLn . show` doesn't have to do
any extra reductions.

If you walk through with `minimum_bug [2, 1]` you'll end up with
`minimum_bug [1]` as the result, which is not in normal form and thus
requires additional evaluation when `putStrLn . show` attempts to display
it. If you use a larger list and walk through, you can see that the amount
of redundant evaluation explodes when the list is in descending order.

-bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20141020/c5562420/attachment.html>

Reply | Threaded
Open this post in threaded view
|

learning lazy evaluation

Dimitri DeFigueiredo
Thanks Bob.

The (lack of) exponential explosion makes sense now. And, yes, I was
using both ghci and ghc 7.6.3 without any flags.

When memoization takes place is still a little murky to me, though.
Figure 2.2 in that chapter of Simon Marlow's book has the kind of
"aliasing" with the number '1' (two pointers pointing to the same
location) that I thought would also happen with 'minimum_bug xs' , but I
guess 1::Int is a special case that is different from other terms
without optimization or the depiction is not accurate.

I have a further question on that is troubling me. It seems that
constructors are treated differently from other expressions when it
comes to evaluation. The examples in the book show that:

Prelude> let x = 1 + 2 :: Int
Prelude> :sprint x
x = _


In other words, x is just an unevaluated thunk.
But at the same time:

Prelude> let x = 1 + 2 :: Int
Prelude> let z = (x,x)

Prelude> :sprint z
z = (_,_)


So, 'z' has been evaluated to WHNF.
I was expecting to get:

Prelude> :sprint z
z = _


Meaning, an unevaluated expression. Just like we did with 'x', but it
seems z has already been partially evaluated. Am I getting this right?
Are expressions with constructors evaluated differently? Why is :sprint
z different from 'z = _' in this case?


Thanks again,


Dimitri

On 20/10/14 13:21, Bob Ippolito wrote:

>
>
> On Mon, Oct 20, 2014 at 11:57 AM, Dimitri DeFigueiredo
> <defigueiredo at ucdavis.edu <mailto:defigueiredo at ucdavis.edu>> wrote:
>
>     I found an interesting situation while making recursive calls that
>     I am trying to understand. I re-wrote the 'minimum' function from
>     the prelude as such:
>
>     minimum_bug :: (Ord a) => [a] -> a
>     minimum_bug  [x] = x
>     minimum_bug  (x:xs) | x > minimum_bug xs = minimum_bug xs
>                         | otherwise          = x
>
>     (I understand I should be using something like "minbug xs = foldr1
>     min xs" for this, but bare with me)
>
>     The interesting thing is that the function has exponential time
>     behavior. In a way, that makes sense as I am making two recursive
>     calls. But I was expecting GHC to be smart enough to transform it
>     into something like:
>
>     minimum_bug' :: (Ord a) => [a] -> a
>     minimum_bug'  [x] = x
>     minimum_bug'  (x:xs) | x > y      = y
>                          | otherwise  = x
>                             where y = minimum_bug' xs
>
>     (This one always works as expected)
>
>     I understand that lazy evaluation implies memoization in some cases.
>     When does GHC only use memoization to avoid this kind of behavior?
>
>
> I assume you're trying this with ghci or compiling without
> optimizations. In these scenarios GHC isn't doing anything clever at
> all, so there will be two separate calls to minimum_bug in each
> recursion. Using a variable ensures that this value is explicitly
> shared between the guard and the result in the list.
>
> This section of Parallel and Concurrent Programming in Haskell helped
> me understand Haskell's non-strict evaluation model:
> http://chimera.labs.oreilly.com/books/1230000000929/ch02.html#sec_par-eval-whnf
>
>     Another interesting fact is that in my simple tests the
>     exponential behavior does NOT occur when I pass the function a
>     list in sorted order.
>
>     main = do putStrLn "started - in order list"
>               putStrLn $ show $ minimum_bug [1..30000]  -- no problem
>               putStrLn "started - out of order list"
>               putStrLn $ show $ minimum_bug [27,26..1]  -- don't do
>     this with large numbers!
>               putStrLn "done!"
>
>     It's not clear to me how the sorted list is able to escape the
>     exponential slow down.
>
>
> Because displaying the value doesn't have any recursion to it, since
> it's `x` and not `minumum_bug x`.
>
> You can walk through the execution.
>
> -- Original call
> minimum_bug [1, 2]
> -- Expand the first matching pattern
> minimum_bug (1:[2]) | 1 > minimum_bug [2]
> -- minimum_bug [2] is forced due to the guard
> -- Evaluate minimum_bug [2]
> minimum_bug [2]
> -> 2
> -- Step back to the guard we were trying to evaluate
> minimum_bug (1:[2]) | 1 > 2
> 1 > 2
> -> False
> -- Guard fails, go to the otherwise case
>   | otherwise = 1
> -> 1
>
> The `1` is already in normal form, so `putStrLn . show` doesn't have
> to do any extra reductions.
>
> If you walk through with `minimum_bug [2, 1]` you'll end up with
> `minimum_bug [1]` as the result, which is not in normal form and thus
> requires additional evaluation when `putStrLn . show` attempts to
> display it. If you use a larger list and walk through, you can see
> that the amount of redundant evaluation explodes when the list is in
> descending order.
>
> -bob
>
>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20141020/818c61af/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

learning lazy evaluation

Bob Ippolito
That is an interesting observation. Haskell is defined to have non-strict
evaluation, which basically means the evaluation must happen from the
outside in. It does not guarantee laziness, so even in interpreted settings
it doesn't *have* to create a thunk. What you're seeing is simply an
implementation detail of GHC that's consistent with the Haskell language
but inconsistent with the expectation that it should be maximally lazy.

On Mon, Oct 20, 2014 at 2:18 PM, Dimitri DeFigueiredo <
defigueiredo at ucdavis.edu> wrote:

>  Thanks Bob.
>
> The (lack of) exponential explosion makes sense now. And, yes, I was using
> both ghci and ghc 7.6.3 without any flags.
>
> When memoization takes place is still a little murky to me, though.
> Figure 2.2 in that chapter of Simon Marlow's book has the kind of
> "aliasing" with the number '1' (two pointers pointing to the same location)
> that I thought would also happen with 'minimum_bug xs' , but I guess 1::Int
> is a special case that is different from other terms without optimization
> or the depiction is not accurate.
>
> I have a further question on that is troubling me. It seems that
> constructors are treated differently from other expressions when it comes
> to evaluation. The examples in the book show that:
>
> Prelude> let x = 1 + 2 :: Int
> Prelude> :sprint x
> x = _
>
>
> In other words, x is just an unevaluated thunk.
> But at the same time:
>
> Prelude> let x = 1 + 2 :: Int
> Prelude> let z = (x,x)
>
> Prelude> :sprint z
> z = (_,_)
>
>
> So, 'z' has been evaluated to WHNF.
> I was expecting to get:
>
> Prelude> :sprint z
> z = _
>
>
> Meaning, an unevaluated expression. Just like we did with 'x', but it
> seems z has already been partially evaluated. Am I getting this right? Are
> expressions with constructors evaluated differently? Why is :sprint z
> different from 'z = _' in this case?
>
>
> Thanks again,
>
>
> Dimitri
>
>
> On 20/10/14 13:21, Bob Ippolito wrote:
>
>
>
> On Mon, Oct 20, 2014 at 11:57 AM, Dimitri DeFigueiredo <
> defigueiredo at ucdavis.edu> wrote:
>
>> I found an interesting situation while making recursive calls that I am
>> trying to understand. I re-wrote the 'minimum' function from the prelude as
>> such:
>>
>> minimum_bug :: (Ord a) => [a] -> a
>> minimum_bug  [x] = x
>> minimum_bug  (x:xs) | x > minimum_bug xs = minimum_bug xs
>>                     | otherwise          = x
>>
>> (I understand I should be using something like "minbug xs = foldr1 min
>> xs" for this, but bare with me)
>>
>> The interesting thing is that the function has exponential time behavior.
>> In a way, that makes sense as I am making two recursive calls. But I was
>> expecting GHC to be smart enough to transform it into something like:
>>
>> minimum_bug' :: (Ord a) => [a] -> a
>> minimum_bug'  [x] = x
>> minimum_bug'  (x:xs) | x > y      = y
>>                      | otherwise  = x
>>                         where y = minimum_bug' xs
>>
>> (This one always works as expected)
>>
>> I understand that lazy evaluation implies memoization in some cases.
>> When does GHC only use memoization to avoid this kind of behavior?
>>
>
>  I assume you're trying this with ghci or compiling without
> optimizations. In these scenarios GHC isn't doing anything clever at all,
> so there will be two separate calls to minimum_bug in each recursion. Using
> a variable ensures that this value is explicitly shared between the guard
> and the result in the list.
>
>  This section of Parallel and Concurrent Programming in Haskell helped me
> understand Haskell's non-strict evaluation model:
> http://chimera.labs.oreilly.com/books/1230000000929/ch02.html#sec_par-eval-whnf
>
>  Another interesting fact is that in my simple tests the exponential
>> behavior does NOT occur when I pass the function a list in sorted order.
>>
>> main = do putStrLn "started - in order list"
>>           putStrLn $ show $ minimum_bug [1..30000]  -- no problem
>>           putStrLn "started - out of order list"
>>           putStrLn $ show $ minimum_bug [27,26..1]  -- don't do this with
>> large numbers!
>>           putStrLn "done!"
>>
>> It's not clear to me how the sorted list is able to escape the
>> exponential slow down.
>>
>
>  Because displaying the value doesn't have any recursion to it, since
> it's `x` and not `minumum_bug x`.
>
>  You can walk through the execution.
>
>  -- Original call
> minimum_bug [1, 2]
> -- Expand the first matching pattern
> minimum_bug (1:[2]) | 1 > minimum_bug [2]
> -- minimum_bug [2] is forced due to the guard
> -- Evaluate minimum_bug [2]
> minimum_bug [2]
> -> 2
> -- Step back to the guard we were trying to evaluate
> minimum_bug (1:[2]) | 1 > 2
> 1 > 2
> -> False
> -- Guard fails, go to the otherwise case
>   | otherwise = 1
> -> 1
>
>  The `1` is already in normal form, so `putStrLn . show` doesn't have to
> do any extra reductions.
>
>  If you walk through with `minimum_bug [2, 1]` you'll end up with
> `minimum_bug [1]` as the result, which is not in normal form and thus
> requires additional evaluation when `putStrLn . show` attempts to display
> it. If you use a larger list and walk through, you can see that the amount
> of redundant evaluation explodes when the list is in descending order.
>
>  -bob
>
>
>
>
> _______________________________________________
> Beginners mailing listBeginners at haskell.orghttp://www.haskell.org/mailman/listinfo/beginners
>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20141021/d2c5ba23/attachment.html>