[Proposal] Strict `sum` and `product`

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Strict `sum` and `product`

Emily Pillmore
Nice, thanks Chessai


On Mon, Oct 26, 2020 at 9:16 PM, chessai <[hidden email]> wrote:
Emily, I already put up an MR: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4355

On Mon, Oct 26, 2020, 20:14 Emily Pillmore <[hidden email]> wrote:
Just to keep this going so it doesn't die out. I'm going to request that Hécate submit a PR to address the composite criticism here, implementing `sum|product` in terms of `foldr`, and providing a strict equivalent `sum'|product'` variant for each as discussed. We'll move the discussion to that PR. 

Cheers,
Emily 


On Thu, Oct 22, 2020 at 7:05 AM, Fumiaki Kinoshita <[hidden email]> wrote:
For a forced example, consider a wrapping of Double with a test
for infinity on the left argument


 a + b | isInfinite a = a


This makes sense only if sum is foldr (+) 0, while the implementation of sum for lists is foldl (+) 0. But hey, we are comparing a hypothetical type with hundreds of real world uses. 

Haskell should remain a lazy language
I'm not sure if I agree. We've been bitten by laziness too much, hence the introduction of BangPatterns, StrictData, deepseq, etc. IMO Haskell ecosystem should be more strict.

I'm uncomfortable with making foldMap/sum strict.

No one is suggesting to change foldMap; what OP wants to change is sum and product, and foldMap _happens to be_ the current default implementation.

> we dont knowing the context. the best way to avoid the cost of computation is not to compute it; what alternatives are there
i this s[ecific problem.

The context is pretty clear; the status quo (sum = foldl (+) 0) is flat-out wrong. I'd say more than 80% of the uses of sum is supposed to look at all the elements. And even if (+) is short-circuiting in the way Jon suggested, lazy foldl does not take any advantage of it.

if one was to make foldMap/sum strict why not also other functions, e.g. scanAccum, where does one stop on the grounds of possibly saving costs in so as yet
unknown application
I'm not sure which function you are talking about (there's no function called scanAccum in base nor containers). If you mean scanl, lazy accumulator still makes sense because the resulting list can be consumed incrementally. foldl is not.

Instead of thinking/programming  in a pure lazy language, you are then in the position of thinking lazily apart from some set of
to be determined exceptional cases. the simplicity and uniformity of the model is gone.
Again, there's little to be gained from lazy foldl. It's avoided in general and I'd say it's not "uniform" to use the function.

4. As per item 3 but from a pedagogical point of view. I taught UG  FP for years. Complicating the model is not going to help students,
to say nothing of the caveats that would have to be added to existing teaching materials.
Also from a pedagogical point of view, a number of beginners fall into a trap of lazy foldl. Caveats should be added to the current one if at all, but this danger zone can be removed with one byte change.

I'm seriously worried that the use of foldl in the standard library might make beginners think that it's fine to use foldl.

As far as I remember, the same change has been suggested a while ago but didn't make it. I hope we don't get discouraged this time and go ahead with the change.

2020年10月22日(木) 19:19 David Duke <[hidden email]>:
Concur with Jon's point. 
I'm uncomfortable with making foldMap/sum strict.
Haskell was intended to explore a region of the programming design space that had not been well covered. and it remains
if not unique  then at least exemplary in that respect. When  I write in Haskell I do so with the knowledge that I am working with
lazy evaluation with the benefits and risk that comes with that. If I'm desperate to have finer operational control, I  either
swallow that pill and adapt my use of Haskell accordingly, or pick up a different tool from the programming language toolbox.

Now the OP pointed to an issue with one specific use of sum. However, that seems like weak grounds to make library changes:
1. we dont knowing the context. the best way to avoid the cost of computation is not to compute it; what alternatives are there
i this s[ecific problem.
2. if one was to make foldMap/sum strict why not also other functions, e.g. scanAccum, where does one stop on the grounds of possibly saving costs in so as yet
unknown application
3. Instead of thinking/programming  in a pure lazy language, you are then in the position of thinking lazily apart from some set of
to be determined exceptional cases. the simplicity and uniformity of the model is gone.
4. As per item 3 but from a pedagogical point of view. I taught UG  FP for years. Complicating the model is not going to help students,
to say nothing of the caveats that would have to be added to existing teaching materials.
So whilst  the post was clearly intended to trigger discussion, in terms on conveying sentiment a strict  foldMap would be a big '-'
for me personally. Whilst there are performance challenges in working  with Haskell. The primary role of the programming notation
is allowing one to think about the problem and obtain a clear, correct solution. Performance tuning comes later if at all, and
her are known techniques for doing so  with Haskell/ghc.
5. OTOH I see little   harm in adding a library that provides a strict foldMap for those
who want to use it after  deliberate thought.

YMMMocV
David







On Thu, Oct 22, 2020 at 10:36 AM Jon Fairbairn <[hidden email]> wrote:
Sylvain Henry <[hidden email]> writes:

> Do we have an actual example of the use of a lazy sum in the
> wild?

That’s not a very good argument, though I’ve seen it used quite
a lot.  The problem is that just because no-one can currently
think of a “real” example it doesn’t mean that there are no
cases where laziness is required.

For a forced example, consider a wrapping of Double with a test
for infinity on the left argument


 a + b | isInfinite a = a


which doesn’t evaluate it’s second argument if the first is
infinite.

> If the most common case is the strict one, making `sum`
> strict would be a better default.

Haskell should remain a lazy language, so the defaults should be
lazy unless they can be proved strict anyway.

> If need be we could also provide `lazySum` or something, but
> is there really a need?

Who can truly say? I’d support sum' as the strict version as
that is consistent with other usages.

  — Jón

>
> Sylvain
>
>
> On 18/10/2020 22:16, Oleg Grenrus wrote:
>>
>> Sorry I wasn't clear myself. Option C is "add sum'"
>>
>> For lists:
>>
>>     sum  = foldr  (+) 0
>>     sum' = foldl' (+) 0
>>
>> For Foldable
>>
>>     sum  = getSum #. foldMap  Sum
>>     sum' = getSum #. foldMap' Sum
>>
>> - Oleg
>>
>> On 18.10.2020 23.11, Hécate wrote:
>>>
>>> Indeed, and I initially went to suggest a
>>> foldMap'`-based implementation to keep with the current
>>> implementation of many Foldable functions that are based
>>> on `foldMap` rather than a raw `fold`.
>>>
>>> On 18/10/2020 22:04, Oleg Grenrus wrote:
>>>> For the sake of bigger audience I didn't bother
>>>> mentioning #. which is a coercion helper. It's
>>>> essentially better (.) when the first argument is
>>>> newtype constructor (i.e. coerce).
>>>>
>>>> So with Option A (strict):
>>>>
>>>>     sum = getSum #. foldMap' Sum
>>>>
>>>> Or Option B (lazy)
>>>>
>>>>     sum = getSum #. foldMap Sum
>>>>
>>>> ---
>>>>
>>>> There is also third option, Option C:
>>>>
>>>>     sum  = foldr  (+) 0
>>>>     sum' = foldl' (+) 0
>>>>
>>>> I don't think this is worthwhile, but it is an option.
>>>> (to rehash, I don't consider maintaining status quo to
>>>> be an option at all).
>>>>
>>>> - Oleg
>>>> On 18.10.2020 22.54, Vanessa McHale wrote:
>>>>>
>>>>> It's
>>>>>
>>>>> sum = getSum #. foldMap Sum
>>>>>
>>>>> in base.
>>>>>
>>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
>>>>>>
>>>>>> The problem is the current definition of sum for lists
>>>>>> which uses foldl, i.e non-strict left fold
>>>>>>
>>>>>>     sum = foldl (+) 0
>>>>>>
>>>>>> It's utterly broken. Either we should change it to
>>>>>> foldl' to work on some types where addition is strict,
>>>>>> Option A:
>>>>>>
>>>>>>     sum = foldl' (+) 0
>>>>>>
>>>>>> or alternatively (to make people using lazy
>>>>>> accumulator types), Option B:
>>>>>>
>>>>>>     sum = foldr (+) 0
>>>>>>
>>>>>> The current state is no good for anyone or anything.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> Related issue which Hecate didn't clearly mention, is
>>>>>> that Foldable class default implementation has
>>>>>>
>>>>>>    class Foldable f where
>>>>>>        ...
>>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy definition
>>>>>>
>>>>>> If we select option A, then I argue that for
>>>>>> consistency the default `Foldable.sum` should be
>>>>>>
>>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
>>>>>>
>>>>>> If we select option B, Foldable definition doesn't need to be changed.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> I repeat, using non-strict left fold, foldl, for sum
>>>>>> and product is not good for anything.
>>>>>> Either foldr or foldl'.
>>>>>>
>>>>>> I have no strong preference. Current state is unacceptable.
>>>>>>
>>>>>> -  Oleg
>>>>>>
>>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
>>>>>>>
>>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
>>>>>>>
>>>>>>>> In conclusion, leaving things to the optimiser that
>>>>>>>> could be trivially made fast every time seems
>>>>>>>> needlessly risky.
>>>>>>>
>>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
>>>>>>> would still fail on a lazy accumulator type, say a
>>>>>>> lazy pair type. If at all, sum and product should be
>>>>>>> deepseq-strict. So currently, letting the optimiser
>>>>>>> make a lazy sum strict is still the smaller hack.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Libraries mailing list
>>>>>>> [hidden email]
>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>
>>>>>> _______________________________________________
>>>>>> Libraries mailing list
>>>>>> [hidden email]
>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>
>>>>> _______________________________________________
>>>>> Libraries mailing list
>>>>> [hidden email]
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> [hidden email]
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> --
>>> Hécate ✨
>>> IRC: Uniaika
>>> WWW:https://glitchbra.in
>>> RUN: BSD
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

--
Jón Fairbairn                                 [hidden email]
http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2014-04-05)

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


--
David Duke
Emeritus Professor of Computer Science
School of Computing University of Leeds UK
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
12