Repair to floating point enumerations?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Repair to floating point enumerations?

Malcolm Wallace
Dear Haskell-Primers (and libraries).

Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
Prelude, regarding numeric enumerations for Floats/Doubles:

    Prelude> [0, 0.3 .. 1.1]
    [0.0,0.3,0.6,0.899999,1.2]

What is odd is that the last value of the list is much larger than the
specified termination value.  But this is exactly as specified by the
Haskell'98 Report.

    http://haskell.org/onlinereport/basic.html#sect6.3.4

    "For Float and Double, the semantics of the enumFrom family is given
    by the rules for Int above, except that the list terminates when the
    elements become greater than e3+i/2 for positive increment i, or
    when they become less than e3+i/2 for negative i.

We have discussed this question (and related ones, such as whether Float
and Double belong in the Enum class at all) several times before, and I
do not wish to rehash all of those points again e.g.:

    http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
    http://www.haskell.org/pipermail/haskell/2001-October/008218.html
    http://www.haskell.org/pipermail/haskell/2002-October/010607.html

Phil proposes that, although retaining the instances of Enum for Float
and Double, we simplify the definitions of the numericEnumFrom family:

  numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
  numericEnumFrom         =  iterate (+1)
  numericEnumFromThen n m =  iterate (+(m-n)) n
  numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
  numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)

The particular feature of note is that the odd fudge factor of (1/2 *
the increment) is removed.  The inexact nature of floating point numbers
would therefore cause a specification like

    [ 0.0, 0.1 .. 0.3 ]

to yield the sequence

    [ 0.0, 0.1, 0.2 ]

that is, to omit the upper bound, because (3 * 0.1) is actually
represented as 0.30000000000004, strictly greater than 0.3.

Phil argues that this behaviour is more desirable: "the simple fix is
that the user must add a suitable epsilon to the upper bound.  The key
word here is *suitable*.  The old definitions included completely
bizarre and often highly unsuitable choices of epsilon."

This proposal seems to me to improve the consistency of the enumeration
syntax across both the integral and floating types.  Some users may
still be surprised, but the surprise will be easier to explain.

I am bringing this question to the attention of all who are interested
in Haskell Prime, because it seems like a sensible and well-reasoned
change.  Discussion on whether to adopt this proposal for H' is welcome.

But as maintainer and bug-fixer of the Haskell'98 Report, I have also
been asked whether we should make this change retrospectively to the
Haskell'98 language (as a "typo").  Since it involves not merely an
ordinary library function, but a Prelude function, and moreover a
function that is used in the desugaring of syntax, it is less clear to
me whether to alter Haskell'98.

Thoughts?

Regards,
    Malcolm
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

RE: Repair to floating point enumerations?

Mitchell, Neil
Hi Malcolm,

The current behaviour does sound like a bug, and the revised behaviour
does sound like a fix - and one that has a sensible explanation if
people trip over it. In general having floating point be a member of
classes such as Eq has some obvious problems, but I realise is a
necessity for practical programming. Given that we have Eq Double, then
Enum Double seems no worse.

If we don't alter H98 then a valid program under H98 vs H' will give
different results without any warning - that seems really bad. In
addition, having two instances around for one typeclass/type pair in a
big library (base) which are switched with some flag seems like a
nightmare for compiler writers. So I think a good solution would be to
fix H98 as a typo, and include it in H'.

Thanks

Neil


> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Malcolm Wallace
> Sent: 15 October 2008 10:41 am
> To: [hidden email]
> Cc: [hidden email]
> Subject: Repair to floating point enumerations?
>
> Dear Haskell-Primers (and libraries).
>
> Recently, Phil Wadler has pointed out a weird anomaly in the
> Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
>
>     Prelude> [0, 0.3 .. 1.1]
>     [0.0,0.3,0.6,0.899999,1.2]
>
> What is odd is that the last value of the list is much larger
> than the specified termination value.  But this is exactly as
> specified by the
> Haskell'98 Report.
>
>     http://haskell.org/onlinereport/basic.html#sect6.3.4
>
>     "For Float and Double, the semantics of the enumFrom
> family is given
>     by the rules for Int above, except that the list
> terminates when the
>     elements become greater than e3+i/2 for positive increment i, or
>     when they become less than e3+i/2 for negative i.
>
> We have discussed this question (and related ones, such as
> whether Float and Double belong in the Enum class at all)
> several times before, and I do not wish to rehash all of
> those points again e.g.:
>
>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>
> Phil proposes that, although retaining the instances of Enum
> for Float and Double, we simplify the definitions of the
> numericEnumFrom family:
>
>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a
> -> a -> [a]
>   numericEnumFrom         =  iterate (+1)
>   numericEnumFromThen n m =  iterate (+(m-n)) n
>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>   numericEnumFromThenTo n m p = takeWhile (<= p)
> (numericEnumFromThen n m)
>
> The particular feature of note is that the odd fudge factor
> of (1/2 * the increment) is removed.  The inexact nature of
> floating point numbers would therefore cause a specification like
>
>     [ 0.0, 0.1 .. 0.3 ]
>
> to yield the sequence
>
>     [ 0.0, 0.1, 0.2 ]
>
> that is, to omit the upper bound, because (3 * 0.1) is
> actually represented as 0.30000000000004, strictly greater than 0.3.
>
> Phil argues that this behaviour is more desirable: "the
> simple fix is that the user must add a suitable epsilon to
> the upper bound.  The key word here is *suitable*.  The old
> definitions included completely bizarre and often highly
> unsuitable choices of epsilon."
>
> This proposal seems to me to improve the consistency of the
> enumeration syntax across both the integral and floating
> types.  Some users may still be surprised, but the surprise
> will be easier to explain.
>
> I am bringing this question to the attention of all who are
> interested in Haskell Prime, because it seems like a sensible
> and well-reasoned change.  Discussion on whether to adopt
> this proposal for H' is welcome.
>
> But as maintainer and bug-fixer of the Haskell'98 Report, I
> have also been asked whether we should make this change
> retrospectively to the
> Haskell'98 language (as a "typo").  Since it involves not
> merely an ordinary library function, but a Prelude function,
> and moreover a function that is used in the desugaring of
> syntax, it is less clear to me whether to alter Haskell'98.
>
> Thoughts?
>
> Regards,
>     Malcolm
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>
>

==============================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==============================================================================

_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

David Roundy-2
In reply to this post by Malcolm Wallace
On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:

> Dear Haskell-Primers (and libraries).
>
> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
> Prelude, regarding numeric enumerations for Floats/Doubles:
>
>     Prelude> [0, 0.3 .. 1.1]
>     [0.0,0.3,0.6,0.899999,1.2]
>
> What is odd is that the last value of the list is much larger than the
> specified termination value.  But this is exactly as specified by the
> Haskell'98 Report.
>
>     http://haskell.org/onlinereport/basic.html#sect6.3.4
>
>     "For Float and Double, the semantics of the enumFrom family is given
>     by the rules for Int above, except that the list terminates when the
>     elements become greater than e3+i/2 for positive increment i, or
>     when they become less than e3+i/2 for negative i.
>
> We have discussed this question (and related ones, such as whether Float
> and Double belong in the Enum class at all) several times before, and I
> do not wish to rehash all of those points again e.g.:
>
>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>
> Phil proposes that, although retaining the instances of Enum for Float
> and Double, we simplify the definitions of the numericEnumFrom family:
>
>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>   numericEnumFrom         =  iterate (+1)
>   numericEnumFromThen n m =  iterate (+(m-n)) n
>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>
> The particular feature of note is that the odd fudge factor of (1/2 *
> the increment) is removed.  The inexact nature of floating point numbers
> would therefore cause a specification like
>
>     [ 0.0, 0.1 .. 0.3 ]
>
> to yield the sequence
>
>     [ 0.0, 0.1, 0.2 ]
>
> that is, to omit the upper bound, because (3 * 0.1) is actually
> represented as 0.30000000000004, strictly greater than 0.3.
>
> Phil argues that this behaviour is more desirable: "the simple fix is
> that the user must add a suitable epsilon to the upper bound.  The key
> word here is *suitable*.  The old definitions included completely
> bizarre and often highly unsuitable choices of epsilon."
>
> This proposal seems to me to improve the consistency of the enumeration
> syntax across both the integral and floating types.  Some users may
> still be surprised, but the surprise will be easier to explain.
>
> I am bringing this question to the attention of all who are interested
> in Haskell Prime, because it seems like a sensible and well-reasoned
> change.  Discussion on whether to adopt this proposal for H' is welcome.
>
> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
> been asked whether we should make this change retrospectively to the
> Haskell'98 language (as a "typo").  Since it involves not merely an
> ordinary library function, but a Prelude function, and moreover a
> function that is used in the desugaring of syntax, it is less clear to
> me whether to alter Haskell'98.
>
> Thoughts?

It sounds like a bad fix to me.  It seems important that the
[0.0,0.1.. x] notation should work correctly in the common cases.  And
the common case really is that the final value is intended as an exact
multiple of the increment.

Why not look for a heuristic that gets the common cases right, rather
than going with an elegant wrong solution? After all, these
enumerations are most often used by people who neither care nor know
how they're implemented, but who most likely would prefer if haskell
worked as well as matlab, python, etc.

One reasonable option would be to actually take into account the
expected roundoff error (which isn't hard to compute for simple sums
like this).

It would also be a good idea to reduce that roundoff error by using
multiplication rather than addition to create the enumeration (which
also allows us to ensure that finite enumerations terminate).
e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
Admittedly, this is a stupid thing to compute, but it's also stupid to
add many small numbers together in sequence, since it always serves to
increase the roundoff error.  It's true that most people's C code
would be just as naive, but we're writing Haskell, and you're talking
about the Prelude, which should be written intelligently, not using
the simplest code, in such a way that it won't bite programmers.

David
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Lennart Augustsson
I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
of 0.1 get exactly what they deserve if x is not included.  Floating
point numbers are not the real numbers, and the sooner they learn that
the better.  We can fudge this all we like, but 0.1 is never going to
be exactly representable as a binary floating point number no matter
what we do.

On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <[hidden email]> wrote:

> On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
>> Dear Haskell-Primers (and libraries).
>>
>> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
>> Prelude, regarding numeric enumerations for Floats/Doubles:
>>
>>     Prelude> [0, 0.3 .. 1.1]
>>     [0.0,0.3,0.6,0.899999,1.2]
>>
>> What is odd is that the last value of the list is much larger than the
>> specified termination value.  But this is exactly as specified by the
>> Haskell'98 Report.
>>
>>     http://haskell.org/onlinereport/basic.html#sect6.3.4
>>
>>     "For Float and Double, the semantics of the enumFrom family is given
>>     by the rules for Int above, except that the list terminates when the
>>     elements become greater than e3+i/2 for positive increment i, or
>>     when they become less than e3+i/2 for negative i.
>>
>> We have discussed this question (and related ones, such as whether Float
>> and Double belong in the Enum class at all) several times before, and I
>> do not wish to rehash all of those points again e.g.:
>>
>>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>>
>> Phil proposes that, although retaining the instances of Enum for Float
>> and Double, we simplify the definitions of the numericEnumFrom family:
>>
>>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>>   numericEnumFrom         =  iterate (+1)
>>   numericEnumFromThen n m =  iterate (+(m-n)) n
>>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>>
>> The particular feature of note is that the odd fudge factor of (1/2 *
>> the increment) is removed.  The inexact nature of floating point numbers
>> would therefore cause a specification like
>>
>>     [ 0.0, 0.1 .. 0.3 ]
>>
>> to yield the sequence
>>
>>     [ 0.0, 0.1, 0.2 ]
>>
>> that is, to omit the upper bound, because (3 * 0.1) is actually
>> represented as 0.30000000000004, strictly greater than 0.3.
>>
>> Phil argues that this behaviour is more desirable: "the simple fix is
>> that the user must add a suitable epsilon to the upper bound.  The key
>> word here is *suitable*.  The old definitions included completely
>> bizarre and often highly unsuitable choices of epsilon."
>>
>> This proposal seems to me to improve the consistency of the enumeration
>> syntax across both the integral and floating types.  Some users may
>> still be surprised, but the surprise will be easier to explain.
>>
>> I am bringing this question to the attention of all who are interested
>> in Haskell Prime, because it seems like a sensible and well-reasoned
>> change.  Discussion on whether to adopt this proposal for H' is welcome.
>>
>> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
>> been asked whether we should make this change retrospectively to the
>> Haskell'98 language (as a "typo").  Since it involves not merely an
>> ordinary library function, but a Prelude function, and moreover a
>> function that is used in the desugaring of syntax, it is less clear to
>> me whether to alter Haskell'98.
>>
>> Thoughts?
>
> It sounds like a bad fix to me.  It seems important that the
> [0.0,0.1.. x] notation should work correctly in the common cases.  And
> the common case really is that the final value is intended as an exact
> multiple of the increment.
>
> Why not look for a heuristic that gets the common cases right, rather
> than going with an elegant wrong solution? After all, these
> enumerations are most often used by people who neither care nor know
> how they're implemented, but who most likely would prefer if haskell
> worked as well as matlab, python, etc.
>
> One reasonable option would be to actually take into account the
> expected roundoff error (which isn't hard to compute for simple sums
> like this).
>
> It would also be a good idea to reduce that roundoff error by using
> multiplication rather than addition to create the enumeration (which
> also allows us to ensure that finite enumerations terminate).
> e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
> Admittedly, this is a stupid thing to compute, but it's also stupid to
> add many small numbers together in sequence, since it always serves to
> increase the roundoff error.  It's true that most people's C code
> would be just as naive, but we're writing Haskell, and you're talking
> about the Prelude, which should be written intelligently, not using
> the simplest code, in such a way that it won't bite programmers.
>
> David
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

RE: Repair to floating point enumerations?

Duncan Coutts
In reply to this post by Mitchell, Neil
On Wed, 2008-10-15 at 11:25 +0100, Mitchell, Neil wrote:

> Hi Malcolm,
>
> The current behaviour does sound like a bug, and the revised behaviour
> does sound like a fix - and one that has a sensible explanation if
> people trip over it. In general having floating point be a member of
> classes such as Eq has some obvious problems, but I realise is a
> necessity for practical programming. Given that we have Eq Double, then
> Enum Double seems no worse.
>
> If we don't alter H98 then a valid program under H98 vs H' will give
> different results without any warning - that seems really bad. In
> addition, having two instances around for one typeclass/type pair in a
> big library (base) which are switched with some flag seems like a
> nightmare for compiler writers. So I think a good solution would be to
> fix H98 as a typo, and include it in H'.

I would take the contrary position and say H98 should be left alone and
the change should be proposed for H'.

The argument is that H98 is done and the revised report was published 7
years ago. Changing H98 now just doesn't seem to make much sense.

If we're talking about changing the meaning of 'valid' programs then
doing that at a boundary like H98 -> H' seems much more sensible than
having to explain the difference in H98-pre2008 programs vs
H98-post2008.

People will not expect all programs to be exactly the same between H98
and H' (otherwise there would be little need for a new standard). Yes H'
is mostly compatible extensions but not all of it.

Duncan

_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

David Roundy-2
In reply to this post by Lennart Augustsson
Here's a counter-proposal:

numericEnumFromThenTo   :: RealFloat a => a -> a -> a -> [a]
numericEnumFrom 0 = map fromInteger [1..]
numericEnumFrom n = map ((n+).fromInteger) [1..]
numericEnumFromThen n m =  map (\x -> n+fromInteger x*(m-n)) [1..]
numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n)
numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m)

epsilon :: RealFloat a => a
epsilon = 1/2^(floatDigits (undefined :: a))

This uses quite a reasonable approximation for the roundoff error, and
has the advantage of not inappropriately returning _|_.  It does
sometimes create duplicate entries in the list, but I think that is
better that returning an infinite list of duplicate entries as the
code proposed below does.

And yes, the fuzzy comparison is a bit ugly, but at least it means
that every user is not forced to implement fuzzy comparison in their
quick and dirty code (which is the only thing this syntax is good
for).

David

On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:

> I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
> of 0.1 get exactly what they deserve if x is not included.  Floating
> point numbers are not the real numbers, and the sooner they learn that
> the better.  We can fudge this all we like, but 0.1 is never going to
> be exactly representable as a binary floating point number no matter
> what we do.
>
> On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <[hidden email]> wrote:
> > On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
> >> Dear Haskell-Primers (and libraries).
> >>
> >> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
> >> Prelude, regarding numeric enumerations for Floats/Doubles:
> >>
> >>     Prelude> [0, 0.3 .. 1.1]
> >>     [0.0,0.3,0.6,0.899999,1.2]
> >>
> >> What is odd is that the last value of the list is much larger than the
> >> specified termination value.  But this is exactly as specified by the
> >> Haskell'98 Report.
> >>
> >>     http://haskell.org/onlinereport/basic.html#sect6.3.4
> >>
> >>     "For Float and Double, the semantics of the enumFrom family is given
> >>     by the rules for Int above, except that the list terminates when the
> >>     elements become greater than e3+i/2 for positive increment i, or
> >>     when they become less than e3+i/2 for negative i.
> >>
> >> We have discussed this question (and related ones, such as whether Float
> >> and Double belong in the Enum class at all) several times before, and I
> >> do not wish to rehash all of those points again e.g.:
> >>
> >>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
> >>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
> >>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
> >>
> >> Phil proposes that, although retaining the instances of Enum for Float
> >> and Double, we simplify the definitions of the numericEnumFrom family:
> >>
> >>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
> >>   numericEnumFrom         =  iterate (+1)
> >>   numericEnumFromThen n m =  iterate (+(m-n)) n
> >>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
> >>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
> >>
> >> The particular feature of note is that the odd fudge factor of (1/2 *
> >> the increment) is removed.  The inexact nature of floating point numbers
> >> would therefore cause a specification like
> >>
> >>     [ 0.0, 0.1 .. 0.3 ]
> >>
> >> to yield the sequence
> >>
> >>     [ 0.0, 0.1, 0.2 ]
> >>
> >> that is, to omit the upper bound, because (3 * 0.1) is actually
> >> represented as 0.30000000000004, strictly greater than 0.3.
> >>
> >> Phil argues that this behaviour is more desirable: "the simple fix is
> >> that the user must add a suitable epsilon to the upper bound.  The key
> >> word here is *suitable*.  The old definitions included completely
> >> bizarre and often highly unsuitable choices of epsilon."
> >>
> >> This proposal seems to me to improve the consistency of the enumeration
> >> syntax across both the integral and floating types.  Some users may
> >> still be surprised, but the surprise will be easier to explain.
> >>
> >> I am bringing this question to the attention of all who are interested
> >> in Haskell Prime, because it seems like a sensible and well-reasoned
> >> change.  Discussion on whether to adopt this proposal for H' is welcome.
> >>
> >> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
> >> been asked whether we should make this change retrospectively to the
> >> Haskell'98 language (as a "typo").  Since it involves not merely an
> >> ordinary library function, but a Prelude function, and moreover a
> >> function that is used in the desugaring of syntax, it is less clear to
> >> me whether to alter Haskell'98.
> >>
> >> Thoughts?
> >
> > It sounds like a bad fix to me.  It seems important that the
> > [0.0,0.1.. x] notation should work correctly in the common cases.  And
> > the common case really is that the final value is intended as an exact
> > multiple of the increment.
> >
> > Why not look for a heuristic that gets the common cases right, rather
> > than going with an elegant wrong solution? After all, these
> > enumerations are most often used by people who neither care nor know
> > how they're implemented, but who most likely would prefer if haskell
> > worked as well as matlab, python, etc.
> >
> > One reasonable option would be to actually take into account the
> > expected roundoff error (which isn't hard to compute for simple sums
> > like this).
> >
> > It would also be a good idea to reduce that roundoff error by using
> > multiplication rather than addition to create the enumeration (which
> > also allows us to ensure that finite enumerations terminate).
> > e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
> > Admittedly, this is a stupid thing to compute, but it's also stupid to
> > add many small numbers together in sequence, since it always serves to
> > increase the roundoff error.  It's true that most people's C code
> > would be just as naive, but we're writing Haskell, and you're talking
> > about the Prelude, which should be written intelligently, not using
> > the simplest code, in such a way that it won't bite programmers.
> >
> > David
> > _______________________________________________
> > Libraries mailing list
> > [hidden email]
> > http://www.haskell.org/mailman/listinfo/libraries
> >
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Christopher Lane Hinson

I agree with David, we should be using multiplication, not addition.
However, I think that under the law of least surprise, we should
require that for all a,b,z:

all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].

For example, anything in the neighborhood of this is just unfair, even
if it's within David's fudge factor:

Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55]
[-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity]

If I want to include the terminating value, then what I really want is
probably some f such that:

f (6 :: Int) 0 0.55 = [0,0.11,0.22,0.33,0.44,0.55]

--Lane



On Wed, 15 Oct 2008, David Roundy wrote:

> Here's a counter-proposal:
>
> numericEnumFromThenTo   :: RealFloat a => a -> a -> a -> [a]
> numericEnumFrom 0 = map fromInteger [1..]
> numericEnumFrom n = map ((n+).fromInteger) [1..]
> numericEnumFromThen n m =  map (\x -> n+fromInteger x*(m-n)) [1..]
> numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n)
> numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m)
>
> epsilon :: RealFloat a => a
> epsilon = 1/2^(floatDigits (undefined :: a))
>
> This uses quite a reasonable approximation for the roundoff error, and
> has the advantage of not inappropriately returning _|_.  It does
> sometimes create duplicate entries in the list, but I think that is
> better that returning an infinite list of duplicate entries as the
> code proposed below does.
>
> And yes, the fuzzy comparison is a bit ugly, but at least it means
> that every user is not forced to implement fuzzy comparison in their
> quick and dirty code (which is the only thing this syntax is good
> for).
>
> David
>
> On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:
>> I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
>> of 0.1 get exactly what they deserve if x is not included.  Floating
>> point numbers are not the real numbers, and the sooner they learn that
>> the better.  We can fudge this all we like, but 0.1 is never going to
>> be exactly representable as a binary floating point number no matter
>> what we do.
>>
>> On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <[hidden email]> wrote:
>>> On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
>>>> Dear Haskell-Primers (and libraries).
>>>>
>>>> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
>>>> Prelude, regarding numeric enumerations for Floats/Doubles:
>>>>
>>>>     Prelude> [0, 0.3 .. 1.1]
>>>>     [0.0,0.3,0.6,0.899999,1.2]
>>>>
>>>> What is odd is that the last value of the list is much larger than the
>>>> specified termination value.  But this is exactly as specified by the
>>>> Haskell'98 Report.
>>>>
>>>>     http://haskell.org/onlinereport/basic.html#sect6.3.4
>>>>
>>>>     "For Float and Double, the semantics of the enumFrom family is given
>>>>     by the rules for Int above, except that the list terminates when the
>>>>     elements become greater than e3+i/2 for positive increment i, or
>>>>     when they become less than e3+i/2 for negative i.
>>>>
>>>> We have discussed this question (and related ones, such as whether Float
>>>> and Double belong in the Enum class at all) several times before, and I
>>>> do not wish to rehash all of those points again e.g.:
>>>>
>>>>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>>>>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>>>>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>>>>
>>>> Phil proposes that, although retaining the instances of Enum for Float
>>>> and Double, we simplify the definitions of the numericEnumFrom family:
>>>>
>>>>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>>>>   numericEnumFrom         =  iterate (+1)
>>>>   numericEnumFromThen n m =  iterate (+(m-n)) n
>>>>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>>>>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>>>>
>>>> The particular feature of note is that the odd fudge factor of (1/2 *
>>>> the increment) is removed.  The inexact nature of floating point numbers
>>>> would therefore cause a specification like
>>>>
>>>>     [ 0.0, 0.1 .. 0.3 ]
>>>>
>>>> to yield the sequence
>>>>
>>>>     [ 0.0, 0.1, 0.2 ]
>>>>
>>>> that is, to omit the upper bound, because (3 * 0.1) is actually
>>>> represented as 0.30000000000004, strictly greater than 0.3.
>>>>
>>>> Phil argues that this behaviour is more desirable: "the simple fix is
>>>> that the user must add a suitable epsilon to the upper bound.  The key
>>>> word here is *suitable*.  The old definitions included completely
>>>> bizarre and often highly unsuitable choices of epsilon."
>>>>
>>>> This proposal seems to me to improve the consistency of the enumeration
>>>> syntax across both the integral and floating types.  Some users may
>>>> still be surprised, but the surprise will be easier to explain.
>>>>
>>>> I am bringing this question to the attention of all who are interested
>>>> in Haskell Prime, because it seems like a sensible and well-reasoned
>>>> change.  Discussion on whether to adopt this proposal for H' is welcome.
>>>>
>>>> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
>>>> been asked whether we should make this change retrospectively to the
>>>> Haskell'98 language (as a "typo").  Since it involves not merely an
>>>> ordinary library function, but a Prelude function, and moreover a
>>>> function that is used in the desugaring of syntax, it is less clear to
>>>> me whether to alter Haskell'98.
>>>>
>>>> Thoughts?
>>>
>>> It sounds like a bad fix to me.  It seems important that the
>>> [0.0,0.1.. x] notation should work correctly in the common cases.  And
>>> the common case really is that the final value is intended as an exact
>>> multiple of the increment.
>>>
>>> Why not look for a heuristic that gets the common cases right, rather
>>> than going with an elegant wrong solution? After all, these
>>> enumerations are most often used by people who neither care nor know
>>> how they're implemented, but who most likely would prefer if haskell
>>> worked as well as matlab, python, etc.
>>>
>>> One reasonable option would be to actually take into account the
>>> expected roundoff error (which isn't hard to compute for simple sums
>>> like this).
>>>
>>> It would also be a good idea to reduce that roundoff error by using
>>> multiplication rather than addition to create the enumeration (which
>>> also allows us to ensure that finite enumerations terminate).
>>> e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
>>> Admittedly, this is a stupid thing to compute, but it's also stupid to
>>> add many small numbers together in sequence, since it always serves to
>>> increase the roundoff error.  It's true that most people's C code
>>> would be just as naive, but we're writing Haskell, and you're talking
>>> about the Prelude, which should be written intelligently, not using
>>> the simplest code, in such a way that it won't bite programmers.
>>>
>>> David
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://www.haskell.org/mailman/listinfo/libraries
>>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://www.haskell.org/mailman/listinfo/libraries
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Isaac Dupree
Christopher Lane Hinson wrote:
>
> I agree with David, we should be using multiplication, not addition.
> However, I think that under the law of least surprise, we should
> require that for all a,b,z:
>
> all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].

so that [0,0.1..0.3] doesn't include the terminating value
that's a little more than the literal 0.3?

> For example, anything in the neighborhood of this is just unfair, even
> if it's within David's fudge factor:
>
> Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55]
> [-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity]

but that's a significant fudge, 0.5 versus 0.55 versus 0.6
-- right?

-Isaac
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

David Roundy-2
On Thu, Oct 16, 2008 at 05:31:50PM -0400, Isaac Dupree wrote:

> Christopher Lane Hinson wrote:
>>
>> I agree with David, we should be using multiplication, not addition.
>> However, I think that under the law of least surprise, we should
>> require that for all a,b,z:
>>
>> all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
>
> so that [0,0.1..0.3] doesn't include the terminating value that's a
> little more than the literal 0.3?

We could do what octave appears to do, which is to set anything within
the fudge factor (but greater than the upper bound) equal to the upper
bound.  That seems like a reasonable option to me.

David
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Jon Fairbairn
David Roundy <[hidden email]> writes:

> On Thu, Oct 16, 2008 at 05:31:50PM -0400, Isaac Dupree wrote:
>> Christopher Lane Hinson wrote:
>>>
>>> I agree with David, we should be using multiplication, not addition.
>>> However, I think that under the law of least surprise, we should
>>> require that for all a,b,z:
>>>
>>> all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
>>
>> so that [0,0.1..0.3] doesn't include the terminating value that's a
>> little more than the literal 0.3?
>
> We could do what octave appears to do, which is to set anything within
> the fudge factor (but greater than the upper bound) equal to the upper
> bound.  That seems like a reasonable option to me.

I've not been following this closely, but I've not seen a
clear definition of what this is all for.  Without that,
we're going to see a succession of fudges that will work for
some notions of what the notation means and fail for others.

Assuming z>=a, must the resulting list include a? include z
exactly when?  If a or z is a literal, does include here
mean "include the value represented by the literal" or some
other nearby value?  Must the step (l!!(i+1)-l!!i) always be
the same? Must it equal (b-a)? (I can even imagine that for
some applications, a list that looked like this: [0.0,
1.0e-2, 2.0e-2, 3.0e-2, 4.0e-2, 5.0e-2,
6.0000000000000005e-2, 7.0e-2, 8.0e-2, 9.0e-2,
9.999999999999999e-2, 0.10999999999999999, 0.11] would be
preferable to one that included only one of the last two
elements).

Whatever is decided for floats, the definition must be
consistent with that for discrete quantities. I'm not sure
this is possible. Since floating point numbers are full of
surprises for anyone who is used to real numbers (and even a
few for people who are quite familiar with floats), the
principle of least surprise would surely indicate no
instance of Enum for floats.

--
Jón Fairbairn                                 [hidden email]


_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Roman Leshchinskiy
In reply to this post by Malcolm Wallace
On 15/10/2008, at 20:41, Malcolm Wallace wrote:

> Phil proposes that, although retaining the instances of Enum for Float
> and Double, we simplify the definitions of the numericEnumFrom family:
>
>  numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a ->  
> [a]
>  numericEnumFrom         =  iterate (+1)
>  numericEnumFromThen n m =  iterate (+(m-n)) n
>  numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>  numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen  
> n m)


I'd like to raise the following two points in this context.

Firstly, an array library which attempts to provide reasonable  
counterparts to the list functions would want to define array versions  
of enumFromTo and enumFromThenTo. To be efficient, it must be able to  
determine the expected number of elements reasonably fast (in constant  
time).

Secondly, a parallel array library (such as the one provided by DPH)  
also needs to be able to generate, say, the first half of the array on  
one processor and the second half on another. This requires enumFromTo  
to obey some form of the following law for some f and g:

   enumFromTo m n = enumFromTo m (f m n) ++ enumFromTo (g m n) n

It might make sense to try to provide this for Float and Double for  
the sake of consistency with future libraries. I'm not sure how easy  
it is to do, though.

Roman


_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Simon Marlow-7
In reply to this post by Malcolm Wallace
Malcolm Wallace wrote:

> Phil proposes that, although retaining the instances of Enum for Float
> and Double, we simplify the definitions of the numericEnumFrom family:
>
>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>   numericEnumFrom         =  iterate (+1)
>   numericEnumFromThen n m =  iterate (+(m-n)) n
>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)

I'll leave it to the floating-point experts to decide exactly what to do
here for Haskell' (but I note that David Roundy's version looks better than
the iterate version above, because the errors won't accumulate).

> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
> been asked whether we should make this change retrospectively to the
> Haskell'98 language (as a "typo").  Since it involves not merely an
> ordinary library function, but a Prelude function, and moreover a
> function that is used in the desugaring of syntax, it is less clear to
> me whether to alter Haskell'98.

We definitely can't make breaking changes to Haskell 98; this would be much
more than a "typo".

However, this does give us a problem if we decide to make a change here for
H', as Neil points out, because there would be two mutually-incompatible
instances for Enum Float.  We'd need to have a clear distinction between
programs that are Haskell 98 and those that are not, with only the former
allowed to use the haskell98 package.

Cheers,
        Simon

_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime
Reply | Threaded
Open this post in threaded view
|

Re: Repair to floating point enumerations?

Neil Mitchell
Hi

Bringing up an old thread, but this issue just bit me in real life:

[0, 2.5 .. 3.75 ] = [0, 2.5, 5.0]

I was comparative testing some C++ and some Haskell, and its a shame
when its the Haskell that is clearly wrong! I don't think any end
decision was reached in this thread, but I think this is sufficiently
incorrect to call the Haskell 98 behaviour a bug.

Thanks

Neil

On Wed, Oct 22, 2008 at 10:11 AM, Simon Marlow <[hidden email]> wrote:

> Malcolm Wallace wrote:
>
>> Phil proposes that, although retaining the instances of Enum for Float
>> and Double, we simplify the definitions of the numericEnumFrom family:
>>
>>  numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>>  numericEnumFrom         =  iterate (+1)
>>  numericEnumFromThen n m =  iterate (+(m-n)) n
>>  numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>>  numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>
> I'll leave it to the floating-point experts to decide exactly what to do
> here for Haskell' (but I note that David Roundy's version looks better than
> the iterate version above, because the errors won't accumulate).
>
>> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
>> been asked whether we should make this change retrospectively to the
>> Haskell'98 language (as a "typo").  Since it involves not merely an
>> ordinary library function, but a Prelude function, and moreover a
>> function that is used in the desugaring of syntax, it is less clear to
>> me whether to alter Haskell'98.
>
> We definitely can't make breaking changes to Haskell 98; this would be much
> more than a "typo".
>
> However, this does give us a problem if we decide to make a change here for
> H', as Neil points out, because there would be two mutually-incompatible
> instances for Enum Float.  We'd need to have a clear distinction between
> programs that are Haskell 98 and those that are not, with only the former
> allowed to use the haskell98 package.
>
> Cheers,
>        Simon
>
> _______________________________________________
> Haskell-prime mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-prime
>
_______________________________________________
Haskell-prime mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-prime