nubBy seems broken in recent GHCs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

nubBy seems broken in recent GHCs

Cale Gibbard
According to the Report:

  nubBy            :: (a -> a -> Bool) -> [a] -> [a]
  nubBy eq []      =  []
  nubBy eq (x:xs)  =  x : nubBy eq (filter (\y -> not (eq x y)) xs)

Hence, we should have that

nubBy (<) (1:2:[])
= 1 : nubBy (<) (filter (\y -> not (1 < y)) (2:[]))
= 1 : nubBy (<) []
= 1 : []

However in ghc-6.10.3:

Prelude Data.List> nubBy (<) [1,2]
[1,2]

The order of the parameters to the function which is passed to nubBy
is *important* since the function might not be an equivalence
relation. nubBy is more generally useful for sieving even when the
relation is not symmetric. groupBy, for a similar reason, has
applications for grouping beyond those provided by equivalence
relations, and I think we should be able to rely on its behaviour.

Moreover, I contend that the Report's order is more sensible, since
the parameters to the relation stay in the left-to-right order in
which they occurred in the list. It would be good if this could get
fixed for 6.10.4.

 - Cale

PS: There is actually another implementation of groupBy which would
*also* be useful to have, but which is exactly the same for
equivalence relations as groupBy, and it compares adjacent elements
rather than the first element of a group with successive ones.
(groupByAdjacent or something would be a reasonable name.)

An implementation of it can be found here:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=5595
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell-cafe] nubBy seems broken in recent GHCs

Cale Gibbard
2009/6/6 Bertram Felgenhauer <[hidden email]>:

> Interesting. This was changed in response to
>
>    http://hackage.haskell.org/trac/ghc/ticket/2528
>
> | Tue Sep  2 11:29:50 CEST 2008  Simon Marlow <[hidden email]>
> |   * #2528: reverse the order of args to (==) in nubBy to match nub
> |   This only makes a difference when the (==) definition is not
> |   reflexive, but strictly speaking it does violate the report definition
> |   of nubBy, so we should fix it.
>
> It turns out that 'elem' differs from the report version and should
> have its comparison reversed. Of course that would only ever matter
> for broken Eq instances.
>
> However, the report also states that the nubBy function may assume that
> the given predicate defines an equivalence relation.
>
>    http://haskell.org/onlinereport/list.html#sect17.6
>
> So I'm not sure there's anything to be fixed here - although backing
> out the above patch probably won't hurt anybody.
>
> Bertram

Yeah, while most Eq instances really do define an equivalence relation
(at least extensionally), and the Report authors probably thought in
terms of an equivalence relation (even going so far as to say that
nubBy can assume its parameter is one, which I think was a mistake), I
think nubBy has a much more general use. It does a generalised kind of
sieving, specifically,

nubBy f xs is the unique subsequence of xs that:
1) Has the property that if x and y are elements such that x occurs
before y in it, then f x y is False.
2) The sequence of indices of selected elements is lexicographically
minimum for all subsequences satisfying condition 1. (That is, it
always picks the earliest elements possible.)

I think this is how it ought to be specified.

Similarly, groupBy f xs is (and should be) the unique list of
contiguous sublists of xs such that:
1) concat (groupBy f xs) = xs
2) If x is the head of any of the sublists and y is any other element
of that sublist, then f x y
3) The sequence of lengths of the sublists is lexicographically
maximum for all lists satisfying the first two properties (That is, it
always prefers adding elements to an earlier group to starting a new
group.)

 - Cale
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell-cafe] nubBy seems broken in recent GHCs

Duncan Coutts
In reply to this post by Cale Gibbard
On Sat, 2009-06-06 at 18:39 +0200, Bertram Felgenhauer wrote:

> Interesting. This was changed in response to
>
>     http://hackage.haskell.org/trac/ghc/ticket/2528
>
> | Tue Sep  2 11:29:50 CEST 2008  Simon Marlow <[hidden email]>
> |   * #2528: reverse the order of args to (==) in nubBy to match nub
> |   This only makes a difference when the (==) definition is not
> |   reflexive, but strictly speaking it does violate the report definition
> |   of nubBy, so we should fix it.
>
> It turns out that 'elem' differs from the report version and should
> have its comparison reversed. Of course that would only ever matter
> for broken Eq instances.
>
> However, the report also states that the nubBy function may assume that
> the given predicate defines an equivalence relation.
>
>     http://haskell.org/onlinereport/list.html#sect17.6
>
> So I'm not sure there's anything to be fixed here - although backing
> out the above patch probably won't hurt anybody.

Seems to me the obvious solution is to revert the nubBy change and then
fix nub so that so that we still get nub = nubBy (==) (which is what
ticket #2528 was complaining about in the first place).

We need more SmallCheck properties for the List module! When Don and I
were testing our stream versions of the List module we uncovered several
of these little weirdnesses which were underspecified in the report, or
different between the report and common implementations. For example I
seem to recall that genericTake and take do not coincide when their
types coincide.

Duncan

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell-cafe] nubBy seems broken in recent GHCs

Henning Thielemann
In reply to this post by Cale Gibbard

On Tue, 9 Jun 2009, Cale Gibbard wrote:

> Similarly, groupBy f xs is (and should be) the unique list of
> contiguous sublists of xs such that:
> 1) concat (groupBy f xs) = xs
> 2) If x is the head of any of the sublists and y is any other element
> of that sublist, then f x y
> 3) The sequence of lengths of the sublists is lexicographically
> maximum for all lists satisfying the first two properties (That is, it
> always prefers adding elements to an earlier group to starting a new
> group.)

groupBy defined this way was found to be inappropriate for many cases,
like grouping a list into increasing sequences using groupBy (<=). Other
cases can be found in Haskell-Cafe or [hidden email].
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries