Proposal: add ordNub somewhere in containers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add ordNub somewhere in containers

Henning Thielemann

On Wed, 18 Oct 2017, Gershom B wrote:

> Vis a vis NonEmpty, base provides a partial `fromList`, much as I
> dislike it. Hence, the proper "roll your own" is `fromList . ordNub .
> toList`.

My idea of NonEmpty is to prove totality by implementation. I'd prefer a
custom obviously total implementation of NonEmpty.ordNub.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add ordNub somewhere in containers

Andreas Abel-2
In reply to this post by David Feuer
Your current proposal sounds fine, I think it is safe to go ahead.

 > ordNubBy :: (a -> a -> Ordering) -> [a] -> [a]

can be taken at a second step, if there is no consensus now.

Personally, I would include it.

On 18.10.2017 19:49, David Feuer wrote:

> I am convinced that we should add
>
> ordNub :: Ord a => [a] -> [a]
> ordNubOn :: Ord b => (a -> b) -> [a] -> [b]
> intNub :: [Int] -> [Int]
> intNubOn :: (a -> Int) -> [a] -> [a]
>
> And because nub preserves non-emptiness, I believe we should also offer
>
> ordNub1 :: Ord a => NonEmpty a -> NonEmpty a
> ordNubOn1 :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
> intNub1 :: NonEmpty Int -> NonEmpty Int
> intNubOn1 :: (a -> Int) -> NonEmpty a -> NonEmpty a
>
> I imagine we should also add these operations for Data.Sequence.Seq.
>
> I'm not yet convinced that we should add
>
> ordNubBy :: (a -> a -> Ordering) -> [a] -> [a]
>
> but I'm open to further discussion of that question. My main concern is
> that the properties of the comparison argument require careful
> documentation. In its favor, using it improperly cannot *expose* a
> broken Set to later operations.
>
> I would very much like to hear further bikeshedding around names and
> namespaces.
>
> On Oct 16, 2017 6:18 PM, "Gershom B" <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     There have been many discussions over the years about adding an
>     efficient order preserving nub somewhere to our core libraries. It
>     always comes down to the same issue: an efficient nub wants to be
>     backed by an efficient `Set`, but the API of the `nub` itself doesn't
>     make reference to any other data structures besides lists. So it feels
>     a bit conceptually strange to put an efficient nub anywhere besides
>     `Data.List` even though it can't go there without inverting our
>     dependency tree in a weird way or inlining an efficient set
>     implementation into the middle of it.
>
>     Nonetheless, the convenience of having a good `nub` lying around in a
>     core library is undeniable, and after writing the "usual" one in my
>     code for the zillionth time, I decided to raise an issue about it:
>
>     https://github.com/haskell/containers/issues/439
>     <https://github.com/haskell/containers/issues/439>
>
>     I was promptly directed here to make a proper proposal.
>
>     So, here:
>
>     1) I propose two new functions,
>
>     `ordNub` and `intNub`
>
>     with the standard implementation (from
>     https://github.com/nh2/haskell-ordnub
>     <https://github.com/nh2/haskell-ordnub>):
>
>     import qualified Data.Set as Set
>
>     ordNub :: (Ord a) => [a] -> [a]
>     ordNub l = go Set.empty l
>        where
>          go _ [] = []
>          go s (x:xs) = if x `Set.member` s then go s xs
>                                            else x : go (Set.insert x s) xs
>
>     and the same implementation, but specialized to `Int` and using
>     `IntSet`s.
>
>     The rationale for the names is that the former has a long history of
>     use in folklore, and the latter is the obvious specialization of it.
>
>     2) I propose these functions be added to a new module in the
>     `containers` library: `Data.Containers.ListUtils`. This can also
>     potentially in the future add efficient list intersection, etc. as
>     documented on the above reference link.
>
>     The rationale for the new module is that it can provide a meaningful
>     home for such functions which operate on lists, but require other data
>     structures to be implemented efficiently...
>
>     Discussion period: 2 weeks.
>
>     --Gershom
>     _______________________________________________
>     Libraries mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
>
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>


--
Andreas Abel  <><      Du bist der geliebte Mensch.

Department of Computer Science and Engineering
Chalmers and Gothenburg University, Sweden

[hidden email]
http://www.cse.chalmers.se/~abela/
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add nubOrd somewhere in containers

Joachim Breitner-2
In reply to this post by David Feuer
Hi,

Am Mittwoch, den 18.10.2017, 13:49 -0400 schrieb David Feuer:
> I am convinced that we should add
>
> ordNub :: Ord a => [a] -> [a]
> ordNubOn :: Ord b => (a -> b) -> [a] -> [b]
> intNub :: [Int] -> [Int]
> intNubOn :: (a -> Int) -> [a] -> [a]

can we shed the names a bit? I think the operation “nubbing” should be
first, and the details (”…On”, “…By”, “…OrdOn”, “Ord”, “Int”) should be
the second component. This would also have the advantage of listing all
”nub” variants together in an index.

So:

nubOrd   :: Ord a => [a] -> [a]
nubOrdOn :: Ord b => (a -> b) -> [a] -> [b]
nubInt   :: [Int] -> [Int]
nubIntOn :: (a -> Int) -> [a] -> [a]

This is consistent with, say “map” vs. “mapM” and other suffix-based
naming in the libraries.


Greetings,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add nubOrd somewhere in containers

Elliot Cameron-2
That's also consistent with the commonly used "extra" package: https://hackage.haskell.org/package/extra-1.6/docs/Data-List-Extra.html#v:nubOrd

On Fri, Oct 20, 2017 at 10:14 AM, Joachim Breitner <[hidden email]> wrote:
Hi,

Am Mittwoch, den 18.10.2017, 13:49 -0400 schrieb David Feuer:
> I am convinced that we should add
>
> ordNub :: Ord a => [a] -> [a]
> ordNubOn :: Ord b => (a -> b) -> [a] -> [b]
> intNub :: [Int] -> [Int]
> intNubOn :: (a -> Int) -> [a] -> [a]

can we shed the names a bit? I think the operation “nubbing” should be
first, and the details (”…On”, “…By”, “…OrdOn”, “Ord”, “Int”) should be
the second component. This would also have the advantage of listing all
”nub” variants together in an index.

So:

nubOrd   :: Ord a => [a] -> [a]
nubOrdOn :: Ord b => (a -> b) -> [a] -> [b]
nubInt   :: [Int] -> [Int]
nubIntOn :: (a -> Int) -> [a] -> [a]

This is consistent with, say “map” vs. “mapM” and other suffix-based
naming in the libraries.


Greetings,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add ordNub somewhere in containers

Gershom Bazerman
In reply to this post by David Feuer
The discussion period is well over and it seems the consensus is for
adding this, basically as David proposes.  -- i.e. ordNub, and
ordNubOn, but not ordNubBy. (and also the -1 variants and the int-
variants). I think the only module name proposed to add them to is
Data.Containers.ListUtils, so I guess that stands.

David -- are you up for adding these, or would you like someone else
(i.e. me) to prepare a patch?

--Gershom


On Wed, Oct 18, 2017 at 1:49 PM, David Feuer <[hidden email]> wrote:

> I am convinced that we should add
>
> ordNub :: Ord a => [a] -> [a]
> ordNubOn :: Ord b => (a -> b) -> [a] -> [b]
> intNub :: [Int] -> [Int]
> intNubOn :: (a -> Int) -> [a] -> [a]
>
> And because nub preserves non-emptiness, I believe we should also offer
>
> ordNub1 :: Ord a => NonEmpty a -> NonEmpty a
> ordNubOn1 :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
> intNub1 :: NonEmpty Int -> NonEmpty Int
> intNubOn1 :: (a -> Int) -> NonEmpty a -> NonEmpty a
>
> I imagine we should also add these operations for Data.Sequence.Seq.
>
> I'm not yet convinced that we should add
>
> ordNubBy :: (a -> a -> Ordering) -> [a] -> [a]
>
> but I'm open to further discussion of that question. My main concern is that
> the properties of the comparison argument require careful documentation. In
> its favor, using it improperly cannot *expose* a broken Set to later
> operations.
>
> I would very much like to hear further bikeshedding around names and
> namespaces.
>
> On Oct 16, 2017 6:18 PM, "Gershom B" <[hidden email]> wrote:
>>
>> There have been many discussions over the years about adding an
>> efficient order preserving nub somewhere to our core libraries. It
>> always comes down to the same issue: an efficient nub wants to be
>> backed by an efficient `Set`, but the API of the `nub` itself doesn't
>> make reference to any other data structures besides lists. So it feels
>> a bit conceptually strange to put an efficient nub anywhere besides
>> `Data.List` even though it can't go there without inverting our
>> dependency tree in a weird way or inlining an efficient set
>> implementation into the middle of it.
>>
>> Nonetheless, the convenience of having a good `nub` lying around in a
>> core library is undeniable, and after writing the "usual" one in my
>> code for the zillionth time, I decided to raise an issue about it:
>>
>> https://github.com/haskell/containers/issues/439
>>
>> I was promptly directed here to make a proper proposal.
>>
>> So, here:
>>
>> 1) I propose two new functions,
>>
>> `ordNub` and `intNub`
>>
>> with the standard implementation (from
>> https://github.com/nh2/haskell-ordnub):
>>
>> import qualified Data.Set as Set
>>
>> ordNub :: (Ord a) => [a] -> [a]
>> ordNub l = go Set.empty l
>>   where
>>     go _ [] = []
>>     go s (x:xs) = if x `Set.member` s then go s xs
>>                                       else x : go (Set.insert x s) xs
>>
>> and the same implementation, but specialized to `Int` and using `IntSet`s.
>>
>> The rationale for the names is that the former has a long history of
>> use in folklore, and the latter is the obvious specialization of it.
>>
>> 2) I propose these functions be added to a new module in the
>> `containers` library: `Data.Containers.ListUtils`. This can also
>> potentially in the future add efficient list intersection, etc. as
>> documented on the above reference link.
>>
>> The rationale for the new module is that it can provide a meaningful
>> home for such functions which operate on lists, but require other data
>> structures to be implemented efficiently...
>>
>> Discussion period: 2 weeks.
>>
>> --Gershom
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add ordNub somewhere in containers

Henning Thielemann

On Mon, 13 Nov 2017, Gershom B wrote:

> The discussion period is well over and it seems the consensus is for
> adding this, basically as David proposes.  -- i.e. ordNub, and
> ordNubOn, but not ordNubBy. (and also the -1 variants and the int-
> variants). I think the only module name proposed to add them to is
> Data.Containers.ListUtils, so I guess that stands.

I think there should be some more thoughts on a good module name. E.g. use
of plural (Utils vs. Util) - singular makes more sense in qualified
identifiers, use of abbreviations (Util vs. Utility) - we have 'import ...
as' for abbreviations, consistency with other data structures - where
would we put ordNub for Seq, in Data.Container.SequenceUtility or in
Data.Sequence?
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
12