Proposal for containers: Add 'lookup' function to Data.Set

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Thomas
Well, "intern" includes a "and if it is not there, add it".  Calling
this "intern" without that behavior would be highly misleading.  That
said, maybe we just want an "intern" function?  It would save us a dip
into the set, for new strings and leave (marginally) less room to make
a mistake.  Are there many cases where we want `lookup` where we don't
actually want `intern`?

On Tue, Jun 28, 2016 at 2:22 AM, Chris Wong <[hidden email]> wrote:

> Python uses "intern", so perhaps that can serve as the name.
>
> (See https://docs.python.org/2/library/functions.html#intern)
>
> On Tue, Jun 28, 2016 at 9:47 AM, David Feuer <[hidden email]> wrote:
>> +1 on the function. -1/2 on the name.
>>
>> On Jun 27, 2016 5:45 PM, "Nicolas Godbout" <[hidden email]>
>> wrote:
>>>
>>>
>>> WHAT
>>>
>>> It is proposed to add a ‘lookup' function on 'Set' in the "containers"
>>> package. Feedback during the next two weeks is welcome.
>>>
>>> The function
>>>
>>> > lookup :: Ord a => a -> Set a -> Maybe a
>>>
>>> is almost indentical to the 'member' function but, in addition, returns
>>> the value
>>> stored in the set.
>>>
>>> WHY
>>>
>>> The point of this proposal is to facilitate program-wide data sharing. The
>>> 'lookup'
>>> function gives access to a pointer to an object already stored in a Set
>>> and equal
>>> to a given argument. The 'lookup' function is a natural extension to the
>>> current
>>> 'lookupLT', 'lookupGT', 'lookupLE' and 'lookupGE' functions, with obvious
>>> semantics.
>>>
>>> Example use case: In a parser, the memory footprint can be reduced by
>>> collapsing
>>> all equal strings to a single instance of each string. To achieve this,
>>> one needs
>>> a way to get a previously seen string (internally, a pointer) equal to a
>>> newly
>>> parsed string. Amazingly, this is very difficult with the current
>>> "containers" library interface.
>>> One current option is to use a Map instead, e.g., 'Map String String'
>>> which stores twice as many pointers as necessary.
>>>
>>> HOW
>>>
>>> The git pull request at
>>> https://github.com/haskell/containers/pull/291
>>> contains the straight-forward implementation of the 'lookup’ function on
>>> 'Set', with test cases,
>>> as a patch against the current containers master branch.
>>>
>>>
>>> Salutations,
>>> Nicolas.
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
>
>
> --
> Chris Wong (https://lambda.xyz)
>
> "I had not the vaguest idea what this meant and when I could not
> remember the words, my tutor threw the book at my head, which did not
> stimulate my intellect in any way." -- Bertrand Russell
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer
I suspect we may want both functions.

On Wed, Jun 29, 2016 at 4:57 PM, David Thomas <[hidden email]> wrote:

> Well, "intern" includes a "and if it is not there, add it".  Calling
> this "intern" without that behavior would be highly misleading.  That
> said, maybe we just want an "intern" function?  It would save us a dip
> into the set, for new strings and leave (marginally) less room to make
> a mistake.  Are there many cases where we want `lookup` where we don't
> actually want `intern`?
>
> On Tue, Jun 28, 2016 at 2:22 AM, Chris Wong <[hidden email]> wrote:
>> Python uses "intern", so perhaps that can serve as the name.
>>
>> (See https://docs.python.org/2/library/functions.html#intern)
>>
>> On Tue, Jun 28, 2016 at 9:47 AM, David Feuer <[hidden email]> wrote:
>>> +1 on the function. -1/2 on the name.
>>>
>>> On Jun 27, 2016 5:45 PM, "Nicolas Godbout" <[hidden email]>
>>> wrote:
>>>>
>>>>
>>>> WHAT
>>>>
>>>> It is proposed to add a ‘lookup' function on 'Set' in the "containers"
>>>> package. Feedback during the next two weeks is welcome.
>>>>
>>>> The function
>>>>
>>>> > lookup :: Ord a => a -> Set a -> Maybe a
>>>>
>>>> is almost indentical to the 'member' function but, in addition, returns
>>>> the value
>>>> stored in the set.
>>>>
>>>> WHY
>>>>
>>>> The point of this proposal is to facilitate program-wide data sharing. The
>>>> 'lookup'
>>>> function gives access to a pointer to an object already stored in a Set
>>>> and equal
>>>> to a given argument. The 'lookup' function is a natural extension to the
>>>> current
>>>> 'lookupLT', 'lookupGT', 'lookupLE' and 'lookupGE' functions, with obvious
>>>> semantics.
>>>>
>>>> Example use case: In a parser, the memory footprint can be reduced by
>>>> collapsing
>>>> all equal strings to a single instance of each string. To achieve this,
>>>> one needs
>>>> a way to get a previously seen string (internally, a pointer) equal to a
>>>> newly
>>>> parsed string. Amazingly, this is very difficult with the current
>>>> "containers" library interface.
>>>> One current option is to use a Map instead, e.g., 'Map String String'
>>>> which stores twice as many pointers as necessary.
>>>>
>>>> HOW
>>>>
>>>> The git pull request at
>>>> https://github.com/haskell/containers/pull/291
>>>> contains the straight-forward implementation of the 'lookup’ function on
>>>> 'Set', with test cases,
>>>> as a patch against the current containers master branch.
>>>>
>>>>
>>>> Salutations,
>>>> Nicolas.
>>>>
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> [hidden email]
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>
>>
>>
>> --
>> Chris Wong (https://lambda.xyz)
>>
>> "I had not the vaguest idea what this meant and when I could not
>> remember the words, my tutor threw the book at my head, which did not
>> stimulate my intellect in any way." -- Bertrand Russell
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Alec
In reply to this post by David Thomas
> Are there many cases where we want `lookup` where we don't actually want `intern`?

Anecdotally, the majority of the times I've wanted `intern` functionality, I've been implementing something full-text-search-like.  This meant having a "load the lexicon" phase, where a set of blessed words get `intern`ed, and then a query phase where I want is `lookup` because my lexicon is doing double duty as a stoplist.  I have no sense of whether this would be a common case.

Also, +1 for `intern`, +1 for `lookup`.  I'm fine with both names.

On Wed, Jun 29, 2016 at 4:57 PM David Thomas <[hidden email]> wrote:
Well, "intern" includes a "and if it is not there, add it".  Calling
this "intern" without that behavior would be highly misleading.  That
said, maybe we just want an "intern" function?  It would save us a dip
into the set, for new strings and leave (marginally) less room to make
a mistake.  Are there many cases where we want `lookup` where we don't
actually want `intern`?

On Tue, Jun 28, 2016 at 2:22 AM, Chris Wong <[hidden email]> wrote:
> Python uses "intern", so perhaps that can serve as the name.
>
> (See https://docs.python.org/2/library/functions.html#intern)
>
> On Tue, Jun 28, 2016 at 9:47 AM, David Feuer <[hidden email]> wrote:
>> +1 on the function. -1/2 on the name.
>>
>> On Jun 27, 2016 5:45 PM, "Nicolas Godbout" <[hidden email]>
>> wrote:
>>>
>>>
>>> WHAT
>>>
>>> It is proposed to add a ‘lookup' function on 'Set' in the "containers"
>>> package. Feedback during the next two weeks is welcome.
>>>
>>> The function
>>>
>>> > lookup :: Ord a => a -> Set a -> Maybe a
>>>
>>> is almost indentical to the 'member' function but, in addition, returns
>>> the value
>>> stored in the set.
>>>
>>> WHY
>>>
>>> The point of this proposal is to facilitate program-wide data sharing. The
>>> 'lookup'
>>> function gives access to a pointer to an object already stored in a Set
>>> and equal
>>> to a given argument. The 'lookup' function is a natural extension to the
>>> current
>>> 'lookupLT', 'lookupGT', 'lookupLE' and 'lookupGE' functions, with obvious
>>> semantics.
>>>
>>> Example use case: In a parser, the memory footprint can be reduced by
>>> collapsing
>>> all equal strings to a single instance of each string. To achieve this,
>>> one needs
>>> a way to get a previously seen string (internally, a pointer) equal to a
>>> newly
>>> parsed string. Amazingly, this is very difficult with the current
>>> "containers" library interface.
>>> One current option is to use a Map instead, e.g., 'Map String String'
>>> which stores twice as many pointers as necessary.
>>>
>>> HOW
>>>
>>> The git pull request at
>>> https://github.com/haskell/containers/pull/291
>>> contains the straight-forward implementation of the 'lookup’ function on
>>> 'Set', with test cases,
>>> as a patch against the current containers master branch.
>>>
>>>
>>> Salutations,
>>> Nicolas.
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
>
>
> --
> Chris Wong (https://lambda.xyz)
>
> "I had not the vaguest idea what this meant and when I could not
> remember the words, my tutor threw the book at my head, which did not
> stimulate my intellect in any way." -- Bertrand Russell
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer

Let me be firm (co-maintainer's prerogative): the function will not be named `lookup`. The current leader in my eyes is Andreas Abel's `lookupEntry`, but I could be convinced to use something else if others prefer

On Jun 29, 2016 5:12 PM, "Alec" <[hidden email]> wrote:
> Are there many cases where we want `lookup` where we don't actually want `intern`?

Anecdotally, the majority of the times I've wanted `intern` functionality, I've been implementing something full-text-search-like.  This meant having a "load the lexicon" phase, where a set of blessed words get `intern`ed, and then a query phase where I want is `lookup` because my lexicon is doing double duty as a stoplist.  I have no sense of whether this would be a common case.

Also, +1 for `intern`, +1 for `lookup`.  I'm fine with both names.

On Wed, Jun 29, 2016 at 4:57 PM David Thomas <[hidden email]> wrote:
Well, "intern" includes a "and if it is not there, add it".  Calling
this "intern" without that behavior would be highly misleading.  That
said, maybe we just want an "intern" function?  It would save us a dip
into the set, for new strings and leave (marginally) less room to make
a mistake.  Are there many cases where we want `lookup` where we don't
actually want `intern`?

On Tue, Jun 28, 2016 at 2:22 AM, Chris Wong <[hidden email]> wrote:
> Python uses "intern", so perhaps that can serve as the name.
>
> (See https://docs.python.org/2/library/functions.html#intern)
>
> On Tue, Jun 28, 2016 at 9:47 AM, David Feuer <[hidden email]> wrote:
>> +1 on the function. -1/2 on the name.
>>
>> On Jun 27, 2016 5:45 PM, "Nicolas Godbout" <[hidden email]>
>> wrote:
>>>
>>>
>>> WHAT
>>>
>>> It is proposed to add a ‘lookup' function on 'Set' in the "containers"
>>> package. Feedback during the next two weeks is welcome.
>>>
>>> The function
>>>
>>> > lookup :: Ord a => a -> Set a -> Maybe a
>>>
>>> is almost indentical to the 'member' function but, in addition, returns
>>> the value
>>> stored in the set.
>>>
>>> WHY
>>>
>>> The point of this proposal is to facilitate program-wide data sharing. The
>>> 'lookup'
>>> function gives access to a pointer to an object already stored in a Set
>>> and equal
>>> to a given argument. The 'lookup' function is a natural extension to the
>>> current
>>> 'lookupLT', 'lookupGT', 'lookupLE' and 'lookupGE' functions, with obvious
>>> semantics.
>>>
>>> Example use case: In a parser, the memory footprint can be reduced by
>>> collapsing
>>> all equal strings to a single instance of each string. To achieve this,
>>> one needs
>>> a way to get a previously seen string (internally, a pointer) equal to a
>>> newly
>>> parsed string. Amazingly, this is very difficult with the current
>>> "containers" library interface.
>>> One current option is to use a Map instead, e.g., 'Map String String'
>>> which stores twice as many pointers as necessary.
>>>
>>> HOW
>>>
>>> The git pull request at
>>> https://github.com/haskell/containers/pull/291
>>> contains the straight-forward implementation of the 'lookup’ function on
>>> 'Set', with test cases,
>>> as a patch against the current containers master branch.
>>>
>>>
>>> Salutations,
>>> Nicolas.
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
>
>
> --
> Chris Wong (https://lambda.xyz)
>
> "I had not the vaguest idea what this meant and when I could not
> remember the words, my tutor threw the book at my head, which did not
> stimulate my intellect in any way." -- Bertrand Russell
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Alec
`lookupEntry` seems fine to me...

\begin{bikeshedding}
but it bugs me a little that (according to the API) Sets have "members" (`member`, `notMember`, and `splitMember`) and "elems" (`elems`, `elemAt`), but (at least as of yet) no "entries".  (In my brief scan of the docs, it appears "element" is the preferred English for "thing that can be in a Set").  Andreas' suggestion applies the name to Map as well, so I buy that "entry" is actually a new notion distinct from "element" or "member"---but if a user isn't expected to have that notion in mind when seeking this functionality, it seems mildly disadvantageous to introduce a new word for "thing that can be in a Set" to the API.
\end{bikeshedding}

On Wed, Jun 29, 2016 at 5:18 PM David Feuer <[hidden email]> wrote:

Let me be firm (co-maintainer's prerogative): the function will not be named `lookup`. The current leader in my eyes is Andreas Abel's `lookupEntry`, but I could be convinced to use something else if others prefer

On Jun 29, 2016 5:12 PM, "Alec" <[hidden email]> wrote:
> Are there many cases where we want `lookup` where we don't actually want `intern`?

Anecdotally, the majority of the times I've wanted `intern` functionality, I've been implementing something full-text-search-like.  This meant having a "load the lexicon" phase, where a set of blessed words get `intern`ed, and then a query phase where I want is `lookup` because my lexicon is doing double duty as a stoplist.  I have no sense of whether this would be a common case.

Also, +1 for `intern`, +1 for `lookup`.  I'm fine with both names.

On Wed, Jun 29, 2016 at 4:57 PM David Thomas <[hidden email]> wrote:
Well, "intern" includes a "and if it is not there, add it".  Calling
this "intern" without that behavior would be highly misleading.  That
said, maybe we just want an "intern" function?  It would save us a dip
into the set, for new strings and leave (marginally) less room to make
a mistake.  Are there many cases where we want `lookup` where we don't
actually want `intern`?

On Tue, Jun 28, 2016 at 2:22 AM, Chris Wong <[hidden email]> wrote:
> Python uses "intern", so perhaps that can serve as the name.
>
> (See https://docs.python.org/2/library/functions.html#intern)
>
> On Tue, Jun 28, 2016 at 9:47 AM, David Feuer <[hidden email]> wrote:
>> +1 on the function. -1/2 on the name.
>>
>> On Jun 27, 2016 5:45 PM, "Nicolas Godbout" <[hidden email]>
>> wrote:
>>>
>>>
>>> WHAT
>>>
>>> It is proposed to add a ‘lookup' function on 'Set' in the "containers"
>>> package. Feedback during the next two weeks is welcome.
>>>
>>> The function
>>>
>>> > lookup :: Ord a => a -> Set a -> Maybe a
>>>
>>> is almost indentical to the 'member' function but, in addition, returns
>>> the value
>>> stored in the set.
>>>
>>> WHY
>>>
>>> The point of this proposal is to facilitate program-wide data sharing. The
>>> 'lookup'
>>> function gives access to a pointer to an object already stored in a Set
>>> and equal
>>> to a given argument. The 'lookup' function is a natural extension to the
>>> current
>>> 'lookupLT', 'lookupGT', 'lookupLE' and 'lookupGE' functions, with obvious
>>> semantics.
>>>
>>> Example use case: In a parser, the memory footprint can be reduced by
>>> collapsing
>>> all equal strings to a single instance of each string. To achieve this,
>>> one needs
>>> a way to get a previously seen string (internally, a pointer) equal to a
>>> newly
>>> parsed string. Amazingly, this is very difficult with the current
>>> "containers" library interface.
>>> One current option is to use a Map instead, e.g., 'Map String String'
>>> which stores twice as many pointers as necessary.
>>>
>>> HOW
>>>
>>> The git pull request at
>>> https://github.com/haskell/containers/pull/291
>>> contains the straight-forward implementation of the 'lookup’ function on
>>> 'Set', with test cases,
>>> as a patch against the current containers master branch.
>>>
>>>
>>> Salutations,
>>> Nicolas.
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> [hidden email]
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
>
>
> --
> Chris Wong (https://lambda.xyz)
>
> "I had not the vaguest idea what this meant and when I could not
> remember the words, my tutor threw the book at my head, which did not
> stimulate my intellect in any way." -- Bertrand Russell
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Bardur Arantsson-2
On 06/29/2016 11:48 PM, Alec wrote:

> `lookupEntry` seems fine to me...
>
> \begin{bikeshedding}
> but it bugs me a little that (according to the API) Sets have "members"
> (`member`, `notMember`, and `splitMember`) and "elems" (`elems`,
> `elemAt`), but (at least as of yet) no "entries".  (In my brief scan of
> the docs, it appears "element" is the preferred English for "thing that
> can be in a Set").  Andreas' suggestion applies the name to Map as well,
> so I buy that "entry" is actually a new notion distinct from "element"
> or "member"---but if a user isn't expected to have that notion in mind
> when seeking this functionality, it seems mildly disadvantageous to
> introduce a new word for "thing that can be in a Set" to the API.
> \end{bikeshedding}
>

"I love bikesheds! Don't you!?" :)

I can only agree that it's a little incongrouous... but I'll defer to
maintainer's privilege in the interest of getting things done :).

Regards,

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer
Paint it what you like. Just not `lookup` :-).

On Wed, Jun 29, 2016 at 6:06 PM, Bardur Arantsson <[hidden email]> wrote:

> On 06/29/2016 11:48 PM, Alec wrote:
>> `lookupEntry` seems fine to me...
>>
>> \begin{bikeshedding}
>> but it bugs me a little that (according to the API) Sets have "members"
>> (`member`, `notMember`, and `splitMember`) and "elems" (`elems`,
>> `elemAt`), but (at least as of yet) no "entries".  (In my brief scan of
>> the docs, it appears "element" is the preferred English for "thing that
>> can be in a Set").  Andreas' suggestion applies the name to Map as well,
>> so I buy that "entry" is actually a new notion distinct from "element"
>> or "member"---but if a user isn't expected to have that notion in mind
>> when seeking this functionality, it seems mildly disadvantageous to
>> introduce a new word for "thing that can be in a Set" to the API.
>> \end{bikeshedding}
>>
>
> "I love bikesheds! Don't you!?" :)
>
> I can only agree that it's a little incongrouous... but I'll defer to
> maintainer's privilege in the interest of getting things done :).
>
> Regards,
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

wren romano-2
FWIW, as someone who often interns strings in her programs, I'd
suggest *not* naming it "intern" (or variations thereon). One issue is
the "and store it if it's not there yet" issue already mentioned. But
another —and imo more important— issue is that one of the main goals
of interning things is so that you can get a fast equality check. In
general whenever we intern things we're going to return some token,
which supports fast equality, and takes up as little memory as
possible, but which can later on be cashed in for the original
(shared) value if we actually need it. In impure languages we can use
the pointer to the object as our token, but we can't do that (without
serious grossness) in Haskell so we'd return some other token like an
integer, perhaps with a newtype wrapper to keep track of its
providence so we can avoid mixing up tokens from different intern
tables.

I already have an implementation of intern tables for ByteString in my
local libraries; it'd be trivial to extend that to intern tables for
String or Text or anything else supporting either (a) equality and
hashability, or (b) orderability. IMO, if users want an intern table
then they should ask for one, they shouldn't use some ad hoc
combination of other data types which obfuscates the true purpose.
Adding this to containers should be proposed on a separate thread, but
I'm entirely willing to do it if people actually want it

--
Live well,
~wren
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Nicolas Godbout

Here is recap of the numerous answers to this proposal one week into voting.
As a reminder, the original proposal is to add the following to Data.Set

    lookup :: Ord a => a -> Set a -> Maybe a

There is essentially unconditional support for inclusion of such a function. The debate centers around the name given to the function. There were quite a number of +1 votes for the ‘lookup’ name as is. There were also quite a number of valid objections, in particular from the "containers" package containers. The final name will _not_ be 'lookup', it remains to decide what the name is.

The following names have been floated so far, together with opinions (expressed on the list and my own).

* lookupEQ
    pro: it fits in the lookupGT, lookupLT, lookupGE, lookupLE  cluster of existing Data.Set functions
    con: the cluster set of functions have atypical names and don't seem to generate any enthusiasm on the list

* find
    pro: closer to the semantics, similar to Data.Map.findWithDefault
    con: nothing to do with the signature of Data.List.find
In my opinion, this one is dead.

* lookupSharedPointer, lookupWithSharing, lookupIntern
    pro: express the intention of returning a pointer to enable sharing
    con: new pattern for libraries, explicitly mentions pointers which is decidedly un-Haskell-like
My strong vote goes to eliminating these. Pointer behaviour is fairly obvious to expert Haskellers, but should not be mentioned to beginners. Let them ask on the Haskell mailing list why there is a 'lookup'-like function on Sets and let us just briefly mention the sharing behavior in the Haddock docs.

* lookupEntry, lookupKey, lookupWithKey
   pro: These names are in the spirit of "container" functions.
   con: In the case of 'Entry', it introduces a new concept distinct from a member or element.
These are the names deserving discussion and voting. There is already a (+1) for 'lookupEntry'
It was mentioned that a pattern emerges with 'insertWithKey', 'adjustWithKey', 'updateWithKey' functions from Data.Map
     > lookupWithKey :: Ord k => k -> Map a -> Maybe (k,a)
suggesting the current proposal to converge to
     > lookupWithKey :: Ord a => a -> Set a -> Maybe a

As a side note, it was noted in several replies that all relevant "container" lookup functions should come with the guarantee that the copy from the container is returned. I vote (+1) on that!

My personal take on the current matter is that whatever we choose should be compatible with Data.List.
    > lookup??? :: Ord a => [a] -> a -> Maybe a
    > lookup??? :: Ord a => a -> Set a -> Maybe a
    > lookup??? :: Ord k => k -> Map k a -> Maybe (k,a)
where the element returned is the one from the container, enabling sharing. This kills the name 'lookup' since it is already taken in Data.List. What seems to make the most sense is 'lookupEntry', which defines the concept of 'entry' as whatever is logically stored in the container. A Map is therefore defined as logically storing key-value pairs, exactly as in an association list.

+1 on 'lookupEntry'


Nicolas.



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Nicolas Godbout
In reply to this post by wren romano-2
Correction, there are several errors in the signatures at the end of my last message.
    > lookup??? :: Eq a => a -> [a] -> Maybe a
    > lookup??? :: Ord a => a -> Set a -> Maybe a
Also, copying the pattern for maps gives a slightly stranger signature
    > lookup??? :: (Ord k, Eq a) => (k,a) -> Map k a -> Maybe (k,a)
which combines a key lookup with an equality test on the value.

Nicolas.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer
In reply to this post by Nicolas Godbout

I think it bears mentioning that there's no need to discuss pointers to explain these functions. == in general may be coarser than structural equality, since most interesting data structures can represent the same abstract object in more than one way. So we need only explain that the result is the element stored in the set that is == to the requested value.

On Jul 4, 2016 11:32 AM, "Nicolas Godbout" <[hidden email]> wrote:

Here is recap of the numerous answers to this proposal one week into voting.
As a reminder, the original proposal is to add the following to Data.Set

    lookup :: Ord a => a -> Set a -> Maybe a

There is essentially unconditional support for inclusion of such a function. The debate centers around the name given to the function. There were quite a number of +1 votes for the ‘lookup’ name as is. There were also quite a number of valid objections, in particular from the "containers" package containers. The final name will _not_ be 'lookup', it remains to decide what the name is.

The following names have been floated so far, together with opinions (expressed on the list and my own).

* lookupEQ
    pro: it fits in the lookupGT, lookupLT, lookupGE, lookupLE  cluster of existing Data.Set functions
    con: the cluster set of functions have atypical names and don't seem to generate any enthusiasm on the list

* find
    pro: closer to the semantics, similar to Data.Map.findWithDefault
    con: nothing to do with the signature of Data.List.find
In my opinion, this one is dead.

* lookupSharedPointer, lookupWithSharing, lookupIntern
    pro: express the intention of returning a pointer to enable sharing
    con: new pattern for libraries, explicitly mentions pointers which is decidedly un-Haskell-like
My strong vote goes to eliminating these. Pointer behaviour is fairly obvious to expert Haskellers, but should not be mentioned to beginners. Let them ask on the Haskell mailing list why there is a 'lookup'-like function on Sets and let us just briefly mention the sharing behavior in the Haddock docs.

* lookupEntry, lookupKey, lookupWithKey
   pro: These names are in the spirit of "container" functions.
   con: In the case of 'Entry', it introduces a new concept distinct from a member or element.
These are the names deserving discussion and voting. There is already a (+1) for 'lookupEntry'
It was mentioned that a pattern emerges with 'insertWithKey', 'adjustWithKey', 'updateWithKey' functions from Data.Map
     > lookupWithKey :: Ord k => k -> Map a -> Maybe (k,a)
suggesting the current proposal to converge to
     > lookupWithKey :: Ord a => a -> Set a -> Maybe a

As a side note, it was noted in several replies that all relevant "container" lookup functions should come with the guarantee that the copy from the container is returned. I vote (+1) on that!

My personal take on the current matter is that whatever we choose should be compatible with Data.List.
    > lookup??? :: Ord a => [a] -> a -> Maybe a
    > lookup??? :: Ord a => a -> Set a -> Maybe a
    > lookup??? :: Ord k => k -> Map k a -> Maybe (k,a)
where the element returned is the one from the container, enabling sharing. This kills the name 'lookup' since it is already taken in Data.List. What seems to make the most sense is 'lookupEntry', which defines the concept of 'entry' as whatever is logically stored in the container. A Map is therefore defined as logically storing key-value pairs, exactly as in an association list.

+1 on 'lookupEntry'


Nicolas.



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer
In reply to this post by Nicolas Godbout

What we want for Map is definitely not that, but rather

??? :: Ord k => k -> Map k v -> Maybe (k, v)

In each case we look up a key to retrieve an "entry" (whether we want that terminology or not). In the case of a Map, the entry is a key-value pair; in the case of a Set it is merely a key.

On Jul 4, 2016 11:48 AM, "Nicolas Godbout" <[hidden email]> wrote:
Correction, there are several errors in the signatures at the end of my last message.
    > lookup??? :: Eq a => a -> [a] -> Maybe a
    > lookup??? :: Ord a => a -> Set a -> Maybe a
Also, copying the pattern for maps gives a slightly stranger signature
    > lookup??? :: (Ord k, Eq a) => (k,a) -> Map k a -> Maybe (k,a)
which combines a key lookup with an equality test on the value.

Nicolas.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Nicolas Godbout


Le 4 juil. 2016 à 11:53, David Feuer <[hidden email]> a écrit :

What we want for Map is definitely not that, but rather

??? :: Ord k => k -> Map k v -> Maybe (k, v)

In each case we look up a key to retrieve an "entry" (whether we want that terminology or not). In the case of a Map, the entry is a key-value pair; in the case of a Set it is merely a key.

Agreed.

However, I am still scratching my head about how the functions we are considering would behave on Lists. The existence of 'Data.List.lookup' also muddles the issues since it works on association lists and not just any list.

The challenge: find a name '???' such that all three of the following make sense:

   ??? :: Ord a => a -> Set a -> Maybe a
   ??? :: Eq a => a -> [a] -> Maybe a
   ??? :: Ord k => k -> Map k v -> Maybe (k, v)

A name such a 'lookupKey' looks weird on lists and sets. It is not quite a "containers" problem, but it would be nice to be compatible with "base" names.

Under this spotlight, what makes sense is 'lookupElem' where elements of lists and sets are obvious, and the elements of a Map are defined as the key-value pairs. Data.Map currently has no function on elements, so this avenue seems open and viable.



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer

I barely care about Data.List association lists. They have no business being in Data.List at all. If compatibility with association lists gets in the way of a good interface, I'll take the good interface anyway.

On Jul 4, 2016 12:18 PM, "Nicolas Godbout" <[hidden email]> wrote:


Le 4 juil. 2016 à 11:53, David Feuer <[hidden email]> a écrit :

What we want for Map is definitely not that, but rather

??? :: Ord k => k -> Map k v -> Maybe (k, v)

In each case we look up a key to retrieve an "entry" (whether we want that terminology or not). In the case of a Map, the entry is a key-value pair; in the case of a Set it is merely a key.

Agreed.

However, I am still scratching my head about how the functions we are considering would behave on Lists. The existence of 'Data.List.lookup' also muddles the issues since it works on association lists and not just any list.

The challenge: find a name '???' such that all three of the following make sense:

   ??? :: Ord a => a -> Set a -> Maybe a
   ??? :: Eq a => a -> [a] -> Maybe a
   ??? :: Ord k => k -> Map k v -> Maybe (k, v)

A name such a 'lookupKey' looks weird on lists and sets. It is not quite a "containers" problem, but it would be nice to be compatible with "base" names.

Under this spotlight, what makes sense is 'lookupElem' where elements of lists and sets are obvious, and the elements of a Map are defined as the key-value pairs. Data.Map currently has no function on elements, so this avenue seems open and viable.



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

David Feuer
In reply to this post by Nicolas Godbout

To add to the summary, we also considered a function to insert an element into a set or a pair into a map preserving the existing key. For maps, we'd want a variant of insertWithKey, I suppose. TBH, I find the current insertion behavior perplexing. Perhaps it would make sense to add modules with more sensible insertion functions? How much risk is there of breaking things by changing the behavior of the current functions to *preserve* elements/keys rather than replacing them, and making insertWithKey give the passed function the stored key rather than the given one?

On Jul 4, 2016 11:32 AM, "Nicolas Godbout" <[hidden email]> wrote:

Here is recap of the numerous answers to this proposal one week into voting.
As a reminder, the original proposal is to add the following to Data.Set

    lookup :: Ord a => a -> Set a -> Maybe a

There is essentially unconditional support for inclusion of such a function. The debate centers around the name given to the function. There were quite a number of +1 votes for the ‘lookup’ name as is. There were also quite a number of valid objections, in particular from the "containers" package containers. The final name will _not_ be 'lookup', it remains to decide what the name is.

The following names have been floated so far, together with opinions (expressed on the list and my own).

* lookupEQ
    pro: it fits in the lookupGT, lookupLT, lookupGE, lookupLE  cluster of existing Data.Set functions
    con: the cluster set of functions have atypical names and don't seem to generate any enthusiasm on the list

* find
    pro: closer to the semantics, similar to Data.Map.findWithDefault
    con: nothing to do with the signature of Data.List.find
In my opinion, this one is dead.

* lookupSharedPointer, lookupWithSharing, lookupIntern
    pro: express the intention of returning a pointer to enable sharing
    con: new pattern for libraries, explicitly mentions pointers which is decidedly un-Haskell-like
My strong vote goes to eliminating these. Pointer behaviour is fairly obvious to expert Haskellers, but should not be mentioned to beginners. Let them ask on the Haskell mailing list why there is a 'lookup'-like function on Sets and let us just briefly mention the sharing behavior in the Haddock docs.

* lookupEntry, lookupKey, lookupWithKey
   pro: These names are in the spirit of "container" functions.
   con: In the case of 'Entry', it introduces a new concept distinct from a member or element.
These are the names deserving discussion and voting. There is already a (+1) for 'lookupEntry'
It was mentioned that a pattern emerges with 'insertWithKey', 'adjustWithKey', 'updateWithKey' functions from Data.Map
     > lookupWithKey :: Ord k => k -> Map a -> Maybe (k,a)
suggesting the current proposal to converge to
     > lookupWithKey :: Ord a => a -> Set a -> Maybe a

As a side note, it was noted in several replies that all relevant "container" lookup functions should come with the guarantee that the copy from the container is returned. I vote (+1) on that!

My personal take on the current matter is that whatever we choose should be compatible with Data.List.
    > lookup??? :: Ord a => [a] -> a -> Maybe a
    > lookup??? :: Ord a => a -> Set a -> Maybe a
    > lookup??? :: Ord k => k -> Map k a -> Maybe (k,a)
where the element returned is the one from the container, enabling sharing. This kills the name 'lookup' since it is already taken in Data.List. What seems to make the most sense is 'lookupEntry', which defines the concept of 'entry' as whatever is logically stored in the container. A Map is therefore defined as logically storing key-value pairs, exactly as in an association list.

+1 on 'lookupEntry'


Nicolas.



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for containers: Add 'lookup' function to Data.Set

Ben Millwood
In reply to this post by Henning Thielemann
I agree with Henning, and moreover with wren's point that Data.Set is an
often-unsatisfactory data structure for interning in general. Map can do
the job more obviously and more comprehensively, at the cost of a few
more pointers -- and when has Haskell ever been about trading clarity
for fewer pointers?

I don't think I'll formally vote, since I haven't written any Haskell in
months, but I think the addition of this at-first-glance
seemingly-useless function adds surprise and a little bit of
headscratching to the user interface as a whole, to the extent that I
feel like it has a "weirdness cost" even if most users can just ignore
it. It also complicates (in my mind) the semantic story of Data.Set,
from something purely about a collection of elements and a membership
test to something which has specific opinions on which of two notionally
"equal" things are most appropriate in a given situation. I think the
cost of turning Data.Set from a simple thing into a complicated thing is
a subtle one that should not totally be overlooked -- API simplicity is
a virtue in itself.

Add to that Henning and wren's evidence, which I interpret as saying
that this function isn't quite the *perfect* tool for seemingly *any*
job, and it doesn't seem worth the cognitive / maintenance cost to me.

(I perhaps do not expect these considerations to be very strong or very
important, but I think it would be unwise to ignore them altogether.)

On Tue, Jun 28, 2016 at 11:19:16AM +0200, Henning Thielemann wrote:

>
>On Mon, 27 Jun 2016, Nicolas Godbout wrote:
>
>>Example use case: In a parser, the memory footprint can be reduced by collapsing
>>all equal strings to a single instance of each string. To achieve this, one needs
>>a way to get a previously seen string (internally, a pointer) equal to a newly
>>parsed string. Amazingly, this is very difficult with the current "containers" library interface.
>>One current option is to use a Map instead, e.g., 'Map String String'
>>which stores twice as many pointers as necessary.
>
>I think I'd prefer the (Map String String) variant to an obscure Set
>lookup function. I wonder whether you will later use the Map anyway,
>as the compiler grows and you need to attach more data to tokens. In
>order to make your intent clearer you might define
>
>  newtype SharedToken = SharedToken String
>
>and use (Map String SharedToken).
>_______________________________________________
>Libraries mailing list
>[hidden email]
>http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
12