Add NonEmptyMap and NonEmptySet to containers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Zemyla
A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just Single or Deep.

On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]> wrote:
I don't see the benefit there, unless you see a way to work it into the representation.

On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]> wrote:
As long as we're doing this, can we also add NonEmptySeq as well?

On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]> wrote:
I'm -1 on any kind of Map = NEMap.

An ordinary map and a non-empty map are semantically different. I believe that if I non-empty maps were already in containers, I would pretty much always care whether a Map I see in code is a 0-map or 1-map.

Similarly, I prefer Int and Word instead of Int and Unsigned.Int. (Luckily that's already the case.)

We already have a precedent with Text and ByteString, where the lazy and the strict versions are only distinguished by the module prefix. In my experience, modules where both are used are pretty common, and I end up just introducing type LByteString = Lazy.ByteString in all my projects, because otherwise I need to scroll to the imports section whenever I need to know which flavor of bytestring is being used. (Or if I'm reading haddocks, I have to look at the link because Haddock hides module prefixes.)

"why not both" is even worse. I still can't trust the Map, but now I also have to learn and remember that two modules are the same. Speaking from experience again – most people seem to be surprised by the fact that Data.Map.Lazy and Data.Map.Strict export the same Map type. The proposed module hierarchy would move containers to the top of my "packages that confuse beginners" list, beating even aeson.

As an aside, I wish we had a proper interface for container-like structures, or at least a solution to name scoping. I really like the way Rust does it, for instance, where certain functions can be "attached" to a type – I'm hesitant to call them "methods" because Rust is not an OOP language.
On Apr 25 2019, at 2:49 pm, Mario Blažević <[hidden email]> wrote:
On 2019-04-18 11:00 p.m., David Feuer wrote:
I'm in favor of the proposal. I find the isomorphism between Map (a,b) v
and Map a (NonemptyMap b v) very pleasant. The fact that others have
written less-performant implementations of this idea is rather
convincing. The fact that doing this removes partial matches in the
implementation is nice. And I'll take performance improvements where I
can get them. The main question is the proper name of the type. Just
Data.Map.Nonempty.Map, or .NonemptyMap? Should the empty be capitalized?

There seems to be a consensus for Data.Map.NonEmpty.NEMap, with the
type and the functions slightly off the regular ones. This design would
make it easier to use regular and non-empty containers together, but it
be annoying for the use case of replacing all uses of an existing
regular container with a non-empty one. I'd rather change just the
import declaration than all occurrences of the type name and functions.

I don't want to derail the implementation with bikeshedding, so I'm
just going to ask why not both? The library can both export the tweaked
names and add a module, say Data.NonEmpty.Map.Lazy, that exports the
type synonym Map = NEMap. It would also rename all the functions back to
their names from Data.Map.Lazy.



On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson

adding non-empty variants of Map and Set, analogous to
Data.List.NonEmpty for List, to containers. semigroupoids
demonstrates the many uses and structure of non-empty containers in
general, and libraries such as
interest in non-empty maps and sets in particular. My favorite
use-case is that they're needed to "curry" containers: for example,
|Map (k0, k1) v| is isomorphic not to |Map k0 (Map k1 v)| but to
|Map k0 (NonEmptyMap k1 v)|. I like this use-case because it comes
from the containers themselves.

Importantly, there's no good way to do this outside of containers;
doing so leads to imbalancing / extra indirection, or massive code
duplication. If one wraps the container was an extra value like
Data.List.NonEmpty, one's left with an unavoidable extra
indirection/imbalance. One can rectify this by copying and modifying
the implementation of containers, but that's hardly maintainable;
even as though the algorithms are the same, enough lines are touched
that merging upstream containers is nigh impossible.

On the other hand, the non-empty containers can be elegantly and
sufficiently implemented alongside their originals by taking the Bin
constructor and breaking it out into it's own type, mutually
recursive with the original. This avoids the indirection/imbalancing
and code duplication problems: the algorithms work exactly as before
creating the same trees (remember the UNPACK), and no code
duplicated since the functions become mutually recursive matching
the types.

To briefly summarize the thread:

1. I proposed the issue after performing this same refactor on the
dependent-map package:
a fork of containers.
changes the types, to make sure UNPACK preserved the importance.
the benchmarks showed rather than degrading performance, PR 616
actually /improved/ it.

 If there is preliminary consensus, I'll make a second PR on top
which generalizes the functions like on my dependent-map branch.

Thanks,

John

_______________________________________________
Libraries mailing list


_______________________________________________
Libraries mailing list

_______________________________________________
Libraries mailing list
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

John Ericson-2
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:

> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     <[hidden email]> wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

David Feuer
No, a Seq can be Empty at the bottom too. It's definitely not a mutually recursive situation like Set or Map, and it's not immediately a top-level one like IntSet or IntMap. A nonempty sequence type would need somewhat different functions for everything. In most cases, the changes seem fairly straightforward. I'm not sure about weird functions like inits and tails. Some extra care might be required for replicate.

I believe it would be possible to restructure things to make the distinction a top-level one, by using a possibly-empty type for the top and a nonempty one below. Zemyla has already identified some very solid unrelated reasons to want to separate the tops from the rest, but I'm somewhat concerned about source code duplication with that general approach. Even if we do that, it's not clear to me that we can make a non-empty/possibly-empty distinction without incurring a performance penalty.

On Thu, Apr 25, 2019, 1:17 PM John Ericson <[hidden email]> wrote:
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:
> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     <[hidden email]> wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

John Ericson-2

Thanks for the clarification, David. Sounds like a good wait-and-see thing: after this lands, Zemyla's proposal and the experience with NonEmpty{Map,Set} can inform what Seq should end up looking like.

John

On 4/25/19 1:28 PM, David Feuer wrote:
No, a Seq can be Empty at the bottom too. It's definitely not a mutually recursive situation like Set or Map, and it's not immediately a top-level one like IntSet or IntMap. A nonempty sequence type would need somewhat different functions for everything. In most cases, the changes seem fairly straightforward. I'm not sure about weird functions like inits and tails. Some extra care might be required for replicate.

I believe it would be possible to restructure things to make the distinction a top-level one, by using a possibly-empty type for the top and a nonempty one below. Zemyla has already identified some very solid unrelated reasons to want to separate the tops from the rest, but I'm somewhat concerned about source code duplication with that general approach. Even if we do that, it's not clear to me that we can make a non-empty/possibly-empty distinction without incurring a performance penalty.

On Thu, Apr 25, 2019, 1:17 PM John Ericson [hidden email] wrote:
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:
> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     [hidden email] wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Zemyla
We could have a GADT for the FingerTree:

data Emptiness = E | NE

data FingerTree e a where
  EmptyT :: FingerTree 'E a
  SingleT :: a -> FingerTree e a
  DeepT :: !(Digit a) -> FingerTree e' (Node a) -> !(Digit a) -> FingerTree e a

And because Seq is a newtype wrapper around FingerTree (Elem a) anyway, Seq and NonEmptySeq would become newtype wrappers around FingerTree 'E (Elem a) and FingerTree 'NE (Elem a). This avoids the majority of the code duplication.

On Thu, Apr 25, 2019, 12:34 John Ericson <[hidden email]> wrote:

Thanks for the clarification, David. Sounds like a good wait-and-see thing: after this lands, Zemyla's proposal and the experience with NonEmpty{Map,Set} can inform what Seq should end up looking like.

John

On 4/25/19 1:28 PM, David Feuer wrote:
No, a Seq can be Empty at the bottom too. It's definitely not a mutually recursive situation like Set or Map, and it's not immediately a top-level one like IntSet or IntMap. A nonempty sequence type would need somewhat different functions for everything. In most cases, the changes seem fairly straightforward. I'm not sure about weird functions like inits and tails. Some extra care might be required for replicate.

I believe it would be possible to restructure things to make the distinction a top-level one, by using a possibly-empty type for the top and a nonempty one below. Zemyla has already identified some very solid unrelated reasons to want to separate the tops from the rest, but I'm somewhat concerned about source code duplication with that general approach. Even if we do that, it's not clear to me that we can make a non-empty/possibly-empty distinction without incurring a performance penalty.

On Thu, Apr 25, 2019, 1:17 PM John Ericson [hidden email] wrote:
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:
> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     [hidden email] wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

David Feuer
Oh, I see what you mean. But then containers would depend on GADTs, which is rather counter to tradition. And I don't really see the benefit over a newtype and smart constructors around the existing Seq type.

On Thu, Apr 25, 2019, 1:41 PM Zemyla <[hidden email]> wrote:
We could have a GADT for the FingerTree:

data Emptiness = E | NE

data FingerTree e a where
  EmptyT :: FingerTree 'E a
  SingleT :: a -> FingerTree e a
  DeepT :: !(Digit a) -> FingerTree e' (Node a) -> !(Digit a) -> FingerTree e a

And because Seq is a newtype wrapper around FingerTree (Elem a) anyway, Seq and NonEmptySeq would become newtype wrappers around FingerTree 'E (Elem a) and FingerTree 'NE (Elem a). This avoids the majority of the code duplication.

On Thu, Apr 25, 2019, 12:34 John Ericson <[hidden email]> wrote:

Thanks for the clarification, David. Sounds like a good wait-and-see thing: after this lands, Zemyla's proposal and the experience with NonEmpty{Map,Set} can inform what Seq should end up looking like.

John

On 4/25/19 1:28 PM, David Feuer wrote:
No, a Seq can be Empty at the bottom too. It's definitely not a mutually recursive situation like Set or Map, and it's not immediately a top-level one like IntSet or IntMap. A nonempty sequence type would need somewhat different functions for everything. In most cases, the changes seem fairly straightforward. I'm not sure about weird functions like inits and tails. Some extra care might be required for replicate.

I believe it would be possible to restructure things to make the distinction a top-level one, by using a possibly-empty type for the top and a nonempty one below. Zemyla has already identified some very solid unrelated reasons to want to separate the tops from the rest, but I'm somewhat concerned about source code duplication with that general approach. Even if we do that, it's not clear to me that we can make a non-empty/possibly-empty distinction without incurring a performance penalty.

On Thu, Apr 25, 2019, 1:17 PM John Ericson [hidden email] wrote:
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:
> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     [hidden email] wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

John Ericson-2

I think the concrete benefit to GADTs is that the tags are not monomorphized wrt the equality constraints. So I'd view this or the unpacking sums things as two ways to end up with the 3 variant tag and no extra indirection or partiality. So for those goals, it's, adopt GADTs or modify GHC, pick your poison! :)

John

On 4/25/19 2:04 PM, David Feuer wrote:
Oh, I see what you mean. But then containers would depend on GADTs, which is rather counter to tradition. And I don't really see the benefit over a newtype and smart constructors around the existing Seq type.

On Thu, Apr 25, 2019, 1:41 PM Zemyla <[hidden email]> wrote:
We could have a GADT for the FingerTree:

data Emptiness = E | NE

data FingerTree e a where
  EmptyT :: FingerTree 'E a
  SingleT :: a -> FingerTree e a
  DeepT :: !(Digit a) -> FingerTree e' (Node a) -> !(Digit a) -> FingerTree e a

And because Seq is a newtype wrapper around FingerTree (Elem a) anyway, Seq and NonEmptySeq would become newtype wrappers around FingerTree 'E (Elem a) and FingerTree 'NE (Elem a). This avoids the majority of the code duplication.

On Thu, Apr 25, 2019, 12:34 John Ericson [hidden email] wrote:

Thanks for the clarification, David. Sounds like a good wait-and-see thing: after this lands, Zemyla's proposal and the experience with NonEmpty{Map,Set} can inform what Seq should end up looking like.

John

On 4/25/19 1:28 PM, David Feuer wrote:
No, a Seq can be Empty at the bottom too. It's definitely not a mutually recursive situation like Set or Map, and it's not immediately a top-level one like IntSet or IntMap. A nonempty sequence type would need somewhat different functions for everything. In most cases, the changes seem fairly straightforward. I'm not sure about weird functions like inits and tails. Some extra care might be required for replicate.

I believe it would be possible to restructure things to make the distinction a top-level one, by using a possibly-empty type for the top and a nonempty one below. Zemyla has already identified some very solid unrelated reasons to want to separate the tops from the rest, but I'm somewhat concerned about source code duplication with that general approach. Even if we do that, it's not clear to me that we can make a non-empty/possibly-empty distinction without incurring a performance penalty.

On Thu, Apr 25, 2019, 1:17 PM John Ericson [hidden email] wrote:
I haven't looked into `Seq` in addition to `Map` and `Set`, just
`IntSet` and `IntMap`. But it might be a similar thing? I take it that
with `Seq` today only the root can be empty and everything else is
single or deep? That means we *would* just use a single Maybe-like thing
at the top level, no mutual recursion. But on the other hand to make
that to make that work efficiently we would would need GHC to support
unboxing sums, so 1 + 2 variants can become a flat 3.

Also, https://github.com/haskell/containers/pull/616 is now where the
actual implementation is happening, not just the datatype changes as
before. Feel free to comment on the concrete work in progress, everyone!

John

On 4/25/19 11:36 AM, Zemyla wrote:
> A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
> Single or Deep.
>
> On Thu, Apr 25, 2019, 09:55 David Feuer <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I don't see the benefit there, unless you see a way to work it
>     into the representation.
>
>     On Thu, Apr 25, 2019, 10:53 AM Zemyla <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         As long as we're doing this, can we also add NonEmptySeq as well?
>
>         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <[hidden email]
>         <mailto:[hidden email]>> wrote:
>
>             I'm -1 on any kind of |Map = NEMap|.
>
>             An ordinary map and a non-empty map are semantically
>             different. I believe that if I non-empty maps were already
>             in |containers|, I would pretty much always care whether a
>             |Map| I see in code is a 0-map or 1-map.
>
>             Similarly, I prefer |Int| and |Word| instead of |Int| and
>             |Unsigned.Int|. (Luckily that's already the case.)
>
>             We already have a precedent with |Text| and |ByteString|,
>             where the lazy and the strict versions are only
>             distinguished by the module prefix. In my experience,
>             modules where both are used are pretty common, and I end
>             up just introducing |type LByteString = Lazy.ByteString|
>             in all my projects, because otherwise I need to scroll to
>             the imports section whenever I need to know which flavor
>             of bytestring is being used. (Or if I'm reading haddocks,
>             I have to look at the link because Haddock hides module
>             prefixes.)
>
>             "why not both" is even worse. I still can't trust the
>             |Map|, but now I also have to learn and remember that two
>             modules are the same. Speaking from experience again –
>             most people seem to be surprised by the fact that
>             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>             |Map| type. The proposed module hierarchy would move
>             |containers| to the top of my "packages that confuse
>             beginners" list, beating even |aeson|.
>
>             As an aside, I wish we had a proper interface for
>             container-like structures, or at least a solution to name
>             scoping. I really like the way Rust does it, for instance,
>             where certain functions can be "attached" to a type – I'm
>             hesitant to call them "methods" because Rust is not an OOP
>             language.
>             On Apr 25 2019, at 2:49 pm, Mario Blažević
>             <[hidden email] <mailto:[hidden email]>> wrote:
>
>                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>
>                     I'm in favor of the proposal. I find the
>                     isomorphism between Map (a,b) v
>                     and Map a (NonemptyMap b v) very pleasant. The
>                     fact that others have
>                     written less-performant implementations of this
>                     idea is rather
>                     convincing. The fact that doing this removes
>                     partial matches in the
>                     implementation is nice. And I'll take performance
>                     improvements where I
>                     can get them. The main question is the proper name
>                     of the type. Just
>                     Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>                     empty be capitalized?
>
>
>                 There seems to be a consensus for
>                 Data.Map.NonEmpty.NEMap, with the
>                 type and the functions slightly off the regular ones.
>                 This design would
>                 make it easier to use regular and non-empty containers
>                 together, but it
>                 be annoying for the use case of replacing all uses of
>                 an existing
>                 regular container with a non-empty one. I'd rather
>                 change just the
>                 import declaration than all occurrences of the type
>                 name and functions.
>
>                 I don't want to derail the implementation with
>                 bikeshedding, so I'm
>                 just going to ask why not both? The library can both
>                 export the tweaked
>                 names and add a module, say Data.NonEmpty.Map.Lazy,
>                 that exports the
>                 type synonym Map = NEMap. It would also rename all the
>                 functions back to
>                 their names from Data.Map.Lazy.
>
>
>
>                     On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>                     [hidden email] wrote:
>
>                     In
>                     https://github.com/haskell/containers/issues/608 I
>                     proposed
>                     adding non-empty variants of Map and Set, analogous to
>                     Data.List.NonEmpty for List, to containers.
>                     semigroupoids
>                     demonstrates the many uses and structure of
>                     non-empty containers in
>                     general, and libraries such as
>                     https://github.com/mstksg/nonempty-containers and
>                     https://github.com/andrewthad/non-empty-containers
>                     demonstrate the
>                     interest in non-empty maps and sets in particular.
>                     My favorite
>                     use-case is that they're needed to "curry"
>                     containers: for example,
>                     |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>                     k1 v)| but to
>                     |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>                     because it comes
>                     from the containers themselves.
>
>                     Importantly, there's no good way to do this
>                     outside of containers;
>                     doing so leads to imbalancing / extra indirection,
>                     or massive code
>                     duplication. If one wraps the container was an
>                     extra value like
>                     Data.List.NonEmpty, one's left with an unavoidable
>                     extra
>                     indirection/imbalance. One can rectify this by
>                     copying and modifying
>                     the implementation of containers, but that's
>                     hardly maintainable;
>                     even as though the algorithms are the same, enough
>                     lines are touched
>                     that merging upstream containers is nigh impossible.
>
>                     On the other hand, the non-empty containers can be
>                     elegantly and
>                     sufficiently implemented alongside their originals
>                     by taking the Bin
>                     constructor and breaking it out into it's own
>                     type, mutually
>                     recursive with the original. This avoids the
>                     indirection/imbalancing
>                     and code duplication problems: the algorithms work
>                     exactly as before
>                     creating the same trees (remember the UNPACK), and
>                     no code
>                     duplicated since the functions become mutually
>                     recursive matching
>                     the types.
>
>                     To briefly summarize the thread:
>
>                     1. I proposed the issue after performing this same
>                     refactor on the
>                     dependent-map package:
>                     https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>                     a fork of containers.
>                     2. I made
>                     https://github.com/haskell/containers/pull/616
>                     which just
>                     changes the types, to make sure UNPACK preserved
>                     the importance.
>                     3.
>                     https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>                     the benchmarks showed rather than degrading
>                     performance, PR 616
>                     actually /improved/ it.
>
>                      If there is preliminary consensus, I'll make a
>                     second PR on top
>                     which generalizes the functions like on my
>                     dependent-map branch.
>
>                     Thanks,
>
>                     John
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email]
>                     <mailto:[hidden email]>
>                     <mailto:[hidden email]
>                     <mailto:[hidden email]>>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                     _______________________________________________
>                     Libraries mailing list
>                     [hidden email] <mailto:[hidden email]>
>                     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
>                 _______________________________________________
>                 Libraries mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>             _______________________________________________
>             Libraries mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Henning Thielemann
In reply to this post by Mario Blažević

On Thu, 25 Apr 2019, Mario Blažević wrote:

> (Speaking of which, what would be the type of deleteNE?)

NonEmptyMap.delete :: k -> NonEmptyMap k a -> Map k a
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Henning Thielemann
In reply to this post by Artyom Kazak-2

On Thu, 25 Apr 2019, Artyom Kazak wrote:

> We already have a precedent with Text and ByteString, where the lazy and
> the strict versions are only distinguished by the module prefix. In my
> experience, modules where both are used are pretty common, and I end up
> just introducing type LByteString = Lazy.ByteString in all my projects,

Why not just use Lazy.ByteString? That's very comprehensible and that is
what qualification is intended for.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Theodore Lief Gannon
In reply to this post by Vladislav Zavialov-2
On Fri, Apr 19, 2019 at 2:11 AM Vladislav Zavialov <[hidden email]> wrote:
NonEmptyMap and NonEmptySet are way more readable than NEMap and NESet, the latter read as “NEM ap” and “NES et”.

Agreed on this, I really dislike prefixed abbreviations. Less so for suffixes; MapNE and SetNE are harder to mis-parse. But overall, I prefer spelling it out. There is value in clarity.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Jon Purdy
In reply to this post by Andreas Abel


On Thu, Apr 18, 2019 at 11:49 PM Andreas Abel <[hidden email]> wrote:

We definitely should /not/ repeat the choice made for
Data.List.NonEmpty.NonEmpty.  Using non-empty lists /always/ requires a
type synonym, like

   import qualified Data.List.NonEmpty as NEList
   type NEList = NEList.NonEmpty

which is quite ugly, and type synonyms are sometimes expanded in GHC
error messages, which causes additional trouble.

I’m not sure what you mean here by “always requires a type synonym”—I typically use NonEmpty like any other module meant to be imported qualified, like Text, Vector, Map, Set, &c.:

import Data.List.NonEmpty (NonEmpty)
import qualified Data.List.NonEmpty as NonEmpty  -- or NE if I’m lazy

example :: NonEmpty X -> Y


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Add NonEmptyMap and NonEmptySet to containers

Andreas Abel
 > I’m not sure what you mean here by “always requires a type synonym”
 > ...
 > example :: NonEmpty X -> Y

Jon, here you are exactly demonstrating the problem.  NonEmpty what?
List, Set, Map, HashMap, IntMap, ...?

On 2019-04-28 00:07, Jon Purdy wrote:

>
>
> On Thu, Apr 18, 2019 at 11:49 PM Andreas Abel <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>     We definitely should /not/ repeat the choice made for
>     Data.List.NonEmpty.NonEmpty.  Using non-empty lists /always/ requires a
>     type synonym, like
>
>         import qualified Data.List.NonEmpty as NEList
>         type NEList = NEList.NonEmpty
>
>     which is quite ugly, and type synonyms are sometimes expanded in GHC
>     error messages, which causes additional trouble.
>
>
> I’m not sure what you mean here by “always requires a type synonym”—I
> typically use NonEmpty like any other module meant to be imported
> qualified, like Text, Vector, Map, Set, &c.:
>
> import Data.List.NonEmpty (NonEmpty)
> import qualified Data.List.NonEmpty as NonEmpty  -- or NE if I’m lazy
>
> example :: NonEmpty X -> Y
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
12