Proposal: Add filterM or filterA to Data.Sequence

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Proposal: Add filterM or filterA to Data.Sequence

David Feuer
This can be given exactly the same implementation as the one for lists:

filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
filterM p = foldr go (pure empty)
  where
    go x r = f <$> p x <*> r
      where
        f flg ys = if flg then x <| ys else ys


Bikeshed all you like over the name.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Andreas Abel
+1

On 28.12.2014 06:22, David Feuer wrote:

> This can be given exactly the same implementation as the one for lists:
>
> filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
> filterM p = foldr go (pure empty)
>    where
>      go x r = f <$> p x <*> r
>        where
>          f flg ys = if flg then x <| ys else ys
>
>
> Bikeshed all you like over the name.


--
Andreas Abel  <><      Du bist der geliebte Mensch.

Department of Computer Science and Engineering
Chalmers and Gothenburg University, Sweden

[hidden email]
http://www2.tcs.ifi.lmu.de/~abel/
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Edward Kmett-2
In reply to this post by David Feuer
+1 to just generalizing filterM in Data.Sequence

On Sun, Dec 28, 2014 at 12:22 AM, David Feuer <[hidden email]> wrote:
This can be given exactly the same implementation as the one for lists:

filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
filterM p = foldr go (pure empty)
  where
    go x r = f <$> p x <*> r
      where
        f flg ys = if flg then x <| ys else ys


Bikeshed all you like over the name.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Johan Tibell-2
We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.

On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett <[hidden email]> wrote:
+1 to just generalizing filterM in Data.Sequence

On Sun, Dec 28, 2014 at 12:22 AM, David Feuer <[hidden email]> wrote:
This can be given exactly the same implementation as the one for lists:

filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
filterM p = foldr go (pure empty)
  where
    go x r = f <$> p x <*> r
      where
        f flg ys = if flg then x <| ys else ys


Bikeshed all you like over the name.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Artyom Kazak-2
I’m not sure I understand. Do you mean that |filterM| shouldn’t exist
for data structures for which |filterM . toList| is as fast?

If this is the case, I wish that it was at least specified in the
documentation that e.g. “this function doesn’t exist because the naive
composition is guaranteed to be optimised away / the faster version is
actually impossible to write / etc.”. I find myself wondering way too
often whether some piece of code I’ve written is a potential candidate
for optimisation, and knowing in advance that the naive version is the
“recommended” approach lets me not waste my time on benchmarking code
which was already benchmarked by others.

On 12/30/2014 08:05 PM, Johan Tibell wrote:

> We should check that `filterM . toList` isn't as fast. That was the
> case for bytestring and we rejected adding filterM there for that reason.
>
> On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     +1 to just generalizing filterM in Data.Sequence
>
>     On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         This can be given exactly the same implementation as the one
>         for lists:
>
>         filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
>         filterM p = foldr go (pure empty)
>           where
>             go x r = f <0.3927809241601645gt; p x <*> r
>               where
>                 f flg ys = if flg then x <| ys else ys
>
>
>         Bikeshed all you like over the name.
>         _______________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://www.haskell.org/mailman/listinfo/libraries
>
>
>
>     _______________________________________________
>     Libraries mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://www.haskell.org/mailman/listinfo/libraries
>
>
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Johan Tibell-2
The larger point is this: should we have applicative functions for every higher order function, for every data type? For example, one could make an argument for having Map.alterA in addition to Map.alter, because without alterA one would have to use a combination of lookup and insert, which could be slower, to achieve the same effect.

The obvious downside is the explosion of functions in the API, which is even worse due to their already being lazy and strict versions of most higher-order function (i.e. now we'd have to have 2*2 versions of every function). This seems like a failure in composability and abstraction. Until we've figured out a way to deal with this general issue that doesn't involve duplicating tons of code and swelling the API, I've been pushing back on changes like this.

In this particular case, having to go via lists might hurt performance a bit, but might still be better than the other alternative.

As one of the maintainers of the containers package these are the kind of issues I have to consider*.

* This is by the way one of the reasons that it's important for packages to have dedicated maintainers, so make sure proposed changes are considered in the larger context of the health and evolution of the package as a whole.

On Tue, Dec 30, 2014 at 12:52 PM, Artyom <[hidden email]> wrote:
I’m not sure I understand. Do you mean that |filterM| shouldn’t exist for data structures for which |filterM . toList| is as fast?

If this is the case, I wish that it was at least specified in the documentation that e.g. “this function doesn’t exist because the naive composition is guaranteed to be optimised away / the faster version is actually impossible to write / etc.”. I find myself wondering way too often whether some piece of code I’ve written is a potential candidate for optimisation, and knowing in advance that the naive version is the “recommended” approach lets me not waste my time on benchmarking code which was already benchmarked by others.

On 12/30/2014 08:05 PM, Johan Tibell wrote:

We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.

On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett <[hidden email] <mailto:[hidden email]>> wrote:

    +1 to just generalizing filterM in Data.Sequence

    On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
    <[hidden email] <mailto:[hidden email]>> wrote:

        This can be given exactly the same implementation as the one
        for lists:

        filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a)
        filterM p = foldr go (pure empty)
          where
            go x r = f <0.3927809241601645gt; p x <*> r
              where
                f flg ys = if flg then x <| ys else ys


        Bikeshed all you like over the name.
        _______________________________________________
        Libraries mailing list
        [hidden email] <mailto:[hidden email]>
        http://www.haskell.org/mailman/listinfo/libraries



    _______________________________________________
    Libraries mailing list
    [hidden email] <mailto:[hidden email]>
    http://www.haskell.org/mailman/listinfo/libraries




_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

David Feuer
In reply to this post by Johan Tibell-2
On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell <[hidden email]> wrote:
> We should check that `filterM . toList` isn't as fast. That was the case for
> bytestring and we rejected adding filterM there for that reason.

Let me expand that definition to what it really looks like:

  filterMSeq f = fmap fromList . filterM f . toList

Yes, you can do this. Yes, it's probably even pretty efficient. But
it's a bit painful to have to expand things out like this just to
switch a bit of code from lists to sequences.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

David Feuer
In reply to this post by Johan Tibell-2
On Tue, Dec 30, 2014 at 1:05 PM, Johan Tibell <[hidden email]> wrote:
> The obvious downside is the explosion of functions in the API, which is even
> worse due to their already being lazy and strict versions of most
> higher-order function (i.e. now we'd have to have 2*2 versions of every
> function). This seems like a failure in composability and abstraction. Until
> we've figured out a way to deal with this general issue that doesn't involve
> duplicating tons of code and swelling the API, I've been pushing back on
> changes like this.

1. I think it's better to duplicate code in a library than to make its
users duplicate the code themselves.

2. One option would be to make Data.Sequence.Private, exposing all the
types and various internals as well as defining all the necessary
instances, and then to add Data.Sequence.This, Data.Sequence.That,
etc.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Johan Tibell-2


On Tue, Dec 30, 2014 at 1:21 PM, David Feuer <[hidden email]> wrote:
On Tue, Dec 30, 2014 at 1:05 PM, Johan Tibell <[hidden email]> wrote:
> The obvious downside is the explosion of functions in the API, which is even
> worse due to their already being lazy and strict versions of most
> higher-order function (i.e. now we'd have to have 2*2 versions of every
> function). This seems like a failure in composability and abstraction. Until
> we've figured out a way to deal with this general issue that doesn't involve
> duplicating tons of code and swelling the API, I've been pushing back on
> changes like this.

1. I think it's better to duplicate code in a library than to make its
users duplicate the code themselves.

I tend to disagree, especially if we're not talking about duplicating

    filterMSeq f = fmap fromList . filterM f . toList

but the whole implementation of Seq.filter (which would need to if you want an optimal implementation of filterMSeq).
 
2. One option would be to make Data.Sequence.Private, exposing all the
types and various internals as well as defining all the necessary
instances, and then to add Data.Sequence.This, Data.Sequence.That,
etc.

I don't want to expose the internals (we don't do that for other containers types) because we wouldn't, in practice, be able to change the implementation if we wanted to.


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Kim-Ee Yeoh
Administrator
In reply to this post by Johan Tibell-2

On Wed, Dec 31, 2014 at 1:05 AM, Johan Tibell <[hidden email]> wrote:
As one of the maintainers of the containers package these are the kind of issues I have to consider*.

I share these concerns too with Johan, especially about code duplication and package maintenance.

Kudos to him for stewarding it so successfully all along for I and countless others appreciate his work. And I think it's unfair to push work around like this. What happens is open source fatigue and you find your best contributors leaving.

Traditionally, if you found some missing function, you would roll your own package and upload it for the benefit of everyone.

The proper resolution of this proposal is Data.MissingSeq.filterM.

-- Kim-Ee

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Milan Straka
In reply to this post by David Feuer
Hi all,

> -----Original message-----
> From: David Feuer <[hidden email]>
> Sent: 30 Dec 2014, 13:15
>
> On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell <[hidden email]> wrote:
> > We should check that `filterM . toList` isn't as fast. That was the case for
> > bytestring and we rejected adding filterM there for that reason.
>
> Let me expand that definition to what it really looks like:
>
>   filterMSeq f = fmap fromList . filterM f . toList
>
> Yes, you can do this. Yes, it's probably even pretty efficient. But
> it's a bit painful to have to expand things out like this just to
> switch a bit of code from lists to sequences.

the problem with this is that there are really many functions that
"would be useful", but we do not really want to add them all.

A general rule I have been using is "can this be implemented
straightforwardly using existing methods without hurting performance
much"? So I would like to see the numbers, please.

Also, filterM is not yet settled in base, so we should wait some time
until it is.

Cheers,
Milan

PS: It is a shame we cannot reuse Foldable and/or Traversable for
filtering. Maybe we should add a new Filterable class?
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

David Feuer
There won't be any such numbers in this case, because access to the
details of the sequence structure does not seem to help implement this
function. The same is true of the following, all of which already
exist:

replicate, replicateM, iterateN  -- all are implemented using replicateA
unfoldr, unfoldl --implemented using <| and |>
null -- can be written just as well using  null xs = case viewl xs of
{EmptyL -> True; _ -> False}
scanl, scanr, scanl1, scanr1 -- defined ultimately in terms of traverse
spanl, spanr, breakl, breakr, takeWhile, dropWhile -- defined in terms
of splitAt
findIndicesL, findIndicesR -- defined in terms of foldlWithIndex, foldrWithIndex
take, drop -- previously defined using splitAt; currently defined
using splitAt', but those two functions are likely to merge
filter -- uses foldl
elemIndexL, elemIndicesL, elemIndexR, elemIndicesR, findIndexL,
findIndexR -- all defined using findIndicesL and findIndicesR
update -- uses adjust
zip, zipWith3, zip3, zipWith4, zip4 -- All defined from zipWith

I think almost all of these functions are good to have around
(although the elemIndex family seems a bit silly). Adding a few more
to round things out seems reasonable to me as well.


On Tue, Dec 30, 2014 at 5:41 PM, Milan Straka <[hidden email]> wrote:

> Hi all,
>
>> -----Original message-----
>> From: David Feuer <[hidden email]>
>> Sent: 30 Dec 2014, 13:15
>>
>> On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell <[hidden email]> wrote:
>> > We should check that `filterM . toList` isn't as fast. That was the case for
>> > bytestring and we rejected adding filterM there for that reason.
>>
>> Let me expand that definition to what it really looks like:
>>
>>   filterMSeq f = fmap fromList . filterM f . toList
>>
>> Yes, you can do this. Yes, it's probably even pretty efficient. But
>> it's a bit painful to have to expand things out like this just to
>> switch a bit of code from lists to sequences.
>
> the problem with this is that there are really many functions that
> "would be useful", but we do not really want to add them all.
>
> A general rule I have been using is "can this be implemented
> straightforwardly using existing methods without hurting performance
> much"? So I would like to see the numbers, please.
>
> Also, filterM is not yet settled in base, so we should wait some time
> until it is.
>
> Cheers,
> Milan
>
> PS: It is a shame we cannot reuse Foldable and/or Traversable for
> filtering. Maybe we should add a new Filterable class?
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Milan Straka
Hi all,

> -----Original message-----
> From: David Feuer <[hidden email]>
> Sent: 30 Dec 2014, 18:03
>
> There won't be any such numbers in this case, because access to the
> details of the sequence structure does not seem to help implement this
> function.

sorry for not making clear -- I was interested in difference between the
mentioned
| filterMSeq f = fmap fromList . filterM f . toList
and your foldr-based implementation. I can imagine that one can use
filterMSeq in place instead of filterM, so that is why I am interested
between filterMSeq and the specialized implementation (which is just
a plain foldr + <|).

> The same is true of the following, all of which already exist:
>
> replicate, replicateM, iterateN  -- all are implemented using replicateA
> unfoldr, unfoldl --implemented using <| and |>
> null -- can be written just as well using  null xs = case viewl xs of
> {EmptyL -> True; _ -> False}
> scanl, scanr, scanl1, scanr1 -- defined ultimately in terms of traverse
> spanl, spanr, breakl, breakr, takeWhile, dropWhile -- defined in terms
> of splitAt
> findIndicesL, findIndicesR -- defined in terms of foldlWithIndex, foldrWithIndex
> take, drop -- previously defined using splitAt; currently defined
> using splitAt', but those two functions are likely to merge
> filter -- uses foldl
> elemIndexL, elemIndicesL, elemIndexR, elemIndicesR, findIndexL,
> findIndexR -- all defined using findIndicesL and findIndicesR
> update -- uses adjust
> zip, zipWith3, zip3, zipWith4, zip4 -- All defined from zipWith
>
> I think almost all of these functions are good to have around
> (although the elemIndex family seems a bit silly). Adding a few more
> to round things out seems reasonable to me as well.

true, that is why it is a general rule (to not add functions which can
be implemented using others), not an absolute rule.

The question is whether filterM is as "basic" method used frequently
enough (for example, no one obviously questions zip).

Note that it is not obvious what type does filterM has (unlike e.g. zip),
now that we have Applicative and also filterM is changing in base. From
this point of view, it makes sense to use
  fmap fromList . filterM f . toList
because then we do not have inconsistent filterM-s.

Cheers,
Milan

> On Tue, Dec 30, 2014 at 5:41 PM, Milan Straka <[hidden email]> wrote:
> > Hi all,
> >
> >> -----Original message-----
> >> From: David Feuer <[hidden email]>
> >> Sent: 30 Dec 2014, 13:15
> >>
> >> On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell <[hidden email]> wrote:
> >> > We should check that `filterM . toList` isn't as fast. That was the case for
> >> > bytestring and we rejected adding filterM there for that reason.
> >>
> >> Let me expand that definition to what it really looks like:
> >>
> >>   filterMSeq f = fmap fromList . filterM f . toList
> >>
> >> Yes, you can do this. Yes, it's probably even pretty efficient. But
> >> it's a bit painful to have to expand things out like this just to
> >> switch a bit of code from lists to sequences.
> >
> > the problem with this is that there are really many functions that
> > "would be useful", but we do not really want to add them all.
> >
> > A general rule I have been using is "can this be implemented
> > straightforwardly using existing methods without hurting performance
> > much"? So I would like to see the numbers, please.
> >
> > Also, filterM is not yet settled in base, so we should wait some time
> > until it is.
> >
> > Cheers,
> > Milan
> >
> > PS: It is a shame we cannot reuse Foldable and/or Traversable for
> > filtering. Maybe we should add a new Filterable class?
>
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

David Feuer
On Tue, Dec 30, 2014 at 6:22 PM, Milan Straka <[hidden email]> wrote:

> Hi all,
>
>> -----Original message-----
>> From: David Feuer <[hidden email]>
>> Sent: 30 Dec 2014, 18:03
>>
>> There won't be any such numbers in this case, because access to the
>> details of the sequence structure does not seem to help implement this
>> function.
>
> sorry for not making clear -- I was interested in difference between the
> mentioned
> | filterMSeq f = fmap fromList . filterM f . toList
> and your foldr-based implementation. I can imagine that one can use
> filterMSeq in place instead of filterM, so that is why I am interested
> between filterMSeq and the specialized implementation (which is just
> a plain foldr + <|).

Oh, I really don't know which is better (and that's true also for
unfoldr and unfoldl--is the current implementation better or worse
than using fromList?).

> Note that it is not obvious what type does filterM has (unlike e.g. zip),
> now that we have Applicative and also filterM is changing in base. From
> this point of view, it makes sense to use
>   fmap fromList . filterM f . toList
> because then we do not have inconsistent filterM-s.

I have no problem with having filterM in Data.Sequence be more general
than filterM in Control.Monad for old versions.

I think both unfold(r/l) and filterM actually make a potential point
in my favor—if there are multiple implementations that seem
reasonable, having whichever turns out to be fastest in the library
will save users time benchmarking them.

David
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Add filterM or filterA to Data.Sequence

Bardur Arantsson-2
In reply to this post by Milan Straka
On 2014-12-30 23:41, Milan Straka wrote:

> Hi all,
>
>> -----Original message-----
>> From: David Feuer <[hidden email]>
>> Sent: 30 Dec 2014, 13:15
>>
>> On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell <[hidden email]> wrote:
>>> We should check that `filterM . toList` isn't as fast. That was the case for
>>> bytestring and we rejected adding filterM there for that reason.
>>
>> Let me expand that definition to what it really looks like:
>>
>>   filterMSeq f = fmap fromList . filterM f . toList
>>
>> Yes, you can do this. Yes, it's probably even pretty efficient. But
>> it's a bit painful to have to expand things out like this just to
>> switch a bit of code from lists to sequences.
>
> the problem with this is that there are really many functions that
> "would be useful", but we do not really want to add them all.
>
> A general rule I have been using is "can this be implemented
> straightforwardly using existing methods without hurting performance
> much"? So I would like to see the numbers, please.
>

Well, there's an opposing point called "usability". If there are already
"obvious" functions to use in other data structure modules then they
should *probably* also be represented in the other data structure modules.

(I don't claim that this is necessarily a "trump", but I think it bears
consideration.)

Regards,

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries