Discussion: traversing roots and leaves

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Discussion: traversing roots and leaves

David Feuer
My proposal to move liftA2 into the Applicative class should help get fmaps out of internal nodes in traverse implementations. But there are still two trouble spots:

1. The smaller trouble spot is at the root, when dealing with newtype wrappers. For example, the simplest way to write traverse for Data.Sequence.Seq would be

instance Traversable Seq where
  traverse f (Seq xs) = Seq <$> traverse f xs

The extra fmap may not be free. Today, the only fix is to manually inline the definition of the underlying traversal and manually fuse the fmaps. It works, but it's gross! We could work around this easily using an extra Traversable method:

mapTraversed :: Applicative f
  => (t b -> r) -> (a -> f b) -> t a -> f r

but that wouldn't help with the larger problem below.

2. Leaves are really painful. Data structures that store at most one element per leaf make obvious traversals expensive. For example, in Data.Map, the obvious implementation would be

instance Traversable (Map k) where
  traverse _ Tip = pure Tip
  traverse f (Bin s k v l r) =
    liftA3 (Bin s k) (f v) (traverse f l) (traverse f r)

The trouble is that we would generate a slew of `pure Tip`s that we'd then have to combine. For instance, (Bin 1 k v Tip Tip) would give  liftA3 (Bin 1 k) (f v) (pure Tip) (pure Tip)

Today we work around this by looking ahead at the children of the element we're considering, so we get

fmap (\v' -> Bin 1 k v' Tip Tip) (f v)

instead, which is still pretty lousy but better.



It makes me rather sad to have to write disgustingly ugly Traversable instances just to avoid silly performance issues like this. Does anyone have an idea for fixing (2), and ideally simultaneously taking care of (1)?

Thanks,
David Feuer

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Discussion: traversing roots and leaves

Mario Blažević
On 2017-01-19 02:57 PM, David Feuer wrote:

> 2. Leaves are really painful. Data structures that store at most one
> element per leaf make obvious traversals expensive. For example, in
> Data.Map, the obvious implementation would be
>
> instance Traversable (Map k) where
>   traverse _ Tip = pure Tip
>   traverse f (Bin s k v l r) =
>     liftA3 (Bin s k) (f v) (traverse f l) (traverse f r)
>
> The trouble is that we would generate a slew of `pure Tip`s that we'd
> then have to combine. For instance, (Bin 1 k v Tip Tip) would give
>  liftA3 (Bin 1 k) (f v) (pure Tip) (pure Tip)
>
>
> It makes me rather sad to have to write disgustingly ugly Traversable
> instances just to avoid silly performance issues like this. Does anyone
> have an idea for fixing (2), and ideally simultaneously taking care of (1)?

        I can't offer any immediate fix, but my impression of problem (2) is
that it's for compiler to solve. I doesn't seem right that you should be
manually enumerating all the simple cases close to the tips and inlining
them explicitly.

        Since traverse is forcing the entire spine of the Map, there is not
issue with the change in strictness. The compiler should in principle be
smart enough to enumerate all possible node shapes and expand the
function definitions for each of them. It would be similar to loop
unrolling in some ways.

        Since this kind of transformation would probably be expensive at
compile-time and would increase the generated code size, it should
likely require a command-line option (-O2) or a pragma to activate.

        In short, the solution to problem #2 is to log a GHC proposal and get
the burden off your shoulders.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Discussion: traversing roots and leaves

David Feuer
No, I don't think so. The trouble is that GHC can't assume the Applicative instance is valid. Optimizing this requires knowledge of things like

liftA3 f (pure x) m (pure y) = (\m' -> f x m' y) <$> m

GHC can only discover such facts when enough inlining and/or specialization happen. Also, figuring out how close you should get to the leaves before trying to coalesce actions may be a judgement call in some cases.

On Jan 23, 2017 3:01 PM, "Mario Blažević" <[hidden email]> wrote:
On 2017-01-19 02:57 PM, David Feuer wrote:
2. Leaves are really painful. Data structures that store at most one
element per leaf make obvious traversals expensive. For example, in
Data.Map, the obvious implementation would be

instance Traversable (Map k) where
  traverse _ Tip = pure Tip
  traverse f (Bin s k v l r) =
    liftA3 (Bin s k) (f v) (traverse f l) (traverse f r)

The trouble is that we would generate a slew of `pure Tip`s that we'd
then have to combine. For instance, (Bin 1 k v Tip Tip) would give
 liftA3 (Bin 1 k) (f v) (pure Tip) (pure Tip)


It makes me rather sad to have to write disgustingly ugly Traversable
instances just to avoid silly performance issues like this. Does anyone
have an idea for fixing (2), and ideally simultaneously taking care of (1)?

        I can't offer any immediate fix, but my impression of problem (2) is that it's for compiler to solve. I doesn't seem right that you should be manually enumerating all the simple cases close to the tips and inlining them explicitly.

        Since traverse is forcing the entire spine of the Map, there is not issue with the change in strictness. The compiler should in principle be smart enough to enumerate all possible node shapes and expand the function definitions for each of them. It would be similar to loop unrolling in some ways.

        Since this kind of transformation would probably be expensive at compile-time and would increase the generated code size, it should likely require a command-line option (-O2) or a pragma to activate.

        In short, the solution to problem #2 is to log a GHC proposal and get the burden off your shoulders.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries