space-efficient, composable list transformers [was: Re: Reifying case expressions [was: Re: On stream processing, and a new release of timeplot coming]]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

space-efficient, composable list transformers [was: Re: Reifying case expressions [was: Re: On stream processing, and a new release of timeplot coming]]

Sebastian Fischer-2
Hello Heinrich,

On Tue, Dec 27, 2011 at 1:09 PM, Heinrich Apfelmus <[hidden email]> wrote:
Sebastian Fischer wrote:
all functions defined in terms of `ListTo` and `interpret`
are spine strict - they return a result only after consuming all input list
constructors.
Indeed, the trouble is that my original formulation cannot return a result before it has evaluated all the case expressions. To include laziness, we need a way to return results early.

Sebastian's  ListTransformer  type does precisely that for the special case of lists as results,

Hmm, I think it does more than that.. (see below)

but it turns out that there is also a completely generic way of returning results early. In particular, we can leverage lazy evaluation for the result type.

This is nice! It would be cool if we could get the benefits of ListConsumer and ListTransformer in a single data type.

I know that you chose these names to avoid confusion, but I would like to advertise again the idea of choosing the *same* names for the constructors as the combinators they represent [...] This technique for designing data structures has the huge advantage that it's immediately clear how to interpret it and which laws are supposed to hold. 

I also like your names better, although they suggest that there is a single possible interpretation function. Even at the expense of blinding eyes to the possibility of other interpretation functions, I agree that it makes things clearer to use names from a *motivating* interpretation. In hindsight, my names for the constructors of ListTransformer seem to be inspired by operations on handles. So, `Cut` should have been named `Close` instead..
 
Especially in the case of lists, I think that this approach clears up a lot of confusion about seemingly new concepts like Iteratees and so on.

A share the discomfort with seemingly alien concepts and agree that clarity of exposition is crucial, both for the meaning of defined combinators and their implementation. We should aim at combinators that people are already familiar with, either because they are commonplace (like id, (.), or fmap) or because they are used by many other libraries (like the Applicative combinators).

A good way to explain the meaning of the combinators is via the meaning of the same combinators on a familiar type. Your interpretation function is a type-class morphism from `ListTo a b` to `[a] -> b`. For Functor we have:

    interpret (fmap f a)  =  fmap f (interpret a)

On the left side, we use `fmap` for `ListTo a` on the right side for `((->) l)`. Similarly, we have the following properties for the coresponding Applicative instances:

    interpret (pure x)  =  pure x
    interpret (a <*> b)  =  interpret a <*> interpret b

Such morphism properties simplify how to think about programs a lot, because one can think about programs as if they were written in the *meaning* type without knowing anything about the *implementation* type. The computed results are the same but they are computed more efficiently.

Your `ListTo` type achieves space efficiency for Applicative composition of list functions by executing them in lock-step. Because of the additional laziness provided by the `Fmap` constructor, compositions like

    interpret a . interpret b

can also be executed in constant space. However, we cannot use the space efficient Applicative combinators again to form parallel compositions of sequential ones because we are already in the meaning type.

We could implement composition for the `ListTo` type as follows

    (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c
    a <. b = interpret a <$> b

But if we use a result of this function as argument of <*>, then the advantage of using `ListTo` is lost. While

    interpret ((,) <$> andL <*> andL)

runs in constant space,

    interpret ((,) <$> (andL <. idL) <*> (andL <. idL))

does not.

The ListTransformer type supports composition in lock-step via a category instance. The meaning of `ListTransformer a b` is `[a] -> [b]` with the additional restriction that all functions `f` in the image of the interpretation function are incremental:

    xs `isPrefixOf` ys  ==>  f xs `isPrefixOf` f ys

Composition as implemented in the ListTransformer type satisfies morphism properties for the category instance:

    transformList id  =  id
    transformList (a . b)  =  transformList a . transformList b

As it is implemented on the ListTransformer type directly (without using the interpretation function), it can be combined with the Applicative instance for parallel composition without losing space efficiency.

The Applicative instance for `ListTransformer` is different from the Applicative instance for `ListTo` (or `ListConsumer`). While

    interpret ((,) <$> idL <*> idL)

is of type `[a] -> ([a],[a])`

    transformList ((,) <$> idL <*> idL)

is of type `[a] -> [(a,a)]`. We could achieve the latter behaviour with the former instance by using an additional fmap. But

    uncurry zip <$> ((,) <$> idL <*> idL)

has the same disadvantages regarding space efficiency as referred to above because the computation of `zip` cannot occur in lock-step with another computation.

The meaning of the Applicative instance of `ListTransformer` can again be described using morphism properties on Applicative structures. The corresponding structure on the meaning type `[a] -> [b]` is the following combination of the Applicative instances for functions and for ZipLists.

    fmap f a = liftA (fmap f) a
    pure x = pure (getZipList (pure x))
    a <*> b = getZipList <$> liftA2 (<*>) (ZipList <$> a) (ZipList <$> b)

I have a gut feeling that the laziness provided by the `Fmap` constructor is too implicit to be useful for the kind of lock-step composition provided by ListTransformer. So I don't have high hopes that we can unify `ListConsumer` and `ListTransformer` into a single type.

Do you have an idea?

Sebastian

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: space-efficient, composable list transformers [was: Re: Reifying case expressions]

Heinrich Apfelmus
Sebastian Fischer wrote:

> Your `ListTo` type achieves space efficiency for Applicative composition of
> list functions by executing them in lock-step. Because of the additional
> laziness provided by the `Fmap` constructor, compositions like
>
>     interpret a . interpret b
>
> can also be executed in constant space. However, we cannot use the space
> efficient Applicative combinators again to form parallel compositions of
> sequential ones because we are already in the meaning type.
>
> We could implement composition for the `ListTo` type as follows
>
>     (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c
>     a <. b = interpret a <$> b
>
> But if we use a result of this function as argument of <*>, then the
> advantage of using `ListTo` is lost. While
>
>     interpret ((,) <$> andL <*> andL)
>
> runs in constant space,
>
>     interpret ((,) <$> (andL <. idL) <*> (andL <. idL))
>
> does not.
>
> The ListTransformer type supports composition in lock-step via a category
> instance. The meaning of `ListTransformer a b` is `[a] -> [b]` with the
> additional restriction that all functions `f` in the image of the
> interpretation function are incremental:
>
>     xs `isPrefixOf` ys  ==>  f xs `isPrefixOf` f ys
>
> [..]
>
> The Applicative instance for `ListTransformer` is different from the
> Applicative instance for `ListTo` (or `ListConsumer`). While
>
>     interpret ((,) <$> idL <*> idL)
>
> is of type `[a] -> ([a],[a])`
>
>     transformList ((,) <$> idL <*> idL)
>
> is of type `[a] -> [(a,a)]`.
> [..]

Ah, so  ListTransformer  is actually quite different from  ListTo
because the applicative instance yields a different type. Then again,
the former can be obtained form the latter via  unzip .

> I have a gut feeling that the laziness provided by the `Fmap` constructor
> is too implicit to be useful for the kind of lock-step composition provided
> by ListTransformer. So I don't have high hopes that we can unify
> `ListConsumer` and `ListTransformer` into a single type.
>
> Do you have an idea?

Well, the simple solution would be to restrict the type of  (<.)  to

     (<.) :: ListTo b c -> ListTransformer a b -> ListTo a c

so that the second argument is guaranteed to be incremental. Of course,
this is rather unsatisfactory.

Fortunately, there is a nicer solution that keeps everything in the
ListTo  type. The main problem with  Fmap  is that it can be far from
incremental, because we can plug in any function we like:

     example :: ListTo a [a]
     example = Fmap reverse

Without an explicit guarantee that the function is incremental, we can't
do anything here. But we can just add another constructor to that effect
if we turn  ListTo  into a GADT:

     data ListTo a b where
         CaseOf   :: b ->  (a -> ListTo a b)  -> ListTo a b
         Fmap     :: (b -> c) -> ListTo a b   -> ListTo a c

         FmapCons :: b -> ListTo a [b] -> ListTo a [b]

The interpretation for this case is given by the morphism

     interpret (FmapCons x g) = fmap (x:) $ interpret g

and sequential composition reads

     -- sequential composition
     -- interpret (a <. b) = interpret $ interpret a <$> b
     (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c
     (CaseOf _ cons) <. (FmapCons y b) = cons y <. b
     (Fmap f a)      <. (FmapCons y b) = Fmap f     $ a <. (FmapCons y b)
     (FmapCons x a)  <. (FmapCons y b) = FmapCons x $ a <. (FmapCons y b)
     a <. (CaseOf nil cons) = CaseOf (interpret a nil) ((a <.) . cons)
     a <. (Fmap f b)        = fmap (interpret a . f) b

Of course, the identity has to be redefined to make use of the new guarantee

     idL :: ListTo a [a]
     idL = caseOf [] $ \x -> FmapCons x idL

I'm going to omit the new definition for the applicative instance, the
full code can be found here:

     https://gist.github.com/1550676

With all these combinators in place, even examples like

     liftA2 (,) (andL <. takeL 3) (andL <. idL)

should work as expected.


While nice, the above solution is not perfect. One thing we can do with
  ListTransformer  type is to perform an apply first and then do a
sequential composition.

     a <. (b <*> c)

This only works because the result of  <*>  is already zipped.


And there is an even more worrisome observation: all this work would
have been superfluous if we had *partial evaluation*, i.e. if it were
possible to evaluate expressions like  \xs -> f (4:xs)  beneath the
lambda. Then we could dispense with all the constructor yoga above and
simply use a plain

      type ListTo a b = [a] -> b

with the applicative instance

      instance Applicative (ListTo a) where
          pure b = const b
          (f <*> x) cs = case cs of
              []     -> f [] $ x []
              (c:cs) -> let f' = f . (c:); x; = x . (c:) in
                        f' `partialseq` x' `partialseq` (f' <*> x')

to obtain space efficient parallel and sequential composition. In fact,
by using constructors, we are essentially doing partial evaluation by hand.


Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: space-efficient, composable list transformers [was: Re: Reifying case expressions]

Jan Christiansen-2
Hi,

On Jan 2, 2012, at 2:34 PM, Heinrich Apfelmus wrote:

> Without an explicit guarantee that the function is incremental, we can't do anything here. But we can just add another constructor to that effect if we turn  ListTo  into a GADT:
>
>    data ListTo a b where
>        CaseOf   :: b ->  (a -> ListTo a b)  -> ListTo a b
>        Fmap     :: (b -> c) -> ListTo a b   -> ListTo a c
>
>        FmapCons :: b -> ListTo a [b] -> ListTo a [b]

I did not follow your discussion but how about using an additional GADT for the argument of Fmap, that is

data Fun a b where
  Fun :: (a -> b) -> Fun a b
  Cons :: a -> Fun [a] [a]

data ListTo a b where
  CaseOf   :: b ->  (a -> ListTo a b) -> ListTo a b
  Fmap     :: Fun b c -> ListTo a b   -> ListTo a c

and provide a function to interpret this data type as well

interpretFun :: Fun a b -> a -> b
interpretFun (Fun f)  = f
interpretFun (Cons x) = (x:)

for the sequential composition if I am not mistaken.

(<.) :: ListTo b c -> ListTo a [b] -> ListTo a c
(CaseOf _ cons) <. (Fmap (Cons y) b) = cons y <. b
(Fmap f a)  <. (Fmap g b) = Fmap f $ a <. (Fmap g b)
a <. (CaseOf nil cons)    = CaseOf (interpret a nil) ((a <.) . cons)
a <. (Fmap f b)           = fmap (interpret a . interpretFun f) b


-- functor instance
instance Functor (ListTo a) where
  fmap f = normalize . Fmap (Fun f)

normalize :: ListTo a b -> ListTo a b
normalize (Fmap (Fun f) (Fmap (Fun g) c)) = fmap (f . g) c
normalize x = x

Cheers, Jan
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: space-efficient, composable list transformers [was: Re: Reifying case expressions]

Heinrich Apfelmus
Jan Christiansen wrote:

> On Jan 2, 2012, at 2:34 PM, Heinrich Apfelmus wrote:
>
>> Without an explicit guarantee that the function is incremental, we can't do anything here. But we can just add another constructor to that effect if we turn  ListTo  into a GADT:
>>
>>    data ListTo a b where
>>        CaseOf   :: b ->  (a -> ListTo a b)  -> ListTo a b
>>        Fmap     :: (b -> c) -> ListTo a b   -> ListTo a c
>>
>>        FmapCons :: b -> ListTo a [b] -> ListTo a [b]
>
> I did not follow your discussion but how about using an additional GADT for the argument of Fmap, that is
>
> data Fun a b where
>   Fun :: (a -> b) -> Fun a b
>   Cons :: a -> Fun [a] [a]
>
> data ListTo a b where
>   CaseOf   :: b ->  (a -> ListTo a b) -> ListTo a b
>   Fmap     :: Fun b c -> ListTo a b   -> ListTo a c
>
> and provide a function to interpret this data type as well
>
> interpretFun :: Fun a b -> a -> b
> interpretFun (Fun f)  = f
> interpretFun (Cons x) = (x:)
>
> for the sequential composition if I am not mistaken.
>
> (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c
> (CaseOf _ cons) <. (Fmap (Cons y) b) = cons y <. b
> (Fmap f a)  <. (Fmap g b) = Fmap f $ a <. (Fmap g b)
> a <. (CaseOf nil cons)    = CaseOf (interpret a nil) ((a <.) . cons)
> a <. (Fmap f b)           = fmap (interpret a . interpretFun f) b
>
>
> -- functor instance
> instance Functor (ListTo a) where
>   fmap f = normalize . Fmap (Fun f)
>
> normalize :: ListTo a b -> ListTo a b
> normalize (Fmap (Fun f) (Fmap (Fun g) c)) = fmap (f . g) c
> normalize x = x
>
> Cheers, Jan

Nice, that is a lot simpler indeed, and even closer to the core of the idea.


Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe