Abstracting over things that can be unpacked

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Abstracting over things that can be unpacked

Johan Tibell-2
Hi all,

These ideas are still in very early stages. I present them here in hope of starting a discussion. (We discussed this quite a bit at last year's ICFP, I hope this slightly different take on the problem might lead to new ideas.)

I think the next big step in Haskell performance is going to come from using better data representation in common types such as list, sets, and maps. Today these polymorphic data structures use both more memory and have more indirections than necessary, due to boxing of values. This boxing is due to the values being stored in fields of polymorphic type.

First idea: instead of rejecting unpack pragmas on polymorphic fields, have them require a class constraint on the field types. Example:

    data UnboxPair a b = (Unbox a, Unbox b) => UP {-# UNPACK #-} !a {-# UNPACK #-} !b

The Unbox type class would be similar in spirit to the class with the same name in the vector package, but be implemented internally by GHC. To a first approximation instances would only exist for fields that unpack to non-pointer types (e.g. Int.)

Second idea: Introduce a new pragma, that has similar effect on representations as DPH's [::] vector type. This new pragma does deep unpacking, allowing for more types to be instances of the Unbox type e.g. pairs. Example:

    data T = C {-# UNWRAP #-} (a, b)

If you squint a bit this pragma does the same as [: (a, b) :], except no vectors are involved. The final representation would be the unpacked representation of a and b, concatenated together (e.g. (Int, Int) would result in the field above being 128-bit wide on a 64-bit machine.

The meta-idea tying these two ideas together is to allow for some abstraction over representation transforming pragmas, such as UNPACK.

P.S. Before someone suggest using type families. Please read my email titled "Avoiding O(n^2) instances when using associated data types to unpack values into constructors."

Cheers,
  Johan


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Abstracting over things that can be unpacked

Ryan Newton
+1 ! 


On Fri, Mar 2, 2012 at 7:40 PM, Johan Tibell <[hidden email]> wrote:
Hi all,

These ideas are still in very early stages. I present them here in hope of starting a discussion. (We discussed this quite a bit at last year's ICFP, I hope this slightly different take on the problem might lead to new ideas.)

I think the next big step in Haskell performance is going to come from using better data representation in common types such as list, sets, and maps. Today these polymorphic data structures use both more memory and have more indirections than necessary, due to boxing of values. This boxing is due to the values being stored in fields of polymorphic type.

First idea: instead of rejecting unpack pragmas on polymorphic fields, have them require a class constraint on the field types. Example:

    data UnboxPair a b = (Unbox a, Unbox b) => UP {-# UNPACK #-} !a {-# UNPACK #-} !b

The Unbox type class would be similar in spirit to the class with the same name in the vector package, but be implemented internally by GHC. To a first approximation instances would only exist for fields that unpack to non-pointer types (e.g. Int.)

Second idea: Introduce a new pragma, that has similar effect on representations as DPH's [::] vector type. This new pragma does deep unpacking, allowing for more types to be instances of the Unbox type e.g. pairs. Example:

    data T = C {-# UNWRAP #-} (a, b)

If you squint a bit this pragma does the same as [: (a, b) :], except no vectors are involved. The final representation would be the unpacked representation of a and b, concatenated together (e.g. (Int, Int) would result in the field above being 128-bit wide on a 64-bit machine.

The meta-idea tying these two ideas together is to allow for some abstraction over representation transforming pragmas, such as UNPACK.

P.S. Before someone suggest using type families. Please read my email titled "Avoiding O(n^2) instances when using associated data types to unpack values into constructors."

Cheers,
  Johan


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Abstracting over things that can be unpacked

Twan van Laarhoven
In reply to this post by Johan Tibell-2
On 03/03/12 01:40, Johan Tibell wrote:

> Hi all,
>
> These ideas are still in very early stages. I present them here in hope
> of starting a discussion. (We discussed this quite a bit at last year's
> ICFP, I hope this slightly different take on the problem might lead to
> new ideas.)
>
> I think the next big step in Haskell performance is going to come from
> using better data representation in common types such as list, sets, and
> maps. Today these polymorphic data structures use both more memory and
> have more indirections than necessary, due to boxing of values. This
> boxing is due to the values being stored in fields of polymorphic type.
>
> First idea: instead of rejecting unpack pragmas on polymorphic fields,
> have them require a class constraint on the field types. Example:
>
>      data UnboxPair a b = (Unbox a, Unbox b) => UP {-# UNPACK #-} !a {-#
> UNPACK #-} !b

I expect that this will not be easy to implement, because it requires
interaction with things like the garbage collector. For example,
UnboxPair will need a different info table for different a and b.

It might be possible to essentially specialize UnboxPair for each
different a and b for which it is used, but that gets tricky with
generic code.

> The Unbox type class would be similar in spirit to the class with the
> same name in the vector package, but be implemented internally by GHC.
> To a first approximation instances would only exist for fields that
> unpack to non-pointer types (e.g. Int.)
>
> Second idea: Introduce a new pragma, that has similar effect on
> representations as DPH's [::] vector type. This new pragma does deep
> unpacking, allowing for more types to be instances of the Unbox type.

Could this be handled by just having/deriving an Unbox instance for
(a,b)? I imagine the Unbox type class would have to contain essentially
the same things as Storable, maybe something like

     type UnboxedRepr :: Int -> #
     class Unbox a where
         type Repr a :: # -- gives size and alignment
         unbox :: a -> Repr a
         box   :: Repr a -> a

A problem with an instance (Unboxed a, Unboxed b) => Unboxed (a,b) is
that it allows arbitrarily large unboxed values to be created at
runtime. That doesn't work when you use specialization to create the
needed info tables.


Twan

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Abstracting over things that can be unpacked

Johan Tibell-2
On Sat, Mar 3, 2012 at 8:06 AM, Twan van Laarhoven <[hidden email]> wrote:
I expect that this will not be easy to implement, because it requires interaction with things like the garbage collector. For example, UnboxPair will need a different info table for different a and b.

It might be possible to essentially specialize UnboxPair for each different a and b for which it is used, but that gets tricky with generic code.

I believe use-site code generation is the way to go. This is what C++ does (at compile time) and C# (at runtime, using the JIT) does.
 
The Unbox type class would be similar in spirit to the class with the
same name in the vector package, but be implemented internally by GHC.
To a first approximation instances would only exist for fields that
unpack to non-pointer types (e.g. Int.)

Second idea: Introduce a new pragma, that has similar effect on
representations as DPH's [::] vector type. This new pragma does deep
unpacking, allowing for more types to be instances of the Unbox type.

Could this be handled by just having/deriving an Unbox instance for (a,b)? I imagine the Unbox type class would have to contain essentially the same things as Storable, maybe something like

   type UnboxedRepr :: Int -> #
   class Unbox a where
       type Repr a :: # -- gives size and alignment
       unbox :: a -> Repr a
       box   :: Repr a -> a

A problem with an instance (Unboxed a, Unboxed b) => Unboxed (a,b) is that it allows arbitrarily large unboxed values to be created at runtime. That doesn't work when you use specialization to create the needed info tables.

As I mentioned further up in the email, I think this needs to be done at compile time. However, I'm not sure type classes are the right mechanism, as they don't guarantee that the polymorphism is resolved at compile time. Perhaps type families, in some form, is the right solution.

-- Johan


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Abstracting over things that can be unpacked

Alexey Khudyakov
> As I mentioned further up in the email, I think this needs to be done at
> compile time. However, I'm not sure type classes are the right mechanism, as
> they don't guarantee that the polymorphism is resolved at compile time.
> Perhaps type families, in some form, is the right solution.
>
There is problem with type families. Currently GHC is unable to unpack them[1].

[1] http://hackage.haskell.org/trac/ghc/ticket/3990

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

RE: Abstracting over things that can be unpacked

Simon Peyton Jones
In reply to this post by Johan Tibell-2

 

First idea: instead of rejecting unpack pragmas on polymorphic fields, have them require a class constraint on the field types. Example:

 

    data UnboxPair a b = (Unbox a, Unbox b) => UP {-# UNPACK #-} !a {-# UNPACK #-} !b

 

The Unbox type class would be similar in spirit to the class with the same name in the vector package, but be implemented internally by GHC. To a first approximation instances would only exist for fields that unpack to non-pointer types (e.g. Int.)

 

Johan I don’t understand how this would work.   What methods does the class Unbox have?  What is the final representation of a UP constructor?  How many words?   Pointers or non-pointers?

 

At the moment each constructor has a fixed layout.

 

I’m really not sure what you have in mind here.  More detail needed!

 

Simon


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users