Bulk Synchronous Parallel

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Bulk Synchronous Parallel

Gregory Crosswhite-2
Hey everyone,

Has anyone done any work with bulk synchronous parallel computing in Haskell?  The idea behind the model is that you divide your computation into a series of computation and communication phases, and it has recently occurred to me that this might be an ideal setup for parallelizing a pure language like Haskell because you can think of it as alternating between a stage that independently applies a bunch of functions to a bunch of independent chunks of data and a stage that applies a big function to all of the chunks that recombines them into new chunks for the next parallel phase, so that all stages are conceptually pure even if eventually the second stage is turned into something involving communication and hence side-effectful under the hood.

Experiences?  Thoughts?

Cheers,
Greg

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Sebastian Sylvan


On Mon, Apr 19, 2010 at 11:03 PM, Gregory Crosswhite <[hidden email]> wrote:
Hey everyone,

Has anyone done any work with bulk synchronous parallel computing in Haskell?  The idea behind the model is that you divide your computation into a series of computation and communication phases, and it has recently occurred to me that this might be an ideal setup for parallelizing a pure language like Haskell because you can think of it as alternating between a stage that independently applies a bunch of functions to a bunch of independent chunks of data and a stage that applies a big function to all of the chunks that recombines them into new chunks for the next parallel phase, so that all stages are conceptually pure even if eventually the second stage is turned into something involving communication and hence side-effectful under the hood.

Experiences?  Thoughts?


You may want to check out NDP, e.g. here: http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell

It's at a higher level of abstraction, in a way. You don't need to worry about the dicing up and recombining, the compiler takes care of it for you. You just write things in terms of parallel arrays (which can be small, e.g. 2 element wide) and the compiler will fuse/flatten these together into big bulk parallel computations with communication between them. 

--
Sebastian Sylvan

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Gregory Crosswhite-2
Thanks for the link;  my ultimate interest, though, is in an architecture that could scale to multiple machines rather than multiple cores with shared memory on a single machine.  Has there been any interest and/or progress in making DPH run on multiple machines and other NUMA architectures?

Cheers,
Greg


On Apr 19, 2010, at 3:33 PM, Sebastian Sylvan wrote:



On Mon, Apr 19, 2010 at 11:03 PM, Gregory Crosswhite <[hidden email]> wrote:
Hey everyone,

Has anyone done any work with bulk synchronous parallel computing in Haskell?  The idea behind the model is that you divide your computation into a series of computation and communication phases, and it has recently occurred to me that this might be an ideal setup for parallelizing a pure language like Haskell because you can think of it as alternating between a stage that independently applies a bunch of functions to a bunch of independent chunks of data and a stage that applies a big function to all of the chunks that recombines them into new chunks for the next parallel phase, so that all stages are conceptually pure even if eventually the second stage is turned into something involving communication and hence side-effectful under the hood.

Experiences?  Thoughts?


You may want to check out NDP, e.g. here: http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell

It's at a higher level of abstraction, in a way. You don't need to worry about the dicing up and recombining, the compiler takes care of it for you. You just write things in terms of parallel arrays (which can be small, e.g. 2 element wide) and the compiler will fuse/flatten these together into big bulk parallel computations with communication between them. 

--
Sebastian Sylvan


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Jason Dusek
2010/04/19 Gregory Crosswhite <[hidden email]>:
> Thanks for the link;  my ultimate interest, though, is in an architecture
> that could scale to multiple machines rather than multiple cores with shared
> memory on a single machine.  Has there been any interest and/or progress in
> making DPH run on multiple machines and other NUMA architectures?

  I wonder what it would take to do this.

  One approach is some compiler "magic" that provides you with an RTS
  that can communicate with other RTSen over TCP and chunks the computation
  "appropriately". Or maybe you give it a chunk size and it gives you some
  number of executables that find one another using Bonjour. Values not on
  "this node" are found via some hashing scheme (the same scheme used to
  chunk in the first place).

  There is a lot to know about this problem area.

  It would be a great alternative to OpenMPI.

--
Jason Dusek
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Alexander Solla-2

On Apr 20, 2010, at 11:05 AM, Jason Dusek wrote:

>> Thanks for the link;  my ultimate interest, though, is in an  
>> architecture
>> that could scale to multiple machines rather than multiple cores  
>> with shared
>> memory on a single machine.  Has there been any interest and/or  
>> progress in
>> making DPH run on multiple machines and other NUMA architectures?
>
>  I wonder what it would take to do this.

Erlang.  ;0)
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Gregory Crosswhite-2
You know, I looked into Erlang, and while it looks intriguing it isn't great for my purposes because I want to be able to call Fortran routines to do the heavy number-crunching, and Erlang doesn't have a good standard FFI like Haskell.

Also, I really don't want to use a dynamically typed language again if I can help it.  ;-)

Cheers,
Greg


On Apr 20, 2010, at 10:14 PM, Alexander Solla wrote:

>
> On Apr 20, 2010, at 11:05 AM, Jason Dusek wrote:
>
>>> Thanks for the link;  my ultimate interest, though, is in an architecture
>>> that could scale to multiple machines rather than multiple cores with shared
>>> memory on a single machine.  Has there been any interest and/or progress in
>>> making DPH run on multiple machines and other NUMA architectures?
>>
>> I wonder what it would take to do this.
>
> Erlang.  ;0)
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Aaron D. Ball-3
In reply to this post by Jason Dusek
On Tue, Apr 20, 2010 at 14:05, Jason Dusek <[hidden email]> wrote:

>  One approach is some compiler "magic" that provides you with an RTS
>  that can communicate with other RTSen over TCP and chunks the computation
>  "appropriately".

The approaches to Haskell multi-host parallelism I've seen all seem to
be (a) dead and (b) overly complicated, which I can't help but suspect
are related.

I don't need a tool that automatically figures out how to distribute
any workload in an intelligent way and handles all the communication
for me.  If I have the basic building block, which is the ability to
serialize a Haskell expression with its dependencies and read them
into another Haskell instance where I can evaluate them, I can handle
the other pieces, which are

- passing strings back and forth in whatever way is convenient
- deciding how to divide up my workload.

In the Ruby universe, DRb combines the serialization and "passing
strings around" job and lets me figure out how to divide up the work,
and it would be delightful if there were something similarly simple in
the Haskell world.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Don Stewart-2
aarondball+haskell:
> On Tue, Apr 20, 2010 at 14:05, Jason Dusek <[hidden email]> wrote:
>
> >  One approach is some compiler "magic" that provides you with an RTS
> >  that can communicate with other RTSen over TCP and chunks the computation
> >  "appropriately".
>
> The approaches to Haskell multi-host parallelism I've seen all seem to
> be (a) dead and (b) overly complicated, which I can't help but suspect
> are related.

Eden is active, afaik,

    http://www.mathematik.uni-marburg.de/~eden/ 

They just had a hackathon in St Andrews,

    http://hackage.haskell.org/trac/ghc/wiki/HackPar
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Aaron D. Ball-3
On Wed, Apr 21, 2010 at 14:14, Don Stewart <[hidden email]> wrote:

> Eden is active, afaik,
>
>    http://www.mathematik.uni-marburg.de/~eden/

Unfortunately, Eden is one of the examples I had in mind when
referring to distributed Haskell projects as overly complicated and
[for practical purposes] dead.  Their last release available for
download was in 2006.  Their beta is "available upon request", which
doesn't raise my confidence in the level of active development or
openness of the project.  From the notes here:

> They just had a hackathon in St Andrews,
>
>    http://hackage.haskell.org/trac/ghc/wiki/HackPar

they don't seem to have even a source code repository yet, and they
appear to be bogged down in that complexity I mentioned.  It looks
like they want to engage large, academically interesting problems and
produce papers, as opposed to producing small tools that solve simple
problems in simple ways.

The practical solution, as a result, continues to be "use another language".
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Jason Dusek
In reply to this post by Jason Dusek
2010/4/21 Aaron D. Ball <[hidden email]>:
> I don't need a tool that automatically figures out how to distribute
> any workload in an intelligent way and handles all the communication
> for me.

  You are right in general. Only if you want to rely on purity and a
  few source code annotations to get you parallelism relatively
  cheaply do you care about these compiler approaches. This is something
  that Haskell can do that Ruby, C and friends really can not do -- thus
  I mention it.

> If I have the basic building block, which is the ability to
> serialize a Haskell expression with its dependencies and read them
> into another Haskell instance where I can evaluate them, I can handle
> the other pieces, which are
>
> - passing strings back and forth in whatever way is convenient
> - deciding how to divide up my workload.

  Do add also, configuring servers and their connections.

> In the Ruby universe, DRb combines the serialization and "passing
> strings around" job and lets me figure out how to divide up the work,
> and it would be delightful if there were something similarly simple in
> the Haskell world.

  I think Holumbus has got some promising stuff for user-managed
  distributed workers:

    http://holumbus.fh-wedel.de/trac/browser/distribution

  What do you think?

--
Jason Dusek
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Bas van Dijk-2
In reply to this post by Aaron D. Ball-3
On Wed, Apr 21, 2010 at 8:07 PM, Aaron D. Ball
<[hidden email]> wrote:
> If I have the basic building block, which is the ability to
> serialize a Haskell expression with its dependencies and read them
> into another Haskell instance where I can evaluate them, I can handle
> the other pieces

How I wish we had Clean's dynamics[1]!

regards,

Bas

[1] See section 8 of the Clean report:
http://clean.cs.ru.nl/download/Clean20/doc/CleanLangRep.2.1.pdf
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Don Stewart-2
v.dijk.bas:
> On Wed, Apr 21, 2010 at 8:07 PM, Aaron D. Ball
> <[hidden email]> wrote:
> > If I have the basic building block, which is the ability to
> > serialize a Haskell expression with its dependencies and read them
> > into another Haskell instance where I can evaluate them, I can handle
> > the other pieces
>
> How I wish we had Clean's dynamics[1]!

You could imagine shipping GHCi bytecode between processes, much as
Clean could do.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Peter Gammie
In reply to this post by Aaron D. Ball-3
On Thu, Apr 22, 2010 at 4:07 AM, Aaron D. Ball
<[hidden email]> wrote:

> I don't need a tool that automatically figures out how to distribute
> any workload in an intelligent way and handles all the communication
> for me.  If I have the basic building block, which is the ability to
> serialize a Haskell expression with its dependencies and read them
> into another Haskell instance where I can evaluate them, I can handle
> the other pieces, which are
>
> - passing strings back and forth in whatever way is convenient
> - deciding how to divide up my workload.

Alice/ML is the place to look for this technology.

http://www.ps.uni-saarland.de/alice/

The project may be dead (I don't know), but they did have the most
sophisticated take on pickling that I've seen. It's an ML variant,
with futures, running on top of the same platform used by Mozart/Oz of
CTM fame.

Why do you want to pass strings around?

cheers
peter
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

wren ng thornton
Peter Gammie wrote:
> Alice/ML is the place to look for this technology.
>
> http://www.ps.uni-saarland.de/alice/
>
> The project may be dead (I don't know), but they did have the most
> sophisticated take on pickling that I've seen. It's an ML variant,
> with futures, running on top of the same platform used by Mozart/Oz of
> CTM fame.

I seem to recall AliceML decided to move away from the Mozart/Oz runtime
and develop their own (and that the project died shortly thereafter).
I'd love to be wrong though, AliceML was very nice.

--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Bulk Synchronous Parallel

Thomas Horstmeyer
In reply to this post by Aaron D. Ball-3
Hi Aaron,

On 21.04.2010 20:29, Aaron D. Ball wrote:
 > Unfortunately, Eden is one of the examples I had in mind when
 > referring to distributed Haskell projects as overly complicated and
 > [for practical purposes] dead.  Their last release available for
 > download was in 2006.  Their beta is "available upon request", which
 > doesn't raise my confidence in the level of active development or
 > openness of the project.
 >
You're partly right. The development of the Eden system itself has been
neglected for a time. Mostly because Jost Berthold (the main system
programmer) finished his PhD and left Marburg (and has only limited
spare time with his new job). The remaining members of the Eden group
here were mostly language users and language developers, not system
implementors. No one really opted to take over responsibility...
However, this is changing right now. We are actively merging the Eden
runtime and that of the GpH language (as both need basically the same
features) to get a new runtime based on GHC 6.12 (and GHC HEAD). This
runtime, while still in beta state, is already usable. However, building
it is currently still a bit too complicated to bother end users with it.
So, if your "complicated" is referring to this, you are right. We hope
to simplify this and to have a usable release in the near future. If
your "complicated" was targetting the use of the Eden language itself,
however, I object!


 >
 >> They just had a hackathon in St Andrews,
 >>
 >>     http://hackage.haskell.org/trac/ghc/wiki/HackPar
 >
 > they don't seem to have even a source code repository yet, and they
 > appear to be bogged down in that complexity I mentioned.
 >
Of course, a source code repository for Eden always existed. The new
repository was meant for the merged version of Eden/GpH. And that exists
as well. The Eden webpage will be updated as soon as we have a
pre-release of the 6.12 branch.


Best regards,
Thomas Horstmeyer (Eden group Marburg)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe