parallel Haskell with limited sparks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

parallel Haskell with limited sparks

Henning Thielemann

When I read about parallel programming in Haskell it is always about
manual chunking of data. Why? Most of my applications with parallelization
benefit from a different computation scheme: Start a fixed number of
threads (at most the number of available computing cores) and whenever a
thread finishes a task it gets assigned a new one. This is how make -j,
cabal install -j, ghc -j, work. I wrote my own package pooled-io which
does the same for IO in Haskell and there seem to be more packages that
implemented the same idea. Can I have that computation scheme for non-IO
computations, too? With Parallel Strategies, monad-par etc.?
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: parallel Haskell with limited sparks

Henning Thielemann

On Tue, 9 Jul 2019, Henning Thielemann wrote:

> When I read about parallel programming in Haskell it is always about manual
> chunking of data. Why? Most of my applications with parallelization benefit
> from a different computation scheme: Start a fixed number of threads (at most
> the number of available computing cores) and whenever a thread finishes a
> task it gets assigned a new one. This is how make -j, cabal install -j, ghc
> -j, work. I wrote my own package pooled-io which does the same for IO in
> Haskell and there seem to be more packages that implemented the same idea.
> Can I have that computation scheme for non-IO computations, too? With
> Parallel Strategies, monad-par etc.?

Maybe I misunderstood something and the stuff from the 'parallel' package
already works the way I expected and starting 100 sparks does not mean
that GHC tries to run 100 threads concurrently and chunking is only
necessary when computing the single list elements in parallel is too much
parallelization overhead.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: parallel Haskell with limited sparks

Bryan Richter-2
On 7/9/19 12:02 PM, Henning Thielemann wrote:
 >
 > On Tue, 9 Jul 2019, Henning Thielemann wrote:
 >
 >> When I read about parallel programming in Haskell it is always
 >> about manual chunking of data. Why? Most of my applications with
 >> parallelization benefit from a different computation scheme: Start
 >> a fixed number of threads (at most the number of available
 >> computing cores) and whenever a thread finishes a task it gets
 >> assigned a new one. This is how make -j, cabal install -j, ghc -j,
 >> work. I wrote my own package pooled-io which does the same for IO
 >> in Haskell and there seem to be more packages that implemented
 >> the same idea. Can I have that computation scheme for non-IO
 >> computations, too? With Parallel Strategies, monad-par etc.?
 >
 > Maybe I misunderstood something and the stuff from the 'parallel'
 > package already works the way I expected and starting 100 sparks
 > does not mean that GHC tries to run 100 threads concurrently and
 > chunking is only necessary when computing the single list elements
 > in parallel is too much parallelization overhead.

This is absolutely correct. :)

     GHC doesn’t force us to use a fixed number of rpar calls; we
     can call it as many times as we like, and the system will
     automatically distribute the parallel work among the available
     cores. If the work is divided into smaller chunks, then the system
     will be able to keep all the cores busy for longer.

     A fixed division of work is often called static partitioning,
     whereas distributing smaller units of work among processors at
     runtime is called dynamic partitioning. GHC already provides the
     mechanism for dynamic partitioning; we just have to supply it with
     enough tasks by calling rpar often enough so that it can do its
     job and balance the work evenly.

     The argument to rpar is called a spark. The runtime collects
     sparks in a pool and uses this as a source of work when there
     are spare processors available, using a technique called work
     stealing. Sparks may be evaluated at some point in the future, or
     they might not—it all depends on whether there is a spare core
     available. Sparks are very cheap to create: rpar essentially just
     writes a pointer to the expression into an array.

     -
https://www.oreilly.com/library/view/parallel-and-concurrent/9781449335939/ch02.html

Parallel and Concurrent Programming in Haskell goes into a lot of
detail about chunking strategies, and why you would or wouldn't try to
limit the number of sparks.

-Bryan

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: parallel Haskell with limited sparks

Henning Thielemann
In reply to this post by Henning Thielemann

Thank you for the quote! I even skimmed over the book of Simon Marlow
before asking, but missed that paragraph. :-(
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.