I'm looking for the "right" concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen -> SimState -> IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] -> SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~20000 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int -> [IO a] -> IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. It strikes me as something someone would already have done, but I can't find anything in hackage. Probably I've missed something obvious? Any pointers? If not, what would be the best/easiest existing package to write an extension to? Thanks, Patrick. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Patrick Caldon <[hidden email]> writes:
> it takes about a second to run on a PC. It's functional except it > whacks the rng, which needs IO. I run 5-10 of these jobs, and then > use: Which RNG are you using that it needs so much IO? > > mergeWorld :: [SimState] -> SimState > > to pick the best features of the runs and build another possible world > (state). Then I use this new world to run another 5-10 jobs and so > on. I run this through ~20000 iterations. > > It's an obvious place for parallelism. > > I'm looking for a concurrency library with something like: > > forkSequence :: Int -> [IO a] -> IO [a] > > which I could call with something like this: > > forkSequence 4 (take 10 (repeat (runWorldSim g ss))) > > this would construct 4 threads, then dispatch the 10 jobs onto the > threads, and pack up the > results into a list I could run through my merger. > > It strikes me as something someone would already have done, but I > can't find anything in hackage. Probably I've missed something > obvious? Any pointers? > > If not, what would be the best/easiest existing package to write an > extension to? > > Thanks, > Patrick. > > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe -- Ivan Lazar Miljenovic [hidden email] IvanMiljenovic.wordpress.com _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Ivan Lazar Miljenovic wrote:
> Patrick Caldon <[hidden email]> writes: > >> it takes about a second to run on a PC. It's functional except it >> whacks the rng, which needs IO. I run 5-10 of these jobs, and then >> use: >> > Which RNG are you using that it needs so much IO? Mersenne Twister, System.Random.Mersenne. The ordinary rng kills performance. Patrick. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Patrick Caldon-2
Patrick Caldon wrote:
> > I'm looking for the "right" concurrency library/semantics for what > should be a reasonably simple problem. > > I have a little simulator: > > runWorldSim :: MTGen -> SimState -> IO SimState > > it takes about a second to run on a PC. It's functional except it > whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: > > mergeWorld :: [SimState] -> SimState > > to pick the best features of the runs and build another possible world > (state). Then I use this new world to run another 5-10 jobs and so > on. I run this through ~20000 iterations. > > It's an obvious place for parallelism. > > I'm looking for a concurrency library with something like: > > forkSequence :: Int -> [IO a] -> IO [a] > > which I could call with something like this: > > forkSequence 4 (take 10 (repeat (runWorldSim g ss))) > > this would construct 4 threads, then dispatch the 10 jobs onto the > threads, and pack up the > results into a list I could run through my merger. run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest. So if what you want is a function: runPar :: [IO a] -> IO [a] you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] -> CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style): import Control.Concurrent import Control.Concurrent.STM import Control.Monad modifyTVar :: TVar a -> (a -> a) -> STM () modifyTVar tv f = readTVar tv >>= writeTVar tv . f runPar :: [IO a] -> IO [a] runPar ps = do resVar <- newTVarIO [] mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps atomically $ do res <- readTVar resVar when (length res < length ps) retry return res If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach. Thanks, Neil. P.S. take 10 . repeat is the same as replicate 10 _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Neil Brown wrote:
> Patrick Caldon wrote: >> >> I'm looking for the "right" concurrency library/semantics for what >> should be a reasonably simple problem. >> >> I have a little simulator: >> >> runWorldSim :: MTGen -> SimState -> IO SimState >> >> it takes about a second to run on a PC. It's functional except it >> whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: >> >> mergeWorld :: [SimState] -> SimState >> >> to pick the best features of the runs and build another possible >> world (state). Then I use this new world to run another 5-10 jobs >> and so on. I run this through ~20000 iterations. >> >> It's an obvious place for parallelism. >> >> I'm looking for a concurrency library with something like: >> >> forkSequence :: Int -> [IO a] -> IO [a] >> >> which I could call with something like this: >> >> forkSequence 4 (take 10 (repeat (runWorldSim g ss))) >> >> this would construct 4 threads, then dispatch the 10 jobs onto the >> threads, and pack up the >> results into a list I could run through my merger. > Why particularly do you want to run the 10 jobs on 4 threads? > Haskell's run-time is quite good at spreading out the lightweight > threads onto all your cores, so the easiest thing to do is run the 10 > jobs on 10 (light-weight) threads and let the run-time sort out the > rest. Thanks so much for that! I'll give it a go. Different threads is just because some of the jobs are memory hogs, and I want to minimize the number running simultaneously. I'll see what happens with a runPar-like approach, and use a queue-based approach if it becomes a problem. > So if what you want is a function: > > runPar :: [IO a] -> IO [a] > > you can easily construct this. Shameless plug: my CHP library > effectively has this function already, runParallel :: [CHP a] -> CHP > [a] (CHP being a slight layer on top of IO). But you can do it just > as easily with, say, STM. Here is a version where order doesn't > matter (apologies for the point-free style): > > import Control.Concurrent > import Control.Concurrent.STM > import Control.Monad > > modifyTVar :: TVar a -> (a -> a) -> STM () > modifyTVar tv f = readTVar tv >>= writeTVar tv . f > > runPar :: [IO a] -> IO [a] > runPar ps > = do resVar <- newTVarIO [] > mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps > atomically $ do res <- readTVar resVar > when (length res < length ps) retry > return res > > If order does matter, you can zip the results with an index, and sort > by the index afterwards. If efficiency matters, you can perform other > tweaks. But the principle is quite straightforward. Or you can > refactor your code to take the IO dependency out of your random number > generation, and run the sets of pure code in parallel using the > parallel library. If all you are using IO for is random numbers, > that's probably the nicest approach. > implementation using System.Random, but had to rewrite it because performance was poor :( . > P.S. take 10 . repeat is the same as replicate 10 Thanks again! Patrick. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
On Fri, Dec 4, 2009 at 12:28 PM, Patrick Caldon <[hidden email]> wrote:
Have you tried this, pure, library? http://hackage.haskell.org/package/mersenne-random-pure64
-- Sebastian Sylvan _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Patrick Caldon-2
On Fri, 2009-12-04 at 22:51 +1100, Patrick Caldon wrote:
> I'm looking for the "right" concurrency library/semantics for what > should be a reasonably simple problem. > > I have a little simulator: > > runWorldSim :: MTGen -> SimState -> IO SimState > > it takes about a second to run on a PC. It's functional except it whacks > the rng, which needs IO. Wait! This is not going to work! You cannot use the MTGen from the mersenne-random in a concurrent IO program because the C code uses a single global mutable RNG state. Your "independent" simulations would not be independent and you would not get reproducible results. Indeed you could get incorrect results or segfaults because the C code does not expect to be called from multiple threads simultaneously (there is no locking). Personally I would attack this by eliminating the IO. There's no justification for a random number generator being in IO. And look at the problems it causes! There are other MT implementations that do not use C code which assumes it's ok to use one single global mutable RNG state for an entire process. There are pure-Haskell MT impls that use mutable variables in ST but give an overall pure lazy list of random numbers. If you don't need MT specifically then there are other fast RNGs too. Duncan _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Patrick Caldon-2
On 04/12/09 11:51, Patrick Caldon wrote:
> > I'm looking for the "right" concurrency library/semantics for what > should be a reasonably simple problem. > > I have a little simulator: > > runWorldSim :: MTGen -> SimState -> IO SimState > > it takes about a second to run on a PC. It's functional except it > whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: > > mergeWorld :: [SimState] -> SimState > > to pick the best features of the runs and build another possible world > (state). Then I use this new world to run another 5-10 jobs and so > on. I run this through ~20000 iterations. > > It's an obvious place for parallelism. > to evaluate pure functions instead. If you only use IO for the random numbers then you can either keep a StdGen in your SimState or else use a "State StdGen" monad. Since your random number use is presumably already in monadic IO you could probably switch to a state monad fairly trivially. Paul. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Duncan Coutts-4
On Fri, Dec 4, 2009 at 7:38 AM, Duncan Coutts <[hidden email]> wrote:
So use the PRNG in the statistics package instead. It's got some nice features that make it a better choice than mersenne-random for essentially all uses:
_______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Patrick Caldon-2
It appears there are several implementations existing on Hackage of the
following function, in various disguises: runPar :: [IO a] -> IO [a] the idea being that the IO computations are run in parallel, rather than sequentially. My own Streaming Component Combinators package contains a similar function, but somewhat generalized: class Monad m => ParallelizableMonad m where parallelize :: m a -> m b -> m (a, b) instance ParallelizableMonad IO -- implemented using forkIO instance ParallelizableMonad Identity -- implemented using par instance ParallelizableMonad Maybe -- implemented using par Would there be any interest in having this class packaged in a separate library? If so, can you sugest a better name or some additional functionality? _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
On Wed, Dec 9, 2009 at 2:17 PM, Mario Blazevic <[hidden email]> wrote:
> It appears there are several implementations existing on Hackage of > the following function, in various disguises: > > runPar :: [IO a] -> IO [a] > > > the idea being that the IO computations are run in parallel, rather than > sequentially. My own Streaming Component Combinators package contains a > similar function, but somewhat generalized: > > > class Monad m => ParallelizableMonad m where > parallelize :: m a -> m b -> m (a, b) > > instance ParallelizableMonad IO -- implemented using forkIO > instance ParallelizableMonad Identity -- implemented using par > instance ParallelizableMonad Maybe -- implemented using par > > > Would there be any interest in having this class packaged in a > separate library? If so, can you sugest a better name or some additional > functionality? A similar function that I'm fond of: forkExec :: IO a -> IO (IO a) forkExec k = do result <- newEmptyMVar _ <- forkIO $ k >>= putMVar result return (takeMVar result) Although I don't think it can be generalized to non-IO monads. Antoine _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Antoine Latter wrote:
> A similar function that I'm fond of: > > forkExec :: IO a -> IO (IO a) It's cute that forkExec already has a dual operation with just the right name (specialised to IO): join :: IO (IO a) -> IO a _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Antoine Latter-2
> A similar function that I'm fond of: > > forkExec :: IO a -> IO (IO a) > forkExec k > = do > result <- newEmptyMVar > _ <- forkIO $ k >>= putMVar result > return (takeMVar result) > > Although I don't think it can be generalized to non-IO monads. > > Antoine > I can't test it right now, but wouldn't the following do the job in the Identity monad? forkExec :: Identity a -> Identity (Identity a) forkExec k = let result = runIdentity k in result `par` return (Identity result) _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
On Wed, Dec 9, 2009 at 3:44 PM, Mario Blazevic <[hidden email]> wrote:
> > I can't test it right now, but wouldn't the following do the job in > the Identity monad? > > forkExec :: Identity a -> Identity (Identity a) > forkExec k = let result = runIdentity k > in result `par` return (Identity result) > Since Identity is a newtype, would that be equivalent to "result `par` result"? The forkExec in the IO monad let's other computations keep going until I need the result from the forked computation. In a pure computation, I can already get the same result with `par` and laziness, right? Antoine _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Matthew Brecknell-3
It's a good thing then that forkExec and return are denotationally equal
(though not operationally). Otherwise, I'd be worried. Matthew Brecknell wrote: > Antoine Latter wrote: >> A similar function that I'm fond of: >> >> forkExec :: IO a -> IO (IO a) > > It's cute that forkExec already has a dual operation with just the right > name (specialised to IO): > > join :: IO (IO a) -> IO a > > > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe > > _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
>
> > I can't test it right now, but wouldn't the > > following do the job in the Identity monad? > > > forkExec :: Identity a -> Identity (Identity a) > > forkExec k = let result = runIdentity k > > in result `par` return (Identity result) > > > Since Identity is a newtype, would that be equivalent to "result `par` > result"? The forkExec in the IO monad let's other computations keep > going until I need the result from the forked computation. You're right, it doesn't seem to work the way I hoped. The equivalent function on Maybe monad works, though, so it is possible to write forkExec in monads other than IO. > In a pure computation, I can already get the same result with `par` > and laziness, right? Yes. The goal is to enable writing monadic parallel computations which work under any parallelizable monad. For example, I'm using it to run two trampolining producer/consumer coroutines in parallel. A large majority of interesting coroutines I have are completely agnostic with respect to the underlying monad. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Free forum by Nabble | Edit this page |