I am running a Haskell program with some par/pseq annotations.
When I use threadscope to view the eventlog (not "timeline" but "events") I see a lot of stackoverflow messages. What is this? I don't get any RTS errors printed on the console. The program finishes normally (albeit with less speedup than expected) Should I file this as a possible GHC RTS bug? At least the output is confusing. (... and why can't I copy/paste the text from threadscope's output window) _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
I don't think a stack overflow event indicates an RTS bug. Stack
overflow events usually result in the RTS trying to adjust the stack size, and only if that fails, the program is halted. > (... and why can't I copy/paste the text from threadscope's output window) As a workaround, you can use the "show-ghc-events" binary that is provided by the ghc-events package. Cheers, Andres _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
> As a workaround, you can use the "show-ghc-events" binary that is > provided by the ghc-events package. Thanks, I wasn't aware of that. Are the following lines normal for an eventlog? ... 1877298000: cap 1: waking up thread 4 on cap 1 1877299000: cap 1: thread 4 is runnable 1877305000: cap 6: thread 4 is runnable 1877306000: cap 1: migrating thread 4 to cap 6 1877334000: cap 1: running thread 16 1877345000: cap 6: running thread 4 1877348000: cap 6: stopping thread 4 (thread finished) 1877428000: cap 3: stopping thread 14 (stack overflow) 1877428000: cap 3: running thread 14 1877501000: cap 1: stopping thread 16 (stack overflow) 1877503000: cap 1: running thread 16 1877606000: cap 3: stopping thread 14 (stack overflow) 1877607000: cap 3: running thread 14 1877658000: cap 1: stopping thread 16 (stack overflow) 1877659000: cap 1: running thread 16 1877723000: cap 4: stopping thread 10 (stack overflow) 1877724000: cap 4: running thread 10 1877769000: cap 3: stopping thread 14 (stack overflow) 1877770000: cap 3: running thread 14 ... _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Here's source and logs:
http://www.imn.htwk-leipzig.de/~waldmann/draft/skpp11/subseqsum/Subseqsum.hs The program is meant to show an application of the "third homomorphism theorem" approach (hom-based structural parallel programming). My observation (for this program) is that there is little speedup, indeed the threadscope picture shows a peak only at the very end, but the eventlog contains surprising (for me) stack and heap overflows. Would this work better with Data.Sequence instead of List? (Is there a really cheap way (O(1)) to split some Data.Sequence roughly in half?) PS: I keep telling my students that "structural parallel programming" is the right thing to do, but I find it surprisingly difficult to exhibit clear-cut examples that support the claim (and work with standard ghc, so students can reproduce it). I appreciate any comments. - J.W. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
----- Original Message -----
> From: Johannes Waldmann <[hidden email]> > Sent: Tuesday, June 7, 2011 8:22 AM > > Here's source and logs: > > http://www.imn.htwk-leipzig.de/~waldmann/draft/skpp11/subseqsum/Subseqsum.hs > > The program is meant to show an application of the "third homomorphism > theorem" > approach (hom-based structural parallel programming). > > My observation (for this program) is that there is little speedup, > indeed the threadscope picture shows a peak only at the very end, > but the eventlog contains surprising (for me) stack and heap overflows. > > Would this work better with Data.Sequence instead of List? > (Is there a really cheap way (O(1)) to split some Data.Sequence roughly in > half?) You can split a Sequence in O(log n) time, or a bit cheaper near the ends. I'd be surprised if you see any speedup with this code. The serial algorithm is already O(n), maintaining two counters, and with a lazily generated list it probably runs in constant memory. length and splitAt are already O(n), make two separate traversals, and require allocating memory for the entire list. Even if you have an efficiently splittable data structure, the parallel version of the code needs to keep four counts for each chunk, and do more on each update. Have you tried a version that splits the list into larger chunks, and uses the serial algorithm for computing the left, right, overall, and maximum sums of each chunk, before summing them together? > PS: I keep telling my students that "structural parallel programming" > is the right thing to do, but I find it surprisingly difficult > to exhibit clear-cut examples that support the claim > (and work with standard ghc, so students can reproduce it). > > I appreciate any comments. - J.W. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Johannes Waldmann
Hi,
On 07/06/11 14:22, Johannes Waldmann wrote: > Would this work better with Data.Sequence instead of List? > (Is there a really cheap way (O(1)) to split some Data.Sequence roughly in half?) I came up with this using immutable unboxed arrays, which gives a nice parallel speedup (and somehow avoids the stack overflows, I didn't work out where they were coming from unfortunately): SPARKS: 1000268 (102821 converted, 0 pruned) INIT time 0.02s ( 0.02s elapsed) MUT time 0.90s ( 0.46s elapsed) GC time 0.03s ( 0.03s elapsed) EXIT time 0.01s ( 0.04s elapsed) Total time 0.97s ( 0.53s elapsed) %GC time 3.1% (5.8% elapsed) Alloc rate 586,961,335 bytes per MUT second Productivity 94.4% of total user, 173.5% of total elapsed on my dual-core laptop until around 1e6 elements when I compile with: ghc -O2 -Wall --make -threaded -rtsopts -fforce-recomp Subseqsum.hs and run with: ./Subseqsum 1e6 +RTS -N -s -M1G -A512M but after that (eg: 1e7) the GC time dominates and it slows right down. Note that I haven't tested it for correctness! So there may be bugs: ----8<---- import Data.List (unfoldr) import Control.Parallel (par, pseq) import Data.Monoid (Monoid, mempty, mappend) import Data.Array.Unboxed (UArray, listArray, (!)) import System.Environment (getArgs) main :: IO () main = do [ nn ] <- getArgs let n = read nn xs = stuff a = listArray (0, n - 1) xs print . t $ sss 0 n a stuff :: [Int] stuff = unfoldr ( \ x -> seq x $ Just ( x, mod (113 * x + 558) 335 - 167 ) ) 0 data O = O { s :: ! Int, l :: !Int, r :: !Int , t :: !Int } instance Monoid O where mempty = O { s = 0, r = 0, l = 0, t = 0 } o1 `mappend` o2 = let s' = s o1 + s o2 r' = max (r o2) ( s o2 + r o1 ) l' = max (l o1) ( s o1 + l o2 ) t' = max (r o1 + l o2) $ max ( t o1 ) ( t o2 ) in O { s = s', r = r', l = l', t = t' } msingle :: Int -> O msingle x = O { s = x, r = max x 0, l = max x 0, t = max x 0} sss :: Int -> Int -> UArray Int Int -> O sss lo hi a | lo == hi = mempty | lo + 1 == hi = msingle (a ! lo) | otherwise = let mid = (lo + hi) `div` 2 x = sss lo mid a y = sss mid hi a in x `par` y `pseq` (x `mappend` y) ----8<---- > PS: I keep telling my students that "structural parallel programming" I don't know that term, so I might be missing the point. Sorry if so. Thanks, Claude -- http://claudiusmaximus.goto10.org _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Johannes Waldmann
GHC starts threads with a small stack size to efficiently support lightweight concurrency. As a thread uses more stack space, it will be expanded as needed up to some maximum fixed size. I think these stack overflow events you see are the runtime expanding the thread stacks.
You can adjust the initial and maximum stack sizes using the -k (initial) and -K (max) RTS options. Quoting from the GHC users guide (http://www.haskell.org/ghc/docs/7.0-latest/html/users_guide/runtime-control.html#setting-rts-options): -ksize [Default: 1k] Set the initial stack size for new threads. Thread stacks (including the main thread's stack) live on the heap, and grow as required. The default value is good for concurrent applications with lots of small threads; if your program doesn't fit this model then increasing this option may help performance. The main thread is normally started with a slightly larger heap to cut down on unnecessary stack growth while the program is starting up. -Ksize [Default: 8M] Set the maximum stack size for an individual thread to size bytes. This option is there purely to stop the program eating up all the available memory in the machine if it gets into an infinite loop. On Jun 7, 2011, at 3:55 AM, Johannes Waldmann wrote: > >> As a workaround, you can use the "show-ghc-events" binary that is >> provided by the ghc-events package. > > Thanks, I wasn't aware of that. > > Are the following lines normal for an eventlog? > > ... > 1877298000: cap 1: waking up thread 4 on cap 1 > 1877299000: cap 1: thread 4 is runnable > 1877305000: cap 6: thread 4 is runnable > 1877306000: cap 1: migrating thread 4 to cap 6 > 1877334000: cap 1: running thread 16 > 1877345000: cap 6: running thread 4 > 1877348000: cap 6: stopping thread 4 (thread finished) > 1877428000: cap 3: stopping thread 14 (stack overflow) > 1877428000: cap 3: running thread 14 > 1877501000: cap 1: stopping thread 16 (stack overflow) > 1877503000: cap 1: running thread 16 > 1877606000: cap 3: stopping thread 14 (stack overflow) > 1877607000: cap 3: running thread 14 > 1877658000: cap 1: stopping thread 16 (stack overflow) > 1877659000: cap 1: running thread 16 > 1877723000: cap 4: stopping thread 10 (stack overflow) > 1877724000: cap 4: running thread 10 > 1877769000: cap 3: stopping thread 14 (stack overflow) > 1877770000: cap 3: running thread 14 > ... > > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe > _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Claude Heiland-Allen-2
> I came up with this using immutable unboxed arrays [...]
> sss :: Int -> Int -> UArray Int Int -> O Nice. - Although what you do here is actually "C programming": you have a "global" array, and work on its indices. Actually you always have a pair of indices, denoting a subsequence. It would be nice if the program text showed this directly, using data Seq a = Seq { contents :: ! ( Array Int a ) , lo :: ! Int, hi :: ! Int } and then all (well, most) of the methods in Data.Sequence, implemented in such a way that the "contents" is just copied around, e.g., splitAt :: Int -> Seq a -> ( Seq a, Seq a ) splitAt k s = ( s { hi = lo s + k - 1 } , s { lo = lo s + k } ) and then a Foldable instance, and then ... specializations for the types that are allowed as elements of an unboxed array. (and that's the part that does not seem to be working easily. Can this be done by some specialize/rules magic in GHC?) > > PS: I keep telling my students that "structural parallel programming" > I don't know that term, so I might be missing the point. Sorry if so. Your code is fine. - Some pointers are here: http://www.iis.sinica.edu.tw/~scm/2008/constructing-list-homomorphism/ _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Free forum by Nabble | Edit this page |