Executing conduit streams in parallel leads to memory leaks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Executing conduit streams in parallel leads to memory leaks

Simon Hafner
When I run my conduit without any additions, it works as expected,
with low constant memory usage, as advertised. It's a bit slow, so I
tried to speed it up with worker pools (via parallel-io) and staged
folding (via stm-conduit). However, then the memory usage indicates
all the ByteString from the file readings are being fully allocated
and kept in memory, even though they're not being used after a step of
conduit. [1]

I thought maybe because of the closing IO, the release of the file
handle somehow keeps the read string in memory, so I wanted to make
absolutely sure that's not the problem. [2] Switch out the
`Lib.readFile` with `B.readFile` to undo that specific part.

I was not using a worker pool in the beginning, so maybe the
`mapConcurrently_` somehow allocated all the threads, but with the
pooled solution, that should be solved as well.

What else could cause all the ByteStrings to be kept in memory in the
parallel version?

The example is available on:
https://github.com/reactormonk/non-constant-memory

[1] https://github.com/reactormonk/non-constant-memory/blob/master/src/Lib.hs#L51
[2] https://github.com/reactormonk/non-constant-memory/blob/master/src/Lib.hs#L62
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.