Bug-free, leak-free, battle-tested library for broadcast channels?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug-free, leak-free, battle-tested library for broadcast channels?

Saurabh Nanda
(cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)

I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:


But, I'm not sure if these are battle-tested. Any help would be appreciated.

-- Saurabh.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Merijn Verstraaten
I don't know about split-channel, which seems to try and have a lot of functionality, but broadcast-chan is literally (and I mean the "copy+paste" kind of literally) the exact same implementation as Control.Concurrent.Chan.

The only difference is that, with Control.Concurrent.Chan each Chan always has access to both the read and write end of the Chan. This read end will keep data inside that Chan alive (and thus in memory!) indefinitely, if no one reads from that Chan. If you use dupChan this will create a new read end that tracks data in the Chan separately from the original one.

This means that if you have a worker that only ever writes into a Chan, this read end is basically keeping everything you write into the Chan alive forever.

broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

So, if your problem is "I want to broadcast messages, but messages that are sent when there are no listeners should be dropped and forgotten", then use broadcast-chan.

The package itself isn't "battle-tested", but since it's basically a copy+paste of Control.Concurrent.Chan with a trivial wrapper, I feel fairly confident that it doesn't have any major problems.

Cheers,
Merijn


> On 23 Jan 2017, at 11:52, Saurabh Nanda <[hidden email]> wrote:
>
> (cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)
>
> I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:
>
> * http://hackage.haskell.org/package/split-channel
> * https://hackage.haskell.org/package/broadcast-chan
>
> But, I'm not sure if these are battle-tested. Any help would be appreciated.
>
> -- Saurabh.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Greg Horn
I'm not sure if it's exactly what you're looking for, but I've been using zeromq for this for a long time http://hackage.haskell.org/package/zeromq4-haskell. It's bindings to a C++ library, but that means that you can communicate with other languages which also have bindings.

On Tue, Jan 24, 2017 at 2:18 AM Merijn Verstraaten <[hidden email]> wrote:
I don't know about split-channel, which seems to try and have a lot of functionality, but broadcast-chan is literally (and I mean the "copy+paste" kind of literally) the exact same implementation as Control.Concurrent.Chan.

The only difference is that, with Control.Concurrent.Chan each Chan always has access to both the read and write end of the Chan. This read end will keep data inside that Chan alive (and thus in memory!) indefinitely, if no one reads from that Chan. If you use dupChan this will create a new read end that tracks data in the Chan separately from the original one.

This means that if you have a worker that only ever writes into a Chan, this read end is basically keeping everything you write into the Chan alive forever.

broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

So, if your problem is "I want to broadcast messages, but messages that are sent when there are no listeners should be dropped and forgotten", then use broadcast-chan.

The package itself isn't "battle-tested", but since it's basically a copy+paste of Control.Concurrent.Chan with a trivial wrapper, I feel fairly confident that it doesn't have any major problems.

Cheers,
Merijn


> On 23 Jan 2017, at 11:52, Saurabh Nanda <[hidden email]> wrote:
>
> (cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)
>
> I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:
>
> * http://hackage.haskell.org/package/split-channel
> * https://hackage.haskell.org/package/broadcast-chan
>
> But, I'm not sure if these are battle-tested. Any help would be appreciated.
>
> -- Saurabh.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

David Turner-2
In reply to this post by Saurabh Nanda
The implementation in STM works well and certainly ticks the battle-tested box:

https://hackage.haskell.org/package/stm-2.4.4.1/docs/Control-Concurrent-STM-TChan.html#v:newBroadcastTChan

Cheers,



On 23 January 2017 at 10:52, Saurabh Nanda <[hidden email]> wrote:
(cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)

I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:


But, I'm not sure if these are battle-tested. Any help would be appreciated.

-- Saurabh.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Saurabh Nanda
In reply to this post by Merijn Verstraaten
broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

If broadcast-chan does the exact same thing as broadcastTChan, why the need for a new library? This is what I'm trying to understand at the discussion on Reddit [1]. Would you prefer the mailing list or Reddit for the discussion?


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Saurabh Nanda
In reply to this post by David Turner-2
The implementation in STM works well and certainly ticks the battle-tested box:

https://hackage.haskell.org/package/stm-2.4.4.1/docs/Control-Concurrent-STM-TChan.html#v:newBroadcastTChan


What does the following comment in the documentation really mean (highlighted by >>><<<)?

"Create a write-only TChan. >>> More precisely, readTChan will retry even after items have been written to the channel.<<< The only way to read a broadcast channel is to duplicate it with dupTChan."



-- Saurabh.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Saurabh Nanda
In reply to this post by Merijn Verstraaten
Am I using broadcastTChan & STM correctly here? https://gist.github.com/saurabhnanda/3cc39f1e0a646254dd8819b01cd04ac3 Is it acceptable to basically create a `broadcastTChan` and make it "escape" the STM monad?



On Tue, Jan 24, 2017 at 3:45 PM, Merijn Verstraaten <[hidden email]> wrote:
I don't know about split-channel, which seems to try and have a lot of functionality, but broadcast-chan is literally (and I mean the "copy+paste" kind of literally) the exact same implementation as Control.Concurrent.Chan.

The only difference is that, with Control.Concurrent.Chan each Chan always has access to both the read and write end of the Chan. This read end will keep data inside that Chan alive (and thus in memory!) indefinitely, if no one reads from that Chan. If you use dupChan this will create a new read end that tracks data in the Chan separately from the original one.

This means that if you have a worker that only ever writes into a Chan, this read end is basically keeping everything you write into the Chan alive forever.

broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

So, if your problem is "I want to broadcast messages, but messages that are sent when there are no listeners should be dropped and forgotten", then use broadcast-chan.

The package itself isn't "battle-tested", but since it's basically a copy+paste of Control.Concurrent.Chan with a trivial wrapper, I feel fairly confident that it doesn't have any major problems.

Cheers,
Merijn


> On 23 Jan 2017, at 11:52, Saurabh Nanda <[hidden email]> wrote:
>
> (cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)
>
> I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:
>
> * http://hackage.haskell.org/package/split-channel
> * https://hackage.haskell.org/package/broadcast-chan
>
> But, I'm not sure if these are battle-tested. Any help would be appreciated.
>
> -- Saurabh.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.




--

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

David Turner-2
In reply to this post by Saurabh Nanda
At a high level, it means it does what I think you're looking for: the "read end" of the channel created there isn't connected to anything, so there's no leak if there's no consumers. But duplicating it gives you a new "read end" that yields values as you would expect.

Without going into too much detail about STM, everything runs in a transaction, and `retry` means to roll back any changes within the transaction and start again. An unconditional `retry` (including unconditionally reading from a channel created with `newBroadcastTChan`) is a deadlock and therefore probably a mistake. So don't do that!


On 25 Jan 2017 05:41, "Saurabh Nanda" <[hidden email]> wrote:
The implementation in STM works well and certainly ticks the battle-tested box:

https://hackage.haskell.org/package/stm-2.4.4.1/docs/Control-Concurrent-STM-TChan.html#v:newBroadcastTChan


What does the following comment in the documentation really mean (highlighted by >>><<<)?

"Create a write-only TChan. >>> More precisely, readTChan will retry even after items have been written to the channel.<<< The only way to read a broadcast channel is to duplicate it with dupTChan."



-- Saurabh.



_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

David Turner-2
In reply to this post by Saurabh Nanda
Looks good to me. In particular, creating things in one transaction and using them in another is normal for STM.

If you're creating the channel in a transaction on its own you might prefer to call this:


On 25 Jan 2017 07:55, "Saurabh Nanda" <[hidden email]> wrote:
Am I using broadcastTChan & STM correctly here? https://gist.github.com/saurabhnanda/3cc39f1e0a646254dd8819b01cd04ac3 Is it acceptable to basically create a `broadcastTChan` and make it "escape" the STM monad?



On Tue, Jan 24, 2017 at 3:45 PM, Merijn Verstraaten <[hidden email]> wrote:
I don't know about split-channel, which seems to try and have a lot of functionality, but broadcast-chan is literally (and I mean the "copy+paste" kind of literally) the exact same implementation as Control.Concurrent.Chan.

The only difference is that, with Control.Concurrent.Chan each Chan always has access to both the read and write end of the Chan. This read end will keep data inside that Chan alive (and thus in memory!) indefinitely, if no one reads from that Chan. If you use dupChan this will create a new read end that tracks data in the Chan separately from the original one.

This means that if you have a worker that only ever writes into a Chan, this read end is basically keeping everything you write into the Chan alive forever.

broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

So, if your problem is "I want to broadcast messages, but messages that are sent when there are no listeners should be dropped and forgotten", then use broadcast-chan.

The package itself isn't "battle-tested", but since it's basically a copy+paste of Control.Concurrent.Chan with a trivial wrapper, I feel fairly confident that it doesn't have any major problems.

Cheers,
Merijn


> On 23 Jan 2017, at 11:52, Saurabh Nanda <[hidden email]> wrote:
>
> (cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)
>
> I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:
>
> * http://hackage.haskell.org/package/split-channel
> * https://hackage.haskell.org/package/broadcast-chan
>
> But, I'm not sure if these are battle-tested. Any help would be appreciated.
>
> -- Saurabh.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.




--

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

David Turner-2
On an unrelated note, are you sure about using `getCurrentTime` to measure the start and end times of your process? If you want to report on the duration of the process, it can be better to use the monotonic clock from here:

https://hackage.haskell.org/package/clock-0.7.2/docs/System-Clock.html

The trouble with the real-time clock is that it can change discontinuously, even backwards, and it is hard to account for things like leap seconds with it, so durations calculated by subtracting two UTCTimes are a little unreliable.

OTOH you might well want to know the real start and end times of your process instead/as well - it depends on your application.

Hope that helps,

David


On 25 Jan 2017 08:19, "David Turner" <[hidden email]> wrote:
Looks good to me. In particular, creating things in one transaction and using them in another is normal for STM.

If you're creating the channel in a transaction on its own you might prefer to call this:


On 25 Jan 2017 07:55, "Saurabh Nanda" <[hidden email]> wrote:
Am I using broadcastTChan & STM correctly here? https://gist.github.com/saurabhnanda/3cc39f1e0a646254dd8819b01cd04ac3 Is it acceptable to basically create a `broadcastTChan` and make it "escape" the STM monad?



On Tue, Jan 24, 2017 at 3:45 PM, Merijn Verstraaten <[hidden email]> wrote:
I don't know about split-channel, which seems to try and have a lot of functionality, but broadcast-chan is literally (and I mean the "copy+paste" kind of literally) the exact same implementation as Control.Concurrent.Chan.

The only difference is that, with Control.Concurrent.Chan each Chan always has access to both the read and write end of the Chan. This read end will keep data inside that Chan alive (and thus in memory!) indefinitely, if no one reads from that Chan. If you use dupChan this will create a new read end that tracks data in the Chan separately from the original one.

This means that if you have a worker that only ever writes into a Chan, this read end is basically keeping everything you write into the Chan alive forever.

broadcast-chan does the exact same thing as 'newBroadcastTChan' from Control.Concurrent.STM.TChan, that is your original Chan contains a "write" end only, and not a read end, the end result of that is: If you create a new write channel any message you write into it will, if there are no listeners, be immediately dropped and GCed. If you create "read" ends, then each read end will receive every message written into the write end, as long as the read end is active. So if you create 5 workers with a new read end each, then every message written after those have been created will be seen by all workers.

So, if your problem is "I want to broadcast messages, but messages that are sent when there are no listeners should be dropped and forgotten", then use broadcast-chan.

The package itself isn't "battle-tested", but since it's basically a copy+paste of Control.Concurrent.Chan with a trivial wrapper, I feel fairly confident that it doesn't have any major problems.

Cheers,
Merijn


> On 23 Jan 2017, at 11:52, Saurabh Nanda <[hidden email]> wrote:
>
> (cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)
>
> I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:
>
> * http://hackage.haskell.org/package/split-channel
> * https://hackage.haskell.org/package/broadcast-chan
>
> But, I'm not sure if these are battle-tested. Any help would be appreciated.
>
> -- Saurabh.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.




--

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Saurabh Nanda
The trouble with the real-time clock is that it can change discontinuously, even backwards, and it is hard to account for things like leap seconds with it, so durations calculated by subtracting two UTCTimes are a little unreliable.

Thanks for pointing that out. Didn't realize this subtlety.

-- Saurabh.
 

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Gregory Collins-3
In reply to this post by Saurabh Nanda
Try "unagi-chan", it's supposed to be much faster than Control.Concurrent.Chan, and comes with bounded channels out of the box.

Greg

On Mon, Jan 23, 2017 at 2:52 AM, Saurabh Nanda <[hidden email]> wrote:
(cross-posted from Reddit because I'm not sure of the audience overlap between haskell-cafe & reddit)

I want to broadcast some instrumentation data from deep within my Haskell app. This data will have listeners in some environments (say, debug), but not others (say, production). Which library should I be using? A little searching threw two possible candidates:


But, I'm not sure if these are battle-tested. Any help would be appreciated.

-- Saurabh.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.



--
Gregory Collins <[hidden email]>

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Bug-free, leak-free, battle-tested library for broadcast channels?

Joachim Durchholz
In reply to this post by David Turner-2
Am 25.01.2017 um 09:47 schrieb David Turner:
> The trouble with the real-time clock is that it can change
> discontinuously, even backwards, and it is hard to account for things
> like leap seconds with it, so durations calculated by subtracting two
> UTCTimes are a little unreliable.
>
> OTOH you might well want to know the real start and end times of your
> process instead/as well - it depends on your application.

You use the proper tool for the job.
Monotonic clock for measuring durations, UTC for reporting approximate
points in time. I.e. something like
   2017-01-12 10:43: benchmark #3 started
   2017-01-12 10:43: benchmark #3 took 3.472 ms to complete

Mixing the monotonic durations and UTC is a path to madness, and nothing
good can come from it. (SCNR the hyperbole.)
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.