Re: Cloud Haskell and network latency issues with -threaded

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Cloud Haskell and network latency issues with -threaded

Tim Watson
Hi Kostirya,

I'm putting the parallel-haskell and ghc-users lists on cc, just in case other (better informed) folks want to chip in here.

----

First of all, I'm assuming you're talking about network latency when compiling with -threaded - if not I apologise for misunderstanding!

There is apparently an outstanding network latency issue when compiling with -threaded, but according to a conversation I had with the other developers on #haskell-distributed, this is not something that's specific to Cloud Haskell. It is something to do with the threaded runtime system, so would need to be solved for GHC (or is it just the Network package!?) in general. Writing up a simple C program and equivalent socket use in Haskell and comparing the latency using -threaded will show this up.

See the latency section in http://haskell-distributed.github.com/wiki/networktransport.html for some more details. According to that, there *are* some things we might be able to do, but the 20% latency isn't going to change significantly on the face of things.

We have an open ticket to look into this (https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point we'll try and put together the sample programs in a github repository (if that's not already done - I might've missed previous spikes done by Edsko or others) and investigate further.

One of the other (more experienced!) devs might be able to chip in and proffer a better explanation.

Cheers,
Tim


On 6 Feb 2013, at 13:27, [hidden email] wrote:

> Haven’t you had a necessity to launch Haskell in no-threaded mode during the intense network data exchange?
> I am getting the double performance penalty in threaded mode. But I must use threaded mode because epoll and kevent are available in the threaded mode only.
>

[snip]

>
>
> среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson написал:
> Hello all,
>
> It's been a busy week for Cloud Haskell and I wanted to share a few of
> our news items with you all.
>
> Firstly, we have a new home page at http://haskell-distributed.github.com,
> into which most of the documentation and wiki pages have been merged. Making
> sassy looking websites is not really my bag, so I'm very grateful to the
> various author's whose Creative Commons licensed designs and layouts made
> it easy to put together. We've already had some pull requests to fix minor
> problems on the site, so thanks very much to those who've contributed already!
>
> As well as the new site, you will find a few of us hanging out on the
> #haskell-distributed channel on freenode. Please do come along and join in
> the conversation.
>
> We also recently split up the distributed-process project into separate
> git repositories, one for each component that makes up Cloud Haskell. This
> was done partly for administrative purposes and partly because we're in the
> process of setting up CI builds for all the projects.
>
> Finally, we've moved from Github's issue tracker to a hosted Jira/Bamboo setup
> at https://cloud-haskell.atlassian.net - pull requests are naturally still welcome
> via Github! Although you can browse issues freely without logging in, you will
> need to provide an email address and get an account in order to submit new ones.
> If you have any difficulties logging in, please don't hesitate to contact me
> directly, via this forum or the cloud-haskell-developers mailing list (on
> google groups).
>
> As always, we'd be delighted to hear any feedback!
>
> Cheers,
> Tim


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Cloud Haskell and network latency issues with -threaded

Andreas Voellmy
Hi Edsko, 

Can you explain the figure linked to on that page a bit? E.g. how should the axes be labelled? 


On Wed, Feb 6, 2013 at 9:33 AM, Edsko de Vries <[hidden email]> wrote:
Hi,

Just for clarity's sake (as the author of that "Latency" section that Tim referred to): I have addressed all of the issues listed there in Network.Transport.TCP, with the exception of the first (the -threaded issue). As Tim points out, this is not a Cloud Haskell specific issue; I have written this up as a short blog post at http://www.edsko.net/2013/02/06/performance-problems-with-threaded .

Edsko



On Wednesday, 6 February 2013 14:09:22 UTC, Tim Watson wrote:
Hi Kostirya,

I'm putting the parallel-haskell and ghc-users lists on cc, just in case other (better informed) folks want to chip in here.

----

First of all, I'm assuming you're talking about network latency when compiling with -threaded - if not I apologise for misunderstanding!

There is apparently an outstanding network latency issue when compiling with -threaded, but according to a conversation I had with the other developers on #haskell-distributed, this is not something that's specific to Cloud Haskell. It is something to do with the threaded runtime system, so would need to be solved for GHC (or is it just the Network package!?) in general. Writing up a simple C program and equivalent socket use in Haskell and comparing the latency using -threaded will show this up.

See the latency section in http://haskell-distributed.github.com/wiki/networktransport.html for some more details. According to that, there *are* some things we might be able to do, but the 20% latency isn't going to change significantly on the face of things.

We have an open ticket to look into this (https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point we'll try and put together the sample programs in a github repository (if that's not already done - I might've missed previous spikes done by Edsko or others) and investigate further.

One of the other (more experienced!) devs might be able to chip in and proffer a better explanation.

Cheers,
Tim


On 6 Feb 2013, at 13:27, [hidden email] wrote:

> Haven’t you had a necessity to launch Haskell in no-threaded mode during the intense network data exchange?
> I am getting the double performance penalty in threaded mode. But I must use threaded mode because epoll and kevent are available in the threaded mode only.
>

[snip]

>
>
> среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson написал:
> Hello all,
>
> It's been a busy week for Cloud Haskell and I wanted to share a few of
> our news items with you all.
>
> Firstly, we have a new home page at http://haskell-distributed.github.com,
> into which most of the documentation and wiki pages have been merged. Making
> sassy looking websites is not really my bag, so I'm very grateful to the
> various author's whose Creative Commons licensed designs and layouts made
> it easy to put together. We've already had some pull requests to fix minor
> problems on the site, so thanks very much to those who've contributed already!
>
> As well as the new site, you will find a few of us hanging out on the
> #haskell-distributed channel on freenode. Please do come along and join in
> the conversation.
>
> We also recently split up the distributed-process project into separate
> git repositories, one for each component that makes up Cloud Haskell. This
> was done partly for administrative purposes and partly because we're in the
> process of setting up CI builds for all the projects.
>
> Finally, we've moved from Github's issue tracker to a hosted Jira/Bamboo setup
> at https://cloud-haskell.atlassian.net - pull requests are naturally still welcome
> via Github! Although you can browse issues freely without logging in, you will
> need to provide an email address and get an account in order to submit new ones.
> If you have any difficulties logging in, please don't hesitate to contact me
> directly, via this forum or the cloud-haskell-developers mailing list (on
> google groups).
>
> As always, we'd be delighted to hear any feedback!
>
> Cheers,
> Tim

--
 
 


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

RE: Cloud Haskell and network latency issues with -threaded

Simon Peyton Jones
In reply to this post by Tim Watson

I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC

 

This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.

 

Simon

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Andreas Voellmy
Sent: 06 February 2013 14:28
To: [hidden email]
Cc: [hidden email]; parallel-haskell; [hidden email]
Subject: Re: Cloud Haskell and network latency issues with -threaded

 

Hi all, 

 

I haven't followed the conversations around CloudHaskell closely, but I noticed the discussion around latency using the threaded runtime system, and I thought I'd jump in here.

 

I've been developing a server in Haskell that serves hundreds to thousands of clients over very long-lived TCP sockets. I also had latency problems with GHC. For example, with 100 clients I had a 10 ms (millisecond) latency and with 500 clients I had a 29ms latency. I looked into the problem and found that some bottlenecks in the threaded IO manager were the cause. I made some hacks there and got the latency for 100 and 500 clients down to under 0.2 ms. I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC. Hopefully our work will also fix the latency issues in CloudHaskell programs :)

 

It would be very helpful if someone has some benchmark CloudHaskell applications and workloads to test with. Does anyone have these handy? 

 

Cheers, 

Andi

 

On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <[hidden email]> wrote:

Hi Kostirya,

I'm putting the parallel-haskell and ghc-users lists on cc, just in case other (better informed) folks want to chip in here.

----

First of all, I'm assuming you're talking about network latency when compiling with -threaded - if not I apologise for misunderstanding!

There is apparently an outstanding network latency issue when compiling with -threaded, but according to a conversation I had with the other developers on #haskell-distributed, this is not something that's specific to Cloud Haskell. It is something to do with the threaded runtime system, so would need to be solved for GHC (or is it just the Network package!?) in general. Writing up a simple C program and equivalent socket use in Haskell and comparing the latency using -threaded will show this up.

See the latency section in http://haskell-distributed.github.com/wiki/networktransport.html for some more details. According to that, there *are* some things we might be able to do, but the 20% latency isn't going to change significantly on the face of things.

We have an open ticket to look into this (https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point we'll try and put together the sample programs in a github repository (if that's not already done - I might've missed previous spikes done by Edsko or others) and investigate further.

One of the other (more experienced!) devs might be able to chip in and proffer a better explanation.

Cheers,
Tim


On 6 Feb 2013, at 13:27, [hidden email] wrote:

> Haven't you had a necessity to launch Haskell in no-threaded mode during the intense network data exchange?
> I am getting the double performance penalty in threaded mode. But I must use threaded mode because epoll and kevent are available in the threaded mode only.
>

[snip]

>
>
> среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson написал:
> Hello all,
>
> It's been a busy week for Cloud Haskell and I wanted to share a few of
> our news items with you all.
>
> Firstly, we have a new home page at http://haskell-distributed.github.com,
> into which most of the documentation and wiki pages have been merged. Making
> sassy looking websites is not really my bag, so I'm very grateful to the
> various author's whose Creative Commons licensed designs and layouts made
> it easy to put together. We've already had some pull requests to fix minor
> problems on the site, so thanks very much to those who've contributed already!
>
> As well as the new site, you will find a few of us hanging out on the
> #haskell-distributed channel on freenode. Please do come along and join in
> the conversation.
>
> We also recently split up the distributed-process project into separate
> git repositories, one for each component that makes up Cloud Haskell. This
> was done partly for administrative purposes and partly because we're in the
> process of setting up CI builds for all the projects.
>
> Finally, we've moved from Github's issue tracker to a hosted Jira/Bamboo setup
> at https://cloud-haskell.atlassian.net - pull requests are naturally still welcome
> via Github! Although you can browse issues freely without logging in, you will
> need to provide an email address and get an account in order to submit new ones.
> If you have any difficulties logging in, please don't hesitate to contact me
> directly, via this forum or the cloud-haskell-developers mailing list (on
> google groups).
>
> As always, we'd be delighted to hear any feedback!
>
> Cheers,
> Tim

--
[hidden email].
https://groups.google.com/groups/opt_out.

 

--
[hidden email].
 
 


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

RE: Cloud Haskell and network latency issues with -threaded

Edward Z. Yang
Hey folks,

The latency changes sound relevant to some work on the scheduler I'm doing;
is there a place I can see the changes?

Thanks,
Edward

Excerpts from Simon Peyton-Jones's message of Wed Feb 06 10:10:10 -0800 2013:

> I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC
>
> This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.
>
> Simon
>
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Andreas Voellmy
> Sent: 06 February 2013 14:28
> To: [hidden email]
> Cc: [hidden email]; parallel-haskell; [hidden email]
> Subject: Re: Cloud Haskell and network latency issues with -threaded
>
> Hi all,
>
> I haven't followed the conversations around CloudHaskell closely, but I noticed the discussion around latency using the threaded runtime system, and I thought I'd jump in here.
>
> I've been developing a server in Haskell that serves hundreds to thousands of clients over very long-lived TCP sockets. I also had latency problems with GHC. For example, with 100 clients I had a 10 ms (millisecond) latency and with 500 clients I had a 29ms latency. I looked into the problem and found that some bottlenecks in the threaded IO manager were the cause. I made some hacks there and got the latency for 100 and 500 clients down to under 0.2 ms. I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC. Hopefully our work will also fix the latency issues in CloudHaskell programs :)
>
> It would be very helpful if someone has some benchmark CloudHaskell applications and workloads to test with. Does anyone have these handy?
>
> Cheers,
> Andi
>
> On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <[hidden email]<mailto:[hidden email]>> wrote:
> Hi Kostirya,
>
> I'm putting the parallel-haskell and ghc-users lists on cc, just in case other (better informed) folks want to chip in here.
>
> ----
>
> First of all, I'm assuming you're talking about network latency when compiling with -threaded - if not I apologise for misunderstanding!
>
> There is apparently an outstanding network latency issue when compiling with -threaded, but according to a conversation I had with the other developers on #haskell-distributed, this is not something that's specific to Cloud Haskell. It is something to do with the threaded runtime system, so would need to be solved for GHC (or is it just the Network package!?) in general. Writing up a simple C program and equivalent socket use in Haskell and comparing the latency using -threaded will show this up.
>
> See the latency section in http://haskell-distributed.github.com/wiki/networktransport.html for some more details. According to that, there *are* some things we might be able to do, but the 20% latency isn't going to change significantly on the face of things.
>
> We have an open ticket to look into this (https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point we'll try and put together the sample programs in a github repository (if that's not already done - I might've missed previous spikes done by Edsko or others) and investigate further.
>
> One of the other (more experienced!) devs might be able to chip in and proffer a better explanation.
>
> Cheers,
> Tim
>
> On 6 Feb 2013, at 13:27, [hidden email]<mailto:[hidden email]> wrote:
>
> > Haven't you had a necessity to launch Haskell in no-threaded mode during the intense network data exchange?
> > I am getting the double performance penalty in threaded mode. But I must use threaded mode because epoll and kevent are available in the threaded mode only.
> >
>
> [snip]
>
> >
> >
> > среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson написал:
> > Hello all,
> >
> > It's been a busy week for Cloud Haskell and I wanted to share a few of
> > our news items with you all.
> >
> > Firstly, we have a new home page at http://haskell-distributed.github.com,
> > into which most of the documentation and wiki pages have been merged. Making
> > sassy looking websites is not really my bag, so I'm very grateful to the
> > various author's whose Creative Commons licensed designs and layouts made
> > it easy to put together. We've already had some pull requests to fix minor
> > problems on the site, so thanks very much to those who've contributed already!
> >
> > As well as the new site, you will find a few of us hanging out on the
> > #haskell-distributed channel on freenode. Please do come along and join in
> > the conversation.
> >
> > We also recently split up the distributed-process project into separate
> > git repositories, one for each component that makes up Cloud Haskell. This
> > was done partly for administrative purposes and partly because we're in the
> > process of setting up CI builds for all the projects.
> >
> > Finally, we've moved from Github's issue tracker to a hosted Jira/Bamboo setup
> > at https://cloud-haskell.atlassian.net - pull requests are naturally still welcome
> > via Github! Although you can browse issues freely without logging in, you will
> > need to provide an email address and get an account in order to submit new ones.
> > If you have any difficulties logging in, please don't hesitate to contact me
> > directly, via this forum or the cloud-haskell-developers mailing list (on
> > google groups).
> >
> > As always, we'd be delighted to hear any feedback!
> >
> > Cheers,
> > Tim
>

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Cloud Haskell and network latency issues with -threaded

Andreas Voellmy
Hi Edward, 

I did two things to improve latency for my application: (1) rework the IO manager and (2) stabilize the work pushing. (1) seems like a big win and we are almost done with the work on that part. It is less clear whether (2) will generally help much. It helped me when I developed it against 7.4.1, but it doesn't seem to have much impact on HEAD on the few measurements I did. The idea of (2) was to keep running averages of the run queue length of each capability, then push work when these running averages get too out-of-balance. The desired effect (which seems to work on my particular application) is to avoid cases in which threads are pushed back and forth among cores, which may make cache usage worse. You can see my patch here: https://github.com/AndreasVoellmy/ghc-arv/commits/push-work-exchange-squashed

-Andi


On Fri, Feb 8, 2013 at 12:10 AM, Edward Z. Yang <[hidden email]> wrote:
Hey folks,

The latency changes sound relevant to some work on the scheduler I'm doing;
is there a place I can see the changes?

Thanks,
Edward

Excerpts from Simon Peyton-Jones's message of Wed Feb 06 10:10:10 -0800 2013:
> I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC
>
> This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.
>
> Simon
>
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Andreas Voellmy
> Sent: 06 February 2013 14:28
> To: [hidden email]
> Cc: [hidden email]; parallel-haskell; [hidden email]
> Subject: Re: Cloud Haskell and network latency issues with -threaded
>
> Hi all,
>
> I haven't followed the conversations around CloudHaskell closely, but I noticed the discussion around latency using the threaded runtime system, and I thought I'd jump in here.
>
> I've been developing a server in Haskell that serves hundreds to thousands of clients over very long-lived TCP sockets. I also had latency problems with GHC. For example, with 100 clients I had a 10 ms (millisecond) latency and with 500 clients I had a 29ms latency. I looked into the problem and found that some bottlenecks in the threaded IO manager were the cause. I made some hacks there and got the latency for 100 and 500 clients down to under 0.2 ms. I (with help from Kazu and helpful comments from Bryan and Johan) have nearly completed an overhaul to the IO manager based on my observations and we are in the final stages of getting it into GHC. Hopefully our work will also fix the latency issues in CloudHaskell programs :)
>
> It would be very helpful if someone has some benchmark CloudHaskell applications and workloads to test with. Does anyone have these handy?
>
> Cheers,
> Andi
>
> On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <[hidden email]<mailto:[hidden email]>> wrote:
> Hi Kostirya,
>
> I'm putting the parallel-haskell and ghc-users lists on cc, just in case other (better informed) folks want to chip in here.
>
> ----
>
> First of all, I'm assuming you're talking about network latency when compiling with -threaded - if not I apologise for misunderstanding!
>
> There is apparently an outstanding network latency issue when compiling with -threaded, but according to a conversation I had with the other developers on #haskell-distributed, this is not something that's specific to Cloud Haskell. It is something to do with the threaded runtime system, so would need to be solved for GHC (or is it just the Network package!?) in general. Writing up a simple C program and equivalent socket use in Haskell and comparing the latency using -threaded will show this up.
>
> See the latency section in http://haskell-distributed.github.com/wiki/networktransport.html for some more details. According to that, there *are* some things we might be able to do, but the 20% latency isn't going to change significantly on the face of things.
>
> We have an open ticket to look into this (https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point we'll try and put together the sample programs in a github repository (if that's not already done - I might've missed previous spikes done by Edsko or others) and investigate further.
>
> One of the other (more experienced!) devs might be able to chip in and proffer a better explanation.
>
> Cheers,
> Tim
>
> On 6 Feb 2013, at 13:27, [hidden email]<mailto:[hidden email]> wrote:
>
> > Haven't you had a necessity to launch Haskell in no-threaded mode during the intense network data exchange?
> > I am getting the double performance penalty in threaded mode. But I must use threaded mode because epoll and kevent are available in the threaded mode only.
> >
>
> [snip]
>
> >
> >
> > среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson написал:
> > Hello all,
> >
> > It's been a busy week for Cloud Haskell and I wanted to share a few of
> > our news items with you all.
> >
> > Firstly, we have a new home page at http://haskell-distributed.github.com,
> > into which most of the documentation and wiki pages have been merged. Making
> > sassy looking websites is not really my bag, so I'm very grateful to the
> > various author's whose Creative Commons licensed designs and layouts made
> > it easy to put together. We've already had some pull requests to fix minor
> > problems on the site, so thanks very much to those who've contributed already!
> >
> > As well as the new site, you will find a few of us hanging out on the
> > #haskell-distributed channel on freenode. Please do come along and join in
> > the conversation.
> >
> > We also recently split up the distributed-process project into separate
> > git repositories, one for each component that makes up Cloud Haskell. This
> > was done partly for administrative purposes and partly because we're in the
> > process of setting up CI builds for all the projects.
> >
> > Finally, we've moved from Github's issue tracker to a hosted Jira/Bamboo setup
> > at https://cloud-haskell.atlassian.net - pull requests are naturally still welcome
> > via Github! Although you can browse issues freely without logging in, you will
> > need to provide an email address and get an account in order to submit new ones.
> > If you have any difficulties logging in, please don't hesitate to contact me
> > directly, via this forum or the cloud-haskell-developers mailing list (on
> > google groups).
> >
> > As always, we'd be delighted to hear any feedback!
> >
> > Cheers,
> > Tim
>


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Cloud Haskell and network latency issues with -threaded

Edward Z. Yang
OK. I think it is high priority for us to get some latency benchmarks
into nofib so that GHC devs (including me) can start measuring changes
off them.  I know Edsko has some benchmarks here:
http://www.edsko.net/2013/02/06/performance-problems-with-threaded/
but they depend on network which makes it a little difficult to move into nofib.
I'm working on other scheduler changes that may help you guys out; we
should keep each other updated.

I noticed your patch also incorporates the "make yield actually work" patch;
do you think the improvement in 7.4.1 was due to that specific change?
(Have you instrumented the run queues and checked how your patch changes
the distribution of jobs over your runtime?)

Somewhat unrelatedly, if you have some good latency tests already,
it may be worth a try compiling your copy of GHC -fno-omit-yields, so that
forced context switches get serviced more predictably.

Cheers,
Edward

Excerpts from Andreas Voellmy's message of Thu Feb 07 21:20:25 -0800 2013:

> Hi Edward,
>
> I did two things to improve latency for my application: (1) rework the IO
> manager and (2) stabilize the work pushing. (1) seems like a big win and we
> are almost done with the work on that part. It is less clear whether (2)
> will generally help much. It helped me when I developed it against 7.4.1,
> but it doesn't seem to have much impact on HEAD on the few measurements I
> did. The idea of (2) was to keep running averages of the run queue length
> of each capability, then push work when these running averages get too
> out-of-balance. The desired effect (which seems to work on my particular
> application) is to avoid cases in which threads are pushed back and forth
> among cores, which may make cache usage worse. You can see my patch here:
> https://github.com/AndreasVoellmy/ghc-arv/commits/push-work-exchange-squashed
> .
>
> -Andi
>
> On Fri, Feb 8, 2013 at 12:10 AM, Edward Z. Yang <[hidden email]> wrote:
>
> > Hey folks,
> >
> > The latency changes sound relevant to some work on the scheduler I'm doing;
> > is there a place I can see the changes?
> >
> > Thanks,
> > Edward
> >
> > Excerpts from Simon Peyton-Jones's message of Wed Feb 06 10:10:10 -0800
> > 2013:
> > > I (with help from Kazu and helpful comments from Bryan and Johan) have
> > nearly completed an overhaul to the IO manager based on my observations and
> > we are in the final stages of getting it into GHC
> > >
> > > This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.
> > >
> > > Simon
> > >
> > > From: [hidden email] [mailto:
> > [hidden email]] On Behalf Of Andreas Voellmy
> > > Sent: 06 February 2013 14:28
> > > To: [hidden email]
> > > Cc: [hidden email]; parallel-haskell;
> > [hidden email]
> > > Subject: Re: Cloud Haskell and network latency issues with -threaded
> > >
> > > Hi all,
> > >
> > > I haven't followed the conversations around CloudHaskell closely, but I
> > noticed the discussion around latency using the threaded runtime system,
> > and I thought I'd jump in here.
> > >
> > > I've been developing a server in Haskell that serves hundreds to
> > thousands of clients over very long-lived TCP sockets. I also had latency
> > problems with GHC. For example, with 100 clients I had a 10 ms
> > (millisecond) latency and with 500 clients I had a 29ms latency. I looked
> > into the problem and found that some bottlenecks in the threaded IO manager
> > were the cause. I made some hacks there and got the latency for 100 and 500
> > clients down to under 0.2 ms. I (with help from Kazu and helpful comments
> > from Bryan and Johan) have nearly completed an overhaul to the IO manager
> > based on my observations and we are in the final stages of getting it into
> > GHC. Hopefully our work will also fix the latency issues in CloudHaskell
> > programs :)
> > >
> > > It would be very helpful if someone has some benchmark CloudHaskell
> > applications and workloads to test with. Does anyone have these handy?
> > >
> > > Cheers,
> > > Andi
> > >
> > > On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <[hidden email]
> > <mailto:[hidden email]>> wrote:
> > > Hi Kostirya,
> > >
> > > I'm putting the parallel-haskell and ghc-users lists on cc, just in case
> > other (better informed) folks want to chip in here.
> > >
> > > ----
> > >
> > > First of all, I'm assuming you're talking about network latency when
> > compiling with -threaded - if not I apologise for misunderstanding!
> > >
> > > There is apparently an outstanding network latency issue when compiling
> > with -threaded, but according to a conversation I had with the other
> > developers on #haskell-distributed, this is not something that's specific
> > to Cloud Haskell. It is something to do with the threaded runtime system,
> > so would need to be solved for GHC (or is it just the Network package!?) in
> > general. Writing up a simple C program and equivalent socket use in Haskell
> > and comparing the latency using -threaded will show this up.
> > >
> > > See the latency section in
> > http://haskell-distributed.github.com/wiki/networktransport.html for some
> > more details. According to that, there *are* some things we might be able
> > to do, but the 20% latency isn't going to change significantly on the face
> > of things.
> > >
> > > We have an open ticket to look into this (
> > https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point
> > we'll try and put together the sample programs in a github repository (if
> > that's not already done - I might've missed previous spikes done by Edsko
> > or others) and investigate further.
> > >
> > > One of the other (more experienced!) devs might be able to chip in and
> > proffer a better explanation.
> > >
> > > Cheers,
> > > Tim
> > >
> > > On 6 Feb 2013, at 13:27, [hidden email]<mailto:[hidden email]>
> > wrote:
> > >
> > > > Haven't you had a necessity to launch Haskell in no-threaded mode
> > during the intense network data exchange?
> > > > I am getting the double performance penalty in threaded mode. But I
> > must use threaded mode because epoll and kevent are available in the
> > threaded mode only.
> > > >
> > >
> > > [snip]
> > >
> > > >
> > > >
> > > > среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson
> > написал:
> > > > Hello all,
> > > >
> > > > It's been a busy week for Cloud Haskell and I wanted to share a few of
> > > > our news items with you all.
> > > >
> > > > Firstly, we have a new home page at
> > http://haskell-distributed.github.com,
> > > > into which most of the documentation and wiki pages have been merged.
> > Making
> > > > sassy looking websites is not really my bag, so I'm very grateful to
> > the
> > > > various author's whose Creative Commons licensed designs and layouts
> > made
> > > > it easy to put together. We've already had some pull requests to fix
> > minor
> > > > problems on the site, so thanks very much to those who've contributed
> > already!
> > > >
> > > > As well as the new site, you will find a few of us hanging out on the
> > > > #haskell-distributed channel on freenode. Please do come along and
> > join in
> > > > the conversation.
> > > >
> > > > We also recently split up the distributed-process project into separate
> > > > git repositories, one for each component that makes up Cloud Haskell.
> > This
> > > > was done partly for administrative purposes and partly because we're
> > in the
> > > > process of setting up CI builds for all the projects.
> > > >
> > > > Finally, we've moved from Github's issue tracker to a hosted
> > Jira/Bamboo setup
> > > > at https://cloud-haskell.atlassian.net - pull requests are naturally
> > still welcome
> > > > via Github! Although you can browse issues freely without logging in,
> > you will
> > > > need to provide an email address and get an account in order to submit
> > new ones.
> > > > If you have any difficulties logging in, please don't hesitate to
> > contact me
> > > > directly, via this forum or the cloud-haskell-developers mailing list
> > (on
> > > > google groups).
> > > >
> > > > As always, we'd be delighted to hear any feedback!
> > > >
> > > > Cheers,
> > > > Tim
> > >
> >

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Cloud Haskell and network latency issues with -threaded

Andreas Voellmy



On Fri, Feb 8, 2013 at 12:30 AM, Edward Z. Yang <[hidden email]> wrote:
OK. I think it is high priority for us to get some latency benchmarks
into nofib so that GHC devs (including me) can start measuring changes
off them.  I know Edsko has some benchmarks here:
http://www.edsko.net/2013/02/06/performance-problems-with-threaded/
but they depend on network which makes it a little difficult to move into nofib.
I'm working on other scheduler changes that may help you guys out; we
should keep each other updated.

That would be great :)
 

I noticed your patch also incorporates the "make yield actually work" patch;
do you think the improvement in 7.4.1 was due to that specific change?
 
Actually, I believe that patch is irrelevant to the scheduler change and probably should not be in there, strictly speaking. I actually needed that patch for the IO manager revisions to work properly.
 
(Have you instrumented the run queues and checked how your patch changes
the distribution of jobs over your runtime?)

I didn't do this very rigorously, but I think I added some print statements in the scheduler and I looked at some eventlogs in threadscope to see that threads work pushing slows down after a while. I had planned to write a script to analyze an event log file to extract these stats, but I never got around to it. 

-Andi

 
Somewhat unrelatedly, if you have some good latency tests already,
it may be worth a try compiling your copy of GHC -fno-omit-yields, so that
forced context switches get serviced more predictably.

Cheers,
Edward

Excerpts from Andreas Voellmy's message of Thu Feb 07 21:20:<a href="tel:25%20-0800%202013" value="+12508002013">25 -0800 2013:
> Hi Edward,
>
> I did two things to improve latency for my application: (1) rework the IO
> manager and (2) stabilize the work pushing. (1) seems like a big win and we
> are almost done with the work on that part. It is less clear whether (2)
> will generally help much. It helped me when I developed it against 7.4.1,
> but it doesn't seem to have much impact on HEAD on the few measurements I
> did. The idea of (2) was to keep running averages of the run queue length
> of each capability, then push work when these running averages get too
> out-of-balance. The desired effect (which seems to work on my particular
> application) is to avoid cases in which threads are pushed back and forth
> among cores, which may make cache usage worse. You can see my patch here:
> https://github.com/AndreasVoellmy/ghc-arv/commits/push-work-exchange-squashed
> .
>
> -Andi
>
> On Fri, Feb 8, 2013 at 12:10 AM, Edward Z. Yang <[hidden email]> wrote:
>
> > Hey folks,
> >
> > The latency changes sound relevant to some work on the scheduler I'm doing;
> > is there a place I can see the changes?
> >
> > Thanks,
> > Edward
> >
> > Excerpts from Simon Peyton-Jones's message of Wed Feb 06 10:10:10 -0800
> > 2013:
> > > I (with help from Kazu and helpful comments from Bryan and Johan) have
> > nearly completed an overhaul to the IO manager based on my observations and
> > we are in the final stages of getting it into GHC
> > >
> > > This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.
> > >
> > > Simon
> > >
> > > From: [hidden email] [mailto:
> > [hidden email]] On Behalf Of Andreas Voellmy
> > > Sent: 06 February 2013 14:28
> > > To: [hidden email]
> > > Cc: [hidden email]; parallel-haskell;
> > [hidden email]
> > > Subject: Re: Cloud Haskell and network latency issues with -threaded
> > >
> > > Hi all,
> > >
> > > I haven't followed the conversations around CloudHaskell closely, but I
> > noticed the discussion around latency using the threaded runtime system,
> > and I thought I'd jump in here.
> > >
> > > I've been developing a server in Haskell that serves hundreds to
> > thousands of clients over very long-lived TCP sockets. I also had latency
> > problems with GHC. For example, with 100 clients I had a 10 ms
> > (millisecond) latency and with 500 clients I had a 29ms latency. I looked
> > into the problem and found that some bottlenecks in the threaded IO manager
> > were the cause. I made some hacks there and got the latency for 100 and 500
> > clients down to under 0.2 ms. I (with help from Kazu and helpful comments
> > from Bryan and Johan) have nearly completed an overhaul to the IO manager
> > based on my observations and we are in the final stages of getting it into
> > GHC. Hopefully our work will also fix the latency issues in CloudHaskell
> > programs :)
> > >
> > > It would be very helpful if someone has some benchmark CloudHaskell
> > applications and workloads to test with. Does anyone have these handy?
> > >
> > > Cheers,
> > > Andi
> > >
> > > On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <[hidden email]
> > <mailto:[hidden email]>> wrote:
> > > Hi Kostirya,
> > >
> > > I'm putting the parallel-haskell and ghc-users lists on cc, just in case
> > other (better informed) folks want to chip in here.
> > >
> > > ----
> > >
> > > First of all, I'm assuming you're talking about network latency when
> > compiling with -threaded - if not I apologise for misunderstanding!
> > >
> > > There is apparently an outstanding network latency issue when compiling
> > with -threaded, but according to a conversation I had with the other
> > developers on #haskell-distributed, this is not something that's specific
> > to Cloud Haskell. It is something to do with the threaded runtime system,
> > so would need to be solved for GHC (or is it just the Network package!?) in
> > general. Writing up a simple C program and equivalent socket use in Haskell
> > and comparing the latency using -threaded will show this up.
> > >
> > > See the latency section in
> > http://haskell-distributed.github.com/wiki/networktransport.html for some
> > more details. According to that, there *are* some things we might be able
> > to do, but the 20% latency isn't going to change significantly on the face
> > of things.
> > >
> > > We have an open ticket to look into this (
> > https://cloud-haskell.atlassian.net/browse/NTTCP-4) and at some point
> > we'll try and put together the sample programs in a github repository (if
> > that's not already done - I might've missed previous spikes done by Edsko
> > or others) and investigate further.
> > >
> > > One of the other (more experienced!) devs might be able to chip in and
> > proffer a better explanation.
> > >
> > > Cheers,
> > > Tim
> > >
> > > On 6 Feb 2013, at 13:27, [hidden email]<mailto:[hidden email]>
> > wrote:
> > >
> > > > Haven't you had a necessity to launch Haskell in no-threaded mode
> > during the intense network data exchange?
> > > > I am getting the double performance penalty in threaded mode. But I
> > must use threaded mode because epoll and kevent are available in the
> > threaded mode only.
> > > >
> > >
> > > [snip]
> > >
> > > >
> > > >
> > > > среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson
> > написал:
> > > > Hello all,
> > > >
> > > > It's been a busy week for Cloud Haskell and I wanted to share a few of
> > > > our news items with you all.
> > > >
> > > > Firstly, we have a new home page at
> > http://haskell-distributed.github.com,
> > > > into which most of the documentation and wiki pages have been merged.
> > Making
> > > > sassy looking websites is not really my bag, so I'm very grateful to
> > the
> > > > various author's whose Creative Commons licensed designs and layouts
> > made
> > > > it easy to put together. We've already had some pull requests to fix
> > minor
> > > > problems on the site, so thanks very much to those who've contributed
> > already!
> > > >
> > > > As well as the new site, you will find a few of us hanging out on the
> > > > #haskell-distributed channel on freenode. Please do come along and
> > join in
> > > > the conversation.
> > > >
> > > > We also recently split up the distributed-process project into separate
> > > > git repositories, one for each component that makes up Cloud Haskell.
> > This
> > > > was done partly for administrative purposes and partly because we're
> > in the
> > > > process of setting up CI builds for all the projects.
> > > >
> > > > Finally, we've moved from Github's issue tracker to a hosted
> > Jira/Bamboo setup
> > > > at https://cloud-haskell.atlassian.net - pull requests are naturally
> > still welcome
> > > > via Github! Although you can browse issues freely without logging in,
> > you will
> > > > need to provide an email address and get an account in order to submit
> > new ones.
> > > > If you have any difficulties logging in, please don't hesitate to
> > contact me
> > > > directly, via this forum or the cloud-haskell-developers mailing list
> > (on
> > > > google groups).
> > > >
> > > > As always, we'd be delighted to hear any feedback!
> > > >
> > > > Cheers,
> > > > Tim
> > >
> >

--




_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users