[RFC] Compiler pipeline timings visualization

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Compiler pipeline timings visualization

GHC - devs mailing list
tl;dr: I propose adding a new GHC flag for generating a log that allows
visualizing how much time each stage in the compiler pipleline took, similar
to Clang's -ftime-trace.

Hello, everyone.

I'd like to tell you about a feature I've been working on recently, namely,
the ability to generate and visualize how long each stage of the compilation
pipeline takes to complete. This is basically about the same as
the -ftime-trace flag that has landed in Clang several months ago [1].

The initial motivation for this feature was the desire to understand why
compilation times are so long when using LLVM as backend. But later I realized
that this functionality is useful on its own for end users, not just GHC devs,
so it would make sense to add a new flag -ddump-time-trace.

Since not only does Haskell have complex type system, but also there is
a variety of language extensions, we, the Haskell users, often experience
very long compilation times. For instance, the usage of the TypeFaimilies
extension can considerably slow down the compilation process [2].
It is useful to understand how you can fix your code so that it compiles faster.

There are two options for that right now:
- Building GHC from source for profiling and using the just-built GHC for
  compiling your problem code.
- Building the compiler from source with event log support [3].

The former option just doesn't do it, since the output would be
"which GHC function calls took how much time", but there'd be no information
about which part of the user code was being compiled.

The latter option is much closer to what we need. If we link the GHC executable
with the -eventlog flag, then various significant events will be written to
a special log file. For example, "Parser began parsing the Main.hs file after
5 ms since GHC has started, and ended parsing it 3 ms after that".
The resulting log file can be parsed with the ghc-events library [4], and also
visualized using the ThreadScope app [5].

Bu this approach has its disadvantages.

Firstly, if the user wants visualization, they'd have to install ThreadScope.
Some companies' internal policies prohibit installing third-party apps from
the internet. It would be good if we could generate a log that could be
visualized on any computer with a browser. For that we could use
the Chrome Trace Event format [6]. This is an ordinary JSON file with a specific
structure that can be viewed in the Chrome browser by going to
the chrome://tracing page, or on https://speedscope.app. A file in exactly this
format would be generated by Clang if you passed it the -ftime-trace flag.

Secondly, the event log contains many events that are not relevant to the end
user, for example, thread creation, memory allocation, etc.

As an initial proof of concept, I've developed a command line tool for
transforming event log files to Chrome Trace Event files [7].

Thirdly, in order for the event log to be generated, you'd still have to build
GHC from source. The majority of the GHC users won't go this way. Not only
would it require some basic understanding of the GHC building process, but also
building itself takes quite some time. It would be great if the needed
functionality came with GHC out of the box.

This is why I've added support for generating Trace Event files into my fork
of GHC [8], and I would like to propose including this feature into the mainline
GHC.

(Note that my implementation is still a bit buggy, for example, it only works
in -j mode. This will be fixed.)

You can now execute the following command:

ghc -ddump-to-file -ddump-file-prefix="Main." -ddump-time-trace -j -O3 Main.hs

And receive this image [9]. Here, we've compiled two modules, one of which
depends on the other.

One more difference from event log is that now various metadata
(like file and module names) are emitted as a separate JSON attribute, instead
of being encoded in the name of an event. For example, parsing the Main.hs file
and parsing the QSort.hs file in one compilation are events with the same name,
but different metadata. We can group them together if we want to know how much
time the parsing took overall. The event log format doesn't allow us to do it.

So, we can now generate a Trace Event file from a single GHC invocation.
However, most projects use build systems that invoke the compiler many times.
It would be good if we could form a log for the whole project.

This is solved by merging logs. However, there is one subtlety: the events in
logs use relative timestamps (the number of microseconds since the process
has started). To combine logs from multiple processes, we add a clock
synchronization point into each trace log in the form of an additional
'beginningOfTime' JSON attribute that contains the absolute time when
the proccess has started.

There is a Python script that performs the actual merging [10].

This is how it looks like with multiple processes [11].

Also, I've implemented generation of the 'beginningOfTime' attribute in
LLVM [12], so build systems could take advantage of that and combine GHC trace
logs with llc/opt trace logs when GHC uses LLVM as backend.

Thoughts?

Sergej.

[1]  https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-chart-profiler-for-Clang/
[2]  https://gitlab.haskell.org/ghc/ghc/issues/8095
[3]  https://gitlab.haskell.org/ghc/ghc/-/wikis/event-log
[4]  http://hackage.haskell.org/package/ghc-events
[5]  https://wiki.haskell.org/ThreadScope
[6]  https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit
[7]  https://github.com/broadwaylamb/ghc-eventlog-chrome
[8]  https://gitlab.haskell.org/broadwaylamb/ghc/-/commits/timetrace-better-metadata
[9]  https://user-images.githubusercontent.com/16309982/79079705-39775e00-7d19-11ea-9507-eb0f11581c63.png
[10] https://github.com/broadwaylamb/merge_trace_events
[11] https://user-images.githubusercontent.com/16309982/79080338-ad673580-7d1c-11ea-9e30-5e6f72e77555.png
[12] https://github.com/llvm/llvm-project/commit/2899103108d38215af8aae377cd0e54794278209

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Compiler pipeline timings visualization

Andreas Klebinger
Hi Sergej,

I think this is a good idea in general, and it seems you did some great
work there already.
Something like this can also help with pinpointing performance issues
inside the compiler
so would not just be useful to end users.

intuitively I would assume that instead of adding another way
to produce compiler events we should:
* Ship GHC with eventlog enabled by default
* Provide an official converter from eventLog to the chrome trace format.

The eventlog format is quite flexible, and if it's unsuitable to what
you want I would
prefer to extend it rather than adding support for additional formats
inside GHC itself.

This way we:
* Continue to have one unified way to dump events from haskell apps (the
eventlog)
* Users need not go to third party apps, as the converter could
reasonably come with GHC (like hp2ps)
* We are free to include information that can't (easily) be encoded in
the chrome format.

The downside is that users will have to invoke ghc, followed by some
other tool to get the
chrome trace, but to me that seems like a worthwhile tradeoff for
keeping the compiler and
various visualization formats somewhat separated.

The obvious question there is how much enabling the eventlog by default
would impact non-logging ghc
invocations. I have not measured this and it might rule out this
approach if it has a big impact and isn't
easily corrected.

As a last point I want to encourage you to open a ticket about this.
Mailing list threads tend to be harder to follow and find down the line
then tickets in my experience.

Cheers,
Andreas

Sergej Jaskiewicz via ghc-devs schrieb am 01.05.2020 um 11:09:

> tl;dr: I propose adding a new GHC flag for generating a log that allows
> visualizing how much time each stage in the compiler pipleline took, similar
> to Clang's -ftime-trace.
>
> Hello, everyone.
>
> I'd like to tell you about a feature I've been working on recently, namely,
> the ability to generate and visualize how long each stage of the compilation
> pipeline takes to complete. This is basically about the same as
> the -ftime-trace flag that has landed in Clang several months ago [1].
>
> The initial motivation for this feature was the desire to understand why
> compilation times are so long when using LLVM as backend. But later I realized
> that this functionality is useful on its own for end users, not just GHC devs,
> so it would make sense to add a new flag -ddump-time-trace.
>
> Since not only does Haskell have complex type system, but also there is
> a variety of language extensions, we, the Haskell users, often experience
> very long compilation times. For instance, the usage of the TypeFaimilies
> extension can considerably slow down the compilation process [2].
> It is useful to understand how you can fix your code so that it compiles faster.
>
> There are two options for that right now:
> - Building GHC from source for profiling and using the just-built GHC for
>    compiling your problem code.
> - Building the compiler from source with event log support [3].
>
> The former option just doesn't do it, since the output would be
> "which GHC function calls took how much time", but there'd be no information
> about which part of the user code was being compiled.
>
> The latter option is much closer to what we need. If we link the GHC executable
> with the -eventlog flag, then various significant events will be written to
> a special log file. For example, "Parser began parsing the Main.hs file after
> 5 ms since GHC has started, and ended parsing it 3 ms after that".
> The resulting log file can be parsed with the ghc-events library [4], and also
> visualized using the ThreadScope app [5].
>
> Bu this approach has its disadvantages.
>
> Firstly, if the user wants visualization, they'd have to install ThreadScope.
> Some companies' internal policies prohibit installing third-party apps from
> the internet. It would be good if we could generate a log that could be
> visualized on any computer with a browser. For that we could use
> the Chrome Trace Event format [6]. This is an ordinary JSON file with a specific
> structure that can be viewed in the Chrome browser by going to
> the chrome://tracing page, or on https://speedscope.app. A file in exactly this
> format would be generated by Clang if you passed it the -ftime-trace flag.
>
> Secondly, the event log contains many events that are not relevant to the end
> user, for example, thread creation, memory allocation, etc.
>
> As an initial proof of concept, I've developed a command line tool for
> transforming event log files to Chrome Trace Event files [7].
>
> Thirdly, in order for the event log to be generated, you'd still have to build
> GHC from source. The majority of the GHC users won't go this way. Not only
> would it require some basic understanding of the GHC building process, but also
> building itself takes quite some time. It would be great if the needed
> functionality came with GHC out of the box.
>
> This is why I've added support for generating Trace Event files into my fork
> of GHC [8], and I would like to propose including this feature into the mainline
> GHC.
>
> (Note that my implementation is still a bit buggy, for example, it only works
> in -j mode. This will be fixed.)
>
> You can now execute the following command:
>
> ghc -ddump-to-file -ddump-file-prefix="Main." -ddump-time-trace -j -O3 Main.hs
>
> And receive this image [9]. Here, we've compiled two modules, one of which
> depends on the other.
>
> One more difference from event log is that now various metadata
> (like file and module names) are emitted as a separate JSON attribute, instead
> of being encoded in the name of an event. For example, parsing the Main.hs file
> and parsing the QSort.hs file in one compilation are events with the same name,
> but different metadata. We can group them together if we want to know how much
> time the parsing took overall. The event log format doesn't allow us to do it.
>
> So, we can now generate a Trace Event file from a single GHC invocation.
> However, most projects use build systems that invoke the compiler many times.
> It would be good if we could form a log for the whole project.
>
> This is solved by merging logs. However, there is one subtlety: the events in
> logs use relative timestamps (the number of microseconds since the process
> has started). To combine logs from multiple processes, we add a clock
> synchronization point into each trace log in the form of an additional
> 'beginningOfTime' JSON attribute that contains the absolute time when
> the proccess has started.
>
> There is a Python script that performs the actual merging [10].
>
> This is how it looks like with multiple processes [11].
>
> Also, I've implemented generation of the 'beginningOfTime' attribute in
> LLVM [12], so build systems could take advantage of that and combine GHC trace
> logs with llc/opt trace logs when GHC uses LLVM as backend.
>
> Thoughts?
>
> Sergej.
>
> [1]  https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-chart-profiler-for-Clang/
> [2]  https://gitlab.haskell.org/ghc/ghc/issues/8095
> [3]  https://gitlab.haskell.org/ghc/ghc/-/wikis/event-log
> [4]  http://hackage.haskell.org/package/ghc-events
> [5]  https://wiki.haskell.org/ThreadScope
> [6]  https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit
> [7]  https://github.com/broadwaylamb/ghc-eventlog-chrome
> [8]  https://gitlab.haskell.org/broadwaylamb/ghc/-/commits/timetrace-better-metadata
> [9]  https://user-images.githubusercontent.com/16309982/79079705-39775e00-7d19-11ea-9507-eb0f11581c63.png
> [10] https://github.com/broadwaylamb/merge_trace_events
> [11] https://user-images.githubusercontent.com/16309982/79080338-ad673580-7d1c-11ea-9e30-5e6f72e77555.png
> [12] https://github.com/llvm/llvm-project/commit/2899103108d38215af8aae377cd0e54794278209
>
> _______________________________________________
> ghc-devs mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: [RFC] Compiler pipeline timings visualization

GHC - devs mailing list
This sounds great!

Event logging needn't add much cost if we don't log much -- eg just compiler phases.  It simply depends on how many events you generate!

For naïve uses, the log you show in [9] is a bit complicated.  Something even simpler might be good.

Simon

|  -----Original Message-----
|  From: ghc-devs <[hidden email]> On Behalf Of Andreas
|  Klebinger
|  Sent: 01 May 2020 11:04
|  To: Sergej Jaskiewicz <[hidden email]>
|  Cc: [hidden email]
|  Subject: Re: [RFC] Compiler pipeline timings visualization
|  
|  Hi Sergej,
|  
|  I think this is a good idea in general, and it seems you did some great
|  work there already.
|  Something like this can also help with pinpointing performance issues
|  inside the compiler
|  so would not just be useful to end users.
|  
|  intuitively I would assume that instead of adding another way
|  to produce compiler events we should:
|  * Ship GHC with eventlog enabled by default
|  * Provide an official converter from eventLog to the chrome trace format.
|  
|  The eventlog format is quite flexible, and if it's unsuitable to what
|  you want I would
|  prefer to extend it rather than adding support for additional formats
|  inside GHC itself.
|  
|  This way we:
|  * Continue to have one unified way to dump events from haskell apps (the
|  eventlog)
|  * Users need not go to third party apps, as the converter could
|  reasonably come with GHC (like hp2ps)
|  * We are free to include information that can't (easily) be encoded in
|  the chrome format.
|  
|  The downside is that users will have to invoke ghc, followed by some
|  other tool to get the
|  chrome trace, but to me that seems like a worthwhile tradeoff for
|  keeping the compiler and
|  various visualization formats somewhat separated.
|  
|  The obvious question there is how much enabling the eventlog by default
|  would impact non-logging ghc
|  invocations. I have not measured this and it might rule out this
|  approach if it has a big impact and isn't
|  easily corrected.
|  
|  As a last point I want to encourage you to open a ticket about this.
|  Mailing list threads tend to be harder to follow and find down the line
|  then tickets in my experience.
|  
|  Cheers,
|  Andreas
|  
|  Sergej Jaskiewicz via ghc-devs schrieb am 01.05.2020 um 11:09:
|  > tl;dr: I propose adding a new GHC flag for generating a log that allows
|  > visualizing how much time each stage in the compiler pipleline took,
|  similar
|  > to Clang's -ftime-trace.
|  >
|  > Hello, everyone.
|  >
|  > I'd like to tell you about a feature I've been working on recently,
|  namely,
|  > the ability to generate and visualize how long each stage of the
|  compilation
|  > pipeline takes to complete. This is basically about the same as
|  > the -ftime-trace flag that has landed in Clang several months ago [1].
|  >
|  > The initial motivation for this feature was the desire to understand
|  why
|  > compilation times are so long when using LLVM as backend. But later I
|  realized
|  > that this functionality is useful on its own for end users, not just
|  GHC devs,
|  > so it would make sense to add a new flag -ddump-time-trace.
|  >
|  > Since not only does Haskell have complex type system, but also there is
|  > a variety of language extensions, we, the Haskell users, often
|  experience
|  > very long compilation times. For instance, the usage of the
|  TypeFaimilies
|  > extension can considerably slow down the compilation process [2].
|  > It is useful to understand how you can fix your code so that it
|  compiles faster.
|  >
|  > There are two options for that right now:
|  > - Building GHC from source for profiling and using the just-built GHC
|  for
|  >    compiling your problem code.
|  > - Building the compiler from source with event log support [3].
|  >
|  > The former option just doesn't do it, since the output would be
|  > "which GHC function calls took how much time", but there'd be no
|  information
|  > about which part of the user code was being compiled.
|  >
|  > The latter option is much closer to what we need. If we link the GHC
|  executable
|  > with the -eventlog flag, then various significant events will be
|  written to
|  > a special log file. For example, "Parser began parsing the Main.hs file
|  after
|  > 5 ms since GHC has started, and ended parsing it 3 ms after that".
|  > The resulting log file can be parsed with the ghc-events library [4],
|  and also
|  > visualized using the ThreadScope app [5].
|  >
|  > Bu this approach has its disadvantages.
|  >
|  > Firstly, if the user wants visualization, they'd have to install
|  ThreadScope.
|  > Some companies' internal policies prohibit installing third-party apps
|  from
|  > the internet. It would be good if we could generate a log that could be
|  > visualized on any computer with a browser. For that we could use
|  > the Chrome Trace Event format [6]. This is an ordinary JSON file with a
|  specific
|  > structure that can be viewed in the Chrome browser by going to
|  > the chrome://tracing page, or on https://speedscope.app. A file in
|  exactly this
|  > format would be generated by Clang if you passed it the -ftime-trace
|  flag.
|  >
|  > Secondly, the event log contains many events that are not relevant to
|  the end
|  > user, for example, thread creation, memory allocation, etc.
|  >
|  > As an initial proof of concept, I've developed a command line tool for
|  > transforming event log files to Chrome Trace Event files [7].
|  >
|  > Thirdly, in order for the event log to be generated, you'd still have
|  to build
|  > GHC from source. The majority of the GHC users won't go this way. Not
|  only
|  > would it require some basic understanding of the GHC building process,
|  but also
|  > building itself takes quite some time. It would be great if the needed
|  > functionality came with GHC out of the box.
|  >
|  > This is why I've added support for generating Trace Event files into my
|  fork
|  > of GHC [8], and I would like to propose including this feature into the
|  mainline
|  > GHC.
|  >
|  > (Note that my implementation is still a bit buggy, for example, it only
|  works
|  > in -j mode. This will be fixed.)
|  >
|  > You can now execute the following command:
|  >
|  > ghc -ddump-to-file -ddump-file-prefix="Main." -ddump-time-trace -j -O3
|  Main.hs
|  >
|  > And receive this image [9]. Here, we've compiled two modules, one of
|  which
|  > depends on the other.
|  >
|  > One more difference from event log is that now various metadata
|  > (like file and module names) are emitted as a separate JSON attribute,
|  instead
|  > of being encoded in the name of an event. For example, parsing the
|  Main.hs file
|  > and parsing the QSort.hs file in one compilation are events with the
|  same name,
|  > but different metadata. We can group them together if we want to know
|  how much
|  > time the parsing took overall. The event log format doesn't allow us to
|  do it.
|  >
|  > So, we can now generate a Trace Event file from a single GHC
|  invocation.
|  > However, most projects use build systems that invoke the compiler many
|  times.
|  > It would be good if we could form a log for the whole project.
|  >
|  > This is solved by merging logs. However, there is one subtlety: the
|  events in
|  > logs use relative timestamps (the number of microseconds since the
|  process
|  > has started). To combine logs from multiple processes, we add a clock
|  > synchronization point into each trace log in the form of an additional
|  > 'beginningOfTime' JSON attribute that contains the absolute time when
|  > the proccess has started.
|  >
|  > There is a Python script that performs the actual merging [10].
|  >
|  > This is how it looks like with multiple processes [11].
|  >
|  > Also, I've implemented generation of the 'beginningOfTime' attribute in
|  > LLVM [12], so build systems could take advantage of that and combine
|  GHC trace
|  > logs with llc/opt trace logs when GHC uses LLVM as backend.
|  >
|  > Thoughts?
|  >
|  > Sergej.
|  >
|  > [1]  https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-
|  chart-profiler-for-Clang/
|  > [2]  https://gitlab.haskell.org/ghc/ghc/issues/8095
|  > [3]  https://gitlab.haskell.org/ghc/ghc/-/wikis/event-log
|  > [4]  http://hackage.haskell.org/package/ghc-events
|  > [5]  https://wiki.haskell.org/ThreadScope
|  > [6]  https://docs.google.com/document/d/1CvAClvFfyA5R-
|  PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit
|  > [7]  https://github.com/broadwaylamb/ghc-eventlog-chrome
|  > [8]  https://gitlab.haskell.org/broadwaylamb/ghc/-/commits/timetrace-
|  better-metadata
|  > [9]  https://user-images.githubusercontent.com/16309982/79079705-
|  39775e00-7d19-11ea-9507-eb0f11581c63.png
|  > [10] https://github.com/broadwaylamb/merge_trace_events
|  > [11] https://user-images.githubusercontent.com/16309982/79080338-
|  ad673580-7d1c-11ea-9e30-5e6f72e77555.png
|  > [12] https://github.com/llvm/llvm-
|  project/commit/2899103108d38215af8aae377cd0e54794278209
|  >
|  > _______________________________________________
|  > ghc-devs mailing list
|  > [hidden email]
|  > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
|  
|  _______________________________________________
|  ghc-devs mailing list
|  [hidden email]
|  http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Compiler pipeline timings visualization

Ben Gamari-2
In reply to this post by Andreas Klebinger
Andreas Klebinger <[hidden email]> writes:

> Hi Sergej,
>
> I think this is a good idea in general, and it seems you did some great
> work there already.
> Something like this can also help with pinpointing performance issues
> inside the compiler
> so would not just be useful to end users.
>
> intuitively I would assume that instead of adding another way
> to produce compiler events we should:
> * Ship GHC with eventlog enabled by default
For what it's worth, I have also been thinking about proposing this.
I've been doing a lot of performance characterisation recently where I
would have liked to use our binary distributions. Sadly this isn't an
option as the ghc executable doesn't support the eventlog.

I've not taken measurements on the eventlog overhead in ghc-bin but I
suspect that any overhead that *does* exist can be easily eliminated (as
I describe in #17949). We just need to ensure that
Debug.Trace.{traceEvent,traceMarker} know not to pack their buffers if
their respective tracing flags aren't enabled.

Frankly, I also question the value of shipping the non-tracing-enabled
RTS at all. Enabling tracing by default would allow us to eliminate
three whole RTS builds from CI, which I suspect would be worthwhile.
The size overhead on a statically-linked executable appears to be fairly
small in the grand scheme of things:

    -rw-r--r-- 1 ben users 6.0M Apr 26 15:11 libHSrts-1.0.a
    -rw-r--r-- 1 ben users 6.4M Apr 26 15:21 libHSrts-1.0_l.a


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Compiler pipeline timings visualization

Ben Gamari-3
In reply to this post by GHC - devs mailing list
Sergej Jaskiewicz via ghc-devs <[hidden email]> writes:

> tl;dr: I propose adding a new GHC flag for generating a log that allows
> visualizing how much time each stage in the compiler pipleline took, similar
> to Clang's -ftime-trace.
>
> Hello, everyone.

Hi Sergej,


[snip]

> The latter option is much closer to what we need. If we link the GHC executable
> with the -eventlog flag, then various significant events will be written to
> a special log file. For example, "Parser began parsing the Main.hs file after
> 5 ms since GHC has started, and ended parsing it 3 ms after that".
> The resulting log file can be parsed with the ghc-events library [4], and also
> visualized using the ThreadScope app [5].
>
I'm a bit short on time but here are a few points in no particular order:

Out of curiosity, have you seen Alp's work [1] in this area? This work
allows use to the ghc-events-analyze package [2] to visualize (e.g. [3])
the existing withTiming eventlog output in a reasonably easy-to-consume
format.

That being said, I can certainly see the benefit of using the Chrome
tracing infrastructure for this; it would make for a nicer user
experience than the static image that ghc-events-analyze spits out.

I also know that Matthew Pickering has been working in this area and I'm
sure will have something interesting to add.

I will admit that I am a bit reluctant to add support for a *particular*
tracing format to GHC itself. I think it would make the most sense for
GHC to produce a consumer-agnostic trace representation (via our
existing eventlog mechanism) and for users to transform this into the
format their tools require.

Our current withTimings infrastructure is quite ad-hoc and could perhaps
expose more detail. It would be great to know what sorts of detail
people would find useful.

Cheers,

- Ben


[1] https://www.haskell.org/ghc/blog/20190924-eventful-ghc.html
[2] https://hackage.haskell.org/package/ghc-events-analyze
[3] https://www.haskell.org/ghc/blog/images/eventful-ghc/ghc-events-viz.svg

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Compiler pipeline timings visualization

Matthew Pickering
Hi Sergej,

I definitely agree adding the capability for GHC to emit telemetry
would be useful. I also prefer the eventlog like other people has
already mentioned. We should at least ship with compiler so it can
emit these user events even if it can't emit RTS events.

In future we should probably look more into uprobes in order to make
tracing more efficient but we have what we have for now.

Recently I have been working with Dimity Ivanov on some tooling
related to this. He has implemented a standard tracing format and
exporters for many common diagnostic tools.

https://github.com/ethercrow/opentelemetry-haskell

You can see some examples of eventlogs generated by GHC in this issue:

https://github.com/ethercrow/opentelemetry-haskell/issues/9

Cheers,

Matt


On Fri, May 1, 2020 at 5:55 PM Ben Gamari <[hidden email]> wrote:

>
> Sergej Jaskiewicz via ghc-devs <[hidden email]> writes:
>
> > tl;dr: I propose adding a new GHC flag for generating a log that allows
> > visualizing how much time each stage in the compiler pipleline took, similar
> > to Clang's -ftime-trace.
> >
> > Hello, everyone.
>
> Hi Sergej,
>
>
> [snip]
>
> > The latter option is much closer to what we need. If we link the GHC executable
> > with the -eventlog flag, then various significant events will be written to
> > a special log file. For example, "Parser began parsing the Main.hs file after
> > 5 ms since GHC has started, and ended parsing it 3 ms after that".
> > The resulting log file can be parsed with the ghc-events library [4], and also
> > visualized using the ThreadScope app [5].
> >
> I'm a bit short on time but here are a few points in no particular order:
>
> Out of curiosity, have you seen Alp's work [1] in this area? This work
> allows use to the ghc-events-analyze package [2] to visualize (e.g. [3])
> the existing withTiming eventlog output in a reasonably easy-to-consume
> format.
>
> That being said, I can certainly see the benefit of using the Chrome
> tracing infrastructure for this; it would make for a nicer user
> experience than the static image that ghc-events-analyze spits out.
>
> I also know that Matthew Pickering has been working in this area and I'm
> sure will have something interesting to add.
>
> I will admit that I am a bit reluctant to add support for a *particular*
> tracing format to GHC itself. I think it would make the most sense for
> GHC to produce a consumer-agnostic trace representation (via our
> existing eventlog mechanism) and for users to transform this into the
> format their tools require.
>
> Our current withTimings infrastructure is quite ad-hoc and could perhaps
> expose more detail. It would be great to know what sorts of detail
> people would find useful.
>
> Cheers,
>
> - Ben
>
>
> [1] https://www.haskell.org/ghc/blog/20190924-eventful-ghc.html
> [2] https://hackage.haskell.org/package/ghc-events-analyze
> [3] https://www.haskell.org/ghc/blog/images/eventful-ghc/ghc-events-viz.svg
> _______________________________________________
> ghc-devs mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Compiler pipeline timings visualization

Ömer Sinan Ağacan
Hi,

While better representation/visualization of the pipeline might be useful, my
opinion is that it might not be as useful as you might expect. The problem is
lazy evaluation: it assigns the blame to incorrect places.

For example, in GHC if I update a record field in pass 1 and use it in pass 2,
allocation and runtime costs of generating the field (maybe an analysis result)
is assigned to pass 2, which is not what you want most of the time (one of the
reasons why I think lazy evaluation is not a good default).

Ömer

Matthew Pickering <[hidden email]>, 3 May 2020 Paz, 17:44
tarihinde şunu yazdı:

>
> Hi Sergej,
>
> I definitely agree adding the capability for GHC to emit telemetry
> would be useful. I also prefer the eventlog like other people has
> already mentioned. We should at least ship with compiler so it can
> emit these user events even if it can't emit RTS events.
>
> In future we should probably look more into uprobes in order to make
> tracing more efficient but we have what we have for now.
>
> Recently I have been working with Dimity Ivanov on some tooling
> related to this. He has implemented a standard tracing format and
> exporters for many common diagnostic tools.
>
> https://github.com/ethercrow/opentelemetry-haskell
>
> You can see some examples of eventlogs generated by GHC in this issue:
>
> https://github.com/ethercrow/opentelemetry-haskell/issues/9
>
> Cheers,
>
> Matt
>
>
> On Fri, May 1, 2020 at 5:55 PM Ben Gamari <[hidden email]> wrote:
> >
> > Sergej Jaskiewicz via ghc-devs <[hidden email]> writes:
> >
> > > tl;dr: I propose adding a new GHC flag for generating a log that allows
> > > visualizing how much time each stage in the compiler pipleline took, similar
> > > to Clang's -ftime-trace.
> > >
> > > Hello, everyone.
> >
> > Hi Sergej,
> >
> >
> > [snip]
> >
> > > The latter option is much closer to what we need. If we link the GHC executable
> > > with the -eventlog flag, then various significant events will be written to
> > > a special log file. For example, "Parser began parsing the Main.hs file after
> > > 5 ms since GHC has started, and ended parsing it 3 ms after that".
> > > The resulting log file can be parsed with the ghc-events library [4], and also
> > > visualized using the ThreadScope app [5].
> > >
> > I'm a bit short on time but here are a few points in no particular order:
> >
> > Out of curiosity, have you seen Alp's work [1] in this area? This work
> > allows use to the ghc-events-analyze package [2] to visualize (e.g. [3])
> > the existing withTiming eventlog output in a reasonably easy-to-consume
> > format.
> >
> > That being said, I can certainly see the benefit of using the Chrome
> > tracing infrastructure for this; it would make for a nicer user
> > experience than the static image that ghc-events-analyze spits out.
> >
> > I also know that Matthew Pickering has been working in this area and I'm
> > sure will have something interesting to add.
> >
> > I will admit that I am a bit reluctant to add support for a *particular*
> > tracing format to GHC itself. I think it would make the most sense for
> > GHC to produce a consumer-agnostic trace representation (via our
> > existing eventlog mechanism) and for users to transform this into the
> > format their tools require.
> >
> > Our current withTimings infrastructure is quite ad-hoc and could perhaps
> > expose more detail. It would be great to know what sorts of detail
> > people would find useful.
> >
> > Cheers,
> >
> > - Ben
> >
> >
> > [1] https://www.haskell.org/ghc/blog/20190924-eventful-ghc.html
> > [2] https://hackage.haskell.org/package/ghc-events-analyze
> > [3] https://www.haskell.org/ghc/blog/images/eventful-ghc/ghc-events-viz.svg
> > _______________________________________________
> > ghc-devs mailing list
> > [hidden email]
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> _______________________________________________
> ghc-devs mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs