What does the DWARF information generated by your GHC branch look like?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
Hi!

(I've CCed ghc-devs on this email, as I think the question is of general
interest.)

I enjoyed reading your paper [1] and I have some questions.

 * What does the generated DWARF information look like? For example, will
you fill in the .debug_line section so that standard tools like "perf
report" and gprof can be used on Haskell code? Code pointers would be
appreciated.

 * Does your GHC allow DWARF information to be generated without actually
using any of the RTS (e.g. eventlog) machinery? This would be very useful
if you want to use "report record/report" only, in order to achieve minimal
overhead when that matters. Another way to ask the same question, do you
have a ghc -g flag that has no implication for the runtime settings?

 * Do you generate DW_TAG_subprogram sections in the .debug_info section so
that other tools can figure out the name of Haskell functions?

Cheers,
Johan

1. "Causality of Optimized Haskell: What is burning our cycles?"
http://eprints.whiterose.ac.uk/76448/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140227/a56027bb/attachment.html>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
On Thu, Feb 27, 2014 at 6:43 PM, Peter Wortmann <scpmw at leeds.ac.uk> wrote:

>
> Johan Tibell wrote:
> > I enjoyed reading your paper [1] and I have some questions.
>
> Thanks! The DWARF patches are currently under review for Trac #3693. Any
> feedback would be very appreciated:
> https://github.com/scpmw/ghc/commits/profiling-import


I haven't had time to look at it all. I added a comment on
https://github.com/scpmw/ghc/commit/bbf6f35d8c341c8aadca1a48657084c007837b21


> >  will you fill in the .debug_line section so that standard tools like
>  > "perf report" and gprof can be used on Haskell code?
>
> Yes, even though from a few quick tests the results of "perf report"
> aren't too useful, as source code links are pretty coarse and jump
> around a lot - especially for optimised Haskell code. There's the option
> to instead annotate with source code links to a generated ".dump-simpl"
> file, which might turn out to be more useful.
>

I think that in general we should be as "standard" (i.e. close to how e.g.
GCC uses DWARF) as possible and put extra information in e.g. .debug_ghc.
That way we maximize the chance that standard tools will do something
sensible.


> > Code pointers would be appreciated.
>
> Is this about how .debug_line information is generated? We take the same
> approach as LLVM (and GCC, I think) and simply annotate the assembly
> with suitable .file & .loc directives. That way we can leave all the
> heavy lifting to the assembler.
>
> Current patch is here:
> https://github.com/scpmw/ghc/commit/c5294576


Makes sense. Thanks.


> >  * Does your GHC allow DWARF information to be generated without
> > actually using any of the RTS (e.g. eventlog) machinery?
>
> The RTS just serves as a DWARF interpreter for its own executable (+
> libraries) in this, so yes, it's fully independent. On the other hand,
> having special code allows us to avoid a few subtleties about Haskell
> code that are hard to communicate to standard debugging tools
> (especially concerning stack tracing).
>

Sounds good. As long as it's possible to use this without the RTS/eventlog
support I be happy. I have profiling needs (e.g. in unordered-containers)
were changes in the <5% are interesting and any extra overhead will skew
the results.


> > Another way to ask the same question, do you have a ghc -g flag that
> > has no implication for the runtime settings?
>
> Right now -g does not affect the RTS at all. We might want to change
> that at some point though so we can get rid of the libdwarf dependency.
>

That sounds good (the don't affect the RTS part). I didn't understand the
libdwarf part.


> >  * Do you generate DW_TAG_subprogram sections in the .debug_info
> > section so that other tools can figure out the name of Haskell
> > functions?
>
> Yes, we are setting the "name" attribute to a suitable Haskell name.
> Sadly, at least GDB seems to ignore it and falls back to the symbol
> name. I investigated this some time ago, and I think the reason was that
> it doesn't recognize the Haskell language ID (which isn't standardized,
> obviously). Simply pretending to be C(++) might fix this, but I would be
> a bit scared of other side-effects.
>

Lets try to get our name standardized and pushed into GDB. It's hopefully
as simple as sending an email to the GDB devs.

Cheers,
Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140228/df226ddd/attachment.html>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
On Fri, Feb 28, 2014 at 3:15 PM, Peter Wortmann <scpmw at leeds.ac.uk> wrote:

> Johan Tibell wrote:> Lets try to get our name standardized and pushed into
> GDB. It's
> > hopefully as simple as sending an email to the GDB devs.
>
> Strictly speaking standardization is the job of the DWARF format
> committee, which seem to currently be in the process of specifying
> DWARF5[1]. Not quite sure we have a strong enough case, but if we wanted
> our own language ID these would probably be the right people to ask.
>

I just emailed the DWARF discussion mailing list to sort this out.

-- Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140228/2f2e6cb2/attachment.html>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
Yes another question while I have you on the line. :)

Do we follow the best practices here:
http://wiki.dwarfstd.org/index.php?title=Best_Practices

-- Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140228/294e9d32/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Austin Seipp-5
(Passive note - this conversation looks like Johan talking to himself,
as I suspect Peter is not 'reply all'-ing to the developers list. So
other people may be a tad lost.)

On Fri, Feb 28, 2014 at 9:36 AM, Johan Tibell <johan.tibell at gmail.com> wrote:

> Yes another question while I have you on the line. :)
>
> Do we follow the best practices here:
> http://wiki.dwarfstd.org/index.php?title=Best_Practices
>
> -- Johan
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>



--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
On Fri, Feb 28, 2014 at 4:53 PM, Austin Seipp <austin at well-typed.com> wrote:
> (Passive note - this conversation looks like Johan talking to himself,
> as I suspect Peter is not 'reply all'-ing to the developers list. So
> other people may be a tad lost.)

Nope, this is just me having a stream of consciousness discussion. If
you see 2 replies from Peter and 5 emails from me (including this
one), that's it. :)

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Roman Cheplyaka-2
In reply to this post by Austin Seipp-5
Or he's not subscribed to this list and his messages do not come through

* Austin Seipp <austin at well-typed.com> [2014-02-28 09:53:31-0600]

> (Passive note - this conversation looks like Johan talking to himself,
> as I suspect Peter is not 'reply all'-ing to the developers list. So
> other people may be a tad lost.)
>
> On Fri, Feb 28, 2014 at 9:36 AM, Johan Tibell <johan.tibell at gmail.com> wrote:
> > Yes another question while I have you on the line. :)
> >
> > Do we follow the best practices here:
> > http://wiki.dwarfstd.org/index.php?title=Best_Practices
> >
> > -- Johan
> >
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >
>
>
>
> --
> Regards,
>
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140228/5e508540/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
On Fri, Feb 28, 2014 at 4:57 PM, Roman Cheplyaka <roma at ro-che.info> wrote:
> Or he's not subscribed to this list and his messages do not come through

Ah yes. From my end I can see that he tried to CC the mailing list,
but I missed that the messages didn't get through.

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Peter Wortmann
In reply to this post by Roman Cheplyaka-2

Roman Cheplyaka wrote:
> Or he's not subscribed to this list and his messages do not come through

Ah thanks, that's probably it. I accumulated lots of error mails from
the mailing list, which however didn't mention subscribing. Sorry about
the confusion...

Greetings,
  Peter Wortmann




Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Peter Wortmann
In reply to this post by Johan Tibell-2

Johan Tibell wrote:
> Do we follow the best practices here: http://wiki.dwarfstd.org/index.php?title=Best_Practices

Not quite sure what exactly you are referring to, here's the current
state:

> For DW_TAG_compilation_unit and DW_TAG_partial_unit DIEs, the name
> attribute should contain the path name of the primary source file from
> which the compilation unit was derived (see Section 3.1.1).

Yes, we do that.

> If the compiler was invoked with a full path name, it is recommended
> to use the path name as given to the compiler, although it is
> considered acceptable to convert the path name to an equivalent path
> where none of the components is a symbolic link.

I am simply using ModLocation for this. The results make sense, even
though I haven't tried crazy symbolic link combinations yet. If we find
something to improve we should probably do it for GHC as a whole.

> combining the compilation directory (see DW_AT_comp_dir) with the
> relative path name.

We set this attribute, albeit simply using getCurrentDirectory. This
might be an oversight, but I couldn't see a location where GHC stores
the "compilation directory" path.

> For modules, subroutines, variables, parameters, constants, types, and
> labels, the DW_AT_name attribute should contain the name of the
> corresponding program object as it appears in the source code

We make a "best effort" to provide a suitable name for every single
procedure. Note that a single function in Haskell might become multiple
subroutines in DWARF - or not appear at all due to in-lining.

> In general, the value of DW_AT_name should be such that a
> fully-qualified name constructed from the DW_AT_name attributes of the
> object and its containing objects will uniquely represent that object
> in a form natural to the source language.

This would probably require us to have a DIE for modules. Not quite sure
how we would approach that.

> The producer may also generate a DW_AT_linkage_name attribute for
> program objects

We do that.

> In many cases, however, it is expensive for a consumer to parse the
> hierarchy, and the presence of the mangled name may be beneficial to
> performance.

This might be the underlying reason why it shows mangled names for
languages with unknown IDs (such as Haskell). We'll see whether Johan's
query to the GDB team brings some light into that.

Greetings,
  Peter Wortmann



Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Peter Wortmann
In reply to this post by Johan Tibell-2

[copy of the dropped reply, for anybody interested]

Johan Tibell wrote:
> I enjoyed reading your paper [1] and I have some questions.

Thanks! The DWARF patches are currently under review for Trac #3693. Any
feedback would be very appreciated:
https://github.com/scpmw/ghc/commits/profiling-import

>  * What does the generated DWARF information look like?

So far we generate:
- .debug_info: Information about all generated procedures and blocks.
- .debug_line: Source-code links for all generated code
- .debug_frame: Unwind information for the GHC stack
- .debug_ghc: Everything we can't properly represent as DWARF

>  will you fill in the .debug_line section so that standard tools like
> "perf report" and gprof can be used on Haskell code?

Yes, even though from a few quick tests the results of "perf report"
aren't too useful, as source code links are pretty coarse and jump
around a lot - especially for optimised Haskell code. There's the option
to instead annotate with source code links to a generated ".dump-simpl"
file, which might turn out to be more useful.

> Code pointers would be appreciated.

Is this about how .debug_line information is generated? We take the same
approach as LLVM (and GCC, I think) and simply annotate the assembly
with suitable .file & .loc directives. That way we can leave all the
heavy lifting to the assembler.

Current patch is here:
https://github.com/scpmw/ghc/commit/c5294576

>  * Does your GHC allow DWARF information to be generated without
> actually using any of the RTS (e.g. eventlog) machinery?

The RTS just serves as a DWARF interpreter for its own executable (+
libraries) in this, so yes, it's fully independent. On the other hand,
having special code allows us to avoid a few subtleties about Haskell
code that are hard to communicate to standard debugging tools
(especially concerning stack tracing).

> Another way to ask the same question, do you have a ghc -g flag that
> has no implication for the runtime settings?

Right now -g does not affect the RTS at all. We might want to change
that at some point though so we can get rid of the libdwarf dependency.

>  * Do you generate DW_TAG_subprogram sections in the .debug_info
> section so that other tools can figure out the name of Haskell
> functions?

Yes, we are setting the "name" attribute to a suitable Haskell name.
Sadly, at least GDB seems to ignore it and falls back to the symbol
name. I investigated this some time ago, and I think the reason was that
it doesn't recognize the Haskell language ID (which isn't standardized,
obviously). Simply pretending to be C(++) might fix this, but I would be
a bit scared of other side-effects.

Greetings,
  Peter Wortmann





Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Nathan Howell-2
In reply to this post by Johan Tibell-2
I did get a language ID assigned a couple years ago, it should be in DWARF
5.

Accepted:  DW_LANG_Haskell assigned value 0x18. -- April 18.2012

http://dwarfstd.org/ShowIssue.php?issue=120218.1



On Fri, Feb 28, 2014 at 7:34 AM, Johan Tibell <johan.tibell at gmail.com>wrote:

> On Fri, Feb 28, 2014 at 3:15 PM, Peter Wortmann <scpmw at leeds.ac.uk> wrote:
>
>> Johan Tibell wrote:> Lets try to get our name standardized and pushed
>> into GDB. It's
>> > hopefully as simple as sending an email to the GDB devs.
>>
>> Strictly speaking standardization is the job of the DWARF format
>> committee, which seem to currently be in the process of specifying
>> DWARF5[1]. Not quite sure we have a strong enough case, but if we wanted
>> our own language ID these would probably be the right people to ask.
>>
>
> I just emailed the DWARF discussion mailing list to sort this out.
>
> -- Johan
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140302/5ec10f56/attachment.html>

Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Peter Wortmann

Nathan Howell wrote:
> I did get a language ID assigned a couple years ago, it should be in DWARF
> 5.
>
> Accepted:  DW_LANG_Haskell assigned value 0x18. -- April 18.2012

Nice work. We'll start using that one then :)

Greetings,
  Peter Wortmann




Reply | Threaded
Open this post in threaded view
|

What does the DWARF information generated by your GHC branch look like?

Johan Tibell-2
On Mon, Mar 3, 2014 at 2:54 PM, Peter Wortmann <scpmw at leeds.ac.uk> wrote:
>
> Nathan Howell wrote:
>> I did get a language ID assigned a couple years ago, it should be in DWARF
>> 5.
>>
>> Accepted:  DW_LANG_Haskell assigned value 0x18. -- April 18.2012
>
> Nice work. We'll start using that one then :)

Great!

-- Johan