include the new -flate-dmd-anal in -O2?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

include the new -flate-dmd-anal in -O2?

Nicolas Frisby
TO: Performance czars and devs

I pushed a patch yesterday enabling a second demand analysis at the end of
the core2core simplification pipeline. The flag is -flate-dmd-anal, and it
is off by default.

My question:

    What's the protocol for deciding if -O2 should imply it?

See http://ghc.haskell.org/trac/ghc/wiki/LateDmd for context.

In particular, this section includes highlights of some nofib runs I did.

  http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers

For some tests, it decreases allocation by 10% to 20%. But on the platforms
I have tried, it causes a couple repeatable slowdowns, up to 10%. I've
investigated a bit, but haven't found any clear explanations. I'm worried
that it's caching effects, eg.

Any suggestions on how I should proceed with my investigation?

Also: I'd appreciate if any developer would generously run some benchmarks
on various platforms they might have and add them to the same section in
the wiki page.

  http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers

NB That it is unfortunately key to build the libraries twice: once with
-flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how
to do this robustly without a distclean ? please let me know if you have a
better method.

So I've used

# one of the following
#GhcLibHcOpts    = -O2  # both with and without -flate-dmd-anal
GhcLibHcOpts    = -O2 -flate-dmd-anal
SplitObjs          = NO
DYNAMIC_BY_DEFAULT   = NO
DYNAMIC_GHC_PROGRAMS = NO

The last three aren't necessary, but please record what you use, if you are
so generous as to run it :).

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130830/2e141208/attachment.htm>

Reply | Threaded
Open this post in threaded view
|

include the new -flate-dmd-anal in -O2?

Johan Tibell-2
Hi Nicolas,

In my opinion we should look at nofib (slow) and make sure that

 1) it's at least neutral on average (runtimes and preferably allocations
too),
 2) there are some benchmarks that improve significantly (that's why we're
making the change after all), and
 3) we can attribute the losses to something other than significantly worse
Core (or at least more programs get better than get worse).

If these 3 hold and the compile times aren't up too much, I think it's a
candidate for being on by default in -02.

In my mind the key is to understand why the programs that got worse got
worse. For example, when I enabled -funbox-small-strict-fields by default
there were some losers, but the reasons these were losers was more
accidental than due to -funbox-small-strict-fields so I was happy to turn
it on by default anyway.

-- Johan



On Fri, Aug 30, 2013 at 12:28 PM, Nicolas Frisby
<nicolas.frisby at gmail.com>wrote:

> TO: Performance czars and devs
>
> I pushed a patch yesterday enabling a second demand analysis at the end of
> the core2core simplification pipeline. The flag is -flate-dmd-anal, and it
> is off by default.
>
> My question:
>
>     What's the protocol for deciding if -O2 should imply it?
>
> See http://ghc.haskell.org/trac/ghc/wiki/LateDmd for context.
>
> In particular, this section includes highlights of some nofib runs I did.
>
>   http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
>
> For some tests, it decreases allocation by 10% to 20%. But on the
> platforms I have tried, it causes a couple repeatable slowdowns, up to 10%.
> I've investigated a bit, but haven't found any clear explanations. I'm
> worried that it's caching effects, eg.
>
> Any suggestions on how I should proceed with my investigation?
>
> Also: I'd appreciate if any developer would generously run some benchmarks
> on various platforms they might have and add them to the same section in
> the wiki page.
>
>   http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
>
> NB That it is unfortunately key to build the libraries twice: once with
> -flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how
> to do this robustly without a distclean ? please let me know if you have a
> better method.
>
> So I've used
>
> # one of the following
> #GhcLibHcOpts    = -O2  # both with and without -flate-dmd-anal
> GhcLibHcOpts    = -O2 -flate-dmd-anal
> SplitObjs          = NO
> DYNAMIC_BY_DEFAULT   = NO
> DYNAMIC_GHC_PROGRAMS = NO
>
> The last three aren't necessary, but please record what you use, if you
> are so generous as to run it :).
>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130830/3cd17512/attachment.htm>

Reply | Threaded
Open this post in threaded view
|

include the new -flate-dmd-anal in -O2?

Edward Z. Yang
In reply to this post by Nicolas Frisby
Perf builds of GHC also use -O2 for ghc-stage2, so check out what happens
to GHC itself with late demand analysis.

Edward

Excerpts from Nicolas Frisby's message of Fri Aug 30 10:28:24 -0700 2013:

> TO: Performance czars and devs
>
> I pushed a patch yesterday enabling a second demand analysis at the end of
> the core2core simplification pipeline. The flag is -flate-dmd-anal, and it
> is off by default.
>
> My question:
>
>     What's the protocol for deciding if -O2 should imply it?
>
> See http://ghc.haskell.org/trac/ghc/wiki/LateDmd for context.
>
> In particular, this section includes highlights of some nofib runs I did.
>
>   http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
>
> For some tests, it decreases allocation by 10% to 20%. But on the platforms
> I have tried, it causes a couple repeatable slowdowns, up to 10%. I've
> investigated a bit, but haven't found any clear explanations. I'm worried
> that it's caching effects, eg.
>
> Any suggestions on how I should proceed with my investigation?
>
> Also: I'd appreciate if any developer would generously run some benchmarks
> on various platforms they might have and add them to the same section in
> the wiki page.
>
>   http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
>
> NB That it is unfortunately key to build the libraries twice: once with
> -flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how
> to do this robustly without a distclean ? please let me know if you have a
> better method.
>
> So I've used
>
> # one of the following
> #GhcLibHcOpts    = -O2  # both with and without -flate-dmd-anal
> GhcLibHcOpts    = -O2 -flate-dmd-anal
> SplitObjs          = NO
> DYNAMIC_BY_DEFAULT   = NO
> DYNAMIC_GHC_PROGRAMS = NO
>
> The last three aren't necessary, but please record what you use, if you are
> so generous as to run it :).
>
> Thanks.