Repeated computations under a lambda

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Repeated computations under a lambda

Conal Elliott
I'm seeing what looks like repeated computation under a lambda with `-O` and `-O2`. The following definition:

> exampleC :: Double -> Double -> Double
> exampleC = \ t -> let s = sin t in \ x -> x + s

yields this Core:

> -- RHS size: {terms: 13, types: 6, coercions: 0}
> exampleC :: Double -> Double -> Double
> exampleC =
>   \ (t_afI6 :: Double) (eta_B1 :: Double) ->
>     case eta_B1 of _ { D# x_aj5c ->
>     case t_afI6 of _ { D# x1_aj5l ->
>     D# (+## x_aj5c (sinDouble# x1_aj5l))
>     }
>     }

The `sinDouble#` here depends only on `t_afI6` (`t`) but still appears under the binding of `eta_B1` (`x`).

I'm concerned because many of my uses of such functions involve computations dependent only on `t` (time) but with millions of uses (space) per `t`. (I'm working on a GHC Core plugin (compiling to categories), with one use generating graphics GPU code.)

Does the optimization I expected somehow happen later in the compilation pipeline?

Are there Core-manipulating functions in GHC that I can use to make it happen earlier (via a `BuiltinRule`, i.e., Core-to-Core function)?

Thanks,
-- Conal


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Sebastian Graf
Hi Conal,

so if I understand this right, you'd rather not wanted `exampleC` to be eta-expanded (or the binding of `s` to be floated into the lambda)?
Or is it that you want CSE to find out that you always supply the same `t` as the first argument and share the partial application and thus the work of computing `s`?

If it's the former: GHC doesn't normally do this, unless it has found out that no sharing (of work done to evaluate `s`) would be lost through eta-expansion.
This is the case when `exampleC` is always called with two arguments, so that no binding of `s` is shared, for example.
Could you maybe post a complete module/expression representative of all uses of `exampleC`?

If it's the latter, I'm afraid I can't really help, but surely someone else can.

Cheers,
Sebastian

On Tue, Jul 18, 2017 at 5:34 PM, Conal Elliott <[hidden email]> wrote:
I'm seeing what looks like repeated computation under a lambda with `-O` and `-O2`. The following definition:

> exampleC :: Double -> Double -> Double
> exampleC = \ t -> let s = sin t in \ x -> x + s

yields this Core:

> -- RHS size: {terms: 13, types: 6, coercions: 0}
> exampleC :: Double -> Double -> Double
> exampleC =
>   \ (t_afI6 :: Double) (eta_B1 :: Double) ->
>     case eta_B1 of _ { D# x_aj5c ->
>     case t_afI6 of _ { D# x1_aj5l ->
>     D# (+## x_aj5c (sinDouble# x1_aj5l))
>     }
>     }

The `sinDouble#` here depends only on `t_afI6` (`t`) but still appears under the binding of `eta_B1` (`x`).

I'm concerned because many of my uses of such functions involve computations dependent only on `t` (time) but with millions of uses (space) per `t`. (I'm working on a GHC Core plugin (compiling to categories), with one use generating graphics GPU code.)

Does the optimization I expected somehow happen later in the compilation pipeline?

Are there Core-manipulating functions in GHC that I can use to make it happen earlier (via a `BuiltinRule`, i.e., Core-to-Core function)?

Thanks,
-- Conal


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Joachim Breitner-2
In reply to this post by Conal Elliott
Hi,


Am Dienstag, den 18.07.2017, 08:34 -0700 schrieb Conal Elliott:

> I'm seeing what looks like repeated computation under a lambda with
> `-O` and `-O2`. The following definition:
>
> > exampleC :: Double -> Double -> Double
> > exampleC = \ t -> let s = sin t in \ x -> x + s
>
> yields this Core:
>
> > -- RHS size: {terms: 13, types: 6, coercions: 0}
> > exampleC :: Double -> Double -> Double
> > exampleC =
> >   \ (t_afI6 :: Double) (eta_B1 :: Double) ->
> >     case eta_B1 of _ { D# x_aj5c ->
> >     case t_afI6 of _ { D# x1_aj5l ->
> >     D# (+## x_aj5c (sinDouble# x1_aj5l))
> >     }
> >     }
ghc -O -dverbose-core2core shows you that the problem is this phase:

==================== Simplifier ====================
  Max iterations = 4
  SimplMode {Phase = 2 [main],
             inline,
             rules,
             eta-expand,
             case-of-case}

It does not happen with -fno-do-lambda-eta-expansion (but you’d lose in
other parts.)

> I'm concerned because many of my uses of such functions involve
> computations dependent only on `t` (time) but with millions of uses
> (space) per `t`. (I'm working on a GHC Core plugin (compiling to
> categories), with one use generating graphics GPU code.)

Did you measure whether this really is a problem? The benefits of not
dealing with dynamically allocated functions might outweigh the cost of
recalculating sin.

Greetings,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Conal Elliott
Thanks very much for this reply, Joachim. I see that `-fno-do-lambda-eta-expansion` with my example prevents moving the computation under the lambda where it gets repeatedly evaluated. I don't understand what this code motion/substitution has to do with eta-expansion. Is it that the `let` expression itself is eta-expanded? The GHC Users Guide describes this flag as "eta-expand let-bindings to increase their arity", which doesn't seem to fit here, since the `let`-bindings are not function-valued (though the `let` expression is).

Thanks also for the suggestion of using `-dverbose-core2core` to see where the unwanted substitution happened.

Did you measure whether this really is a problem? The benefits of not
dealing with dynamically allocated functions might outweigh the cost of
recalculating sin.

No, I haven't measured. In this case, I'm compiling Haskell to GLSL for execution on a GPU, where the inner lambda will be over space, which means at least one application per pixel, so the computations moved under the inner lambda will be redundantly computed a few millions of times per frame (and much more with anti-aliasing). Instead, I want to move those calculations to once per frame and stored in quickly accessed video memory. As the space-independent computation gets more complex, I expect the saving to grow.

Thanks again,  
-- Conal

On Tue, Jul 18, 2017 at 12:08 PM, Joachim Breitner <[hidden email]> wrote:
Hi,


Am Dienstag, den 18.07.2017, 08:34 -0700 schrieb Conal Elliott:
> I'm seeing what looks like repeated computation under a lambda with
> `-O` and `-O2`. The following definition:
>
> > exampleC :: Double -> Double -> Double
> > exampleC = \ t -> let s = sin t in \ x -> x + s
>
> yields this Core:
>
> > -- RHS size: {terms: 13, types: 6, coercions: 0}
> > exampleC :: Double -> Double -> Double
> > exampleC =
> >   \ (t_afI6 :: Double) (eta_B1 :: Double) ->
> >     case eta_B1 of _ { D# x_aj5c ->
> >     case t_afI6 of _ { D# x1_aj5l ->
> >     D# (+## x_aj5c (sinDouble# x1_aj5l))
> >     }
> >     }

ghc -O -dverbose-core2core shows you that the problem is this phase:

==================== Simplifier ====================
  Max iterations = 4
  SimplMode {Phase = 2 [main],
             inline,
             rules,
             eta-expand,
             case-of-case}

It does not happen with -fno-do-lambda-eta-expansion (but you’d lose in
other parts.)

> I'm concerned because many of my uses of such functions involve
> computations dependent only on `t` (time) but with millions of uses
> (space) per `t`. (I'm working on a GHC Core plugin (compiling to
> categories), with one use generating graphics GPU code.)

Did you measure whether this really is a problem? The benefits of not
dealing with dynamically allocated functions might outweigh the cost of
recalculating sin.

Greetings,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Joachim Breitner-2
Hi Conal,

Am Dienstag, den 18.07.2017, 15:35 -0700 schrieb Conal Elliott:
> Thanks very much for this reply, Joachim. I see that `-fno-do-lambda-
> eta-expansion` with my example prevents moving the computation under
> the lambda where it gets repeatedly evaluated. I don't understand
> what this code motion/substitution has to do with eta-expansion. Is
> it that the `let` expression itself is eta-expanded? The GHC Users
> Guide describes this flag as "eta-expand let-bindings to increase
> their arity", which doesn't seem to fit here, since the `let`-
> bindings are not function-valued (though the `let` expression is).

In the code

exampleC :: Double -> Double -> Double
exampleC = \ t -> let s = sin t in \ x -> x + s

the exampleC is the bound things that is eta-expanded. Ok, it is a top-
level function, but that does not make a big difference. When eta-
expanded, it becomes

exampleC :: Double -> Double -> Double
exampleC = \ t y -> (let s = sin
t in \ x -> x + s) y

and from then on, usual simplifications kick in to produce

exampleC :: Double -> Double -> Double
exampleC = \ t y -> let s = sin t in y + s

and eventually

exampleC :: Double -> Double -> Double
exampleC = \ t y -> y + sin t

(I ignore the unboxing of Doubles here).


> > Did you measure whether this really is a problem? The benefits of not
> > dealing with dynamically allocated functions might outweigh the cost of
> > recalculating sin.
>
> No, I haven't measured. In this case, I'm compiling Haskell to GLSL
> for execution on a GPU, where the inner lambda will be over space,
> which means at least one application per pixel, so the computations
> moved under the inner lambda will be redundantly computed a few
> millions of times per frame (and much more with anti-aliasing).
> Instead, I want to move those calculations to once per frame and
> stored in quickly accessed video memory. As the space-independent
> computation gets more complex, I expect the saving to grow.
Hmm. The problem is that GHC’s optimizer uses a cost-model (in this
case: calling an anonymous function is worse than recomputing sind)
that does not hold in your case. Besides simply turning off some
transformations, I am not sure what best you could do to avoid this…

Gruß,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Conal Elliott
> the exampleC is the bound things that is eta-expanded.

Oh! I get it now. Thanks.

> The problem is that GHC’s optimizer uses a cost-model [...] that does not hold in your case.

Makes sense a well. I'll use `-fno-do-lambda-eta-expansion` for now and see well it works.

Thanks very much for these tips.

Regards, - Conal

On Tue, Jul 18, 2017 at 3:44 PM, Joachim Breitner <[hidden email]> wrote:
Hi Conal,

Am Dienstag, den 18.07.2017, 15:35 -0700 schrieb Conal Elliott:
> Thanks very much for this reply, Joachim. I see that `-fno-do-lambda-
> eta-expansion` with my example prevents moving the computation under
> the lambda where it gets repeatedly evaluated. I don't understand
> what this code motion/substitution has to do with eta-expansion. Is
> it that the `let` expression itself is eta-expanded? The GHC Users
> Guide describes this flag as "eta-expand let-bindings to increase
> their arity", which doesn't seem to fit here, since the `let`-
> bindings are not function-valued (though the `let` expression is).

In the code

exampleC :: Double -> Double -> Double
exampleC = \ t -> let s = sin t in \ x -> x + s

the exampleC is the bound things that is eta-expanded. Ok, it is a top-
level function, but that does not make a big difference. When eta-
expanded, it becomes

exampleC :: Double -> Double -> Double
exampleC = \ t y -> (let s = sin
t in \ x -> x + s) y

and from then on, usual simplifications kick in to produce

exampleC :: Double -> Double -> Double
exampleC = \ t y -> let s = sin t in y + s

and eventually

exampleC :: Double -> Double -> Double
exampleC = \ t y -> y + sin t

(I ignore the unboxing of Doubles here).


> > Did you measure whether this really is a problem? The benefits of not
> > dealing with dynamically allocated functions might outweigh the cost of
> > recalculating sin.
>
> No, I haven't measured. In this case, I'm compiling Haskell to GLSL
> for execution on a GPU, where the inner lambda will be over space,
> which means at least one application per pixel, so the computations
> moved under the inner lambda will be redundantly computed a few
> millions of times per frame (and much more with anti-aliasing).
> Instead, I want to move those calculations to once per frame and
> stored in quickly accessed video memory. As the space-independent
> computation gets more complex, I expect the saving to grow.

Hmm. The problem is that GHC’s optimizer uses a cost-model (in this
case: calling an anonymous function is worse than recomputing sind)
that does not hold in your case. Besides simply turning off some
transformations, I am not sure what best you could do to avoid this…

Gruß,
Joachim
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Conal Elliott
In reply to this post by Sebastian Graf
Hi Sebastian,

Thanks for the reply. It's that I don't want `exampleC` to be eta-expanded. Apparently GHC does by default even when doing so moves computation under lambda. I've thought otherwise for a very long time.

-- Conal

On Tue, Jul 18, 2017 at 9:48 AM, Sebastian Graf <[hidden email]> wrote:
Hi Conal,

so if I understand this right, you'd rather not wanted `exampleC` to be eta-expanded (or the binding of `s` to be floated into the lambda)?
Or is it that you want CSE to find out that you always supply the same `t` as the first argument and share the partial application and thus the work of computing `s`?

If it's the former: GHC doesn't normally do this, unless it has found out that no sharing (of work done to evaluate `s`) would be lost through eta-expansion.
This is the case when `exampleC` is always called with two arguments, so that no binding of `s` is shared, for example.
Could you maybe post a complete module/expression representative of all uses of `exampleC`?

If it's the latter, I'm afraid I can't really help, but surely someone else can.

Cheers,
Sebastian

On Tue, Jul 18, 2017 at 5:34 PM, Conal Elliott <[hidden email]> wrote:
I'm seeing what looks like repeated computation under a lambda with `-O` and `-O2`. The following definition:

> exampleC :: Double -> Double -> Double
> exampleC = \ t -> let s = sin t in \ x -> x + s

yields this Core:

> -- RHS size: {terms: 13, types: 6, coercions: 0}
> exampleC :: Double -> Double -> Double
> exampleC =
>   \ (t_afI6 :: Double) (eta_B1 :: Double) ->
>     case eta_B1 of _ { D# x_aj5c ->
>     case t_afI6 of _ { D# x1_aj5l ->
>     D# (+## x_aj5c (sinDouble# x1_aj5l))
>     }
>     }

The `sinDouble#` here depends only on `t_afI6` (`t`) but still appears under the binding of `eta_B1` (`x`).

I'm concerned because many of my uses of such functions involve computations dependent only on `t` (time) but with millions of uses (space) per `t`. (I'm working on a GHC Core plugin (compiling to categories), with one use generating graphics GPU code.)

Does the optimization I expected somehow happen later in the compilation pipeline?

Are there Core-manipulating functions in GHC that I can use to make it happen earlier (via a `BuiltinRule`, i.e., Core-to-Core function)?

Thanks,
-- Conal


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs




_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

David Feuer-2
On Tuesday, July 18, 2017 3:55:28 PM EDT Conal Elliott wrote:
> Hi Sebastian,
>
> Thanks for the reply. It's that I don't want `exampleC` to be eta-expanded.
> Apparently GHC does by default even when doing so moves computation under
> lambda. I've thought otherwise for a very long time.

GHC really likes to eta-expand, because that can be very good for allocation,
unboxing, and I don't know what else. Do you really need to represent the
intermediate result by a *function*? Would it work just to save the Double
itself? I suspect you could likely convince GHC to leave it alone.

David
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Conal Elliott
Here's the code in question, slightly rephrased:

> exampleC t = \ x -> x + s where s = sin t

I wrote it this way so that `sin t` would be computed once per `t` and reused for each value of `x`. The intermediate result `s` has type `Double`---not a function. Without `-fno-do-lambda-eta-expansion`, phase 2 of `ghc -O` causes the `s = sin t` binding to be moved under the `\ x -> ...`. I've been using this programming style for this purpose for longer than I can remember, and apparently I've been mistaken about its effectiveness at least part of that time.

-- Conal

On Tue, Jul 18, 2017 at 4:52 PM, David Feuer <[hidden email]> wrote:
On Tuesday, July 18, 2017 3:55:28 PM EDT Conal Elliott wrote:
> Hi Sebastian,
>
> Thanks for the reply. It's that I don't want `exampleC` to be eta-expanded.
> Apparently GHC does by default even when doing so moves computation under
> lambda. I've thought otherwise for a very long time.

GHC really likes to eta-expand, because that can be very good for allocation,
unboxing, and I don't know what else. Do you really need to represent the
intermediate result by a *function*? Would it work just to save the Double
itself? I suspect you could likely convince GHC to leave it alone.

David


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Joachim Breitner-2
Hi,

Am Dienstag, den 18.07.2017, 17:01 -0700 schrieb Conal Elliott:

> Here's the code in question, slightly rephrased:
>
> > exampleC t = \ x -> x + s where s = sin t
>
> I wrote it this way so that `sin t` would be computed once per `t`
> and reused for each value of `x`. The intermediate result `s` has
> type `Double`---not a function. Without `-fno-do-lambda-eta-
> expansion`, phase 2 of `ghc -O` causes the `s = sin t` binding to be
> moved under the `\ x -> ...`. I've been using this programming style
> for this purpose for longer than I can remember, and apparently I've
> been mistaken about its effectiveness at least part of that time.
usually it is very effective. GHC does this transformation not nilly-
willy, but only when its heuristics indicate that it is ok, usually
because the operation (here sin) is known to be very cheap. If that
code would call some unknown function, or allocate memory, then GHC
would certainly preserve your intention.

Greetings,
Joachim
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Repeated computations under a lambda

GHC - devs mailing list
In reply to this post by Conal Elliott

I opened a ticket and replied:

 

https://ghc.haskell.org/trac/ghc/ticket/13996#comment:1

 

Simon

 

 

 

From: ghc-devs [mailto:[hidden email]] On Behalf Of Conal Elliott
Sent: 19 July 2017 01:01
To: David Feuer <[hidden email]>
Cc: [hidden email]
Subject: Re: Repeated computations under a lambda

 

Here's the code in question, slightly rephrased:

 

> exampleC t = \ x -> x + s where s = sin t

 

I wrote it this way so that `sin t` would be computed once per `t` and reused for each value of `x`. The intermediate result `s` has type `Double`---not a function. Without `-fno-do-lambda-eta-expansion`, phase 2 of `ghc -O` causes the `s = sin t` binding to be moved under the `\ x -> ...`. I've been using this programming style for this purpose for longer than I can remember, and apparently I've been mistaken about its effectiveness at least part of that time.

 

-- Conal

 

On Tue, Jul 18, 2017 at 4:52 PM, David Feuer <[hidden email]> wrote:

On Tuesday, July 18, 2017 3:55:28 PM EDT Conal Elliott wrote:
> Hi Sebastian,
>
> Thanks for the reply. It's that I don't want `exampleC` to be eta-expanded.
> Apparently GHC does by default even when doing so moves computation under
> lambda. I've thought otherwise for a very long time.

GHC really likes to eta-expand, because that can be very good for allocation,
unboxing, and I don't know what else. Do you really need to represent the
intermediate result by a *function*? Would it work just to save the Double
itself? I suspect you could likely convince GHC to leave it alone.

David

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Repeated computations under a lambda

Conal Elliott
In reply to this post by Joachim Breitner-2
> GHC does this transformation not nilly-willy, .... If that
> code would call some unknown function, or allocate memory, then GHC
> would certainly preserve your intention.

Wonderful. Thanks for the clarification.

-- Conal

On Tue, Jul 18, 2017 at 6:51 PM, Joachim Breitner <[hidden email]> wrote:
Hi,

Am Dienstag, den 18.07.2017, 17:01 -0700 schrieb Conal Elliott:
> Here's the code in question, slightly rephrased:
>
> > exampleC t = \ x -> x + s where s = sin t
>
> I wrote it this way so that `sin t` would be computed once per `t`
> and reused for each value of `x`. The intermediate result `s` has
> type `Double`---not a function. Without `-fno-do-lambda-eta-
> expansion`, phase 2 of `ghc -O` causes the `s = sin t` binding to be
> moved under the `\ x -> ...`. I've been using this programming style
> for this purpose for longer than I can remember, and apparently I've
> been mistaken about its effectiveness at least part of that time.

usually it is very effective. GHC does this transformation not nilly-
willy, but only when its heuristics indicate that it is ok, usually
because the operation (here sin) is known to be very cheap. If that
code would call some unknown function, or allocate memory, then GHC
would certainly preserve your intention.

Greetings,
Joachim
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Loading...