panic when compiling SHA

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Kazu Yamamoto (山本和彦)
Hi,

When I tried to build the SHA library with GHC head on on 32bit Linux,
GHC head got panic. GHC 7.4.2 can build SHA on the same machine.

Configuring SHA-1.6.1...
Building SHA-1.6.1...
Failed to install SHA-1.6.1
Last 10 lines of the build log ( /home/kazu/work/rpf/.cabal-sandbox/logs/SHA-1.6.1.log ):
Preprocessing library SHA-1.6.1...
[1 of 1] Compiling Data.Digest.Pure.SHA ( Data/Digest/Pure/SHA.hs, dist/dist-sandbox-ef3aaa11/build/Data/Digest/Pure/SHA.o )
ghc: panic! (the 'impossible' happened)
  (GHC version 7.7.20131202 for i386-unknown-linux):
        regSpill: out of spill slots!
       regs to spill = 1129
       slots left    = 677

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

--Kazu

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Manuel Gómez
On Thu, Dec 26, 2013 at 8:37 PM, Kazu Yamamoto <kazu at iij.ad.jp> wrote:
> When I tried to build the SHA library with GHC head on on 32bit Linux,
> GHC head got panic. GHC 7.4.2 can build SHA on the same machine.
>
> [1 of 1] Compiling Data.Digest.Pure.SHA ( Data/Digest/Pure/SHA.hs, dist/dist-sandbox-ef3aaa11/build/Data/Digest/Pure/SHA.o )
> ghc: panic! (the 'impossible' happened)
>   (GHC version 7.7.20131202 for i386-unknown-linux):
>         regSpill: out of spill slots!
>        regs to spill = 1129
>        slots left    = 677

I ran into this a while ago.  Turns out it?s a known bug[0], but it
seems like it?s been hard to reproduce.  FWIW, I also hit it on 32bit
Linux.  You can probably work around it with `-fno-regs-graph`.

[0]: <https://ghc.haskell.org/trac/ghc/ticket/5361>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2
In reply to this post by Kazu Yamamoto (山本和彦)

On 27/12/2013, at 12:07 PM, Kazu Yamamoto (????) wrote:

> Hi,
>
> When I tried to build the SHA library with GHC head on on 32bit Linux,
> GHC head got panic. GHC 7.4.2 can build SHA on the same machine.
>
> Configuring SHA-1.6.1...
> Building SHA-1.6.1...
> Failed to install SHA-1.6.1
> Last 10 lines of the build log ( /home/kazu/work/rpf/.cabal-sandbox/logs/SHA-1.6.1.log ):
> Preprocessing library SHA-1.6.1...
> [1 of 1] Compiling Data.Digest.Pure.SHA ( Data/Digest/Pure/SHA.hs, dist/dist-sandbox-ef3aaa11/build/Data/Digest/Pure/SHA.o )
> ghc: panic! (the 'impossible' happened)
>  (GHC version 7.7.20131202 for i386-unknown-linux):
> regSpill: out of spill slots!
>       regs to spill = 1129
>       slots left    = 677

There are only a fixed number of register spill slots, and when they're all used the compiler can't dynamically allocate more of them.

This SHA benchmark is pathological in that the intermediate code expands to have many variables with long, overlapping live ranges. The underlying problem is really that the inliner and/or other optimisations have gone crazy and made a huge intermediate program. We *could* give it more spill slots, to make it compile, but the generated code would be horrible.

Try turning down the optimisation level, reduce inliner keenness, or reduce SpecConstr flags.

Ben.


Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Simon Marlow-7
On 28/12/13 03:58, Ben Lippmeier wrote:

>
> On 27/12/2013, at 12:07 PM, Kazu Yamamoto (????) wrote:
>
>> Hi,
>>
>> When I tried to build the SHA library with GHC head on on 32bit Linux,
>> GHC head got panic. GHC 7.4.2 can build SHA on the same machine.
>>
>> Configuring SHA-1.6.1...
>> Building SHA-1.6.1...
>> Failed to install SHA-1.6.1
>> Last 10 lines of the build log ( /home/kazu/work/rpf/.cabal-sandbox/logs/SHA-1.6.1.log ):
>> Preprocessing library SHA-1.6.1...
>> [1 of 1] Compiling Data.Digest.Pure.SHA ( Data/Digest/Pure/SHA.hs, dist/dist-sandbox-ef3aaa11/build/Data/Digest/Pure/SHA.o )
>> ghc: panic! (the 'impossible' happened)
>>   (GHC version 7.7.20131202 for i386-unknown-linux):
>> regSpill: out of spill slots!
>>        regs to spill = 1129
>>        slots left    = 677
>
> There are only a fixed number of register spill slots, and when
> they're all used the compiler can't dynamically allocate more of
> them.

Not true any more in 7.8+ with the linear allocator.  I think it might
still be true for the graph allocator, which is sadly suffering from a
little bitrot and probably doesn't generate very good code with the new
code generator.

So, avoiding -fregs-graph should work around this with 7.8.

Cheers,
        Simon


> This SHA benchmark is pathological in that the intermediate code expands to have many variables with long, overlapping live ranges. The underlying problem is really that the inliner and/or other optimisations have gone crazy and made a huge intermediate program. We *could* give it more spill slots, to make it compile, but the generated code would be horrible.
>
> Try turning down the optimisation level, reduce inliner keenness, or reduce SpecConstr flags.
>
> Ben.
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>


Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Kazu Yamamoto (山本和彦)
Hi,

>> There are only a fixed number of register spill slots, and when
>> they're all used the compiler can't dynamically allocate more of
>> them.
>
> Not true any more in 7.8+ with the linear allocator.  I think it might
> still be true for the graph allocator, which is sadly suffering from a
> little bitrot and probably doesn't generate very good code with the
> new code generator.
>
> So, avoiding -fregs-graph should work around this with 7.8.

I confirmed that removing -fregs-graph should work around this with
7.8.

--Kazu

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2

On 04/01/2014, at 23:22 , Kazu Yamamoto (????) <kazu at iij.ad.jp> wrote:

> Hi,
>
>>> There are only a fixed number of register spill slots, and when
>>> they're all used the compiler can't dynamically allocate more of
>>> them.
>>
>> Not true any more in 7.8+ with the linear allocator.  I think it might
>> still be true for the graph allocator, which is sadly suffering from a
>> little bitrot and probably doesn't generate very good code with the
>> new code generator.
>>
>> So, avoiding -fregs-graph should work around this with 7.8.
>
> I confirmed that removing -fregs-graph should work around this with
> 7.8.

Ok, my mistake. We originally added -fregs-graph when compiling that module because both allocators had a fixed stack size, but the graph allocator did a better job of allocation and avoided overflowing the stack.

Note that removing the flag isn't a "solution" to the underlying problem of the intermediate code being awful. Switching to the linear allocator just permits compilation of core code that was worse than before. Now it needs to spill more registers when compiling the same source code.

Ben.


Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Kazu Yamamoto (山本和彦)
Ben,

> Note that removing the flag isn't a "solution" to the underlying
> problem of the intermediate code being awful. Switching to the
> linear allocator just permits compilation of core code that was
> worse than before. Now it needs to spill more registers when
> compiling the same source code.

So, would you reopen #5361 by yourself?

        https://ghc.haskell.org/trac/ghc/ticket/5361

--Kazu

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2

On 06/01/2014, at 14:08 , Kazu Yamamoto (????) <kazu at iij.ad.jp> wrote:

> Ben,
>
>> Note that removing the flag isn't a "solution" to the underlying
>> problem of the intermediate code being awful. Switching to the
>> linear allocator just permits compilation of core code that was
>> worse than before. Now it needs to spill more registers when
>> compiling the same source code.
>
> So, would you reopen #5361 by yourself?
>
> https://ghc.haskell.org/trac/ghc/ticket/5361

Not if we just have this one test. I'd be keen to blame excessive use of inline pragmas in the SHA library itself, or excessive optimisation flags. It's not really a bug in GHC until there are two tests that exhibit the same problem.

Ben.



Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Simon Peyton Jones
In reply to this post by Ben Lippmeier-2
| Note that removing the flag isn't a "solution" to the underlying problem
| of the intermediate code being awful. Switching to the linear allocator
| just permits compilation of core code that was worse than before. Now it
| needs to spill more registers when compiling the same source code.

In what way is the intermediate code awful? How could it be fixed?

Worth opening a ticket for that issue?  At the moment it's invisible because the issue appears superficially to be about register allocation.

Simon


| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Ben
| Lippmeier
| Sent: 05 January 2014 10:47
| To: Kazu Yamamoto (????)
| Cc: ghc-devs at haskell.org
| Subject: Re: panic when compiling SHA
|
|
| On 04/01/2014, at 23:22 , Kazu Yamamoto (????) <kazu at iij.ad.jp>
| wrote:
|
| > Hi,
| >
| >>> There are only a fixed number of register spill slots, and when
| >>> they're all used the compiler can't dynamically allocate more of
| >>> them.
| >>
| >> Not true any more in 7.8+ with the linear allocator.  I think it
| >> might still be true for the graph allocator, which is sadly suffering
| >> from a little bitrot and probably doesn't generate very good code
| >> with the new code generator.
| >>
| >> So, avoiding -fregs-graph should work around this with 7.8.
| >
| > I confirmed that removing -fregs-graph should work around this with
| > 7.8.
|
| Ok, my mistake. We originally added -fregs-graph when compiling that
| module because both allocators had a fixed stack size, but the graph
| allocator did a better job of allocation and avoided overflowing the
| stack.
|
| Note that removing the flag isn't a "solution" to the underlying problem
| of the intermediate code being awful. Switching to the linear allocator
| just permits compilation of core code that was worse than before. Now it
| needs to spill more registers when compiling the same source code.
|
| Ben.
|
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Adam Wick
In reply to this post by Ben Lippmeier-2
On Jan 6, 2014, at 12:20 AM, Ben Lippmeier <benl at ouroborus.net> wrote:

> On 06/01/2014, at 14:08 , Kazu Yamamoto (????) <kazu at iij.ad.jp> wrote:
>> Ben,
>>
>>> Note that removing the flag isn't a "solution" to the underlying
>>> problem of the intermediate code being awful. Switching to the
>>> linear allocator just permits compilation of core code that was
>>> worse than before. Now it needs to spill more registers when
>>> compiling the same source code.
>>
>> So, would you reopen #5361 by yourself?
>>
>> https://ghc.haskell.org/trac/ghc/ticket/5361
>
> Not if we just have this one test. I'd be keen to blame excessive use of inline pragmas in the SHA library itself, or excessive optimisation flags. It's not really a bug in GHC until there are two tests that exhibit the same problem.


The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly standard ways. There?s nothing too exotic in there, I just basically sprinkled hints in places I thought would be useful, and then backed those up with benchmarking.

If GHC simply emitted rotten code in this case, I?d agree: wait for more examples, and put the onus on the developer to make it work better. However, right now, GHC crashes on valid input. Which is a bug. So I?d argue that the ticket should be re-opened. I suppose, alternatively, the documentation on SPECIALIZE, INLINE, and bang patterns could be changed to note that using them is not officially supported.

If the problem is pretty fundamental, then perhaps instead of panicking and dying, GHC should instead default back to a worse register allocator. Perhaps it could print a warning when that happens, but that?s optional. That would be an easier way to fix this bug if there are deeper algorithmic problems, or if fixing it for SHA would simply move the failure line a little further down the field. (Obviously this route opens a performance regression on my end, but hey, that?s my problem.)


- Adam
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2199 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140106/ffb91b5e/attachment-0001.bin>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2
In reply to this post by Simon Peyton Jones

On 06/01/2014, at 19:43 , Simon Peyton-Jones <simonpj at microsoft.com> wrote:

> | Note that removing the flag isn't a "solution" to the underlying problem
> | of the intermediate code being awful. Switching to the linear allocator
> | just permits compilation of core code that was worse than before. Now it
> | needs to spill more registers when compiling the same source code.
>
> In what way is the intermediate code awful?

Because the error message from the register allocator tells us that there are over 1000 live variables at a particular point the assembly code, but the "biggest" SHA hashing algorithm (SHA-3) should only need to maintain 25 words of state (says Wikipedia).


> How could it be fixed?


Someone that cares enough about the SHA library would need to understand why it's producing the intermediate code it does. My gentle suggestion is that when a library developer starts adding INLINE pragmas to their program it becomes their job to understand why the intermediate code is how it is.


> Worth opening a ticket for that issue?  At the moment it's invisible because the issue appears superficially to be about register allocation.

I'd open a ticket against the SHA library saying the choice of optimisation flags / pragmas is probably causing code explosion during compilation. If the developer then decides this is really a problem in GHC I'd want some description of what core transforms they need to happen to achieve good performance. The strategy of "inline everything and hope for the best" is understandable (I've used it!) but only gets you so far...

The bug report is like someone saying "GHC can't compile my 100MB core program". You can either open a ticket against GHC, or ask "why have you got a 100MB core program?"

Ben.


Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2
In reply to this post by Adam Wick

On 07/01/2014, at 9:26 , Adam Wick <awick at galois.com> wrote:

>> Not if we just have this one test. I'd be keen to blame excessive use of inline pragmas in the SHA library itself, or excessive optimisation flags. It's not really a bug in GHC until there are two tests that exhibit the same problem.
>
> The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly standard ways. There?s nothing too exotic in there, I just basically sprinkled hints in places I thought would be useful, and then backed those up with benchmarking.

Ahh. It's the "sprinkled hints in places I thought would be useful" which is what I'm concerned about. If you just add pragmas without understanding their effect on the core program then it'll bite further down the line. Did you compare the object code size as well as wall clock speedup?


> If GHC simply emitted rotten code in this case, I?d agree: wait for more examples, and put the onus on the developer to make it work better. However, right now, GHC crashes on valid input. Which is a bug. So I?d argue that the ticket should be re-opened. I suppose, alternatively, the documentation on SPECIALIZE, INLINE, and bang patterns could be changed to note that using them is not officially supported.

Sadly, "valid input" isn't a well defined concept in practice. You could write a "valid" 10GB Haskell source file that obeyed the Haskell standard grammar, but I wouldn't expect that to compile either. You could also write small (< 1k) source programs that trigger complexity problems in Hindley-Milner style type inference. You could also use compile-time meta programming (like Template Haskell) to generate intermediate code that is well formed but much too big to compile. The fact that a program obeys a published grammar is not sufficient to expect it to compile with a particular implementation (sorry to say).


> If the problem is pretty fundamental, then perhaps instead of panicking and dying, GHC should instead default back to a worse register allocator. Perhaps it could print a warning when that happens, but that?s optional. That would be an easier way to fix this bug if there are deeper algorithmic problems, or if fixing it for SHA would simply move the failure line a little further down the field. (Obviously this route opens a performance regression on my end, but hey, that?s my problem.)

Adding an INLINE pragma is akin to using compile-time meta programming. I suspect your meta programming is more broken than GHC in this case, but I'd be happy to be proven otherwise. Right now the panic from the register allocator is all the feedback you've got that something is wrong, and the SHA library is the only one I've seen that causes this problem. See above discussion about "valid input".

Ben.


Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Jan Stolarek
>  GHC crashes on valid input. Which is a bug.
As Ben pointed out it is conceivable that compiler will not be able handle a correct program. But
as a user I would expect GHC to detect such situations (if possible) and display an error
message, not crash with a panic (which clearly says this is a bug and should be reported).

Janek

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Adam Wick
On Jan 7, 2014, at 4:11 AM, Jan Stolarek <jan.stolarek at p.lodz.pl> wrote:
>> GHC crashes on valid input. Which is a bug.
> As Ben pointed out it is conceivable that compiler will not be able handle a correct program.

Personally, I find this view extremely disappointing. If my SHA library failed to work on a valid input, I would consider that a bug. Why is GHC special? Keep in mind that I?m not saying that this bug needs to be highest priority and fixed immediately, but instead I?m merely arguing that it should be acknowledged as a bug.

> But as a user I would expect GHC to detect such situations (if possible) and display an error
> message, not crash with a panic (which clearly says this is a bug and should be reported).


Personally, I?d find this an acceptable, if a bit disappointing, solution. Essentially you?re redefining "valid input." It just seems a shame to be doing so because of an implementation weakness rather than an actual, fundamental problem.


- Adam
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2199 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140107/57392fae/attachment.bin>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Adam Wick
In reply to this post by Ben Lippmeier-2
On Jan 7, 2014, at 2:27 AM, Ben Lippmeier <benl at ouroborus.net> wrote:
> On 07/01/2014, at 9:26 , Adam Wick <awick at galois.com> wrote:
>
>>> Not if we just have this one test. I'd be keen to blame excessive use of inline pragmas in the SHA library itself, or excessive optimisation flags. It's not really a bug in GHC until there are two tests that exhibit the same problem.
>>
>> The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly standard ways. There?s nothing too exotic in there, I just basically sprinkled hints in places I thought would be useful, and then backed those up with benchmarking.
>
> Ahh. It's the "sprinkled hints in places I thought would be useful" which is what I'm concerned about. If you just add pragmas without understanding their effect on the core program then it'll bite further down the line. Did you compare the object code size as well as wall clock speedup?

I understand the pragmas and what they do with my code. I use SPECIALIZE twice for two functions. In both functions, it was clearer to write the function as (a -> a -> a -> a), but I wanted specialized versions for the two versions that were going to be used, in which (a == Word32) or (a == Word64). This benchmarked as faster while maintaining code clarity and concision. I use INLINE in five places, each of them a SHA step function, with the understanding that it would generate ideal code for a compiler for the performance-critical parts of the algorithm: straight line, single-block code with no conditionals.

When I did my original performance work, several versions of GHC ago, I did indeed consider compile time, runtime performance, and space usage. I picked what I thought was a reasonable balance at the time.

I also just performed an experiment in which I took the SHA library, deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and INLINE is beside the point.

> Sadly, "valid input" isn't a well defined concept in practice. You could write a "valid" 10GB Haskell source file that obeyed the Haskell standard grammar, but I wouldn't expect that to compile either.

I would. I?m a little disappointed that ghc-devs does not. I wouldn?t expect it to compile quickly, but I would expect it to run.

> You could also write small (< 1k) source programs that trigger complexity problems in Hindley-Milner style type inference. You could also use compile-time meta programming (like Template Haskell) to generate intermediate code that is well formed but much too big to compile. The fact that a program obeys a published grammar is not sufficient to expect it to compile with a particular implementation (sorry to say).

If I write a broken Template Haskell macro, then yes, I agree. This is not the case in this example.

> Adding an INLINE pragma is akin to using compile-time meta programming.

Is it? I find that a strange point of view. Isn?t INLINE just a strong hint to the compiler that this function should be inlined? How is using INLINE any different from simply manually inserting the code at every call site?


- Adam
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2199 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140107/c5c4f0dc/attachment.bin>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Ben Lippmeier-2

On 08/01/2014, at 10:57 , Adam Wick <awick at galois.com> wrote:

> I also just performed an experiment in which I took the SHA library, deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and INLINE is beside the point.

Ok, then maybe the default inliner heuristics are a bit too eager for this program. Whether that's a bug is open for debate. The standard way of setting such heuristics is to compile a "representative" set of benchmarks (eg, nofib) and choose some settings that give good average performance. I don't think this is an ideal approach, but it's the typical one for compiler engineering.


>> Sadly, "valid input" isn't a well defined concept in practice. You could write a "valid" 10GB Haskell source file that obeyed the Haskell standard grammar, but I wouldn't expect that to compile either.
>
> I would. I?m a little disappointed that ghc-devs does not. I wouldn?t expect it to compile quickly, but I would expect it to run.

To satisfy such a demand GHC would need to have linear space usage with respect to the input program size. This implies it must also be linear with respect to the number of top-level declarations, number of variables, number of quantifiers in type sigs, and any other countable thing in the input program. It would also need to be linear for other finite resources that might run out, like symbol table entries. If you had 1Gig top-level foreign exported declarations in the source program I suspect the ELF linker would freak out.

I'm not trying to be difficult or argumentative -- I think limits like these come naturally with a concrete implementation.

I agree it's sad that client programmers can't just enable -O2 and expect every program to work. It'd be nice to have optimisation levels that give resource or complexity guarantees, like "enabling this won't make the code-size non-linear in the input size", but that's not how it works at the moment. I'm not aware of any compiler for a "high level" language that gives such guarantees, but there may be some. I'd be interested to know of any.


>> Adding an INLINE pragma is akin to using compile-time meta programming.
>
> Is it? I find that a strange point of view. Isn?t INLINE just a strong hint to the compiler that this function should be inlined? How is using INLINE any different from simply manually inserting the code at every call site?

It's not a "hint" -- it *forces* inlining at every call site like you said. It'll make a new copy of the function body for every call site, and not back-out if the program gets "too big".

Suppose:

f x = g x ... g x' ... g x''
g y = h y ... h y' ... h y''
h z = i z ... i z' ... i z''
...

now force inlining for all of f g h etc. I'd expect to see at least 3*3*3=27 copies of the body of 'i' in the core program, and even more if SpecConstr and the LiberateCase transform are turned on. We had (and have) big problems like this with DPH. It took too long for the DPH team to unlearn the dogma that "inlining and call pattern specialisation make the program better".

Ben.



Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Carter Schonwald
In reply to this post by Adam Wick
Adam,
I agree that it should be considered a misfeature (or at the very least a
good stress test that currently breaks the register allocator). That said,
INLINE / INLINEABLE are only needed for intermodule optimization, have you
tried using the special "inline" primop selectively, or using INLINEABLE
plus selective inline? I think inline should work in the defining module
even if you don't provide an INLINE or INLINEABLE.

question 1: does the code compile well when you use -fllvm? (seems like the
discussion so far has been NCG focused).
how does the generated assembly fair that way vs the workaroudn path on NCG?




On Tue, Jan 7, 2014 at 6:57 PM, Adam Wick <awick at galois.com> wrote:

> On Jan 7, 2014, at 2:27 AM, Ben Lippmeier <benl at ouroborus.net> wrote:
> > On 07/01/2014, at 9:26 , Adam Wick <awick at galois.com> wrote:
> >
> >>> Not if we just have this one test. I'd be keen to blame excessive use
> of inline pragmas in the SHA library itself, or excessive optimisation
> flags. It's not really a bug in GHC until there are two tests that exhibit
> the same problem.
> >>
> >> The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly
> standard ways. There?s nothing too exotic in there, I just basically
> sprinkled hints in places I thought would be useful, and then backed those
> up with benchmarking.
> >
> > Ahh. It's the "sprinkled hints in places I thought would be useful"
> which is what I'm concerned about. If you just add pragmas without
> understanding their effect on the core program then it'll bite further down
> the line. Did you compare the object code size as well as wall clock
> speedup?
>
> I understand the pragmas and what they do with my code. I use SPECIALIZE
> twice for two functions. In both functions, it was clearer to write the
> function as (a -> a -> a -> a), but I wanted specialized versions for the
> two versions that were going to be used, in which (a == Word32) or (a ==
> Word64). This benchmarked as faster while maintaining code clarity and
> concision. I use INLINE in five places, each of them a SHA step function,
> with the understanding that it would generate ideal code for a compiler for
> the performance-critical parts of the algorithm: straight line,
> single-block code with no conditionals.
>
> When I did my original performance work, several versions of GHC ago, I
> did indeed consider compile time, runtime performance, and space usage. I
> picked what I thought was a reasonable balance at the time.
>
> I also just performed an experiment in which I took the SHA library,
> deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD
> on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and
> INLINE is beside the point.
>
> > Sadly, "valid input" isn't a well defined concept in practice. You could
> write a "valid" 10GB Haskell source file that obeyed the Haskell standard
> grammar, but I wouldn't expect that to compile either.
>
> I would. I?m a little disappointed that ghc-devs does not. I wouldn?t
> expect it to compile quickly, but I would expect it to run.
>
> > You could also write small (< 1k) source programs that trigger
> complexity problems in Hindley-Milner style type inference. You could also
> use compile-time meta programming (like Template Haskell) to generate
> intermediate code that is well formed but much too big to compile. The fact
> that a program obeys a published grammar is not sufficient to expect it to
> compile with a particular implementation (sorry to say).
>
> If I write a broken Template Haskell macro, then yes, I agree. This is not
> the case in this example.
>
> > Adding an INLINE pragma is akin to using compile-time meta programming.
>
> Is it? I find that a strange point of view. Isn?t INLINE just a strong
> hint to the compiler that this function should be inlined? How is using
> INLINE any different from simply manually inserting the code at every call
> site?
>
>
> - Adam
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140108/fa94fbe6/attachment.html>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Jan Stolarek
In reply to this post by Ben Lippmeier-2
> It's not a "hint" -- it *forces* inlining at every call site like you said.
There are exceptions: function must be fully applied to be inlined and there are loop-breakers
(e.g. a self-recursive function will not be inlined).

Janek

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Iavor Diatchki
In reply to this post by Carter Schonwald
Hello,

I find it a bit perplexing (and not at all constructive) that we are
arguing over semantics here.  We have a program (1 module, ~1000 lines of
"no fancy extension Haskell"), which causes GHC to panic.  This is a bug.
 An invariant that we were assuming did not actually hold.  Hence the
message that the "impossible" happened.  If GHC decides to refuse to
compile a program, it should not panic but, rather, explain what happened
and maybe suggest a workaround.

I am not familiar with GHC's back-end, but it seems that there might be
something interesting that's going on here.   The SHA library works fine
with 7.6.3, and it compiles (admittedly very slowly) using GHC head on my
64-bit machine.   So something has changed, and it'd be nice if we
understood what's causing the problem.

Ben suggested that the issue might be the INLINE pragmas, but clearly
that's not the problem, as Adam reproduced the same behavior without those
pragmas.  If the issue is indeed with the built-in inline heuristics, it
sounds like we either should fix the heuristics, or come up with some
suggestions about what to avoid in user programs.  Or, perhaps, the issue
something completely unrelated (e.g., a bug in the register allocator).
Either way, I think this deserves a ticket.

-Iavor








On Tue, Jan 7, 2014 at 10:11 PM, Carter Schonwald <
carter.schonwald at gmail.com> wrote:

> Adam,
> I agree that it should be considered a misfeature (or at the very least a
> good stress test that currently breaks the register allocator). That said,
> INLINE / INLINEABLE are only needed for intermodule optimization, have you
> tried using the special "inline" primop selectively, or using INLINEABLE
> plus selective inline? I think inline should work in the defining module
> even if you don't provide an INLINE or INLINEABLE.
>
> question 1: does the code compile well when you use -fllvm? (seems like
> the discussion so far has been NCG focused).
> how does the generated assembly fair that way vs the workaroudn path on
> NCG?
>
>
>
>
> On Tue, Jan 7, 2014 at 6:57 PM, Adam Wick <awick at galois.com> wrote:
>
>> On Jan 7, 2014, at 2:27 AM, Ben Lippmeier <benl at ouroborus.net> wrote:
>> > On 07/01/2014, at 9:26 , Adam Wick <awick at galois.com> wrote:
>> >
>> >>> Not if we just have this one test. I'd be keen to blame excessive use
>> of inline pragmas in the SHA library itself, or excessive optimisation
>> flags. It's not really a bug in GHC until there are two tests that exhibit
>> the same problem.
>> >>
>> >> The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly
>> standard ways. There?s nothing too exotic in there, I just basically
>> sprinkled hints in places I thought would be useful, and then backed those
>> up with benchmarking.
>> >
>> > Ahh. It's the "sprinkled hints in places I thought would be useful"
>> which is what I'm concerned about. If you just add pragmas without
>> understanding their effect on the core program then it'll bite further down
>> the line. Did you compare the object code size as well as wall clock
>> speedup?
>>
>> I understand the pragmas and what they do with my code. I use SPECIALIZE
>> twice for two functions. In both functions, it was clearer to write the
>> function as (a -> a -> a -> a), but I wanted specialized versions for the
>> two versions that were going to be used, in which (a == Word32) or (a ==
>> Word64). This benchmarked as faster while maintaining code clarity and
>> concision. I use INLINE in five places, each of them a SHA step function,
>> with the understanding that it would generate ideal code for a compiler for
>> the performance-critical parts of the algorithm: straight line,
>> single-block code with no conditionals.
>>
>> When I did my original performance work, several versions of GHC ago, I
>> did indeed consider compile time, runtime performance, and space usage. I
>> picked what I thought was a reasonable balance at the time.
>>
>> I also just performed an experiment in which I took the SHA library,
>> deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD
>> on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and
>> INLINE is beside the point.
>>
>> > Sadly, "valid input" isn't a well defined concept in practice. You
>> could write a "valid" 10GB Haskell source file that obeyed the Haskell
>> standard grammar, but I wouldn't expect that to compile either.
>>
>> I would. I?m a little disappointed that ghc-devs does not. I wouldn?t
>> expect it to compile quickly, but I would expect it to run.
>>
>> > You could also write small (< 1k) source programs that trigger
>> complexity problems in Hindley-Milner style type inference. You could also
>> use compile-time meta programming (like Template Haskell) to generate
>> intermediate code that is well formed but much too big to compile. The fact
>> that a program obeys a published grammar is not sufficient to expect it to
>> compile with a particular implementation (sorry to say).
>>
>> If I write a broken Template Haskell macro, then yes, I agree. This is
>> not the case in this example.
>>
>> > Adding an INLINE pragma is akin to using compile-time meta programming.
>>
>> Is it? I find that a strange point of view. Isn?t INLINE just a strong
>> hint to the compiler that this function should be inlined? How is using
>> INLINE any different from simply manually inserting the code at every call
>> site?
>>
>>
>> - Adam
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140107/927d2430/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

panic when compiling SHA

Carter Schonwald
well said iavor.
It perhaps hints at the register allocators needing some love? I hope to
dig deep into those myself later this year, but maybe it needs some wibbles
to clean up for 7.8 right now?


On Wed, Jan 8, 2014 at 2:14 AM, Iavor Diatchki <iavor.diatchki at gmail.com>wrote:

> Hello,
>
> I find it a bit perplexing (and not at all constructive) that we are
> arguing over semantics here.  We have a program (1 module, ~1000 lines of
> "no fancy extension Haskell"), which causes GHC to panic.  This is a bug.
>  An invariant that we were assuming did not actually hold.  Hence the
> message that the "impossible" happened.  If GHC decides to refuse to
> compile a program, it should not panic but, rather, explain what happened
> and maybe suggest a workaround.
>
> I am not familiar with GHC's back-end, but it seems that there might be
> something interesting that's going on here.   The SHA library works fine
> with 7.6.3, and it compiles (admittedly very slowly) using GHC head on my
> 64-bit machine.   So something has changed, and it'd be nice if we
> understood what's causing the problem.
>
> Ben suggested that the issue might be the INLINE pragmas, but clearly
> that's not the problem, as Adam reproduced the same behavior without those
> pragmas.  If the issue is indeed with the built-in inline heuristics, it
> sounds like we either should fix the heuristics, or come up with some
> suggestions about what to avoid in user programs.  Or, perhaps, the issue
> something completely unrelated (e.g., a bug in the register allocator).
> Either way, I think this deserves a ticket.
>
> -Iavor
>
>
>
>
>
>
>
>
> On Tue, Jan 7, 2014 at 10:11 PM, Carter Schonwald <
> carter.schonwald at gmail.com> wrote:
>
>> Adam,
>> I agree that it should be considered a misfeature (or at the very least a
>> good stress test that currently breaks the register allocator). That said,
>> INLINE / INLINEABLE are only needed for intermodule optimization, have
>> you tried using the special "inline" primop selectively, or using
>> INLINEABLE plus selective inline? I think inline should work in the
>> defining module even if you don't provide an INLINE or INLINEABLE.
>>
>> question 1: does the code compile well when you use -fllvm? (seems like
>> the discussion so far has been NCG focused).
>> how does the generated assembly fair that way vs the workaroudn path on
>> NCG?
>>
>>
>>
>>
>> On Tue, Jan 7, 2014 at 6:57 PM, Adam Wick <awick at galois.com> wrote:
>>
>>> On Jan 7, 2014, at 2:27 AM, Ben Lippmeier <benl at ouroborus.net> wrote:
>>> > On 07/01/2014, at 9:26 , Adam Wick <awick at galois.com> wrote:
>>> >
>>> >>> Not if we just have this one test. I'd be keen to blame excessive
>>> use of inline pragmas in the SHA library itself, or excessive optimisation
>>> flags. It's not really a bug in GHC until there are two tests that exhibit
>>> the same problem.
>>> >>
>>> >> The SHA library uses SPECIALIZE, INLINE, and bang patterns in fairly
>>> standard ways. There?s nothing too exotic in there, I just basically
>>> sprinkled hints in places I thought would be useful, and then backed those
>>> up with benchmarking.
>>> >
>>> > Ahh. It's the "sprinkled hints in places I thought would be useful"
>>> which is what I'm concerned about. If you just add pragmas without
>>> understanding their effect on the core program then it'll bite further down
>>> the line. Did you compare the object code size as well as wall clock
>>> speedup?
>>>
>>> I understand the pragmas and what they do with my code. I use SPECIALIZE
>>> twice for two functions. In both functions, it was clearer to write the
>>> function as (a -> a -> a -> a), but I wanted specialized versions for the
>>> two versions that were going to be used, in which (a == Word32) or (a ==
>>> Word64). This benchmarked as faster while maintaining code clarity and
>>> concision. I use INLINE in five places, each of them a SHA step function,
>>> with the understanding that it would generate ideal code for a compiler for
>>> the performance-critical parts of the algorithm: straight line,
>>> single-block code with no conditionals.
>>>
>>> When I did my original performance work, several versions of GHC ago, I
>>> did indeed consider compile time, runtime performance, and space usage. I
>>> picked what I thought was a reasonable balance at the time.
>>>
>>> I also just performed an experiment in which I took the SHA library,
>>> deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD
>>> on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and
>>> INLINE is beside the point.
>>>
>>> > Sadly, "valid input" isn't a well defined concept in practice. You
>>> could write a "valid" 10GB Haskell source file that obeyed the Haskell
>>> standard grammar, but I wouldn't expect that to compile either.
>>>
>>> I would. I?m a little disappointed that ghc-devs does not. I wouldn?t
>>> expect it to compile quickly, but I would expect it to run.
>>>
>>> > You could also write small (< 1k) source programs that trigger
>>> complexity problems in Hindley-Milner style type inference. You could also
>>> use compile-time meta programming (like Template Haskell) to generate
>>> intermediate code that is well formed but much too big to compile. The fact
>>> that a program obeys a published grammar is not sufficient to expect it to
>>> compile with a particular implementation (sorry to say).
>>>
>>> If I write a broken Template Haskell macro, then yes, I agree. This is
>>> not the case in this example.
>>>
>>> > Adding an INLINE pragma is akin to using compile-time meta programming.
>>>
>>> Is it? I find that a strange point of view. Isn?t INLINE just a strong
>>> hint to the compiler that this function should be inlined? How is using
>>> INLINE any different from simply manually inserting the code at every call
>>> site?
>>>
>>>
>>> - Adam
>>> _______________________________________________
>>> ghc-devs mailing list
>>> ghc-devs at haskell.org
>>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>>
>>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140108/c2ca8703/attachment.html>

12