[commit: ghc] simd: Fixup stack spills when generating AVX instructions. (b787b5d)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[commit: ghc] simd: Fixup stack spills when generating AVX instructions. (b787b5d)

David Terei
Do we have a longer term solution for this? I love the SIMD work, so
please don't take this as a complaint, I agree with getting it done
for now with the mangler and truth is it's doubtful we'll et around to
adding TNTC support to LLVM anytime soon, so mangler is here to stay
for now.

BUT. I would like to see some thought about how in an ideal world with
lots of free time to hack on GHC we'd kill the mangler. Do we know
that plan for this patch? If so it may be nice to document it
somewhere and perhaps create a trac ticket.

On 14 February 2013 23:16, Geoffrey Mainland <gmainlan at microsoft.com> wrote:

> Repository : ssh://darcs.haskell.org//srv/darcs/ghc
>
> On branch  : simd
>
> http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc26bb014a79
>
>>---------------------------------------------------------------
>
> commit b787b5d1e687fb28643cdd6a847ccc26bb014a79
> Author: Geoffrey Mainland <gmainlan at microsoft.com>
> Date:   Sat Nov 26 12:45:23 2011 +0000
>
>     Fixup stack spills when generating AVX instructions.
>
>     LLVM uses aligned AVX moves to spill values onto the stack, which requires
>     32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM inserts
>     extra instructions that munge the stack pointer. This is very very bad for the
>     GHC calling convention, so we tell LLVM to assume the stack is 32-byte
>     aligned. This patch rewrites the spill instructions that LLVM generates so they
>     do not require an aligned stack.
>
>>---------------------------------------------------------------
>
>  compiler/llvmGen/LlvmMangler.hs |   38 ++++++++++++++++++++++++++++++++++++++
>  1 files changed, 38 insertions(+), 0 deletions(-)
>
> diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs
> index 83a2be7..745dcc6 100644
> --- a/compiler/llvmGen/LlvmMangler.hs
> +++ b/compiler/llvmGen/LlvmMangler.hs
> @@ -20,6 +20,10 @@ import System.IO
>  import Data.List ( sortBy )
>  import Data.Function ( on )
>
> +#if x86_64_TARGET_ARCH
> +#define REWRITE_AVX
> +#endif
> +
>  -- Magic Strings
>  secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified :: B.ByteString
>  secStmt       = B.pack "\t.section\t"
> @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler" #-} do
>      w <- openBinaryFile f2 WriteMode
>      ss <- readSections r w
>      hClose r
> +    let fixed = (map rewriteAVX . fixTables) ss
>      let fixed = fixTables ss
>      mapM_ (writeSection w) fixed
>      hClose w
> @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do
>      B.hPutStrLn w hdr
>    B.hPutStrLn w cts
>
> +#if REWRITE_AVX
> +rewriteAVX :: Section -> Section
> +rewriteAVX = rewriteVmovaps . rewriteVmovdqa
> +
> +rewriteVmovdqa :: Section -> Section
> +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu
> +  where
> +    vmovdqa, vmovdqu :: B.ByteString
> +    vmovdqa = B.pack "vmovdqa"
> +    vmovdqu = B.pack "vmovdqu"
> +
> +rewriteVmovap :: Section -> Section
> +rewriteVmovap = rewriteInstructions vmovap vmovup
> +  where
> +    vmovap, vmovup :: B.ByteString
> +    vmovap = B.pack "vmovap"
> +    vmovup = B.pack "vmovup"
> +
> +rewriteInstructions :: B.ByteString -> B.ByteString -> Section -> Section
> +rewriteInstructions matchBS replaceBS (hdr, cts) =
> +    (hdr, loop cts)
> +  where
> +    loop :: B.ByteString -> B.ByteString
> +    loop cts =
> +        case B.breakSubstring cts matchBS of
> +          (hd,tl) | B.null tl -> hd
> +                  | otherwise -> hd `B.append` replaceBS `B.append`
> +                                 loop (B.drop (B.length matchBS) tl)
> +#else /* !REWRITE_AVX */
> +rewriteAVX :: Section -> Section
> +rewriteAVX = id
> +#endif /* !REWRITE_SSE */
> +
>  -- | Reorder and convert sections so info tables end up next to the
>  -- code. Also does stack fixups.
>  fixTables :: [Section] -> [Section]
>
>
>
> _______________________________________________
> ghc-commits mailing list
> ghc-commits at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-commits


Reply | Threaded
Open this post in threaded view
|

[commit: ghc] simd: Fixup stack spills when generating AVX instructions. (b787b5d)

Geoffrey Mainland
I thought I might provoke you with that patch :) Note that it is only on
the simd branch at the moment.

The stack-alignment problem I'm having and the TNTC problem are the same
in one respect: either we mangle LLVM's assembly output, or we get
patches into LLVM. In an ideal world, we would get patches into LLVM
that solve both issues. My issue would be solved if we could patch LLVM
so that llc could be told to use unaligned moves for SSE/AVX register
spills. I also need to revisit the Win32 stack alignment issue.

Geoff

On 02/14/2013 11:49 PM, David Terei wrote:

> Do we have a longer term solution for this? I love the SIMD work, so
> please don't take this as a complaint, I agree with getting it done
> for now with the mangler and truth is it's doubtful we'll et around to
> adding TNTC support to LLVM anytime soon, so mangler is here to stay
> for now.
>
> BUT. I would like to see some thought about how in an ideal world with
> lots of free time to hack on GHC we'd kill the mangler. Do we know
> that plan for this patch? If so it may be nice to document it
> somewhere and perhaps create a trac ticket.
>
> On 14 February 2013 23:16, Geoffrey Mainland <gmainlan at microsoft.com>
wrote:
>> Repository : ssh://darcs.haskell.org//srv/darcs/ghc
>>
>> On branch : simd
>>
>>
http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc26bb014a79

>>
>>> ---------------------------------------------------------------
>>
>> commit b787b5d1e687fb28643cdd6a847ccc26bb014a79
>> Author: Geoffrey Mainland <gmainlan at microsoft.com>
>> Date: Sat Nov 26 12:45:23 2011 +0000
>>
>> Fixup stack spills when generating AVX instructions.
>>
>> LLVM uses aligned AVX moves to spill values onto the stack, which
requires
>> 32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM
inserts
>> extra instructions that munge the stack pointer. This is very very
bad for the
>> GHC calling convention, so we tell LLVM to assume the stack is 32-byte
>> aligned. This patch rewrites the spill instructions that LLVM
generates so they
>> do not require an aligned stack.
>>
>>> ---------------------------------------------------------------
>>
>> compiler/llvmGen/LlvmMangler.hs | 38
++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 38 insertions(+), 0 deletions(-)
>>
>> diff --git a/compiler/llvmGen/LlvmMangler.hs
b/compiler/llvmGen/LlvmMangler.hs

>> index 83a2be7..745dcc6 100644
>> --- a/compiler/llvmGen/LlvmMangler.hs
>> +++ b/compiler/llvmGen/LlvmMangler.hs
>> @@ -20,6 +20,10 @@ import System.IO
>> import Data.List ( sortBy )
>> import Data.Function ( on )
>>
>> +#if x86_64_TARGET_ARCH
>> +#define REWRITE_AVX
>> +#endif
>> +
>> -- Magic Strings
>> secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified ::
B.ByteString
>> secStmt = B.pack "\t.section\t"
>> @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler"
#-} do

>> w <- openBinaryFile f2 WriteMode
>> ss <- readSections r w
>> hClose r
>> + let fixed = (map rewriteAVX . fixTables) ss
>> let fixed = fixTables ss
>> mapM_ (writeSection w) fixed
>> hClose w
>> @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do
>> B.hPutStrLn w hdr
>> B.hPutStrLn w cts
>>
>> +#if REWRITE_AVX
>> +rewriteAVX :: Section -> Section
>> +rewriteAVX = rewriteVmovaps . rewriteVmovdqa
>> +
>> +rewriteVmovdqa :: Section -> Section
>> +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu
>> + where
>> + vmovdqa, vmovdqu :: B.ByteString
>> + vmovdqa = B.pack "vmovdqa"
>> + vmovdqu = B.pack "vmovdqu"
>> +
>> +rewriteVmovap :: Section -> Section
>> +rewriteVmovap = rewriteInstructions vmovap vmovup
>> + where
>> + vmovap, vmovup :: B.ByteString
>> + vmovap = B.pack "vmovap"
>> + vmovup = B.pack "vmovup"
>> +
>> +rewriteInstructions :: B.ByteString -> B.ByteString -> Section ->
Section

>> +rewriteInstructions matchBS replaceBS (hdr, cts) =
>> + (hdr, loop cts)
>> + where
>> + loop :: B.ByteString -> B.ByteString
>> + loop cts =
>> + case B.breakSubstring cts matchBS of
>> + (hd,tl) | B.null tl -> hd
>> + | otherwise -> hd `B.append` replaceBS `B.append`
>> + loop (B.drop (B.length matchBS) tl)
>> +#else /* !REWRITE_AVX */
>> +rewriteAVX :: Section -> Section
>> +rewriteAVX = id
>> +#endif /* !REWRITE_SSE */
>> +
>> -- | Reorder and convert sections so info tables end up next to the
>> -- code. Also does stack fixups.
>> fixTables :: [Section] -> [Section]
>>
>>
>>
>> _______________________________________________
>> ghc-commits mailing list
>> ghc-commits at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-commits
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs




Reply | Threaded
Open this post in threaded view
|

[commit: ghc] simd: Fixup stack spills when generating AVX instructions. (b787b5d)

David Terei
OK. Could you add some documentation to this page on the mangler I
just created please?

http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM/Mangler

Just a sentence is enough, but I'd like to keep track of what we are
using the mangler for in a very readable form (i.e., more than the
source code).

Cheers,
David

On 15 February 2013 09:41, Geoffrey Mainland <mainland at apeiron.net> wrote:

> I thought I might provoke you with that patch :) Note that it is only on
> the simd branch at the moment.
>
> The stack-alignment problem I'm having and the TNTC problem are the same
> in one respect: either we mangle LLVM's assembly output, or we get
> patches into LLVM. In an ideal world, we would get patches into LLVM
> that solve both issues. My issue would be solved if we could patch LLVM
> so that llc could be told to use unaligned moves for SSE/AVX register
> spills. I also need to revisit the Win32 stack alignment issue.
>
> Geoff
>
> On 02/14/2013 11:49 PM, David Terei wrote:
>> Do we have a longer term solution for this? I love the SIMD work, so
>> please don't take this as a complaint, I agree with getting it done
>> for now with the mangler and truth is it's doubtful we'll et around to
>> adding TNTC support to LLVM anytime soon, so mangler is here to stay
>> for now.
>>
>> BUT. I would like to see some thought about how in an ideal world with
>> lots of free time to hack on GHC we'd kill the mangler. Do we know
>> that plan for this patch? If so it may be nice to document it
>> somewhere and perhaps create a trac ticket.
>>
>> On 14 February 2013 23:16, Geoffrey Mainland <gmainlan at microsoft.com>
> wrote:
>>> Repository : ssh://darcs.haskell.org//srv/darcs/ghc
>>>
>>> On branch : simd
>>>
>>>
> http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc26bb014a79
>>>
>>>> ---------------------------------------------------------------
>>>
>>> commit b787b5d1e687fb28643cdd6a847ccc26bb014a79
>>> Author: Geoffrey Mainland <gmainlan at microsoft.com>
>>> Date: Sat Nov 26 12:45:23 2011 +0000
>>>
>>> Fixup stack spills when generating AVX instructions.
>>>
>>> LLVM uses aligned AVX moves to spill values onto the stack, which
> requires
>>> 32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM
> inserts
>>> extra instructions that munge the stack pointer. This is very very
> bad for the
>>> GHC calling convention, so we tell LLVM to assume the stack is 32-byte
>>> aligned. This patch rewrites the spill instructions that LLVM
> generates so they
>>> do not require an aligned stack.
>>>
>>>> ---------------------------------------------------------------
>>>
>>> compiler/llvmGen/LlvmMangler.hs | 38
> ++++++++++++++++++++++++++++++++++++++
>>> 1 files changed, 38 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/compiler/llvmGen/LlvmMangler.hs
> b/compiler/llvmGen/LlvmMangler.hs
>>> index 83a2be7..745dcc6 100644
>>> --- a/compiler/llvmGen/LlvmMangler.hs
>>> +++ b/compiler/llvmGen/LlvmMangler.hs
>>> @@ -20,6 +20,10 @@ import System.IO
>>> import Data.List ( sortBy )
>>> import Data.Function ( on )
>>>
>>> +#if x86_64_TARGET_ARCH
>>> +#define REWRITE_AVX
>>> +#endif
>>> +
>>> -- Magic Strings
>>> secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified ::
> B.ByteString
>>> secStmt = B.pack "\t.section\t"
>>> @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler"
> #-} do
>>> w <- openBinaryFile f2 WriteMode
>>> ss <- readSections r w
>>> hClose r
>>> + let fixed = (map rewriteAVX . fixTables) ss
>>> let fixed = fixTables ss
>>> mapM_ (writeSection w) fixed
>>> hClose w
>>> @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do
>>> B.hPutStrLn w hdr
>>> B.hPutStrLn w cts
>>>
>>> +#if REWRITE_AVX
>>> +rewriteAVX :: Section -> Section
>>> +rewriteAVX = rewriteVmovaps . rewriteVmovdqa
>>> +
>>> +rewriteVmovdqa :: Section -> Section
>>> +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu
>>> + where
>>> + vmovdqa, vmovdqu :: B.ByteString
>>> + vmovdqa = B.pack "vmovdqa"
>>> + vmovdqu = B.pack "vmovdqu"
>>> +
>>> +rewriteVmovap :: Section -> Section
>>> +rewriteVmovap = rewriteInstructions vmovap vmovup
>>> + where
>>> + vmovap, vmovup :: B.ByteString
>>> + vmovap = B.pack "vmovap"
>>> + vmovup = B.pack "vmovup"
>>> +
>>> +rewriteInstructions :: B.ByteString -> B.ByteString -> Section ->
> Section
>>> +rewriteInstructions matchBS replaceBS (hdr, cts) =
>>> + (hdr, loop cts)
>>> + where
>>> + loop :: B.ByteString -> B.ByteString
>>> + loop cts =
>>> + case B.breakSubstring cts matchBS of
>>> + (hd,tl) | B.null tl -> hd
>>> + | otherwise -> hd `B.append` replaceBS `B.append`
>>> + loop (B.drop (B.length matchBS) tl)
>>> +#else /* !REWRITE_AVX */
>>> +rewriteAVX :: Section -> Section
>>> +rewriteAVX = id
>>> +#endif /* !REWRITE_SSE */
>>> +
>>> -- | Reorder and convert sections so info tables end up next to the
>>> -- code. Also does stack fixups.
>>> fixTables :: [Section] -> [Section]
>>>
>>>
>>>
>>> _______________________________________________
>>> ghc-commits mailing list
>>> ghc-commits at haskell.org
>>> http://www.haskell.org/mailman/listinfo/ghc-commits
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>