LLVM and dynamic linking

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Ben Gamari

# The Problem

Dynamic linking is currently broken with the LLVM code generator. This
can be easily seen by attempting to compile GHC with,

    GhcDynamic = YES
    DYNAMIC_BY_DEFAULT = YES
    DYNAMIC_GHC_PROGRAMS = YES
    BuildFlavour = quick-llvm

This build will fail with a error along the lines of,

    dll-split: internal error: invalid closure, info=0x402ec0
    (GHC version 7.7.20131212 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

After some poking around with the help of Peter Wortmann, it seems clear
that this is due to a subtle difference in how LLVM emits function
symbols. While the NCG emits these symbols with `.type @object`, LLVM
emits `.type @function`.

It appears that the `.type` annotation guides the linker in choosing the
relocation mechanism to use for the symbol. While `@object` symbols use
the Global Offset Table, `@function` symbols are relocated through the
Procedure Linking Table, a table of trampoline calls which are fixed up
at runtime. This means that static references to functions end up
pointing not to the object itself (and its info table) but instead to
some linker-generated assembly. When the garbage collector attempts to
examine the info table of one of these references, it finds nonsense and
fails.

# A solution

Peter demonstrated that manually modifying the assembler produced by
llc, passing this through GHC's mangler, and assembling the result
yields a functional binary.

As far as I can tell, LLVM's intermediate language doesn't expose any
way to force a function to `.type @object`. Unfortunately this means
that, at least for now, the only fix is to augment the mangler with
logic to perform this transform. I've done this in my `llvm-dynamic`
branch[1] (in addition to finding a bug in the `rewriteInstructions`
function used by AVX rewriting).

This branch compiles on my x86_64 machine to produce what appears to be
a functional compiler. Unfortunately installation issues (which I'll
describe shortly in a new thread) prevent me from verifying this. I'm
currently waiting for a build on my ARM box but assuming this fix works
this means that GHC could (finally) have first-class, stable ARM support.

Comments?

Cheers,

- Ben


[1] https://github.com/bgamari/ghc/tree/llvm-dynamic

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 489 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131214/ee79f859/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Simon Marlow-7
This sounds right to me.  Did you submit a patch?

Note that dynamic linking with LLVM is likely to produce significantly
worse code that with the NCG right now, because the LLVM back end uses
dynamic references even for symbols in the same package, whereas the NCG
back-end uses direct static references for these.

Cheers,
Simon

On 14/12/2013 22:13, Ben Gamari wrote:

>
> # The Problem
>
> Dynamic linking is currently broken with the LLVM code generator. This
> can be easily seen by attempting to compile GHC with,
>
>      GhcDynamic = YES
>      DYNAMIC_BY_DEFAULT = YES
>      DYNAMIC_GHC_PROGRAMS = YES
>      BuildFlavour = quick-llvm
>
> This build will fail with a error along the lines of,
>
>      dll-split: internal error: invalid closure, info=0x402ec0
>      (GHC version 7.7.20131212 for x86_64_unknown_linux)
>      Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
>
> After some poking around with the help of Peter Wortmann, it seems clear
> that this is due to a subtle difference in how LLVM emits function
> symbols. While the NCG emits these symbols with `.type @object`, LLVM
> emits `.type @function`.
>
> It appears that the `.type` annotation guides the linker in choosing the
> relocation mechanism to use for the symbol. While `@object` symbols use
> the Global Offset Table, `@function` symbols are relocated through the
> Procedure Linking Table, a table of trampoline calls which are fixed up
> at runtime. This means that static references to functions end up
> pointing not to the object itself (and its info table) but instead to
> some linker-generated assembly. When the garbage collector attempts to
> examine the info table of one of these references, it finds nonsense and
> fails.
>
> # A solution
>
> Peter demonstrated that manually modifying the assembler produced by
> llc, passing this through GHC's mangler, and assembling the result
> yields a functional binary.
>
> As far as I can tell, LLVM's intermediate language doesn't expose any
> way to force a function to `.type @object`. Unfortunately this means
> that, at least for now, the only fix is to augment the mangler with
> logic to perform this transform. I've done this in my `llvm-dynamic`
> branch[1] (in addition to finding a bug in the `rewriteInstructions`
> function used by AVX rewriting).
>
> This branch compiles on my x86_64 machine to produce what appears to be
> a functional compiler. Unfortunately installation issues (which I'll
> describe shortly in a new thread) prevent me from verifying this. I'm
> currently waiting for a build on my ARM box but assuming this fix works
> this means that GHC could (finally) have first-class, stable ARM support.
>
> Comments?
>
> Cheers,
>
> - Ben
>
>
> [1] https://github.com/bgamari/ghc/tree/llvm-dynamic
>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Ben Gamari
Simon Marlow <marlowsd at gmail.com> writes:

> This sounds right to me.  Did you submit a patch?
>
Not yet, I'm currently fighting through some build system issues which
are preventing me from actually installing and testing the compiler on
my ARM box.

> Note that dynamic linking with LLVM is likely to produce significantly
> worse code that with the NCG right now, because the LLVM back end uses
> dynamic references even for symbols in the same package, whereas the NCG
> back-end uses direct static references for these.
>
Right. However it (hopefully) works on ARM which is more than I can say
about the NCG. Moreover, I'm hopeful that it will be possible to fix
LLVM's output.

Would this not simply be a matter of flagging package-local symbols with
LLVM's `private` linkage type[1]? In the case where you have references
both internal and external to the package could you not define two
overlapping symbols, one flagged with `private` and the other
`external`? Perhaps I'm missing a subtlety?

Cheers,

- Ben


[1] http://llvm.org/docs/LangRef.html#linkage-types
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 489 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131220/e6095365/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Ben Gamari
In reply to this post by Simon Marlow-7
Simon Marlow <marlowsd at gmail.com> writes:

> This sounds right to me.  Did you submit a patch?
>
> Note that dynamic linking with LLVM is likely to produce significantly
> worse code that with the NCG right now, because the LLVM back end uses
> dynamic references even for symbols in the same package, whereas the NCG
> back-end uses direct static references for these.
>
Today with the help of Edward Yang I examined the code produced by the
LLVM backend in light of this statement. I was surprised to find that
LLVM's code appears to be no worse than the NCG with respect to
intra-package references.

My test case can be found here[2] and can be built with the included
`build.sh` script. The test consists of two modules build into a shared
library. One module, `LibTest`, exports a few simple members while the
other module (`LibTest2`) defines members that consume them. Care is
taken to ensure the members are not inlined.

The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
patches[1] I referred to in my last message. Please let me know if I've
missed something.



# Evaluation

## First example ##

The first member is a simple `String` (defined in `LibTest`),

    helloWorld :: String
    helloWorld = "Hello World!"

The use-site is quite straightforward,

    testHelloWorld :: IO String
    testHelloWorld = return helloWorld
   
With `-O1` the code looks reasonable in both cases. Most importantly,
both backends use IP relative addressing to find the symbol.

### LLVM ###

    0000000000000ef8 <rKw_info>:
         ef8: 48 8b 45 00           mov    0x0(%rbp),%rax
         efc: 48 8d 1d cd 11 20 00 lea    0x2011cd(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
         f03: ff e0                 jmpq   *%rax
   
    0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
         f28: eb ce                 jmp    ef8 <rKw_info>
         f2a: 66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)

### NCG ###

    0000000000000d58 <rH1_info>:
     d58: 48 8d 1d 71 13 20 00 lea    0x201371(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
     d5f: ff 65 00             jmpq   *0x0(%rbp)
   
    0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
     d88: eb ce                 jmp    d58 <rH1_info>


With `-O0` the code is substantially longer but the relocation behavior
is still correct, as one would expect.

Looking at the definition of `helloWorld`[3] itself it becomes clear that
the LLVM backend is more likely to use PLT relocations over GOT. In
general, `stg_*` primitives are called through the PLT. As far as I can
tell, both of these call mechanisms will incur two memory
accesses. However, in the case of the PLT the call will consist of two
JMPs whereas the GOT will consist of only one. Is this a cause for
concern? Could these two jumps interfere with prediction?

In general the LLVM backend produces a few more instructions than the
NCG although this doesn't appear to be related to handling of
relocations. For instance, the inexplicable (to me) `mov` at the
beginning of LLVM's `rKw_info`.


## Second example ##

The second example demonstrates an actual call,

    -- Definition (in LibTest)
    infoRef :: Int -> Int
    infoRef n = n + 1

    -- Call site
    testInfoRef :: IO Int
    testInfoRef = return (infoRef 2)

With `-O1` this produces the following code,

### LLVM ###

    0000000000000fb0 <rLy_info>:
         fb0: 48 8b 45 00           mov    0x0(%rbp),%rax
         fb4: 48 8d 1d a5 10 20 00 lea    0x2010a5(%rip),%rbx        # 202060 <rLx_closure>
         fbb: ff e0                 jmpq   *%rax
   
    0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
         fe0: eb ce                 jmp    fb0 <rLy_info>

### NCG ###

    0000000000000e10 <rI3_info>:
     e10: 48 8d 1d 51 12 20 00 lea    0x201251(%rip),%rbx        # 202068 <rI2_closure>
     e17: ff 65 00             jmpq   *0x0(%rbp)
   
    0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
     e40: eb ce                 jmp    e10 <rI3_info>
     
Again, it seems that LLVM is a bit more verbose but seems to handle
intra-package calls efficiently.



[1] https://github.com/bgamari/ghc/commits/llvm-dynamic
[2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
[3] `helloWorld` definitions:

LLVM:
    00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
        10a8: 50                   push   %rax
        10a9: 4c 8d 75 f0           lea    -0x10(%rbp),%r14
        10ad: 4d 39 fe             cmp    %r15,%r14
        10b0: 73 07                 jae    10b9 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
        10b2: 49 8b 45 f0           mov    -0x10(%r13),%rax
        10b6: 5a                   pop    %rdx
        10b7: ff e0                 jmpq   *%rax
        10b9: 4c 89 ef             mov    %r13,%rdi
        10bc: 48 89 de             mov    %rbx,%rsi
        10bf: e8 0c fd ff ff       callq  dd0 <newCAF at plt>
        10c4: 48 85 c0             test   %rax,%rax
        10c7: 74 22                 je     10eb <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
        10c9: 48 8b 0d 18 0f 20 00 mov    0x200f18(%rip),%rcx        # 201fe8 <_DYNAMIC+0x228>
        10d0: 48 89 4d f0           mov    %rcx,-0x10(%rbp)
        10d4: 48 89 45 f8           mov    %rax,-0x8(%rbp)
        10d8: 48 8d 05 21 00 00 00 lea    0x21(%rip),%rax        # 1100 <cJC_str>
        10df: 4c 89 f5             mov    %r14,%rbp
        10e2: 49 89 c6             mov    %rax,%r14
        10e5: 58                   pop    %rax
        10e6: e9 b5 fc ff ff       jmpq   da0 <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
        10eb: 48 8b 03             mov    (%rbx),%rax
        10ee: 5a                   pop    %rdx
        10ef: ff e0                 jmpq   *%rax


NCG:

    0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
     ef8: 48 8d 45 f0           lea    -0x10(%rbp),%rax
     efc: 4c 39 f8             cmp    %r15,%rax
     eff: 72 3f                 jb     f40 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
     f01: 4c 89 ef             mov    %r13,%rdi
     f04: 48 89 de             mov    %rbx,%rsi
     f07: 48 83 ec 08           sub    $0x8,%rsp
     f0b: b8 00 00 00 00       mov    $0x0,%eax
     f10: e8 1b fd ff ff       callq  c30 <newCAF at plt>
     f15: 48 83 c4 08           add    $0x8,%rsp
     f19: 48 85 c0             test   %rax,%rax
     f1c: 74 20                 je     f3e <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
     f1e: 48 8b 1d cb 10 20 00 mov    0x2010cb(%rip),%rbx        # 201ff0 <_DYNAMIC+0x238>
     f25: 48 89 5d f0           mov    %rbx,-0x10(%rbp)
     f29: 48 89 45 f8           mov    %rax,-0x8(%rbp)
     f2d: 4c 8d 35 1c 00 00 00 lea    0x1c(%rip),%r14        # f50 <cGG_str>
     f34: 48 83 c5 f0           add    $0xfffffffffffffff0,%rbp
     f38: ff 25 7a 10 20 00     jmpq   *0x20107a(%rip)        # 201fb8 <_DYNAMIC+0x200>
     f3e: ff 23                 jmpq   *(%rbx)
     f40: 41 ff 65 f0           jmpq   *-0x10(%r13)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 489 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131227/7f785750/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Carter Schonwald
great work! :)


On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <bgamari.foss at gmail.com> wrote:

> Simon Marlow <marlowsd at gmail.com> writes:
>
> > This sounds right to me.  Did you submit a patch?
> >
> > Note that dynamic linking with LLVM is likely to produce significantly
> > worse code that with the NCG right now, because the LLVM back end uses
> > dynamic references even for symbols in the same package, whereas the NCG
> > back-end uses direct static references for these.
> >
> Today with the help of Edward Yang I examined the code produced by the
> LLVM backend in light of this statement. I was surprised to find that
> LLVM's code appears to be no worse than the NCG with respect to
> intra-package references.
>
> My test case can be found here[2] and can be built with the included
> `build.sh` script. The test consists of two modules build into a shared
> library. One module, `LibTest`, exports a few simple members while the
> other module (`LibTest2`) defines members that consume them. Care is
> taken to ensure the members are not inlined.
>
> The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
> patches[1] I referred to in my last message. Please let me know if I've
> missed something.
>
>
>
> # Evaluation
>
> ## First example ##
>
> The first member is a simple `String` (defined in `LibTest`),
>
>     helloWorld :: String
>     helloWorld = "Hello World!"
>
> The use-site is quite straightforward,
>
>     testHelloWorld :: IO String
>     testHelloWorld = return helloWorld
>
> With `-O1` the code looks reasonable in both cases. Most importantly,
> both backends use IP relative addressing to find the symbol.
>
> ### LLVM ###
>
>     0000000000000ef8 <rKw_info>:
>          ef8:   48 8b 45 00             mov    0x0(%rbp),%rax
>          efc:   48 8d 1d cd 11 20 00    lea    0x2011cd(%rip),%rbx
>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>          f03:   ff e0                   jmpq   *%rax
>
>     0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>          f28:   eb ce                   jmp    ef8 <rKw_info>
>          f2a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
>
> ### NCG ###
>
>     0000000000000d58 <rH1_info>:
>      d58:       48 8d 1d 71 13 20 00    lea    0x201371(%rip),%rbx
>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>      d5f:       ff 65 00                jmpq   *0x0(%rbp)
>
>     0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>      d88:       eb ce                   jmp    d58 <rH1_info>
>
>
> With `-O0` the code is substantially longer but the relocation behavior
> is still correct, as one would expect.
>
> Looking at the definition of `helloWorld`[3] itself it becomes clear that
> the LLVM backend is more likely to use PLT relocations over GOT. In
> general, `stg_*` primitives are called through the PLT. As far as I can
> tell, both of these call mechanisms will incur two memory
> accesses. However, in the case of the PLT the call will consist of two
> JMPs whereas the GOT will consist of only one. Is this a cause for
> concern? Could these two jumps interfere with prediction?
>
> In general the LLVM backend produces a few more instructions than the
> NCG although this doesn't appear to be related to handling of
> relocations. For instance, the inexplicable (to me) `mov` at the
> beginning of LLVM's `rKw_info`.
>
>
> ## Second example ##
>
> The second example demonstrates an actual call,
>
>     -- Definition (in LibTest)
>     infoRef :: Int -> Int
>     infoRef n = n + 1
>
>     -- Call site
>     testInfoRef :: IO Int
>     testInfoRef = return (infoRef 2)
>
> With `-O1` this produces the following code,
>
> ### LLVM ###
>
>     0000000000000fb0 <rLy_info>:
>          fb0:   48 8b 45 00             mov    0x0(%rbp),%rax
>          fb4:   48 8d 1d a5 10 20 00    lea    0x2010a5(%rip),%rbx
>  # 202060 <rLx_closure>
>          fbb:   ff e0                   jmpq   *%rax
>
>     0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>          fe0:   eb ce                   jmp    fb0 <rLy_info>
>
> ### NCG ###
>
>     0000000000000e10 <rI3_info>:
>      e10:       48 8d 1d 51 12 20 00    lea    0x201251(%rip),%rbx
>  # 202068 <rI2_closure>
>      e17:       ff 65 00                jmpq   *0x0(%rbp)
>
>     0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>      e40:       eb ce                   jmp    e10 <rI3_info>
>
> Again, it seems that LLVM is a bit more verbose but seems to handle
> intra-package calls efficiently.
>
>
>
> [1] https://github.com/bgamari/ghc/commits/llvm-dynamic
> [2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
> [3] `helloWorld` definitions:
>
> LLVM:
>     00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>         10a8:   50                      push   %rax
>         10a9:   4c 8d 75 f0             lea    -0x10(%rbp),%r14
>         10ad:   4d 39 fe                cmp    %r15,%r14
>         10b0:   73 07                   jae    10b9
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
>         10b2:   49 8b 45 f0             mov    -0x10(%r13),%rax
>         10b6:   5a                      pop    %rdx
>         10b7:   ff e0                   jmpq   *%rax
>         10b9:   4c 89 ef                mov    %r13,%rdi
>         10bc:   48 89 de                mov    %rbx,%rsi
>         10bf:   e8 0c fd ff ff          callq  dd0 <newCAF at plt>
>         10c4:   48 85 c0                test   %rax,%rax
>         10c7:   74 22                   je     10eb
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
>         10c9:   48 8b 0d 18 0f 20 00    mov    0x200f18(%rip),%rcx
>  # 201fe8 <_DYNAMIC+0x228>
>         10d0:   48 89 4d f0             mov    %rcx,-0x10(%rbp)
>         10d4:   48 89 45 f8             mov    %rax,-0x8(%rbp)
>         10d8:   48 8d 05 21 00 00 00    lea    0x21(%rip),%rax        #
> 1100 <cJC_str>
>         10df:   4c 89 f5                mov    %r14,%rbp
>         10e2:   49 89 c6                mov    %rax,%r14
>         10e5:   58                      pop    %rax
>         10e6:   e9 b5 fc ff ff          jmpq   da0
> <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
>         10eb:   48 8b 03                mov    (%rbx),%rax
>         10ee:   5a                      pop    %rdx
>         10ef:   ff e0                   jmpq   *%rax
>
>
> NCG:
>
>     0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>      ef8:       48 8d 45 f0             lea    -0x10(%rbp),%rax
>      efc:       4c 39 f8                cmp    %r15,%rax
>      eff:       72 3f                   jb     f40
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
>      f01:       4c 89 ef                mov    %r13,%rdi
>      f04:       48 89 de                mov    %rbx,%rsi
>      f07:       48 83 ec 08             sub    $0x8,%rsp
>      f0b:       b8 00 00 00 00          mov    $0x0,%eax
>      f10:       e8 1b fd ff ff          callq  c30 <newCAF at plt>
>      f15:       48 83 c4 08             add    $0x8,%rsp
>      f19:       48 85 c0                test   %rax,%rax
>      f1c:       74 20                   je     f3e
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
>      f1e:       48 8b 1d cb 10 20 00    mov    0x2010cb(%rip),%rbx
>  # 201ff0 <_DYNAMIC+0x238>
>      f25:       48 89 5d f0             mov    %rbx,-0x10(%rbp)
>      f29:       48 89 45 f8             mov    %rax,-0x8(%rbp)
>      f2d:       4c 8d 35 1c 00 00 00    lea    0x1c(%rip),%r14        #
> f50 <cGG_str>
>      f34:       48 83 c5 f0             add    $0xfffffffffffffff0,%rbp
>      f38:       ff 25 7a 10 20 00       jmpq   *0x20107a(%rip)        #
> 201fb8 <_DYNAMIC+0x200>
>      f3e:       ff 23                   jmpq   *(%rbx)
>      f40:       41 ff 65 f0             jmpq   *-0x10(%r13)
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131227/138bbf13/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Aaron Friel
Replying to include the email list. You?re right, the llvm backend and the gmp licensing issues are orthogonal - or should be. The problem is I get build errors when trying to build GHC with LLVM and dynamic libraries.

The result is that I get a few different choices when producing a platform image for development, with some uncomfortable tradeoffs:


  1.
LLVM-built GHC, dynamic libs - doesn?t build.
  2.
LLVM-built GHC, static libs - potential licensing oddities with me shipping a statically linked ghc binary that is now gpled. I am not a lawyer, but the situation makes me uncomfortable.
  3.
GCC/ASM-built GHC, dynamic libs - this is the *standard* for most platforms shipping ghc binaries, but it means that one of the biggest and most critical users of the LLVM backend is neglecting it. It also bifurcates development resources for GHC. Optimization work is duplicated and already devs are getting into the uncomfortable position of suggesting to users that they should trust GHC to build your programs in a particular way, but not itself.
  4.
GCC/ASM-built GHC, static libs - worst of all possible worlds.

Because of this, the libgmp and llvm-backend issues aren?t entirely orthogonal. Trac ticket #7885 is exactly the issue I get when trying to compile #1.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>

Good question but you forgot to email the mailing list too :-)

Using llvm has nothing to do with Gmp. Use the native code gen (it's simper) and integer-simple.

That said, standard ghc dylinks to a system copy of Gmp anyways (I think ). Building ghc as a Dylib is orthogonal.

-Carter

On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:

Excellent research - I?m curious if this is the right thread to inquire about the status of trying to link GHC itself dynamically.

I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4) using snapshot builds of GHC (within the past week) from git, and I hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even the exact same error message).

I?m interested in dynamically linking GHC with LLVM to avoid the entanglement with libgmp?s license.

If this is the wrong thread or if I should reply instead to the trac item, please let me know.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Friday?, ?December? ?27?, ?2013 ?2?:?41? ?PM
To: Ben Gamari<mailto:bgamari.foss at gmail.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

great work! :)


On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <bgamari.foss at gmail.com<mailto:bgamari.foss at gmail.com>> wrote:
Simon Marlow <marlowsd at gmail.com<mailto:marlowsd at gmail.com>> writes:

> This sounds right to me.  Did you submit a patch?
>
> Note that dynamic linking with LLVM is likely to produce significantly
> worse code that with the NCG right now, because the LLVM back end uses
> dynamic references even for symbols in the same package, whereas the NCG
> back-end uses direct static references for these.
>
Today with the help of Edward Yang I examined the code produced by the
LLVM backend in light of this statement. I was surprised to find that
LLVM's code appears to be no worse than the NCG with respect to
intra-package references.

My test case can be found here[2] and can be built with the included
`build.sh` script. The test consists of two modules build into a shared
library. One module, `LibTest`, exports a few simple members while the
other module (`LibTest2`) defines members that consume them. Care is
taken to ensure the members are not inlined.

The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
patches[1] I referred to in my last message. Please let me know if I've
missed something.



# Evaluation

## First example ##

The first member is a simple `String` (defined in `LibTest`),

    helloWorld :: String
    helloWorld = "Hello World!"

The use-site is quite straightforward,

    testHelloWorld :: IO String
    testHelloWorld = return helloWorld

With `-O1` the code looks reasonable in both cases. Most importantly,
both backends use IP relative addressing to find the symbol.

### LLVM ###

    0000000000000ef8 <rKw_info>:
         ef8:   48 8b 45 00             mov    0x0(%rbp),%rax
         efc:   48 8d 1d cd 11 20 00    lea    0x2011cd(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
         f03:   ff e0                   jmpq   *%rax

    0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
         f28:   eb ce                   jmp    ef8 <rKw_info>
         f2a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

### NCG ###

    0000000000000d58 <rH1_info>:
     d58:       48 8d 1d 71 13 20 00    lea    0x201371(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
     d5f:       ff 65 00                jmpq   *0x0(%rbp)

    0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
     d88:       eb ce                   jmp    d58 <rH1_info>


With `-O0` the code is substantially longer but the relocation behavior
is still correct, as one would expect.

Looking at the definition of `helloWorld`[3] itself it becomes clear that
the LLVM backend is more likely to use PLT relocations over GOT. In
general, `stg_*` primitives are called through the PLT. As far as I can
tell, both of these call mechanisms will incur two memory
accesses. However, in the case of the PLT the call will consist of two
JMPs whereas the GOT will consist of only one. Is this a cause for
concern? Could these two jumps interfere with prediction?

In general the LLVM backend produces a few more instructions than the
NCG although this doesn't appear to be related to handling of
relocations. For instance, the inexplicable (to me) `mov` at the
beginning of LLVM's `rKw_info`.


## Second example ##

The second example demonstrates an actual call,

    -- Definition (in LibTest)
    infoRef :: Int -> Int
    infoRef n = n + 1

    -- Call site
    testInfoRef :: IO Int
    testInfoRef = return (infoRef 2)

With `-O1` this produces the following code,

### LLVM ###

    0000000000000fb0 <rLy_info>:
         fb0:   48 8b 45 00             mov    0x0(%rbp),%rax
         fb4:   48 8d 1d a5 10 20 00    lea    0x2010a5(%rip),%rbx        # 202060 <rLx_closure>
         fbb:   ff e0                   jmpq   *%rax

    0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
         fe0:   eb ce                   jmp    fb0 <rLy_info>

### NCG ###

    0000000000000e10 <rI3_info>:
     e10:       48 8d 1d 51 12 20 00    lea    0x201251(%rip),%rbx        # 202068 <rI2_closure>
     e17:       ff 65 00                jmpq   *0x0(%rbp)

    0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
     e40:       eb ce                   jmp    e10 <rI3_info>

Again, it seems that LLVM is a bit more verbose but seems to handle
intra-package calls efficiently.



[1] https://github.com/bgamari/ghc/commits/llvm-dynamic
[2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
[3] `helloWorld` definitions:

LLVM:
    00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
        10a8:   50                      push   %rax
        10a9:   4c 8d 75 f0             lea    -0x10(%rbp),%r14
        10ad:   4d 39 fe                cmp    %r15,%r14
        10b0:   73 07                   jae    10b9 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
        10b2:   49 8b 45 f0             mov    -0x10(%r13),%rax
        10b6:   5a                      pop    %rdx
        10b7:   ff e0                   jmpq   *%rax
        10b9:   4c 89 ef                mov    %r13,%rdi
        10bc:   48 89 de                mov    %rbx,%rsi
        10bf:   e8 0c fd ff ff          callq  dd0 <newCAF at plt>
        10c4:   48 85 c0                test   %rax,%rax
        10c7:   74 22                   je     10eb <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
        10c9:   48 8b 0d 18 0f 20 00    mov    0x200f18(%rip),%rcx        # 201fe8 <_DYNAMIC+0x228>
        10d0:   48 89 4d f0             mov    %rcx,-0x10(%rbp)
        10d4:   48 89 45 f8             mov    %rax,-0x8(%rbp)
        10d8:   48 8d 05 21 00 00 00    lea    0x21(%rip),%rax        # 1100 <cJC_str>
        10df:   4c 89 f5                mov    %r14,%rbp
        10e2:   49 89 c6                mov    %rax,%r14
        10e5:   58                      pop    %rax
        10e6:   e9 b5 fc ff ff          jmpq   da0 <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
        10eb:   48 8b 03                mov    (%rbx),%rax
        10ee:   5a                      pop    %rdx
        10ef:   ff e0                   jmpq   *%rax


NCG:

    0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
     ef8:       48 8d 45 f0             lea    -0x10(%rbp),%rax
     efc:       4c 39 f8                cmp    %r15,%rax
     eff:       72 3f                   jb     f40 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
     f01:       4c 89 ef                mov    %r13,%rdi
     f04:       48 89 de                mov    %rbx,%rsi
     f07:       48 83 ec 08             sub    $0x8,%rsp
     f0b:       b8 00 00 00 00          mov    $0x0,%eax
     f10:       e8 1b fd ff ff          callq  c30 <newCAF at plt>
     f15:       48 83 c4 08             add    $0x8,%rsp
     f19:       48 85 c0                test   %rax,%rax
     f1c:       74 20                   je     f3e <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
     f1e:       48 8b 1d cb 10 20 00    mov    0x2010cb(%rip),%rbx        # 201ff0 <_DYNAMIC+0x238>
     f25:       48 89 5d f0             mov    %rbx,-0x10(%rbp)
     f29:       48 89 45 f8             mov    %rax,-0x8(%rbp)
     f2d:       4c 8d 35 1c 00 00 00    lea    0x1c(%rip),%r14        # f50 <cGG_str>
     f34:       48 83 c5 f0             add    $0xfffffffffffffff0,%rbp
     f38:       ff 25 7a 10 20 00       jmpq   *0x20107a(%rip)        # 201fb8 <_DYNAMIC+0x200>
     f3e:       ff 23                   jmpq   *(%rbx)
     f40:       41 ff 65 f0             jmpq   *-0x10(%r13)

_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140101/8c801032/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Carter Schonwald
7.8 should have working dylib support on the llvm backend. (i believe some
of the relevant patches are in head already, though Ben Gamari can opine on
that)

why do you want ghc to be built with llvm? (i know i've tried myself in the
past, and it should be doable with 7.8 using 7.8 soon too)


On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:

>  Replying to include the email list. You?re right, the llvm backend and
> the gmp licensing issues are orthogonal - or should be. The problem is I
> get build errors when trying to build GHC with LLVM and dynamic libraries.
>
>  The result is that I get a few different choices when producing a
> platform image for development, with some uncomfortable tradeoffs:
>
>
>    1. LLVM-built GHC, dynamic libs - doesn?t build.
>    2. LLVM-built GHC, static libs - potential licensing oddities with me
>    shipping a statically linked ghc binary that is now gpled. I am not a
>    lawyer, but the situation makes me uncomfortable.
>    3. GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>    platforms shipping ghc binaries, but it means that one of the biggest and
>    most critical users of the LLVM backend is neglecting it. It also
>    bifurcates development resources for GHC. Optimization work is duplicated
>    and already devs are getting into the uncomfortable position of suggesting
>    to users that they should trust GHC to build your programs in a particular
>    way, but not itself.
>    4. GCC/ASM-built GHC, static libs - worst of all possible worlds.
>
>
>  Because of this, the libgmp and llvm-backend issues aren?t entirely
> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
> compile #1.
>
>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
> *Sent:* ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
> *To:* Aaron Friel <aaron at frieltek.com>
>
>  Good question but you forgot to email the mailing list too :-)
>
>  Using llvm has nothing to do with Gmp. Use the native code gen (it's
> simper) and integer-simple.
>
>  That said, standard ghc dylinks to a system copy of Gmp anyways (I think
> ). Building ghc as a Dylib is orthogonal.
>
> -Carter
>
> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>
>   Excellent research - I?m curious if this is the right thread to inquire
> about the status of trying to link GHC itself dynamically.
>
>  I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4)
> using snapshot builds of GHC (within the past week) from git, and I hit
> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time
> (even the exact same error message).
>
>  I?m interested in dynamically linking GHC with LLVM to avoid the
> entanglement with libgmp?s license.
>
>  If this is the wrong thread or if I should reply instead to the trac
> item, please let me know.
>
>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
> *Sent:* ?Friday?, ?December? ?27?, ?2013 ?2?:?41? ?PM
> *To:* Ben Gamari <bgamari.foss at gmail.com>
> *Cc:* ghc-devs at haskell.org
>
>  great work! :)
>
>
> On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <bgamari.foss at gmail.com>wrote:
>
>> Simon Marlow <marlowsd at gmail.com> writes:
>>
>> > This sounds right to me.  Did you submit a patch?
>> >
>> > Note that dynamic linking with LLVM is likely to produce significantly
>> > worse code that with the NCG right now, because the LLVM back end uses
>> > dynamic references even for symbols in the same package, whereas the NCG
>> > back-end uses direct static references for these.
>> >
>>  Today with the help of Edward Yang I examined the code produced by the
>> LLVM backend in light of this statement. I was surprised to find that
>> LLVM's code appears to be no worse than the NCG with respect to
>> intra-package references.
>>
>> My test case can be found here[2] and can be built with the included
>> `build.sh` script. The test consists of two modules build into a shared
>> library. One module, `LibTest`, exports a few simple members while the
>> other module (`LibTest2`) defines members that consume them. Care is
>> taken to ensure the members are not inlined.
>>
>> The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
>> patches[1] I referred to in my last message. Please let me know if I've
>> missed something.
>>
>>
>>
>> # Evaluation
>>
>> ## First example ##
>>
>> The first member is a simple `String` (defined in `LibTest`),
>>
>>     helloWorld :: String
>>     helloWorld = "Hello World!"
>>
>> The use-site is quite straightforward,
>>
>>     testHelloWorld :: IO String
>>     testHelloWorld = return helloWorld
>>
>> With `-O1` the code looks reasonable in both cases. Most importantly,
>> both backends use IP relative addressing to find the symbol.
>>
>> ### LLVM ###
>>
>>     0000000000000ef8 <rKw_info>:
>>          ef8:   48 8b 45 00             mov    0x0(%rbp),%rax
>>          efc:   48 8d 1d cd 11 20 00    lea    0x2011cd(%rip),%rbx
>>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>>          f03:   ff e0                   jmpq   *%rax
>>
>>     0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>>          f28:   eb ce                   jmp    ef8 <rKw_info>
>>          f2a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
>>
>> ### NCG ###
>>
>>     0000000000000d58 <rH1_info>:
>>      d58:       48 8d 1d 71 13 20 00    lea    0x201371(%rip),%rbx
>>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>>      d5f:       ff 65 00                jmpq   *0x0(%rbp)
>>
>>     0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>>      d88:       eb ce                   jmp    d58 <rH1_info>
>>
>>
>> With `-O0` the code is substantially longer but the relocation behavior
>> is still correct, as one would expect.
>>
>> Looking at the definition of `helloWorld`[3] itself it becomes clear that
>> the LLVM backend is more likely to use PLT relocations over GOT. In
>> general, `stg_*` primitives are called through the PLT. As far as I can
>> tell, both of these call mechanisms will incur two memory
>> accesses. However, in the case of the PLT the call will consist of two
>> JMPs whereas the GOT will consist of only one. Is this a cause for
>> concern? Could these two jumps interfere with prediction?
>>
>> In general the LLVM backend produces a few more instructions than the
>> NCG although this doesn't appear to be related to handling of
>> relocations. For instance, the inexplicable (to me) `mov` at the
>> beginning of LLVM's `rKw_info`.
>>
>>
>> ## Second example ##
>>
>> The second example demonstrates an actual call,
>>
>>     -- Definition (in LibTest)
>>     infoRef :: Int -> Int
>>     infoRef n = n + 1
>>
>>     -- Call site
>>     testInfoRef :: IO Int
>>     testInfoRef = return (infoRef 2)
>>
>> With `-O1` this produces the following code,
>>
>> ### LLVM ###
>>
>>     0000000000000fb0 <rLy_info>:
>>          fb0:   48 8b 45 00             mov    0x0(%rbp),%rax
>>          fb4:   48 8d 1d a5 10 20 00    lea    0x2010a5(%rip),%rbx
>>  # 202060 <rLx_closure>
>>          fbb:   ff e0                   jmpq   *%rax
>>
>>     0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>>          fe0:   eb ce                   jmp    fb0 <rLy_info>
>>
>> ### NCG ###
>>
>>     0000000000000e10 <rI3_info>:
>>      e10:       48 8d 1d 51 12 20 00    lea    0x201251(%rip),%rbx
>>  # 202068 <rI2_closure>
>>      e17:       ff 65 00                jmpq   *0x0(%rbp)
>>
>>     0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>>      e40:       eb ce                   jmp    e10 <rI3_info>
>>
>> Again, it seems that LLVM is a bit more verbose but seems to handle
>> intra-package calls efficiently.
>>
>>
>>
>> [1] https://github.com/bgamari/ghc/commits/llvm-dynamic
>> [2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
>> [3] `helloWorld` definitions:
>>
>> LLVM:
>>     00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>>         10a8:   50                      push   %rax
>>         10a9:   4c 8d 75 f0             lea    -0x10(%rbp),%r14
>>         10ad:   4d 39 fe                cmp    %r15,%r14
>>         10b0:   73 07                   jae    10b9
>> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
>>         10b2:   49 8b 45 f0             mov    -0x10(%r13),%rax
>>         10b6:   5a                      pop    %rdx
>>         10b7:   ff e0                   jmpq   *%rax
>>         10b9:   4c 89 ef                mov    %r13,%rdi
>>         10bc:   48 89 de                mov    %rbx,%rsi
>>         10bf:   e8 0c fd ff ff          callq  dd0 <newCAF at plt>
>>         10c4:   48 85 c0                test   %rax,%rax
>>         10c7:   74 22                   je     10eb
>> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
>>         10c9:   48 8b 0d 18 0f 20 00    mov    0x200f18(%rip),%rcx
>>  # 201fe8 <_DYNAMIC+0x228>
>>         10d0:   48 89 4d f0             mov    %rcx,-0x10(%rbp)
>>         10d4:   48 89 45 f8             mov    %rax,-0x8(%rbp)
>>         10d8:   48 8d 05 21 00 00 00    lea    0x21(%rip),%rax        #
>> 1100 <cJC_str>
>>         10df:   4c 89 f5                mov    %r14,%rbp
>>         10e2:   49 89 c6                mov    %rax,%r14
>>         10e5:   58                      pop    %rax
>>         10e6:   e9 b5 fc ff ff          jmpq   da0
>> <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
>>         10eb:   48 8b 03                mov    (%rbx),%rax
>>         10ee:   5a                      pop    %rdx
>>         10ef:   ff e0                   jmpq   *%rax
>>
>>
>> NCG:
>>
>>     0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>>      ef8:       48 8d 45 f0             lea    -0x10(%rbp),%rax
>>      efc:       4c 39 f8                cmp    %r15,%rax
>>      eff:       72 3f                   jb     f40
>> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
>>      f01:       4c 89 ef                mov    %r13,%rdi
>>      f04:       48 89 de                mov    %rbx,%rsi
>>      f07:       48 83 ec 08             sub    $0x8,%rsp
>>      f0b:       b8 00 00 00 00          mov    $0x0,%eax
>>      f10:       e8 1b fd ff ff          callq  c30 <newCAF at plt>
>>      f15:       48 83 c4 08             add    $0x8,%rsp
>>      f19:       48 85 c0                test   %rax,%rax
>>      f1c:       74 20                   je     f3e
>> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
>>      f1e:       48 8b 1d cb 10 20 00    mov    0x2010cb(%rip),%rbx
>>  # 201ff0 <_DYNAMIC+0x238>
>>      f25:       48 89 5d f0             mov    %rbx,-0x10(%rbp)
>>      f29:       48 89 45 f8             mov    %rax,-0x8(%rbp)
>>      f2d:       4c 8d 35 1c 00 00 00    lea    0x1c(%rip),%r14        #
>> f50 <cGG_str>
>>      f34:       48 83 c5 f0             add    $0xfffffffffffffff0,%rbp
>>      f38:       ff 25 7a 10 20 00       jmpq   *0x20107a(%rip)        #
>> 201fb8 <_DYNAMIC+0x200>
>>      f3e:       ff 23                   jmpq   *(%rbx)
>>      f40:       41 ff 65 f0             jmpq   *-0x10(%r13)
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140101/974a321a/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Aaron Friel
Because I think it?s going to be an organizational issue and a duplication of effort if GHC is built one way but the future direction of LLVM is another.

Imagine if GCC started developing a new engine and it didn?t work with one of the biggest, most regular consumers of GCC. Say, the Linux kernel, or itself. At first, the situation is optimistic - if this engine doesn?t work for the project that has the smartest, brightest GCC hackers potentially looking at it, then it should fix itself soon enough. Suppose the situation lingers though, and continues for months without fix. The new GCC backend starts to become the default, and the community around GCC advocates for end-users to use it to optimize code for their projects and it even becomes the default for some platforms, such as ARM.

What I?ve described is analogous to the GHC situation - and the result is that GHC isn?t self-hosting on some platforms and the inertia that used to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer have a lot of eyes on them and externally it seems like GHC has accepted a bifurcated approach for development.

I dramatize the situation above, but there?s some truth to it. The LLVM backend needs some care and attention and if the majority of GHC devs can?t build GHC with LLVM, then that means the smartest, brightest GHC hackers won?t have their attention turned toward fixing those problems. If a patch to GHC-HEAD broke compilation for every backend, it would be fixed in short order. If a new version of GCC did not work with GHC, I can imagine it would be only hours before the first patches came in resolving the issue. On OS X Mavericks, an incompatibility with GHC has led to a swift reaction and strong support for resolving platform issues. The attention to the LLVM backend is visibly smaller, but I don?t know enough about the people working on GHC to know if it is actually smaller.

The way I am trying to change this is by making it easier for people to start using GHC (by putting images on Docker.io) and, in the process, learning about GHC?s build process and trying to make things work for my own projects. The Docker image allows anyone with a Linux kernel to build and play with GHC HEAD. The information about building GHC yourself is difficult to approach and I found it hard to get started, and I want to improve that too, so I?m learning and asking questions.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

7.8 should have working dylib support on the llvm backend. (i believe some of the relevant patches are in head already, though Ben Gamari can opine on that)

why do you want ghc to be built with llvm? (i know i've tried myself in the past, and it should be doable with 7.8 using 7.8 soon too)


On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Replying to include the email list. You?re right, the llvm backend and the gmp licensing issues are orthogonal - or should be. The problem is I get build errors when trying to build GHC with LLVM and dynamic libraries.

The result is that I get a few different choices when producing a platform image for development, with some uncomfortable tradeoffs:


  1.
LLVM-built GHC, dynamic libs - doesn?t build.
  2.
LLVM-built GHC, static libs - potential licensing oddities with me shipping a statically linked ghc binary that is now gpled. I am not a lawyer, but the situation makes me uncomfortable.
  3.
GCC/ASM-built GHC, dynamic libs - this is the *standard* for most platforms shipping ghc binaries, but it means that one of the biggest and most critical users of the LLVM backend is neglecting it. It also bifurcates development resources for GHC. Optimization work is duplicated and already devs are getting into the uncomfortable position of suggesting to users that they should trust GHC to build your programs in a particular way, but not itself.
  4.
GCC/ASM-built GHC, static libs - worst of all possible worlds.

Because of this, the libgmp and llvm-backend issues aren?t entirely orthogonal. Trac ticket #7885 is exactly the issue I get when trying to compile #1.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>

Good question but you forgot to email the mailing list too :-)

Using llvm has nothing to do with Gmp. Use the native code gen (it's simper) and integer-simple.

That said, standard ghc dylinks to a system copy of Gmp anyways (I think ). Building ghc as a Dylib is orthogonal.

-Carter

On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:

Excellent research - I?m curious if this is the right thread to inquire about the status of trying to link GHC itself dynamically.

I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4) using snapshot builds of GHC (within the past week) from git, and I hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even the exact same error message).

I?m interested in dynamically linking GHC with LLVM to avoid the entanglement with libgmp?s license.

If this is the wrong thread or if I should reply instead to the trac item, please let me know.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140102/31442af4/attachment.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Carter Schonwald
well, please feel welcome to ask for help as much as you need! To repeat:
if you use ghc HEAD, it should be doable to build GHC head (using head as
the bootstrap compiler) using LLVM. Once Ben's llvm dy linking patches
land, you should be able to do both dynamic and static linking with  llvm.

As for your Mavericks example, if you review ghc trac and the mailing lists
plus irc logs, it took the effort of several folks spread over several
months to make sure that once Mavericks / Xcode 5 landed, that it would be
"easy" to fix.

that said, theres no need to take such a polarizing tone, with speculations
about the priorities of the various GHC devs. We're all volunteers  (ok,
theres a some who are paid volunteers) who care about making sure ghc works
as well as possible for everyone, but have finite time in the day, and so
many different ways to ghc can be made better. (and in many cases, have a
day job that also needs attention too).

please test things and holler when they don't work, and if you can debug
problems and cook up good patches, great!

in the case of llvm and dynamic linking, the root cause was actually pretty
darn subtle, and I'm immensely grateful that Ben Gamari got to the root of
it. (I'd definitely hit the problem myself, and I was absolutely stumped
when I tried to investigate it.)


On Wed, Jan 1, 2014 at 10:03 PM, Aaron Friel <aaron at frieltek.com> wrote:

>   Because I think it?s going to be an organizational issue and a
> duplication of effort if GHC is built one way but the future direction of
> LLVM is another.
>
>  Imagine if GCC started developing a new engine and it didn?t work with
> one of the biggest, most regular consumers of GCC. Say, the Linux kernel,
> or itself. At first, the situation is optimistic - if this engine doesn?t
> work for the project that has the smartest, brightest GCC hackers
> potentially looking at it, then it should fix itself soon enough. Suppose
> the situation lingers though, and continues for months without fix. The new
> GCC backend starts to become the default, and the community around GCC
> advocates for end-users to use it to optimize code for their projects and
> it even becomes the default for some platforms, such as ARM.
>
>  What I?ve described is analogous to the GHC situation - and the result
> is that GHC isn?t self-hosting on some platforms and the inertia that used
> to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to
> be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer
> have a lot of eyes on them and externally it seems like GHC has accepted a
> bifurcated approach for development.
>
>  I dramatize the situation above, but there?s some truth to it. The LLVM
> backend needs some care and attention and if the majority of GHC devs can?t
> build GHC with LLVM, then that means the smartest, brightest GHC hackers
> won?t have their attention turned toward fixing those problems. If a patch
> to GHC-HEAD broke compilation for every backend, it would be fixed in short
> order. If a new version of GCC did not work with GHC, I can
> imagine it would be only hours before the first patches came in resolving
> the issue. On OS X Mavericks, an incompatibility with GHC has led to a
> swift reaction and strong support for resolving platform issues. The
> attention to the LLVM backend is visibly smaller, but I don?t know enough
> about the people working on GHC to know if it is actually smaller.
>
>  The way I am trying to change this is by making it easier for people to
> start using GHC (by putting images on Docker.io) and, in the process,
> learning about GHC?s build process and trying to make things work for my
> own projects. The Docker image allows anyone with a Linux kernel to
> build and play with GHC HEAD. The information about building GHC yourself
> is difficult to approach and I found it hard to get started, and I want to
> improve that too, so I?m learning and asking questions.
>
>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
> *Sent:* ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
> *To:* Aaron Friel <aaron at frieltek.com>
> *Cc:* ghc-devs at haskell.org
>
>  7.8 should have working dylib support on the llvm backend. (i believe
> some of the relevant patches are in head already, though Ben Gamari can
> opine on that)
>
>  why do you want ghc to be built with llvm? (i know i've tried myself in
> the past, and it should be doable with 7.8 using 7.8 soon too)
>
>
> On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
>
>>  Replying to include the email list. You?re right, the llvm backend and
>> the gmp licensing issues are orthogonal - or should be. The problem is I
>> get build errors when trying to build GHC with LLVM and dynamic libraries.
>>
>>  The result is that I get a few different choices when producing a
>> platform image for development, with some uncomfortable tradeoffs:
>>
>>
>>    1. LLVM-built GHC, dynamic libs - doesn?t build.
>>    2. LLVM-built GHC, static libs - potential licensing oddities with me
>>    shipping a statically linked ghc binary that is now gpled. I am not a
>>    lawyer, but the situation makes me uncomfortable.
>>    3. GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>>    platforms shipping ghc binaries, but it means that one of the biggest and
>>    most critical users of the LLVM backend is neglecting it. It also
>>    bifurcates development resources for GHC. Optimization work is duplicated
>>    and already devs are getting into the uncomfortable position of suggesting
>>    to users that they should trust GHC to build your programs in a particular
>>    way, but not itself.
>>    4. GCC/ASM-built GHC, static libs - worst of all possible worlds.
>>
>>
>>  Because of this, the libgmp and llvm-backend issues aren?t entirely
>> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
>> compile #1.
>>
>>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
>> *Sent:* ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
>> *To:* Aaron Friel <aaron at frieltek.com>
>>
>>  Good question but you forgot to email the mailing list too :-)
>>
>>  Using llvm has nothing to do with Gmp. Use the native code gen (it's
>> simper) and integer-simple.
>>
>>  That said, standard ghc dylinks to a system copy of Gmp anyways (I
>> think ). Building ghc as a Dylib is orthogonal.
>>
>> -Carter
>>
>> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>>
>>   Excellent research - I?m curious if this is the right thread to
>> inquire about the status of trying to link GHC itself dynamically.
>>
>>  I?ve been attempting to do so with various LLVM versions (3.2, 3.3,
>> 3.4) using snapshot builds of GHC (within the past week) from git, and I
>> hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every
>> time (even the exact same error message).
>>
>>  I?m interested in dynamically linking GHC with LLVM to avoid the
>> entanglement with libgmp?s license.
>>
>>  If this is the wrong thread or if I should reply instead to the trac
>> item, please let me know.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140101/36b85431/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Aaron Friel
I eagerly look forward to these patches, I hope they are able to land in time for the 7.8 release as well. Do you have any additional information on them - or is it part of a branch I could look at?

And I apologize for the polarizing tone - I?m overdramatizing the situation and I?m new to following GHC at the root (or head, whichever). Regardless, the LLVM dynamic linking issue has popped up now and again (there are a good number of trac issues) and I?m eager to see that GHC is able to be built properly with it and have it stay working.

I have no doubt the issues Ben and others have been working with are subtle and complex. There are absolutely brilliant people here working on GHC, so any problem left unsolved is bound to be uniquely difficult.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Wednesday?, ?January? ?1?, ?2014 ?9?:?53? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

well, please feel welcome to ask for help as much as you need! To repeat: if you use ghc HEAD, it should be doable to build GHC head (using head as the bootstrap compiler) using LLVM. Once Ben's llvm dy linking patches land, you should be able to do both dynamic and static linking with  llvm.

As for your Mavericks example, if you review ghc trac and the mailing lists plus irc logs, it took the effort of several folks spread over several months to make sure that once Mavericks / Xcode 5 landed, that it would be "easy" to fix.

that said, theres no need to take such a polarizing tone, with speculations about the priorities of the various GHC devs. We're all volunteers  (ok, theres a some who are paid volunteers) who care about making sure ghc works as well as possible for everyone, but have finite time in the day, and so many different ways to ghc can be made better. (and in many cases, have a day job that also needs attention too).

please test things and holler when they don't work, and if you can debug problems and cook up good patches, great!

in the case of llvm and dynamic linking, the root cause was actually pretty darn subtle, and I'm immensely grateful that Ben Gamari got to the root of it. (I'd definitely hit the problem myself, and I was absolutely stumped when I tried to investigate it.)


On Wed, Jan 1, 2014 at 10:03 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Because I think it?s going to be an organizational issue and a duplication of effort if GHC is built one way but the future direction of LLVM is another.

Imagine if GCC started developing a new engine and it didn?t work with one of the biggest, most regular consumers of GCC. Say, the Linux kernel, or itself. At first, the situation is optimistic - if this engine doesn?t work for the project that has the smartest, brightest GCC hackers potentially looking at it, then it should fix itself soon enough. Suppose the situation lingers though, and continues for months without fix. The new GCC backend starts to become the default, and the community around GCC advocates for end-users to use it to optimize code for their projects and it even becomes the default for some platforms, such as ARM.

What I?ve described is analogous to the GHC situation - and the result is that GHC isn?t self-hosting on some platforms and the inertia that used to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer have a lot of eyes on them and externally it seems like GHC has accepted a bifurcated approach for development.

I dramatize the situation above, but there?s some truth to it. The LLVM backend needs some care and attention and if the majority of GHC devs can?t build GHC with LLVM, then that means the smartest, brightest GHC hackers won?t have their attention turned toward fixing those problems. If a patch to GHC-HEAD broke compilation for every backend, it would be fixed in short order. If a new version of GCC did not work with GHC, I can imagine it would be only hours before the first patches came in resolving the issue. On OS X Mavericks, an incompatibility with GHC has led to a swift reaction and strong support for resolving platform issues. The attention to the LLVM backend is visibly smaller, but I don?t know enough about the people working on GHC to know if it is actually smaller.

The way I am trying to change this is by making it easier for people to start using GHC (by putting images on Docker.io) and, in the process, learning about GHC?s build process and trying to make things work for my own projects. The Docker image allows anyone with a Linux kernel to build and play with GHC HEAD. The information about building GHC yourself is difficult to approach and I found it hard to get started, and I want to improve that too, so I?m learning and asking questions.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

7.8 should have working dylib support on the llvm backend. (i believe some of the relevant patches are in head already, though Ben Gamari can opine on that)

why do you want ghc to be built with llvm? (i know i've tried myself in the past, and it should be doable with 7.8 using 7.8 soon too)


On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Replying to include the email list. You?re right, the llvm backend and the gmp licensing issues are orthogonal - or should be. The problem is I get build errors when trying to build GHC with LLVM and dynamic libraries.

The result is that I get a few different choices when producing a platform image for development, with some uncomfortable tradeoffs:


  1.
LLVM-built GHC, dynamic libs - doesn?t build.
  2.
LLVM-built GHC, static libs - potential licensing oddities with me shipping a statically linked ghc binary that is now gpled. I am not a lawyer, but the situation makes me uncomfortable.
  3.
GCC/ASM-built GHC, dynamic libs - this is the *standard* for most platforms shipping ghc binaries, but it means that one of the biggest and most critical users of the LLVM backend is neglecting it. It also bifurcates development resources for GHC. Optimization work is duplicated and already devs are getting into the uncomfortable position of suggesting to users that they should trust GHC to build your programs in a particular way, but not itself.
  4.
GCC/ASM-built GHC, static libs - worst of all possible worlds.

Because of this, the libgmp and llvm-backend issues aren?t entirely orthogonal. Trac ticket #7885 is exactly the issue I get when trying to compile #1.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>

Good question but you forgot to email the mailing list too :-)

Using llvm has nothing to do with Gmp. Use the native code gen (it's simper) and integer-simple.

That said, standard ghc dylinks to a system copy of Gmp anyways (I think ). Building ghc as a Dylib is orthogonal.

-Carter

On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:

Excellent research - I?m curious if this is the right thread to inquire about the status of trying to link GHC itself dynamically.

I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4) using snapshot builds of GHC (within the past week) from git, and I hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even the exact same error message).

I?m interested in dynamically linking GHC with LLVM to avoid the entanglement with libgmp?s license.

If this is the wrong thread or if I should reply instead to the trac item, please let me know.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140102/923c9d41/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Carter Schonwald
you can try it out yourself pretty easily, linked from the master ticket on
this https://ghc.haskell.org/trac/ghc/ticket/4210#comment:27
bens ghc repo is at
https://github.com/bgamari/ghc/compare/llvm-intra-package
(nb: its a work in progress of his)


On Thu, Jan 2, 2014 at 2:31 AM, Aaron Friel <aaron at frieltek.com> wrote:

>  I eagerly look forward to these patches, I hope they are able to land in
> time for the 7.8 release as well. Do you have any additional information on
> them - or is it part of a branch I could look at?
>
>  And I apologize for the polarizing tone - I?m overdramatizing the
> situation and I?m new to following GHC at the root (or head, whichever).
> Regardless, the LLVM dynamic linking issue has popped up now and again
> (there are a good number of trac issues) and I?m eager to see that GHC is
> able to be built properly with it and have it stay working.
>
>  I have no doubt the issues Ben and others have been working with are
> subtle and complex. There are absolutely brilliant people here working on
> GHC, so any problem left unsolved is bound to be uniquely difficult.
>
>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
> *Sent:* ?Wednesday?, ?January? ?1?, ?2014 ?9?:?53? ?PM
>
> *To:* Aaron Friel <aaron at frieltek.com>
> *Cc:* ghc-devs at haskell.org
>
>  well, please feel welcome to ask for help as much as you need! To
> repeat: if you use ghc HEAD, it should be doable to build GHC head (using
> head as the bootstrap compiler) using LLVM. Once Ben's llvm dy linking
> patches land, you should be able to do both dynamic and static linking with
>  llvm.
>
>  As for your Mavericks example, if you review ghc trac and the mailing
> lists plus irc logs, it took the effort of several folks spread over
> several months to make sure that once Mavericks / Xcode 5 landed, that it
> would be "easy" to fix.
>
>  that said, theres no need to take such a polarizing tone, with
> speculations about the priorities of the various GHC devs. We're all
> volunteers  (ok, theres a some who are paid volunteers) who care about
> making sure ghc works as well as possible for everyone, but have finite
> time in the day, and so many different ways to ghc can be made better. (and
> in many cases, have a day job that also needs attention too).
>
>  please test things and holler when they don't work, and if you can debug
> problems and cook up good patches, great!
>
>  in the case of llvm and dynamic linking, the root cause was actually
> pretty darn subtle, and I'm immensely grateful that Ben Gamari got to the
> root of it. (I'd definitely hit the problem myself, and I was absolutely
> stumped when I tried to investigate it.)
>
>
> On Wed, Jan 1, 2014 at 10:03 PM, Aaron Friel <aaron at frieltek.com> wrote:
>
>>   Because I think it?s going to be an organizational issue and a
>> duplication of effort if GHC is built one way but the future direction of
>> LLVM is another.
>>
>>  Imagine if GCC started developing a new engine and it didn?t work with
>> one of the biggest, most regular consumers of GCC. Say, the Linux kernel,
>> or itself. At first, the situation is optimistic - if this engine doesn?t
>> work for the project that has the smartest, brightest GCC hackers
>> potentially looking at it, then it should fix itself soon enough. Suppose
>> the situation lingers though, and continues for months without fix. The new
>> GCC backend starts to become the default, and the community around GCC
>> advocates for end-users to use it to optimize code for their projects and
>> it even becomes the default for some platforms, such as ARM.
>>
>>  What I?ve described is analogous to the GHC situation - and the result
>> is that GHC isn?t self-hosting on some platforms and the inertia that used
>> to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to
>> be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer
>> have a lot of eyes on them and externally it seems like GHC has accepted a
>> bifurcated approach for development.
>>
>>  I dramatize the situation above, but there?s some truth to it. The LLVM
>> backend needs some care and attention and if the majority of GHC devs can?t
>> build GHC with LLVM, then that means the smartest, brightest GHC hackers
>> won?t have their attention turned toward fixing those problems. If a patch
>> to GHC-HEAD broke compilation for every backend, it would be fixed in short
>> order. If a new version of GCC did not work with GHC, I can
>> imagine it would be only hours before the first patches came in resolving
>> the issue. On OS X Mavericks, an incompatibility with GHC has led to a
>> swift reaction and strong support for resolving platform issues. The
>> attention to the LLVM backend is visibly smaller, but I don?t know enough
>> about the people working on GHC to know if it is actually smaller.
>>
>>  The way I am trying to change this is by making it easier for people to
>> start using GHC (by putting images on Docker.io) and, in the process,
>> learning about GHC?s build process and trying to make things work for my
>> own projects. The Docker image allows anyone with a Linux kernel to
>> build and play with GHC HEAD. The information about building GHC yourself
>> is difficult to approach and I found it hard to get started, and I want to
>> improve that too, so I?m learning and asking questions.
>>
>>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
>> *Sent:* ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
>> *To:* Aaron Friel <aaron at frieltek.com>
>> *Cc:* ghc-devs at haskell.org
>>
>>  7.8 should have working dylib support on the llvm backend. (i believe
>> some of the relevant patches are in head already, though Ben Gamari can
>> opine on that)
>>
>>  why do you want ghc to be built with llvm? (i know i've tried myself in
>> the past, and it should be doable with 7.8 using 7.8 soon too)
>>
>>
>> On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
>>
>>>  Replying to include the email list. You?re right, the llvm backend and
>>> the gmp licensing issues are orthogonal - or should be. The problem is I
>>> get build errors when trying to build GHC with LLVM and dynamic libraries.
>>>
>>>  The result is that I get a few different choices when producing a
>>> platform image for development, with some uncomfortable tradeoffs:
>>>
>>>
>>>    1. LLVM-built GHC, dynamic libs - doesn?t build.
>>>    2. LLVM-built GHC, static libs - potential licensing oddities with
>>>    me shipping a statically linked ghc binary that is now gpled. I am not a
>>>    lawyer, but the situation makes me uncomfortable.
>>>    3. GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>>>    platforms shipping ghc binaries, but it means that one of the biggest and
>>>    most critical users of the LLVM backend is neglecting it. It also
>>>    bifurcates development resources for GHC. Optimization work is duplicated
>>>    and already devs are getting into the uncomfortable position of suggesting
>>>    to users that they should trust GHC to build your programs in a particular
>>>    way, but not itself.
>>>    4. GCC/ASM-built GHC, static libs - worst of all possible worlds.
>>>
>>>
>>>  Because of this, the libgmp and llvm-backend issues aren?t entirely
>>> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
>>> compile #1.
>>>
>>>  *From:* Carter Schonwald <carter.schonwald at gmail.com>
>>> *Sent:* ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
>>> *To:* Aaron Friel <aaron at frieltek.com>
>>>
>>>  Good question but you forgot to email the mailing list too :-)
>>>
>>>  Using llvm has nothing to do with Gmp. Use the native code gen (it's
>>> simper) and integer-simple.
>>>
>>>  That said, standard ghc dylinks to a system copy of Gmp anyways (I
>>> think ). Building ghc as a Dylib is orthogonal.
>>>
>>> -Carter
>>>
>>> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>>>
>>>   Excellent research - I?m curious if this is the right thread to
>>> inquire about the status of trying to link GHC itself dynamically.
>>>
>>>  I?ve been attempting to do so with various LLVM versions (3.2, 3.3,
>>> 3.4) using snapshot builds of GHC (within the past week) from git, and I
>>> hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every
>>> time (even the exact same error message).
>>>
>>>  I?m interested in dynamically linking GHC with LLVM to avoid the
>>> entanglement with libgmp?s license.
>>>
>>>  If this is the wrong thread or if I should reply instead to the trac
>>> item, please let me know.
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140102/58ab221e/attachment.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Simon Peyton Jones
In reply to this post by Aaron Friel
Aaron,

The LLVM backend needs some care and attention

I?m sure you are right about this.  Could you become one of the people offering that care and attention.  Who are the GHC developers?  They are simply volunteers who make time to give something back to their community, and GHC relies absolutely on their commitment and expertise.  So do please join in if you can; it?s clearly something you care about, and have some knowledge of.

With thanks and best wishes,

Simon

From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Aaron Friel
Sent: 02 January 2014 03:03
To: Carter Schonwald
Cc: ghc-devs at haskell.org
Subject: Re: LLVM and dynamic linking

Because I think it?s going to be an organizational issue and a duplication of effort if GHC is built one way but the future direction of LLVM is another.

Imagine if GCC started developing a new engine and it didn?t work with one of the biggest, most regular consumers of GCC. Say, the Linux kernel, or itself. At first, the situation is optimistic - if this engine doesn?t work for the project that has the smartest, brightest GCC hackers potentially looking at it, then it should fix itself soon enough. Suppose the situation lingers though, and continues for months without fix. The new GCC backend starts to become the default, and the community around GCC advocates for end-users to use it to optimize code for their projects and it even becomes the default for some platforms, such as ARM.

What I?ve described is analogous to the GHC situation - and the result is that GHC isn?t self-hosting on some platforms and the inertia that used to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer have a lot of eyes on them and externally it seems like GHC has accepted a bifurcated approach for development.

I dramatize the situation above, but there?s some truth to it. The LLVM backend needs some care and attention and if the majority of GHC devs can?t build GHC with LLVM, then that means the smartest, brightest GHC hackers won?t have their attention turned toward fixing those problems. If a patch to GHC-HEAD broke compilation for every backend, it would be fixed in short order. If a new version of GCC did not work with GHC, I can imagine it would be only hours before the first patches came in resolving the issue. On OS X Mavericks, an incompatibility with GHC has led to a swift reaction and strong support for resolving platform issues. The attention to the LLVM backend is visibly smaller, but I don?t know enough about the people working on GHC to know if it is actually smaller.

The way I am trying to change this is by making it easier for people to start using GHC (by putting images on Docker.io) and, in the process, learning about GHC?s build process and trying to make things work for my own projects. The Docker image allows anyone with a Linux kernel to build and play with GHC HEAD. The information about building GHC yourself is difficult to approach and I found it hard to get started, and I want to improve that too, so I?m learning and asking questions.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

7.8 should have working dylib support on the llvm backend. (i believe some of the relevant patches are in head already, though Ben Gamari can opine on that)

why do you want ghc to be built with llvm? (i know i've tried myself in the past, and it should be doable with 7.8 using 7.8 soon too)

On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Replying to include the email list. You?re right, the llvm backend and the gmp licensing issues are orthogonal - or should be. The problem is I get build errors when trying to build GHC with LLVM and dynamic libraries.

The result is that I get a few different choices when producing a platform image for development, with some uncomfortable tradeoffs:


  1.  LLVM-built GHC, dynamic libs - doesn?t build.

  1.  LLVM-built GHC, static libs - potential licensing oddities with me shipping a statically linked ghc binary that is now gpled. I am not a lawyer, but the situation makes me uncomfortable.

  1.  GCC/ASM-built GHC, dynamic libs - this is the *standard* for most platforms shipping ghc binaries, but it means that one of the biggest and most critical users of the LLVM backend is neglecting it. It also bifurcates development resources for GHC. Optimization work is duplicated and already devs are getting into the uncomfortable position of suggesting to users that they should trust GHC to build your programs in a particular way, but not itself.

  1.  GCC/ASM-built GHC, static libs - worst of all possible worlds.

Because of this, the libgmp and llvm-backend issues aren?t entirely orthogonal. Trac ticket #7885 is exactly the issue I get when trying to compile #1.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>

Good question but you forgot to email the mailing list too :-)

Using llvm has nothing to do with Gmp. Use the native code gen (it's simper) and integer-simple.

That said, standard ghc dylinks to a system copy of Gmp anyways (I think ). Building ghc as a Dylib is orthogonal.

-Carter

On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Excellent research - I?m curious if this is the right thread to inquire about the status of trying to link GHC itself dynamically.

I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4) using snapshot builds of GHC (within the past week) from git, and I hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even the exact same error message).

I?m interested in dynamically linking GHC with LLVM to avoid the entanglement with libgmp?s license.

If this is the wrong thread or if I should reply instead to the trac item, please let me know.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140102/6b0f29a5/attachment.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Aaron Friel
I am eager to learn and try to work on this :)

From: Simon Peyton-Jones<mailto:simonpj at microsoft.com>
Sent: ?Thursday?, ?January? ?2?, ?2014 ?8?:?17? ?AM
To: Aaron Friel<mailto:aaron at frieltek.com>, Carter Schonwald<mailto:carter.schonwald at gmail.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

Aaron,

The LLVM backend needs some care and attention

I?m sure you are right about this.  Could you become one of the people offering that care and attention.  Who are the GHC developers?  They are simply volunteers who make time to give something back to their community, and GHC relies absolutely on their commitment and expertise.  So do please join in if you can; it?s clearly something you care about, and have some knowledge of.

With thanks and best wishes,

Simon

From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Aaron Friel
Sent: 02 January 2014 03:03
To: Carter Schonwald
Cc: ghc-devs at haskell.org
Subject: Re: LLVM and dynamic linking

Because I think it?s going to be an organizational issue and a duplication of effort if GHC is built one way but the future direction of LLVM is another.

Imagine if GCC started developing a new engine and it didn?t work with one of the biggest, most regular consumers of GCC. Say, the Linux kernel, or itself. At first, the situation is optimistic - if this engine doesn?t work for the project that has the smartest, brightest GCC hackers potentially looking at it, then it should fix itself soon enough. Suppose the situation lingers though, and continues for months without fix. The new GCC backend starts to become the default, and the community around GCC advocates for end-users to use it to optimize code for their projects and it even becomes the default for some platforms, such as ARM.

What I?ve described is analogous to the GHC situation - and the result is that GHC isn?t self-hosting on some platforms and the inertia that used to be behind the LLVM backend seems to have stagnated. Whereas LLVM used to be the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer have a lot of eyes on them and externally it seems like GHC has accepted a bifurcated approach for development.

I dramatize the situation above, but there?s some truth to it. The LLVM backend needs some care and attention and if the majority of GHC devs can?t build GHC with LLVM, then that means the smartest, brightest GHC hackers won?t have their attention turned toward fixing those problems. If a patch to GHC-HEAD broke compilation for every backend, it would be fixed in short order. If a new version of GCC did not work with GHC, I can imagine it would be only hours before the first patches came in resolving the issue. On OS X Mavericks, an incompatibility with GHC has led to a swift reaction and strong support for resolving platform issues. The attention to the LLVM backend is visibly smaller, but I don?t know enough about the people working on GHC to know if it is actually smaller.

The way I am trying to change this is by making it easier for people to start using GHC (by putting images on Docker.io) and, in the process, learning about GHC?s build process and trying to make things work for my own projects. The Docker image allows anyone with a Linux kernel to build and play with GHC HEAD. The information about building GHC yourself is difficult to approach and I found it hard to get started, and I want to improve that too, so I?m learning and asking questions.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>
Cc: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>

7.8 should have working dylib support on the llvm backend. (i believe some of the relevant patches are in head already, though Ben Gamari can opine on that)

why do you want ghc to be built with llvm? (i know i've tried myself in the past, and it should be doable with 7.8 using 7.8 soon too)

On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Replying to include the email list. You?re right, the llvm backend and the gmp licensing issues are orthogonal - or should be. The problem is I get build errors when trying to build GHC with LLVM and dynamic libraries.

The result is that I get a few different choices when producing a platform image for development, with some uncomfortable tradeoffs:


  1.  LLVM-built GHC, dynamic libs - doesn?t build.

  1.  LLVM-built GHC, static libs - potential licensing oddities with me shipping a statically linked ghc binary that is now gpled. I am not a lawyer, but the situation makes me uncomfortable.

  1.  GCC/ASM-built GHC, dynamic libs - this is the *standard* for most platforms shipping ghc binaries, but it means that one of the biggest and most critical users of the LLVM backend is neglecting it. It also bifurcates development resources for GHC. Optimization work is duplicated and already devs are getting into the uncomfortable position of suggesting to users that they should trust GHC to build your programs in a particular way, but not itself.

  1.  GCC/ASM-built GHC, static libs - worst of all possible worlds.

Because of this, the libgmp and llvm-backend issues aren?t entirely orthogonal. Trac ticket #7885 is exactly the issue I get when trying to compile #1.

From: Carter Schonwald<mailto:carter.schonwald at gmail.com>
Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
To: Aaron Friel<mailto:aaron at frieltek.com>

Good question but you forgot to email the mailing list too :-)

Using llvm has nothing to do with Gmp. Use the native code gen (it's simper) and integer-simple.

That said, standard ghc dylinks to a system copy of Gmp anyways (I think ). Building ghc as a Dylib is orthogonal.

-Carter

On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com<mailto:aaron at frieltek.com>> wrote:
Excellent research - I?m curious if this is the right thread to inquire about the status of trying to link GHC itself dynamically.

I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4) using snapshot builds of GHC (within the past week) from git, and I hit ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even the exact same error message).

I?m interested in dynamically linking GHC with LLVM to avoid the entanglement with libgmp?s license.

If this is the wrong thread or if I should reply instead to the trac item, please let me know.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140102/37d753dc/attachment.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Austin Seipp-5
In reply to this post by Aaron Friel
Hi all,

Apologies for the late reply.

First off, one thing to note wrt GMP: GMP is an LGPL library which we
link against. Technically, we need to allow relinking to be compliant
and free of of the LGPL for our own executables, but this should be
reasonably possible - on systems where there is a system-wide GMP
installed, we use that copy (this occurs mostly on OSX and Linux.) And
so do executables compiled by GHC. Even when GHC uses static linking
or dynamic linking for haskell code in this case, it will still always
dynamically link to libgmp - meaning replacing the shared object
should be possible. This is just the way modern Linux/OSX systems
distribute system-wide C libraries, as you expect.

In the case where we don't have this, we build our own copy of libgmp
inside the source tree and use that instead. That said there are other
reasons why we might want to be free of GMP entirely, but that's
neither here nor there. In any case, the issue is pretty orthogonal to
LLVM, dynamic haskell linking, etc - on a Linux system, you should
reasonably be able to swap out a `libgmp.so` for another modified
copy[1], and your Haskell programs should be compliant in this
regard.[2]

Now, as for LLVM.

For one, LLVM actually is a 'relatively' cheap backend to have around.
I say LLVM is 'relatively' cheap because All External Dependencies
Have A Cost. The code is reasonably small, and in any case GHC still
does most of the heavy lifting - the LLVM backend and native code
generator share a very large amount of code. We don't really duplicate
optimizations ourselves, for example, and some optimizations we do
perform on our IR can't be done by LLVM anyway (it doesn't have enough
information.)

But LLVM has some very notable costs for GHC developers:

  * It's slower to compile with, because it tries to re-optimize the
code we give it, but it mostly accomplishes nothing beyond advanced
optimizations like vectorization/scalar evolution.
  * We support a wide range of LLVM versions (a nightmare IMO) which
means pinning down specific versions and supporting them all is rather
difficult. Combined with e.g. distro maintainers who may patch bugs
themselves, and the things you're depending on in the wild (or what
users might report bugs with) aren't as solid or well understood.
  * LLVM is extremely large, extremely complex, and the amount of
people who can sensibly work on both GHC and LLVM are few and far
inbetween. So fixing these issues is time consuming, difficult, and
mostly tedious grunt work.

All this basically sums up to the fact that dealing with LLVM comes
with complications all on its own that makes it a different kind of
beast to handle.

So, the LLVM backend definitely needs some love. All of these things
are solveable (and I have some ideas for solving most of them,) but
none of them will quite come for free. But there are some real
improvements that can be made here I think, and make LLVM much more
smoothly supported for GHC itself. If you'd like to help it'd be
really appreciated - I'd like to see LLVM have more love put forth,
but it's a lot of work of course!.

(Finally, in reference to the last point: I am in the obvious
minority, but I am favorable to having the native code generator
around, even if it's a bit old and crufty these days - at least it's
small, fast and simple enough to be grokked and hacked on, and I don't
think it fragments development all that much. By comparison, LLVM is a
mammoth beast of incredible size with a sizeable entry barrier IMO. I
think there's merit to having both a simple, 'obviously working'
option in addition to the heavy duty one.)

[1] Relevant tool: http://nixos.org/patchelf.html
[2] Of course, IANAL, but there you go.

On Wed, Jan 1, 2014 at 9:03 PM, Aaron Friel <aaron at frieltek.com> wrote:

> Because I think it?s going to be an organizational issue and a duplication
> of effort if GHC is built one way but the future direction of LLVM is
> another.
>
> Imagine if GCC started developing a new engine and it didn?t work with one
> of the biggest, most regular consumers of GCC. Say, the Linux kernel, or
> itself. At first, the situation is optimistic - if this engine doesn?t work
> for the project that has the smartest, brightest GCC hackers potentially
> looking at it, then it should fix itself soon enough. Suppose the situation
> lingers though, and continues for months without fix. The new GCC backend
> starts to become the default, and the community around GCC advocates for
> end-users to use it to optimize code for their projects and it even becomes
> the default for some platforms, such as ARM.
>
> What I?ve described is analogous to the GHC situation - and the result is
> that GHC isn?t self-hosting on some platforms and the inertia that used to
> be behind the LLVM backend seems to have stagnated. Whereas LLVM used to be
> the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer have a
> lot of eyes on them and externally it seems like GHC has accepted a
> bifurcated approach for development.
>
> I dramatize the situation above, but there?s some truth to it. The LLVM
> backend needs some care and attention and if the majority of GHC devs can?t
> build GHC with LLVM, then that means the smartest, brightest GHC hackers
> won?t have their attention turned toward fixing those problems. If a patch
> to GHC-HEAD broke compilation for every backend, it would be fixed in short
> order. If a new version of GCC did not work with GHC, I can imagine it would
> be only hours before the first patches came in resolving the issue. On OS X
> Mavericks, an incompatibility with GHC has led to a swift reaction and
> strong support for resolving platform issues. The attention to the LLVM
> backend is visibly smaller, but I don?t know enough about the people working
> on GHC to know if it is actually smaller.
>
> The way I am trying to change this is by making it easier for people to
> start using GHC (by putting images on Docker.io) and, in the process,
> learning about GHC?s build process and trying to make things work for my own
> projects. The Docker image allows anyone with a Linux kernel to build and
> play with GHC HEAD. The information about building GHC yourself is difficult
> to approach and I found it hard to get started, and I want to improve that
> too, so I?m learning and asking questions.
>
> From: Carter Schonwald
> Sent: ?Wednesday?, ?January? ?1?, ?2014 ?5?:?54? ?PM
> To: Aaron Friel
> Cc: ghc-devs at haskell.org
>
> 7.8 should have working dylib support on the llvm backend. (i believe some
> of the relevant patches are in head already, though Ben Gamari can opine on
> that)
>
> why do you want ghc to be built with llvm? (i know i've tried myself in the
> past, and it should be doable with 7.8 using 7.8 soon too)
>
>
> On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
>>
>> Replying to include the email list. You?re right, the llvm backend and the
>> gmp licensing issues are orthogonal - or should be. The problem is I get
>> build errors when trying to build GHC with LLVM and dynamic libraries.
>>
>> The result is that I get a few different choices when producing a platform
>> image for development, with some uncomfortable tradeoffs:
>>
>> LLVM-built GHC, dynamic libs - doesn?t build.
>> LLVM-built GHC, static libs - potential licensing oddities with me
>> shipping a statically linked ghc binary that is now gpled. I am not a
>> lawyer, but the situation makes me uncomfortable.
>> GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>> platforms shipping ghc binaries, but it means that one of the biggest and
>> most critical users of the LLVM backend is neglecting it. It also bifurcates
>> development resources for GHC. Optimization work is duplicated and already
>> devs are getting into the uncomfortable position of suggesting to users that
>> they should trust GHC to build your programs in a particular way, but not
>> itself.
>> GCC/ASM-built GHC, static libs - worst of all possible worlds.
>>
>>
>> Because of this, the libgmp and llvm-backend issues aren?t entirely
>> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
>> compile #1.
>>
>> From: Carter Schonwald
>> Sent: ?Monday?, ?December? ?30?, ?2013 ?1?:?05? ?PM
>> To: Aaron Friel
>>
>> Good question but you forgot to email the mailing list too :-)
>>
>> Using llvm has nothing to do with Gmp. Use the native code gen (it's
>> simper) and integer-simple.
>>
>> That said, standard ghc dylinks to a system copy of Gmp anyways (I think
>> ). Building ghc as a Dylib is orthogonal.
>>
>> -Carter
>>
>> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>>
>> Excellent research - I?m curious if this is the right thread to inquire
>> about the status of trying to link GHC itself dynamically.
>>
>> I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4)
>> using snapshot builds of GHC (within the past week) from git, and I hit
>> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time (even
>> the exact same error message).
>>
>> I?m interested in dynamically linking GHC with LLVM to avoid the
>> entanglement with libgmp?s license.
>>
>> If this is the wrong thread or if I should reply instead to the trac item,
>> please let me know.
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>



--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

George Colpitts
wrt

We support a wide range of LLVM versions

Why can't we stop doing that and only support one or two, e.g. GHC 7.8
would only support llvm 3.3 and perhaps 3.4?





On Tue, Jan 7, 2014 at 4:54 PM, Austin Seipp <austin at well-typed.com> wrote:

> Hi all,
>
> Apologies for the late reply.
>
> First off, one thing to note wrt GMP: GMP is an LGPL library which we
> link against. Technically, we need to allow relinking to be compliant
> and free of of the LGPL for our own executables, but this should be
> reasonably possible - on systems where there is a system-wide GMP
> installed, we use that copy (this occurs mostly on OSX and Linux.) And
> so do executables compiled by GHC. Even when GHC uses static linking
> or dynamic linking for haskell code in this case, it will still always
> dynamically link to libgmp - meaning replacing the shared object
> should be possible. This is just the way modern Linux/OSX systems
> distribute system-wide C libraries, as you expect.
>
> In the case where we don't have this, we build our own copy of libgmp
> inside the source tree and use that instead. That said there are other
> reasons why we might want to be free of GMP entirely, but that's
> neither here nor there. In any case, the issue is pretty orthogonal to
> LLVM, dynamic haskell linking, etc - on a Linux system, you should
> reasonably be able to swap out a `libgmp.so` for another modified
> copy[1], and your Haskell programs should be compliant in this
> regard.[2]
>
> Now, as for LLVM.
>
> For one, LLVM actually is a 'relatively' cheap backend to have around.
> I say LLVM is 'relatively' cheap because All External Dependencies
> Have A Cost. The code is reasonably small, and in any case GHC still
> does most of the heavy lifting - the LLVM backend and native code
> generator share a very large amount of code. We don't really duplicate
> optimizations ourselves, for example, and some optimizations we do
> perform on our IR can't be done by LLVM anyway (it doesn't have enough
> information.)
>
> But LLVM has some very notable costs for GHC developers:
>
>   * It's slower to compile with, because it tries to re-optimize the
> code we give it, but it mostly accomplishes nothing beyond advanced
> optimizations like vectorization/scalar evolution.
>   * We support a wide range of LLVM versions (a nightmare IMO) which
> means pinning down specific versions and supporting them all is rather
> difficult. Combined with e.g. distro maintainers who may patch bugs
> themselves, and the things you're depending on in the wild (or what
> users might report bugs with) aren't as solid or well understood.
>   * LLVM is extremely large, extremely complex, and the amount of
> people who can sensibly work on both GHC and LLVM are few and far
> inbetween. So fixing these issues is time consuming, difficult, and
> mostly tedious grunt work.
>
> All this basically sums up to the fact that dealing with LLVM comes
> with complications all on its own that makes it a different kind of
> beast to handle.
>
> So, the LLVM backend definitely needs some love. All of these things
> are solveable (and I have some ideas for solving most of them,) but
> none of them will quite come for free. But there are some real
> improvements that can be made here I think, and make LLVM much more
> smoothly supported for GHC itself. If you'd like to help it'd be
> really appreciated - I'd like to see LLVM have more love put forth,
> but it's a lot of work of course!.
>
> (Finally, in reference to the last point: I am in the obvious
> minority, but I am favorable to having the native code generator
> around, even if it's a bit old and crufty these days - at least it's
> small, fast and simple enough to be grokked and hacked on, and I don't
> think it fragments development all that much. By comparison, LLVM is a
> mammoth beast of incredible size with a sizeable entry barrier IMO. I
> think there's merit to having both a simple, 'obviously working'
> option in addition to the heavy duty one.)
>
> [1] Relevant tool: http://nixos.org/patchelf.html
> [2] Of course, IANAL, but there you go.
>
> On Wed, Jan 1, 2014 at 9:03 PM, Aaron Friel <aaron at frieltek.com> wrote:
> > Because I think it?s going to be an organizational issue and a
> duplication
> > of effort if GHC is built one way but the future direction of LLVM is
> > another.
> >
> > Imagine if GCC started developing a new engine and it didn?t work with
> one
> > of the biggest, most regular consumers of GCC. Say, the Linux kernel, or
> > itself. At first, the situation is optimistic - if this engine doesn?t
> work
> > for the project that has the smartest, brightest GCC hackers potentially
> > looking at it, then it should fix itself soon enough. Suppose the
> situation
> > lingers though, and continues for months without fix. The new GCC backend
> > starts to become the default, and the community around GCC advocates for
> > end-users to use it to optimize code for their projects and it even
> becomes
> > the default for some platforms, such as ARM.
> >
> > What I?ve described is analogous to the GHC situation - and the result is
> > that GHC isn?t self-hosting on some platforms and the inertia that used
> to
> > be behind the LLVM backend seems to have stagnated. Whereas LLVM used to
> be
> > the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer
> have a
> > lot of eyes on them and externally it seems like GHC has accepted a
> > bifurcated approach for development.
> >
> > I dramatize the situation above, but there?s some truth to it. The LLVM
> > backend needs some care and attention and if the majority of GHC devs
> can?t
> > build GHC with LLVM, then that means the smartest, brightest GHC hackers
> > won?t have their attention turned toward fixing those problems. If a
> patch
> > to GHC-HEAD broke compilation for every backend, it would be fixed in
> short
> > order. If a new version of GCC did not work with GHC, I can imagine it
> would
> > be only hours before the first patches came in resolving the issue. On
> OS X
> > Mavericks, an incompatibility with GHC has led to a swift reaction and
> > strong support for resolving platform issues. The attention to the LLVM
> > backend is visibly smaller, but I don?t know enough about the people
> working
> > on GHC to know if it is actually smaller.
> >
> > The way I am trying to change this is by making it easier for people to
> > start using GHC (by putting images on Docker.io) and, in the process,
> > learning about GHC?s build process and trying to make things work for my
> own
> > projects. The Docker image allows anyone with a Linux kernel to build and
> > play with GHC HEAD. The information about building GHC yourself is
> difficult
> > to approach and I found it hard to get started, and I want to improve
> that
> > too, so I?m learning and asking questions.
> >
> > From: Carter Schonwald
> > Sent: Wednesday, January 1, 2014 5:54 PM
> > To: Aaron Friel
> > Cc: ghc-devs at haskell.org
> >
> > 7.8 should have working dylib support on the llvm backend. (i believe
> some
> > of the relevant patches are in head already, though Ben Gamari can opine
> on
> > that)
> >
> > why do you want ghc to be built with llvm? (i know i've tried myself in
> the
> > past, and it should be doable with 7.8 using 7.8 soon too)
> >
> >
> > On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
> >>
> >> Replying to include the email list. You?re right, the llvm backend and
> the
> >> gmp licensing issues are orthogonal - or should be. The problem is I get
> >> build errors when trying to build GHC with LLVM and dynamic libraries.
> >>
> >> The result is that I get a few different choices when producing a
> platform
> >> image for development, with some uncomfortable tradeoffs:
> >>
> >> LLVM-built GHC, dynamic libs - doesn?t build.
> >> LLVM-built GHC, static libs - potential licensing oddities with me
> >> shipping a statically linked ghc binary that is now gpled. I am not a
> >> lawyer, but the situation makes me uncomfortable.
> >> GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
> >> platforms shipping ghc binaries, but it means that one of the biggest
> and
> >> most critical users of the LLVM backend is neglecting it. It also
> bifurcates
> >> development resources for GHC. Optimization work is duplicated and
> already
> >> devs are getting into the uncomfortable position of suggesting to users
> that
> >> they should trust GHC to build your programs in a particular way, but
> not
> >> itself.
> >> GCC/ASM-built GHC, static libs - worst of all possible worlds.
> >>
> >>
> >> Because of this, the libgmp and llvm-backend issues aren?t entirely
> >> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
> >> compile #1.
> >>
> >> From: Carter Schonwald
> >> Sent: Monday, December 30, 2013 1:05 PM
> >> To: Aaron Friel
> >>
> >> Good question but you forgot to email the mailing list too :-)
> >>
> >> Using llvm has nothing to do with Gmp. Use the native code gen (it's
> >> simper) and integer-simple.
> >>
> >> That said, standard ghc dylinks to a system copy of Gmp anyways (I think
> >> ). Building ghc as a Dylib is orthogonal.
> >>
> >> -Carter
> >>
> >> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
> >>
> >> Excellent research - I?m curious if this is the right thread to inquire
> >> about the status of trying to link GHC itself dynamically.
> >>
> >> I?ve been attempting to do so with various LLVM versions (3.2, 3.3, 3.4)
> >> using snapshot builds of GHC (within the past week) from git, and I hit
> >> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time
> (even
> >> the exact same error message).
> >>
> >> I?m interested in dynamically linking GHC with LLVM to avoid the
> >> entanglement with libgmp?s license.
> >>
> >> If this is the wrong thread or if I should reply instead to the trac
> item,
> >> please let me know.
> >
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >
>
>
>
> --
> Regards,
>
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140107/fd788cfd/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Austin Seipp-5
Personally I'd be in favor of that to keep it easy, but there hasn't
really been any poll about what to do. For the most part it tends to
work fine, but I think it's the wrong thing to do in any case.

IMO the truly 'correct' thing to do, is not to rely on the system LLVM
at all, but a version specifically tested with and distributed with
GHC. This can be a private binary only we use. We already do this with
MinGW on Windows actually, because in practice, relying on versions
'in the wild' is somewhat troublesome. In our case, we really just
need the bitcode compiler and optimizer, which are pretty small
pieces.

Relying on a moving target like the system install or whatever
possible random XYZ install from SVN or (some derivative forked
toolchain!) is problematic for developers, and users invariably want
to try new combinations, which can break in subtle or odd ways.

I think it's more sensible and straightforward - for the vast majority
of users and use-cases - for us to pick version that is tested,
reliably works and optimizes code well, and ship that. Then users just
know '-fasm is faster for compiling, -fllvm will optimize better for
some code.' That's all they really need to know.

If LLVM is to be considered 'stable' for Tier 1 GHC platforms, I'm
sympathetic to Aaron's argument, and I'd say it should be held to the
same standards as the NCG. That means it should be considered a
reliable option and we should vet it to reasonable standards, even if
it's a bit more work.

It's just really hard to do that right now. But I think implementing
this wouldn't be difficult, it just has some sticky bits about how to
do it.

We can of course upgrade it over time - but I think trying to hit
moving targets in the wild is a bad long-term solution.



On Tue, Jan 7, 2014 at 3:07 PM, George Colpitts
<george.colpitts at gmail.com> wrote:

> wrt
>
> We support a wide range of LLVM versions
>
> Why can't we stop doing that and only support one or two, e.g. GHC 7.8 would
> only support llvm 3.3 and perhaps 3.4?
>
>
>
>
>
> On Tue, Jan 7, 2014 at 4:54 PM, Austin Seipp <austin at well-typed.com> wrote:
>>
>> Hi all,
>>
>> Apologies for the late reply.
>>
>> First off, one thing to note wrt GMP: GMP is an LGPL library which we
>> link against. Technically, we need to allow relinking to be compliant
>> and free of of the LGPL for our own executables, but this should be
>> reasonably possible - on systems where there is a system-wide GMP
>> installed, we use that copy (this occurs mostly on OSX and Linux.) And
>> so do executables compiled by GHC. Even when GHC uses static linking
>> or dynamic linking for haskell code in this case, it will still always
>> dynamically link to libgmp - meaning replacing the shared object
>> should be possible. This is just the way modern Linux/OSX systems
>> distribute system-wide C libraries, as you expect.
>>
>> In the case where we don't have this, we build our own copy of libgmp
>> inside the source tree and use that instead. That said there are other
>> reasons why we might want to be free of GMP entirely, but that's
>> neither here nor there. In any case, the issue is pretty orthogonal to
>> LLVM, dynamic haskell linking, etc - on a Linux system, you should
>> reasonably be able to swap out a `libgmp.so` for another modified
>> copy[1], and your Haskell programs should be compliant in this
>> regard.[2]
>>
>> Now, as for LLVM.
>>
>> For one, LLVM actually is a 'relatively' cheap backend to have around.
>> I say LLVM is 'relatively' cheap because All External Dependencies
>> Have A Cost. The code is reasonably small, and in any case GHC still
>> does most of the heavy lifting - the LLVM backend and native code
>> generator share a very large amount of code. We don't really duplicate
>> optimizations ourselves, for example, and some optimizations we do
>> perform on our IR can't be done by LLVM anyway (it doesn't have enough
>> information.)
>>
>> But LLVM has some very notable costs for GHC developers:
>>
>>   * It's slower to compile with, because it tries to re-optimize the
>> code we give it, but it mostly accomplishes nothing beyond advanced
>> optimizations like vectorization/scalar evolution.
>>   * We support a wide range of LLVM versions (a nightmare IMO) which
>> means pinning down specific versions and supporting them all is rather
>> difficult. Combined with e.g. distro maintainers who may patch bugs
>> themselves, and the things you're depending on in the wild (or what
>> users might report bugs with) aren't as solid or well understood.
>>   * LLVM is extremely large, extremely complex, and the amount of
>> people who can sensibly work on both GHC and LLVM are few and far
>> inbetween. So fixing these issues is time consuming, difficult, and
>> mostly tedious grunt work.
>>
>> All this basically sums up to the fact that dealing with LLVM comes
>> with complications all on its own that makes it a different kind of
>> beast to handle.
>>
>> So, the LLVM backend definitely needs some love. All of these things
>> are solveable (and I have some ideas for solving most of them,) but
>> none of them will quite come for free. But there are some real
>> improvements that can be made here I think, and make LLVM much more
>> smoothly supported for GHC itself. If you'd like to help it'd be
>> really appreciated - I'd like to see LLVM have more love put forth,
>> but it's a lot of work of course!.
>>
>> (Finally, in reference to the last point: I am in the obvious
>> minority, but I am favorable to having the native code generator
>> around, even if it's a bit old and crufty these days - at least it's
>> small, fast and simple enough to be grokked and hacked on, and I don't
>> think it fragments development all that much. By comparison, LLVM is a
>> mammoth beast of incredible size with a sizeable entry barrier IMO. I
>> think there's merit to having both a simple, 'obviously working'
>> option in addition to the heavy duty one.)
>>
>> [1] Relevant tool: http://nixos.org/patchelf.html
>> [2] Of course, IANAL, but there you go.
>>
>> On Wed, Jan 1, 2014 at 9:03 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> > Because I think it?s going to be an organizational issue and a
>> > duplication
>> > of effort if GHC is built one way but the future direction of LLVM is
>> > another.
>> >
>> > Imagine if GCC started developing a new engine and it didn?t work with
>> > one
>> > of the biggest, most regular consumers of GCC. Say, the Linux kernel, or
>> > itself. At first, the situation is optimistic - if this engine doesn?t
>> > work
>> > for the project that has the smartest, brightest GCC hackers potentially
>> > looking at it, then it should fix itself soon enough. Suppose the
>> > situation
>> > lingers though, and continues for months without fix. The new GCC
>> > backend
>> > starts to become the default, and the community around GCC advocates for
>> > end-users to use it to optimize code for their projects and it even
>> > becomes
>> > the default for some platforms, such as ARM.
>> >
>> > What I?ve described is analogous to the GHC situation - and the result
>> > is
>> > that GHC isn?t self-hosting on some platforms and the inertia that used
>> > to
>> > be behind the LLVM backend seems to have stagnated. Whereas LLVM used to
>> > be
>> > the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer
>> > have a
>> > lot of eyes on them and externally it seems like GHC has accepted a
>> > bifurcated approach for development.
>> >
>> > I dramatize the situation above, but there?s some truth to it. The LLVM
>> > backend needs some care and attention and if the majority of GHC devs
>> > can?t
>> > build GHC with LLVM, then that means the smartest, brightest GHC hackers
>> > won?t have their attention turned toward fixing those problems. If a
>> > patch
>> > to GHC-HEAD broke compilation for every backend, it would be fixed in
>> > short
>> > order. If a new version of GCC did not work with GHC, I can imagine it
>> > would
>> > be only hours before the first patches came in resolving the issue. On
>> > OS X
>> > Mavericks, an incompatibility with GHC has led to a swift reaction and
>> > strong support for resolving platform issues. The attention to the LLVM
>> > backend is visibly smaller, but I don?t know enough about the people
>> > working
>> > on GHC to know if it is actually smaller.
>> >
>> > The way I am trying to change this is by making it easier for people to
>> > start using GHC (by putting images on Docker.io) and, in the process,
>> > learning about GHC?s build process and trying to make things work for my
>> > own
>> > projects. The Docker image allows anyone with a Linux kernel to build
>> > and
>> > play with GHC HEAD. The information about building GHC yourself is
>> > difficult
>> > to approach and I found it hard to get started, and I want to improve
>> > that
>> > too, so I?m learning and asking questions.
>> >
>> > From: Carter Schonwald
>> > Sent: Wednesday, January 1, 2014 5:54 PM
>>
>> > To: Aaron Friel
>> > Cc: ghc-devs at haskell.org
>> >
>> > 7.8 should have working dylib support on the llvm backend. (i believe
>> > some
>> > of the relevant patches are in head already, though Ben Gamari can opine
>> > on
>> > that)
>> >
>> > why do you want ghc to be built with llvm? (i know i've tried myself in
>> > the
>> > past, and it should be doable with 7.8 using 7.8 soon too)
>> >
>> >
>> > On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> >>
>> >> Replying to include the email list. You?re right, the llvm backend and
>> >> the
>> >> gmp licensing issues are orthogonal - or should be. The problem is I
>> >> get
>> >> build errors when trying to build GHC with LLVM and dynamic libraries.
>> >>
>> >> The result is that I get a few different choices when producing a
>> >> platform
>> >> image for development, with some uncomfortable tradeoffs:
>> >>
>> >> LLVM-built GHC, dynamic libs - doesn?t build.
>> >> LLVM-built GHC, static libs - potential licensing oddities with me
>> >> shipping a statically linked ghc binary that is now gpled. I am not a
>> >> lawyer, but the situation makes me uncomfortable.
>> >> GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>> >> platforms shipping ghc binaries, but it means that one of the biggest
>> >> and
>> >> most critical users of the LLVM backend is neglecting it. It also
>> >> bifurcates
>> >> development resources for GHC. Optimization work is duplicated and
>> >> already
>> >> devs are getting into the uncomfortable position of suggesting to users
>> >> that
>> >> they should trust GHC to build your programs in a particular way, but
>> >> not
>> >> itself.
>> >> GCC/ASM-built GHC, static libs - worst of all possible worlds.
>> >>
>> >>
>> >> Because of this, the libgmp and llvm-backend issues aren?t entirely
>> >> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
>> >> compile #1.
>> >>
>> >> From: Carter Schonwald
>> >> Sent: Monday, December 30, 2013 1:05 PM
>>
>> >> To: Aaron Friel
>> >>
>> >> Good question but you forgot to email the mailing list too :-)
>> >>
>> >> Using llvm has nothing to do with Gmp. Use the native code gen (it's
>> >> simper) and integer-simple.
>> >>
>> >> That said, standard ghc dylinks to a system copy of Gmp anyways (I
>> >> think
>> >> ). Building ghc as a Dylib is orthogonal.
>> >>
>> >> -Carter
>> >>
>> >> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> >>
>> >> Excellent research - I?m curious if this is the right thread to inquire
>> >> about the status of trying to link GHC itself dynamically.
>> >>
>> >> I?ve been attempting to do so with various LLVM versions (3.2, 3.3,
>> >> 3.4)
>> >> using snapshot builds of GHC (within the past week) from git, and I hit
>> >> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time
>> >> (even
>> >> the exact same error message).
>> >>
>> >> I?m interested in dynamically linking GHC with LLVM to avoid the
>> >> entanglement with libgmp?s license.
>> >>
>> >> If this is the wrong thread or if I should reply instead to the trac
>> >> item,
>> >> please let me know.
>> >
>> >
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://www.haskell.org/mailman/listinfo/ghc-devs
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Austin Seipp, Haskell Consultant
>> Well-Typed LLP, http://www.well-typed.com/
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>



--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Carter Schonwald
well said points.  Theres a lot we can do, and i think I many of those
active in GHC have discussed various ideas to explore in this area for
after the ghc 7.8 release.

I believe someone did an experiment with llvm-general as an alternative ghc
backend a few months back, who was it who did that?
(llvm-general only makes sense for stage-2 ghc, but it does provide the
advantage of statically linking LLVM as a haskell lib.)


On Wed, Jan 8, 2014 at 3:01 AM, Austin Seipp <austin at well-typed.com> wrote:

> Personally I'd be in favor of that to keep it easy, but there hasn't
> really been any poll about what to do. For the most part it tends to
> work fine, but I think it's the wrong thing to do in any case.
>
> IMO the truly 'correct' thing to do, is not to rely on the system LLVM
> at all, but a version specifically tested with and distributed with
> GHC. This can be a private binary only we use. We already do this with
> MinGW on Windows actually, because in practice, relying on versions
> 'in the wild' is somewhat troublesome. In our case, we really just
> need the bitcode compiler and optimizer, which are pretty small
> pieces.
>
> Relying on a moving target like the system install or whatever
> possible random XYZ install from SVN or (some derivative forked
> toolchain!) is problematic for developers, and users invariably want
> to try new combinations, which can break in subtle or odd ways.
>
> I think it's more sensible and straightforward - for the vast majority
> of users and use-cases - for us to pick version that is tested,
> reliably works and optimizes code well, and ship that. Then users just
> know '-fasm is faster for compiling, -fllvm will optimize better for
> some code.' That's all they really need to know.
>
> If LLVM is to be considered 'stable' for Tier 1 GHC platforms, I'm
> sympathetic to Aaron's argument, and I'd say it should be held to the
> same standards as the NCG. That means it should be considered a
> reliable option and we should vet it to reasonable standards, even if
> it's a bit more work.
>
> It's just really hard to do that right now. But I think implementing
> this wouldn't be difficult, it just has some sticky bits about how to
> do it.
>
> We can of course upgrade it over time - but I think trying to hit
> moving targets in the wild is a bad long-term solution.
>
>
>
> On Tue, Jan 7, 2014 at 3:07 PM, George Colpitts
> <george.colpitts at gmail.com> wrote:
> > wrt
> >
> > We support a wide range of LLVM versions
> >
> > Why can't we stop doing that and only support one or two, e.g. GHC 7.8
> would
> > only support llvm 3.3 and perhaps 3.4?
> >
> >
> >
> >
> >
> > On Tue, Jan 7, 2014 at 4:54 PM, Austin Seipp <austin at well-typed.com>
> wrote:
> >>
> >> Hi all,
> >>
> >> Apologies for the late reply.
> >>
> >> First off, one thing to note wrt GMP: GMP is an LGPL library which we
> >> link against. Technically, we need to allow relinking to be compliant
> >> and free of of the LGPL for our own executables, but this should be
> >> reasonably possible - on systems where there is a system-wide GMP
> >> installed, we use that copy (this occurs mostly on OSX and Linux.) And
> >> so do executables compiled by GHC. Even when GHC uses static linking
> >> or dynamic linking for haskell code in this case, it will still always
> >> dynamically link to libgmp - meaning replacing the shared object
> >> should be possible. This is just the way modern Linux/OSX systems
> >> distribute system-wide C libraries, as you expect.
> >>
> >> In the case where we don't have this, we build our own copy of libgmp
> >> inside the source tree and use that instead. That said there are other
> >> reasons why we might want to be free of GMP entirely, but that's
> >> neither here nor there. In any case, the issue is pretty orthogonal to
> >> LLVM, dynamic haskell linking, etc - on a Linux system, you should
> >> reasonably be able to swap out a `libgmp.so` for another modified
> >> copy[1], and your Haskell programs should be compliant in this
> >> regard.[2]
> >>
> >> Now, as for LLVM.
> >>
> >> For one, LLVM actually is a 'relatively' cheap backend to have around.
> >> I say LLVM is 'relatively' cheap because All External Dependencies
> >> Have A Cost. The code is reasonably small, and in any case GHC still
> >> does most of the heavy lifting - the LLVM backend and native code
> >> generator share a very large amount of code. We don't really duplicate
> >> optimizations ourselves, for example, and some optimizations we do
> >> perform on our IR can't be done by LLVM anyway (it doesn't have enough
> >> information.)
> >>
> >> But LLVM has some very notable costs for GHC developers:
> >>
> >>   * It's slower to compile with, because it tries to re-optimize the
> >> code we give it, but it mostly accomplishes nothing beyond advanced
> >> optimizations like vectorization/scalar evolution.
> >>   * We support a wide range of LLVM versions (a nightmare IMO) which
> >> means pinning down specific versions and supporting them all is rather
> >> difficult. Combined with e.g. distro maintainers who may patch bugs
> >> themselves, and the things you're depending on in the wild (or what
> >> users might report bugs with) aren't as solid or well understood.
> >>   * LLVM is extremely large, extremely complex, and the amount of
> >> people who can sensibly work on both GHC and LLVM are few and far
> >> inbetween. So fixing these issues is time consuming, difficult, and
> >> mostly tedious grunt work.
> >>
> >> All this basically sums up to the fact that dealing with LLVM comes
> >> with complications all on its own that makes it a different kind of
> >> beast to handle.
> >>
> >> So, the LLVM backend definitely needs some love. All of these things
> >> are solveable (and I have some ideas for solving most of them,) but
> >> none of them will quite come for free. But there are some real
> >> improvements that can be made here I think, and make LLVM much more
> >> smoothly supported for GHC itself. If you'd like to help it'd be
> >> really appreciated - I'd like to see LLVM have more love put forth,
> >> but it's a lot of work of course!.
> >>
> >> (Finally, in reference to the last point: I am in the obvious
> >> minority, but I am favorable to having the native code generator
> >> around, even if it's a bit old and crufty these days - at least it's
> >> small, fast and simple enough to be grokked and hacked on, and I don't
> >> think it fragments development all that much. By comparison, LLVM is a
> >> mammoth beast of incredible size with a sizeable entry barrier IMO. I
> >> think there's merit to having both a simple, 'obviously working'
> >> option in addition to the heavy duty one.)
> >>
> >> [1] Relevant tool: http://nixos.org/patchelf.html
> >> [2] Of course, IANAL, but there you go.
> >>
> >> On Wed, Jan 1, 2014 at 9:03 PM, Aaron Friel <aaron at frieltek.com> wrote:
> >> > Because I think it?s going to be an organizational issue and a
> >> > duplication
> >> > of effort if GHC is built one way but the future direction of LLVM is
> >> > another.
> >> >
> >> > Imagine if GCC started developing a new engine and it didn?t work with
> >> > one
> >> > of the biggest, most regular consumers of GCC. Say, the Linux kernel,
> or
> >> > itself. At first, the situation is optimistic - if this engine doesn?t
> >> > work
> >> > for the project that has the smartest, brightest GCC hackers
> potentially
> >> > looking at it, then it should fix itself soon enough. Suppose the
> >> > situation
> >> > lingers though, and continues for months without fix. The new GCC
> >> > backend
> >> > starts to become the default, and the community around GCC advocates
> for
> >> > end-users to use it to optimize code for their projects and it even
> >> > becomes
> >> > the default for some platforms, such as ARM.
> >> >
> >> > What I?ve described is analogous to the GHC situation - and the result
> >> > is
> >> > that GHC isn?t self-hosting on some platforms and the inertia that
> used
> >> > to
> >> > be behind the LLVM backend seems to have stagnated. Whereas LLVM used
> to
> >> > be
> >> > the ?new hotness?, I?ve noticed that issues like Trac #7787 no longer
> >> > have a
> >> > lot of eyes on them and externally it seems like GHC has accepted a
> >> > bifurcated approach for development.
> >> >
> >> > I dramatize the situation above, but there?s some truth to it. The
> LLVM
> >> > backend needs some care and attention and if the majority of GHC devs
> >> > can?t
> >> > build GHC with LLVM, then that means the smartest, brightest GHC
> hackers
> >> > won?t have their attention turned toward fixing those problems. If a
> >> > patch
> >> > to GHC-HEAD broke compilation for every backend, it would be fixed in
> >> > short
> >> > order. If a new version of GCC did not work with GHC, I can imagine it
> >> > would
> >> > be only hours before the first patches came in resolving the issue. On
> >> > OS X
> >> > Mavericks, an incompatibility with GHC has led to a swift reaction and
> >> > strong support for resolving platform issues. The attention to the
> LLVM
> >> > backend is visibly smaller, but I don?t know enough about the people
> >> > working
> >> > on GHC to know if it is actually smaller.
> >> >
> >> > The way I am trying to change this is by making it easier for people
> to
> >> > start using GHC (by putting images on Docker.io) and, in the process,
> >> > learning about GHC?s build process and trying to make things work for
> my
> >> > own
> >> > projects. The Docker image allows anyone with a Linux kernel to build
> >> > and
> >> > play with GHC HEAD. The information about building GHC yourself is
> >> > difficult
> >> > to approach and I found it hard to get started, and I want to improve
> >> > that
> >> > too, so I?m learning and asking questions.
> >> >
> >> > From: Carter Schonwald
> >> > Sent: Wednesday, January 1, 2014 5:54 PM
> >>
> >> > To: Aaron Friel
> >> > Cc: ghc-devs at haskell.org
> >> >
> >> > 7.8 should have working dylib support on the llvm backend. (i believe
> >> > some
> >> > of the relevant patches are in head already, though Ben Gamari can
> opine
> >> > on
> >> > that)
> >> >
> >> > why do you want ghc to be built with llvm? (i know i've tried myself
> in
> >> > the
> >> > past, and it should be doable with 7.8 using 7.8 soon too)
> >> >
> >> >
> >> > On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com>
> wrote:
> >> >>
> >> >> Replying to include the email list. You?re right, the llvm backend
> and
> >> >> the
> >> >> gmp licensing issues are orthogonal - or should be. The problem is I
> >> >> get
> >> >> build errors when trying to build GHC with LLVM and dynamic
> libraries.
> >> >>
> >> >> The result is that I get a few different choices when producing a
> >> >> platform
> >> >> image for development, with some uncomfortable tradeoffs:
> >> >>
> >> >> LLVM-built GHC, dynamic libs - doesn?t build.
> >> >> LLVM-built GHC, static libs - potential licensing oddities with me
> >> >> shipping a statically linked ghc binary that is now gpled. I am not a
> >> >> lawyer, but the situation makes me uncomfortable.
> >> >> GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
> >> >> platforms shipping ghc binaries, but it means that one of the biggest
> >> >> and
> >> >> most critical users of the LLVM backend is neglecting it. It also
> >> >> bifurcates
> >> >> development resources for GHC. Optimization work is duplicated and
> >> >> already
> >> >> devs are getting into the uncomfortable position of suggesting to
> users
> >> >> that
> >> >> they should trust GHC to build your programs in a particular way, but
> >> >> not
> >> >> itself.
> >> >> GCC/ASM-built GHC, static libs - worst of all possible worlds.
> >> >>
> >> >>
> >> >> Because of this, the libgmp and llvm-backend issues aren?t entirely
> >> >> orthogonal. Trac ticket #7885 is exactly the issue I get when trying
> to
> >> >> compile #1.
> >> >>
> >> >> From: Carter Schonwald
> >> >> Sent: Monday, December 30, 2013 1:05 PM
> >>
> >> >> To: Aaron Friel
> >> >>
> >> >> Good question but you forgot to email the mailing list too :-)
> >> >>
> >> >> Using llvm has nothing to do with Gmp. Use the native code gen (it's
> >> >> simper) and integer-simple.
> >> >>
> >> >> That said, standard ghc dylinks to a system copy of Gmp anyways (I
> >> >> think
> >> >> ). Building ghc as a Dylib is orthogonal.
> >> >>
> >> >> -Carter
> >> >>
> >> >> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
> >> >>
> >> >> Excellent research - I?m curious if this is the right thread to
> inquire
> >> >> about the status of trying to link GHC itself dynamically.
> >> >>
> >> >> I?ve been attempting to do so with various LLVM versions (3.2, 3.3,
> >> >> 3.4)
> >> >> using snapshot builds of GHC (within the past week) from git, and I
> hit
> >> >> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every
> time
> >> >> (even
> >> >> the exact same error message).
> >> >>
> >> >> I?m interested in dynamically linking GHC with LLVM to avoid the
> >> >> entanglement with libgmp?s license.
> >> >>
> >> >> If this is the wrong thread or if I should reply instead to the trac
> >> >> item,
> >> >> please let me know.
> >> >
> >> >
> >> > _______________________________________________
> >> > ghc-devs mailing list
> >> > ghc-devs at haskell.org
> >> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> Austin Seipp, Haskell Consultant
> >> Well-Typed LLP, http://www.well-typed.com/
> >> _______________________________________________
> >> ghc-devs mailing list
> >> ghc-devs at haskell.org
> >> http://www.haskell.org/mailman/listinfo/ghc-devs
> >
> >
>
>
>
> --
> Regards,
>
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140108/51bdfbc9/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Simon Marlow-7
In reply to this post by Ben Gamari
On 27/12/13 20:21, Ben Gamari wrote:

> Simon Marlow <marlowsd at gmail.com> writes:
>
>> This sounds right to me.  Did you submit a patch?
>>
>> Note that dynamic linking with LLVM is likely to produce significantly
>> worse code that with the NCG right now, because the LLVM back end uses
>> dynamic references even for symbols in the same package, whereas the NCG
>> back-end uses direct static references for these.
>>
> Today with the help of Edward Yang I examined the code produced by the
> LLVM backend in light of this statement. I was surprised to find that
> LLVM's code appears to be no worse than the NCG with respect to
> intra-package references.
>
> My test case can be found here[2] and can be built with the included
> `build.sh` script. The test consists of two modules build into a shared
> library. One module, `LibTest`, exports a few simple members while the
> other module (`LibTest2`) defines members that consume them. Care is
> taken to ensure the members are not inlined.
>
> The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
> patches[1] I referred to in my last message. Please let me know if I've
> missed something.

This is good news, however what worries me is that I still don't
understand *why* you got these results.  Where in the LLVM backend is
the magic that does something special for intra-package references?  I
know where it is in the NCG backend - CLabel.labelDynamic - but I can't
see this function used at all in the LLVM backend.  So what is the
mechanism that lets LLVM optimise these calls?  Is it happening
magically in the linker, perhaps?  But that would only be possible when
using -Bsymbolic or -Bsymbolic-functions, which is a choice made at link
time.

As far as I can tell, all we do is pass a flag to llc to tell it to
compile for dynamic/PIC, in DriverPipeline.runPhase.

Cheers,
        Simon


>
>
> # Evaluation
>
> ## First example ##
>
> The first member is a simple `String` (defined in `LibTest`),
>
>      helloWorld :: String
>      helloWorld = "Hello World!"
>
> The use-site is quite straightforward,
>
>      testHelloWorld :: IO String
>      testHelloWorld = return helloWorld
>
> With `-O1` the code looks reasonable in both cases. Most importantly,
> both backends use IP relative addressing to find the symbol.
>
> ### LLVM ###
>
>      0000000000000ef8 <rKw_info>:
>           ef8: 48 8b 45 00           mov    0x0(%rbp),%rax
>           efc: 48 8d 1d cd 11 20 00 lea    0x2011cd(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>           f03: ff e0                 jmpq   *%rax
>
>      0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>           f28: eb ce                 jmp    ef8 <rKw_info>
>           f2a: 66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)
>
> ### NCG ###
>
>      0000000000000d58 <rH1_info>:
>       d58: 48 8d 1d 71 13 20 00 lea    0x201371(%rip),%rbx        # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>       d5f: ff 65 00             jmpq   *0x0(%rbp)
>
>      0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>       d88: eb ce                 jmp    d58 <rH1_info>
>
>
> With `-O0` the code is substantially longer but the relocation behavior
> is still correct, as one would expect.
>
> Looking at the definition of `helloWorld`[3] itself it becomes clear that
> the LLVM backend is more likely to use PLT relocations over GOT. In
> general, `stg_*` primitives are called through the PLT. As far as I can
> tell, both of these call mechanisms will incur two memory
> accesses. However, in the case of the PLT the call will consist of two
> JMPs whereas the GOT will consist of only one. Is this a cause for
> concern? Could these two jumps interfere with prediction?
>
> In general the LLVM backend produces a few more instructions than the
> NCG although this doesn't appear to be related to handling of
> relocations. For instance, the inexplicable (to me) `mov` at the
> beginning of LLVM's `rKw_info`.
>
>
> ## Second example ##
>
> The second example demonstrates an actual call,
>
>      -- Definition (in LibTest)
>      infoRef :: Int -> Int
>      infoRef n = n + 1
>
>      -- Call site
>      testInfoRef :: IO Int
>      testInfoRef = return (infoRef 2)
>
> With `-O1` this produces the following code,
>
> ### LLVM ###
>
>      0000000000000fb0 <rLy_info>:
>           fb0: 48 8b 45 00           mov    0x0(%rbp),%rax
>           fb4: 48 8d 1d a5 10 20 00 lea    0x2010a5(%rip),%rbx        # 202060 <rLx_closure>
>           fbb: ff e0                 jmpq   *%rax
>
>      0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>           fe0: eb ce                 jmp    fb0 <rLy_info>
>
> ### NCG ###
>
>      0000000000000e10 <rI3_info>:
>       e10: 48 8d 1d 51 12 20 00 lea    0x201251(%rip),%rbx        # 202068 <rI2_closure>
>       e17: ff 65 00             jmpq   *0x0(%rbp)
>
>      0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>       e40: eb ce                 jmp    e10 <rI3_info>
>
> Again, it seems that LLVM is a bit more verbose but seems to handle
> intra-package calls efficiently.
>
>
>
> [1] https://github.com/bgamari/ghc/commits/llvm-dynamic
> [2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
> [3] `helloWorld` definitions:
>
> LLVM:
>      00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>          10a8: 50                   push   %rax
>          10a9: 4c 8d 75 f0           lea    -0x10(%rbp),%r14
>          10ad: 4d 39 fe             cmp    %r15,%r14
>          10b0: 73 07                 jae    10b9 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
>          10b2: 49 8b 45 f0           mov    -0x10(%r13),%rax
>          10b6: 5a                   pop    %rdx
>          10b7: ff e0                 jmpq   *%rax
>          10b9: 4c 89 ef             mov    %r13,%rdi
>          10bc: 48 89 de             mov    %rbx,%rsi
>          10bf: e8 0c fd ff ff       callq  dd0 <newCAF at plt>
>          10c4: 48 85 c0             test   %rax,%rax
>          10c7: 74 22                 je     10eb <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
>          10c9: 48 8b 0d 18 0f 20 00 mov    0x200f18(%rip),%rcx        # 201fe8 <_DYNAMIC+0x228>
>          10d0: 48 89 4d f0           mov    %rcx,-0x10(%rbp)
>          10d4: 48 89 45 f8           mov    %rax,-0x8(%rbp)
>          10d8: 48 8d 05 21 00 00 00 lea    0x21(%rip),%rax        # 1100 <cJC_str>
>          10df: 4c 89 f5             mov    %r14,%rbp
>          10e2: 49 89 c6             mov    %rax,%r14
>          10e5: 58                   pop    %rax
>          10e6: e9 b5 fc ff ff       jmpq   da0 <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
>          10eb: 48 8b 03             mov    (%rbx),%rax
>          10ee: 5a                   pop    %rdx
>          10ef: ff e0                 jmpq   *%rax
>
>
> NCG:
>
>      0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>       ef8: 48 8d 45 f0           lea    -0x10(%rbp),%rax
>       efc: 4c 39 f8             cmp    %r15,%rax
>       eff: 72 3f                 jb     f40 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
>       f01: 4c 89 ef             mov    %r13,%rdi
>       f04: 48 89 de             mov    %rbx,%rsi
>       f07: 48 83 ec 08           sub    $0x8,%rsp
>       f0b: b8 00 00 00 00       mov    $0x0,%eax
>       f10: e8 1b fd ff ff       callq  c30 <newCAF at plt>
>       f15: 48 83 c4 08           add    $0x8,%rsp
>       f19: 48 85 c0             test   %rax,%rax
>       f1c: 74 20                 je     f3e <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
>       f1e: 48 8b 1d cb 10 20 00 mov    0x2010cb(%rip),%rbx        # 201ff0 <_DYNAMIC+0x238>
>       f25: 48 89 5d f0           mov    %rbx,-0x10(%rbp)
>       f29: 48 89 45 f8           mov    %rax,-0x8(%rbp)
>       f2d: 4c 8d 35 1c 00 00 00 lea    0x1c(%rip),%r14        # f50 <cGG_str>
>       f34: 48 83 c5 f0           add    $0xfffffffffffffff0,%rbp
>       f38: ff 25 7a 10 20 00     jmpq   *0x20107a(%rip)        # 201fb8 <_DYNAMIC+0x200>
>       f3e: ff 23                 jmpq   *(%rbx)
>       f40: 41 ff 65 f0           jmpq   *-0x10(%r13)
>


Reply | Threaded
Open this post in threaded view
|

LLVM and dynamic linking

Ben Gamari
Simon Marlow <marlowsd at gmail.com> writes:

> On 27/12/13 20:21, Ben Gamari wrote:
>> Simon Marlow <marlowsd at gmail.com> writes:
>>
>>> This sounds right to me.  Did you submit a patch?
>>>
>>> Note that dynamic linking with LLVM is likely to produce significantly
>>> worse code that with the NCG right now, because the LLVM back end uses
>>> dynamic references even for symbols in the same package, whereas the NCG
>>> back-end uses direct static references for these.
>>>
>> Today with the help of Edward Yang I examined the code produced by the
>> LLVM backend in light of this statement. I was surprised to find that
>> LLVM's code appears to be no worse than the NCG with respect to
>> intra-package references.
>>
>> My test case can be found here[2] and can be built with the included
>> `build.sh` script. The test consists of two modules build into a shared
>> library. One module, `LibTest`, exports a few simple members while the
>> other module (`LibTest2`) defines members that consume them. Care is
>> taken to ensure the members are not inlined.
>>
>> The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
>> patches[1] I referred to in my last message. Please let me know if I've
>> missed something.
>
> This is good news, however what worries me is that I still don't
> understand *why* you got these results.  Where in the LLVM backend is
> the magic that does something special for intra-package references?
>
As far as I can tell, the backend itself does nothing in particular to
handle this.

> I know where it is in the NCG backend - CLabel.labelDynamic - but I
> can't see this function used at all in the LLVM backend.
>
Right. For the record, I took a first stab at implementing[1]
the logic that I thought would needed to get LLVM to do
efficient dynamic linking before taking this measurement. I probably
should have reused more of the machinery used by the NCG however. I
don't believe I managed to get this code stable before dropping it when
I realized that LLVM already somehow did the right thing.

> So what is the mechanism that lets LLVM optimise these calls?  Is it
> happening magically in the linker, perhaps?  But that would only be
> possible when using -Bsymbolic or -Bsymbolic-functions, which is a
> choice made at link time.
>
This seems like the most likely explanation but given we don't pass this
flag I really don't see why the linker would do this. More research is
necessary it seems.

> As far as I can tell, all we do is pass a flag to llc to tell it to
> compile for dynamic/PIC, in DriverPipeline.runPhase.
>
Right. Very mysterious.

Cheers,

- Ben


[1] https://github.com/bgamari/ghc/tree/llvm-intra-package
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 489 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140108/97f1d7e5/attachment.sig>