Given a Cmm expression such as
(_c8Gq::F64) = call MO_F64_Sqrt(_s8oX::F64); // CmmUnsafeForeignCall the native code generator produces an actual call to the sqrt C function, which has the side-effect of causing all floating-point registers to be dumped as they are caller-saved. In the nbody benchmark, this is particularly bad for a rather hot piece of code (see below). Ideally the NCG would recognize this foreign call and instead use the `sqrtsd` SSE instruction when targeting x86-64. Does anyone know if the NCG can produce this instruction? I think it would be beneficial, as the below would turn into one or two instructions. Other math functions such as sin/cos require x87 FPU instructions, which as far as I know we're not using. ;;;;;;;;;;; ; NCG generates this in parts of the nbody benchmark ; to compute the sqrt ; subq $8,%rsp movsd %xmm9,176(%rsp) ; all floating-point registers movsd %xmm1,184(%rsp) ; are caller-saved in SysV ABI movsd %xmm2,192(%rsp) movsd %xmm3,200(%rsp) movq %rdi,208(%rsp) movq %rcx,216(%rsp) movq %rsi,224(%rsp) movsd %xmm4,232(%rsp) movsd %xmm5,240(%rsp) movsd %xmm6,248(%rsp) movsd %xmm7,256(%rsp) movsd %xmm8,264(%rsp) movsd %xmm11,272(%rsp) call _sqrt ;; the loads ;; below are interleaved ;; with computations addq $8,%rsp movsd 264(%rsp),%xmm1 movsd 240(%rsp),%xmm2 movsd 224(%rsp),%xmm2 movsd 232(%rsp),%xmm4 movq 200(%rsp),%rax movsd 248(%rsp),%xmm4 movsd 256(%rsp),%xmm4 movq 216(%rsp),%rcx movsd 192(%rsp),%xmm2 ~kavon _______________________________________________ ghc-devs mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs |
Hi Kavon, I looked a bit and it does not appear that there is an SSE sqrt in the native code gen. It should be easy to add (see a similar addition here: https://phabricator.haskell.org/D3265). The x87 version was available for 32-bit. I think if you use the LLVM backend it will give you the SSE sqrt. Ryan On Fri, Apr 28, 2017 at 9:27 AM, Kavon Farvardin <[hidden email]> wrote: Given a Cmm expression such as _______________________________________________ ghc-devs mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs |
Ryan Yates <[hidden email]> writes:
> Hi Kavon, > > I looked a bit and it does not appear that there is an SSE sqrt in the > native code gen. It should be easy to add (see a similar addition here: > https://phabricator.haskell.org/D3265). The x87 version was available for > 32-bit. I think if you use the LLVM backend it will give you the SSE sqrt. > Indeed. I pushed a starting point to D3508; I haven't validated it but there's a chance it will work. If not it would be great if someone could pick it up. Otherwise I'll return to it when I have time. Cheers, - Ben _______________________________________________ ghc-devs mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs signature.asc (497 bytes) Download Attachment |
Free forum by Nabble | Edit this page |