FFI-free NaN checks? (isDoubleNan and friends)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

FFI-free NaN checks? (isDoubleNan and friends)

Mateusz Kowalczyk
Hi,

Recently at a client I was profiling some code and isDoubleNaN lit up.
We were checking a lot of doubles for NaN as that's what customer would
send in.

I went to investigate and I found that FFI is used to achieve this. I
was always under the impression that FFI costs a little. I had at the
time replaced the code with a hack with great results:

```
isNaN' :: Double -> Bool
isNaN' d = d /= d
```

While this worked and provided good speedup in my case, this fails
catastrophically if the program is compiled with -ffast-math. This is
expected. I have since reverted it. Seeking an alternative solution I
have thought about re-implementing the C code with a native Haskell
version: after all it just checks a few bits. Apparently unsafeCoerce#
and friends were a big no-no but I found
https://phabricator.haskell.org/D3358 . I have implemented the code at
the bottom of this post. Obviously it's missing endianness (compile-time
switch).

This seems to be faster for smaller `mkInput` list than Prelude.isNaN
but slower slightly on the one below. The `/=` version is the fastest
but very fragile.

My question to you all is whether implementing a version of this
function in Haskell makes sense and if not, why not? The
stgDoubleToWord64 is implemented in CMM and I don't know anything about
the costs of that.

* Is there a cheaper alternative to FFI way?
* If yes, does anyone know how to write it such that it compiles to same
code but without the call overhead? I must have failed below as it's
slower on some inputs.

Basically if a faster way exists for isNaN, something I have to do a
lot, I'd love to hear about it.

I leave you with basic code I managed to come up with. 8.4.x only.


```
{-# LANGUAGE MagicHash    #-}
{-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
module Main (main) where

import GHC.Float
import GHC.Prim

isNaN' :: Double -> Bool
isNaN' d = d /= d

isNaNBits :: Double -> Bool
isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
  1# -> case bits `and#` mantissaMask of
    0## -> False
    _ -> True
  _ -> False
  where
    bits :: Word#
    bits = stgDoubleToWord64 d

    expMask, mantissaMask :: Word#
    expMask = 0x7FF0000000000000##
    mantissaMask = 0x000FFFFFFFFFFFFF##

main :: IO ()
main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
`seq` pure ()
  where
    nan :: Double
    nan = log (-1)

    mkInput :: Int -> [Double]
    mkInput n = take n $ cycle [1, nan]

    sumFilter :: (Double -> Bool) -> [Double] -> Double
    sumFilter p = Prelude.sum . Prelude.filter (not . p)
```

--
Mateusz K.
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: FFI-free NaN checks? (isDoubleNan and friends)

Brandon Allbery
If the FFI version is done with "safe", consider using "unsafe" instead. There are technical reasons why this is slightly incorrect, but unless you're fiddling with the CPU's FP control flags they're mostly irrelevant and you can treat isNaN as pure and non-side-effectful, significantly reducing the overhead. You may also be able to use "ccall" to take advantage of C compiler level optimizations, or simply to directly invoke a CPU-based test with asm(); but you'll need to hide that in a C preprocessor #define, so that it looks syntactically like a function call to the FFI.

(One of the technical reasons is that various OSes have been known to introduce bugs in their FP register and state handling across system calls, in which case the "safe" version may turn "complete FP chaos" into merely "wrong answer". It's your call whether, or which side, of this bothers you.)

On Mon, Mar 5, 2018 at 4:53 PM, Mateusz Kowalczyk <[hidden email]> wrote:
Hi,

Recently at a client I was profiling some code and isDoubleNaN lit up.
We were checking a lot of doubles for NaN as that's what customer would
send in.

I went to investigate and I found that FFI is used to achieve this. I
was always under the impression that FFI costs a little. I had at the
time replaced the code with a hack with great results:

```
isNaN' :: Double -> Bool
isNaN' d = d /= d
```

While this worked and provided good speedup in my case, this fails
catastrophically if the program is compiled with -ffast-math. This is
expected. I have since reverted it. Seeking an alternative solution I
have thought about re-implementing the C code with a native Haskell
version: after all it just checks a few bits. Apparently unsafeCoerce#
and friends were a big no-no but I found
https://phabricator.haskell.org/D3358 . I have implemented the code at
the bottom of this post. Obviously it's missing endianness (compile-time
switch).

This seems to be faster for smaller `mkInput` list than Prelude.isNaN
but slower slightly on the one below. The `/=` version is the fastest
but very fragile.

My question to you all is whether implementing a version of this
function in Haskell makes sense and if not, why not? The
stgDoubleToWord64 is implemented in CMM and I don't know anything about
the costs of that.

* Is there a cheaper alternative to FFI way?
* If yes, does anyone know how to write it such that it compiles to same
code but without the call overhead? I must have failed below as it's
slower on some inputs.

Basically if a faster way exists for isNaN, something I have to do a
lot, I'd love to hear about it.

I leave you with basic code I managed to come up with. 8.4.x only.


```
{-# LANGUAGE MagicHash    #-}
{-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
module Main (main) where

import GHC.Float
import GHC.Prim

isNaN' :: Double -> Bool
isNaN' d = d /= d

isNaNBits :: Double -> Bool
isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
  1# -> case bits `and#` mantissaMask of
    0## -> False
    _ -> True
  _ -> False
  where
    bits :: Word#
    bits = stgDoubleToWord64 d

    expMask, mantissaMask :: Word#
    expMask = 0x7FF0000000000000##
    mantissaMask = 0x000FFFFFFFFFFFFF##

main :: IO ()
main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
`seq` pure ()
  where
    nan :: Double
    nan = log (-1)

    mkInput :: Int -> [Double]
    mkInput n = take n $ cycle [1, nan]

    sumFilter :: (Double -> Bool) -> [Double] -> Double
    sumFilter p = Prelude.sum . Prelude.filter (not . p)
```

--
Mateusz K.
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: FFI-free NaN checks? (isDoubleNan and friends)

Mateusz Kowalczyk
On 03/05/2018 10:23 PM, Brandon Allbery wrote:

> If the FFI version is done with "safe", consider using "unsafe" instead.
> There are technical reasons why this is slightly incorrect, but unless
> you're fiddling with the CPU's FP control flags they're mostly irrelevant
> and you can treat isNaN as pure and non-side-effectful, significantly
> reducing the overhead. You may also be able to use "ccall" to take
> advantage of C compiler level optimizations, or simply to directly invoke a
> CPU-based test with asm(); but you'll need to hide that in a C preprocessor
> #define, so that it looks syntactically like a function call to the FFI.
>
> (One of the technical reasons is that various OSes have been known to
> introduce bugs in their FP register and state handling across system calls,
> in which case the "safe" version may turn "complete FP chaos" into merely
> "wrong answer". It's your call whether, or which side, of this bothers you.)

Perhaps I was a little unclear. The FFI-using isDoubleNaN is something
GHC does!

```
libraries/base/GHC/Float.hs:foreign import ccall unsafe "isDoubleNaN"
isDoubleNaN :: Double -> Int
```
```
HsInt
isDoubleNaN(HsDouble d)
{
  union stg_ieee754_dbl u;

  u.d = d;

  return (
    u.ieee.exponent  == 2047 /* 2^11 - 1 */ &&  /* Is the exponent all
ones? */
    (u.ieee.mantissa0 != 0 || u.ieee.mantissa1 != 0)
        /* and the mantissa non-zero? */
    );
}
```

My question is whether it could do better by not doing FFI and instead
computing natively and if not, why not?

> On Mon, Mar 5, 2018 at 4:53 PM, Mateusz Kowalczyk <[hidden email]>
> wrote:
>
>> Hi,
>>
>> Recently at a client I was profiling some code and isDoubleNaN lit up.
>> We were checking a lot of doubles for NaN as that's what customer would
>> send in.
>>
>> I went to investigate and I found that FFI is used to achieve this. I
>> was always under the impression that FFI costs a little. I had at the
>> time replaced the code with a hack with great results:
>>
>> ```
>> isNaN' :: Double -> Bool
>> isNaN' d = d /= d
>> ```
>>
>> While this worked and provided good speedup in my case, this fails
>> catastrophically if the program is compiled with -ffast-math. This is
>> expected. I have since reverted it. Seeking an alternative solution I
>> have thought about re-implementing the C code with a native Haskell
>> version: after all it just checks a few bits. Apparently unsafeCoerce#
>> and friends were a big no-no but I found
>> https://phabricator.haskell.org/D3358 . I have implemented the code at
>> the bottom of this post. Obviously it's missing endianness (compile-time
>> switch).
>>
>> This seems to be faster for smaller `mkInput` list than Prelude.isNaN
>> but slower slightly on the one below. The `/=` version is the fastest
>> but very fragile.
>>
>> My question to you all is whether implementing a version of this
>> function in Haskell makes sense and if not, why not? The
>> stgDoubleToWord64 is implemented in CMM and I don't know anything about
>> the costs of that.
>>
>> * Is there a cheaper alternative to FFI way?
>> * If yes, does anyone know how to write it such that it compiles to same
>> code but without the call overhead? I must have failed below as it's
>> slower on some inputs.
>>
>> Basically if a faster way exists for isNaN, something I have to do a
>> lot, I'd love to hear about it.
>>
>> I leave you with basic code I managed to come up with. 8.4.x only.
>>
>>
>> ```
>> {-# LANGUAGE MagicHash    #-}
>> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
>> module Main (main) where
>>
>> import GHC.Float
>> import GHC.Prim
>>
>> isNaN' :: Double -> Bool
>> isNaN' d = d /= d
>>
>> isNaNBits :: Double -> Bool
>> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
>>   1# -> case bits `and#` mantissaMask of
>>     0## -> False
>>     _ -> True
>>   _ -> False
>>   where
>>     bits :: Word#
>>     bits = stgDoubleToWord64 d
>>
>>     expMask, mantissaMask :: Word#
>>     expMask = 0x7FF0000000000000##
>>     mantissaMask = 0x000FFFFFFFFFFFFF##
>>
>> main :: IO ()
>> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
>> `seq` pure ()
>>   where
>>     nan :: Double
>>     nan = log (-1)
>>
>>     mkInput :: Int -> [Double]
>>     mkInput n = take n $ cycle [1, nan]
>>
>>     sumFilter :: (Double -> Bool) -> [Double] -> Double
>>     sumFilter p = Prelude.sum . Prelude.filter (not . p)
>> ```
>>
>> --
>> Mateusz K.
>> _______________________________________________
>> ghc-devs mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
>
>


--
Mateusz K.
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: FFI-free NaN checks? (isDoubleNan and friends)

Brandon Allbery
I'd in general expect good C code to optimize a little better; and in particular, decomposing an IEEE float is almost certainly more expensive in Haskell than in C, because unions let you cheat. (And I recall looking at the implementation of decodeFloat once; it's significantly longer than that C.) But I have to wonder if that code would be better done with something more native; the implementation may be a portable default, and you might be able to find something x86-specific that is faster.

On Tue, Mar 6, 2018 at 5:35 AM, Mateusz Kowalczyk <[hidden email]> wrote:
On 03/05/2018 10:23 PM, Brandon Allbery wrote:
> If the FFI version is done with "safe", consider using "unsafe" instead.
> There are technical reasons why this is slightly incorrect, but unless
> you're fiddling with the CPU's FP control flags they're mostly irrelevant
> and you can treat isNaN as pure and non-side-effectful, significantly
> reducing the overhead. You may also be able to use "ccall" to take
> advantage of C compiler level optimizations, or simply to directly invoke a
> CPU-based test with asm(); but you'll need to hide that in a C preprocessor
> #define, so that it looks syntactically like a function call to the FFI.
>
> (One of the technical reasons is that various OSes have been known to
> introduce bugs in their FP register and state handling across system calls,
> in which case the "safe" version may turn "complete FP chaos" into merely
> "wrong answer". It's your call whether, or which side, of this bothers you.)

Perhaps I was a little unclear. The FFI-using isDoubleNaN is something
GHC does!

```
libraries/base/GHC/Float.hs:foreign import ccall unsafe "isDoubleNaN"
isDoubleNaN :: Double -> Int
```
```
HsInt
isDoubleNaN(HsDouble d)
{
  union stg_ieee754_dbl u;

  u.d = d;

  return (
    u.ieee.exponent  == 2047 /* 2^11 - 1 */ &&  /* Is the exponent all
ones? */
    (u.ieee.mantissa0 != 0 || u.ieee.mantissa1 != 0)
        /* and the mantissa non-zero? */
    );
}
```

My question is whether it could do better by not doing FFI and instead
computing natively and if not, why not?

> On Mon, Mar 5, 2018 at 4:53 PM, Mateusz Kowalczyk <[hidden email]>
> wrote:
>
>> Hi,
>>
>> Recently at a client I was profiling some code and isDoubleNaN lit up.
>> We were checking a lot of doubles for NaN as that's what customer would
>> send in.
>>
>> I went to investigate and I found that FFI is used to achieve this. I
>> was always under the impression that FFI costs a little. I had at the
>> time replaced the code with a hack with great results:
>>
>> ```
>> isNaN' :: Double -> Bool
>> isNaN' d = d /= d
>> ```
>>
>> While this worked and provided good speedup in my case, this fails
>> catastrophically if the program is compiled with -ffast-math. This is
>> expected. I have since reverted it. Seeking an alternative solution I
>> have thought about re-implementing the C code with a native Haskell
>> version: after all it just checks a few bits. Apparently unsafeCoerce#
>> and friends were a big no-no but I found
>> https://phabricator.haskell.org/D3358 . I have implemented the code at
>> the bottom of this post. Obviously it's missing endianness (compile-time
>> switch).
>>
>> This seems to be faster for smaller `mkInput` list than Prelude.isNaN
>> but slower slightly on the one below. The `/=` version is the fastest
>> but very fragile.
>>
>> My question to you all is whether implementing a version of this
>> function in Haskell makes sense and if not, why not? The
>> stgDoubleToWord64 is implemented in CMM and I don't know anything about
>> the costs of that.
>>
>> * Is there a cheaper alternative to FFI way?
>> * If yes, does anyone know how to write it such that it compiles to same
>> code but without the call overhead? I must have failed below as it's
>> slower on some inputs.
>>
>> Basically if a faster way exists for isNaN, something I have to do a
>> lot, I'd love to hear about it.
>>
>> I leave you with basic code I managed to come up with. 8.4.x only.
>>
>>
>> ```
>> {-# LANGUAGE MagicHash    #-}
>> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
>> module Main (main) where
>>
>> import GHC.Float
>> import GHC.Prim
>>
>> isNaN' :: Double -> Bool
>> isNaN' d = d /= d
>>
>> isNaNBits :: Double -> Bool
>> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
>>   1# -> case bits `and#` mantissaMask of
>>     0## -> False
>>     _ -> True
>>   _ -> False
>>   where
>>     bits :: Word#
>>     bits = stgDoubleToWord64 d
>>
>>     expMask, mantissaMask :: Word#
>>     expMask = 0x7FF0000000000000##
>>     mantissaMask = 0x000FFFFFFFFFFFFF##
>>
>> main :: IO ()
>> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
>> `seq` pure ()
>>   where
>>     nan :: Double
>>     nan = log (-1)
>>
>>     mkInput :: Int -> [Double]
>>     mkInput n = take n $ cycle [1, nan]
>>
>>     sumFilter :: (Double -> Bool) -> [Double] -> Double
>>     sumFilter p = Prelude.sum . Prelude.filter (not . p)
>> ```
>>
>> --
>> Mateusz K.
>> _______________________________________________
>> ghc-devs mailing list
>> [hidden email]
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
>
>


--
Mateusz K.



--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: FFI-free NaN checks? (isDoubleNan and friends)

Mateusz Kowalczyk
On 03/06/2018 10:43 AM, Brandon Allbery wrote:
> I'd in general expect good C code to optimize a little better; and in
> particular, decomposing an IEEE float is almost certainly more expensive in
> Haskell than in C, because unions let you cheat. (And I recall looking at
> the implementation of decodeFloat once; it's significantly longer than that
> C.) But I have to wonder if that code would be better done with something
> more native; the implementation may be a portable default, and you might be
> able to find something x86-specific that is faster.

There's a https://c9x.me/x86/html/file_module_x86_id_316.html that the
‘d /= d’ way compiles to. I suppose maybe I could just keep using that
and fall back onto isDoubleNaN if __FAST_MATH__ is set…

> On Tue, Mar 6, 2018 at 5:35 AM, Mateusz Kowalczyk <[hidden email]>
> wrote:
>
>> On 03/05/2018 10:23 PM, Brandon Allbery wrote:
>>> If the FFI version is done with "safe", consider using "unsafe" instead.
>>> There are technical reasons why this is slightly incorrect, but unless
>>> you're fiddling with the CPU's FP control flags they're mostly irrelevant
>>> and you can treat isNaN as pure and non-side-effectful, significantly
>>> reducing the overhead. You may also be able to use "ccall" to take
>>> advantage of C compiler level optimizations, or simply to directly
>> invoke a
>>> CPU-based test with asm(); but you'll need to hide that in a C
>> preprocessor
>>> #define, so that it looks syntactically like a function call to the FFI.
>>>
>>> (One of the technical reasons is that various OSes have been known to
>>> introduce bugs in their FP register and state handling across system
>> calls,
>>> in which case the "safe" version may turn "complete FP chaos" into merely
>>> "wrong answer". It's your call whether, or which side, of this bothers
>> you.)
>>
>> Perhaps I was a little unclear. The FFI-using isDoubleNaN is something
>> GHC does!
>>
>> ```
>> libraries/base/GHC/Float.hs:foreign import ccall unsafe "isDoubleNaN"
>> isDoubleNaN :: Double -> Int
>> ```
>> ```
>> HsInt
>> isDoubleNaN(HsDouble d)
>> {
>>   union stg_ieee754_dbl u;
>>
>>   u.d = d;
>>
>>   return (
>>     u.ieee.exponent  == 2047 /* 2^11 - 1 */ &&  /* Is the exponent all
>> ones? */
>>     (u.ieee.mantissa0 != 0 || u.ieee.mantissa1 != 0)
>>         /* and the mantissa non-zero? */
>>     );
>> }
>> ```
>>
>> My question is whether it could do better by not doing FFI and instead
>> computing natively and if not, why not?
>>
>>> On Mon, Mar 5, 2018 at 4:53 PM, Mateusz Kowalczyk <
>> [hidden email]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Recently at a client I was profiling some code and isDoubleNaN lit up.
>>>> We were checking a lot of doubles for NaN as that's what customer would
>>>> send in.
>>>>
>>>> I went to investigate and I found that FFI is used to achieve this. I
>>>> was always under the impression that FFI costs a little. I had at the
>>>> time replaced the code with a hack with great results:
>>>>
>>>> ```
>>>> isNaN' :: Double -> Bool
>>>> isNaN' d = d /= d
>>>> ```
>>>>
>>>> While this worked and provided good speedup in my case, this fails
>>>> catastrophically if the program is compiled with -ffast-math. This is
>>>> expected. I have since reverted it. Seeking an alternative solution I
>>>> have thought about re-implementing the C code with a native Haskell
>>>> version: after all it just checks a few bits. Apparently unsafeCoerce#
>>>> and friends were a big no-no but I found
>>>> https://phabricator.haskell.org/D3358 . I have implemented the code at
>>>> the bottom of this post. Obviously it's missing endianness (compile-time
>>>> switch).
>>>>
>>>> This seems to be faster for smaller `mkInput` list than Prelude.isNaN
>>>> but slower slightly on the one below. The `/=` version is the fastest
>>>> but very fragile.
>>>>
>>>> My question to you all is whether implementing a version of this
>>>> function in Haskell makes sense and if not, why not? The
>>>> stgDoubleToWord64 is implemented in CMM and I don't know anything about
>>>> the costs of that.
>>>>
>>>> * Is there a cheaper alternative to FFI way?
>>>> * If yes, does anyone know how to write it such that it compiles to same
>>>> code but without the call overhead? I must have failed below as it's
>>>> slower on some inputs.
>>>>
>>>> Basically if a faster way exists for isNaN, something I have to do a
>>>> lot, I'd love to hear about it.
>>>>
>>>> I leave you with basic code I managed to come up with. 8.4.x only.
>>>>
>>>>
>>>> ```
>>>> {-# LANGUAGE MagicHash    #-}
>>>> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm
>> #-}
>>>> module Main (main) where
>>>>
>>>> import GHC.Float
>>>> import GHC.Prim
>>>>
>>>> isNaN' :: Double -> Bool
>>>> isNaN' d = d /= d
>>>>
>>>> isNaNBits :: Double -> Bool
>>>> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
>>>>   1# -> case bits `and#` mantissaMask of
>>>>     0## -> False
>>>>     _ -> True
>>>>   _ -> False
>>>>   where
>>>>     bits :: Word#
>>>>     bits = stgDoubleToWord64 d
>>>>
>>>>     expMask, mantissaMask :: Word#
>>>>     expMask = 0x7FF0000000000000##
>>>>     mantissaMask = 0x000FFFFFFFFFFFFF##
>>>>
>>>> main :: IO ()
>>>> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
>>>> `seq` pure ()
>>>>   where
>>>>     nan :: Double
>>>>     nan = log (-1)
>>>>
>>>>     mkInput :: Int -> [Double]
>>>>     mkInput n = take n $ cycle [1, nan]
>>>>
>>>>     sumFilter :: (Double -> Bool) -> [Double] -> Double
>>>>     sumFilter p = Prelude.sum . Prelude.filter (not . p)
>>>> ```
>>>>
>>>> --
>>>> Mateusz K.
>>>> _______________________________________________
>>>> ghc-devs mailing list
>>>> [hidden email]
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Mateusz K.
>>
>
>
>


--
Mateusz K.
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: FFI-free NaN checks? (isDoubleNan and friends)

Sylvain Henry-2
In reply to this post by Mateusz Kowalczyk
Hi,

You can try with foreign primops, it should be faster than the FFI:

In IsDoubleNanPrim.s:

.global isDoubleNan_prim
isDoubleNan_prim:
    xor %rbx,%rbx
    ucomisd %xmm1, %xmm1
    lahf
    testb $68, %ah
    jnp .Lout
    mov $1, %rbx
.Lout:
    jmp * (%rbp)


In IsDoubleNan.hs:

{-# LANGUAGE GHCForeignImportPrim #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnliftedFFITypes #-}

module Main where

import GHC.Base

foreign import prim "isDoubleNan_prim" isDoubleNan_prim :: Double# -> Int#

isDoubleNan :: Double -> Bool
isDoubleNan (D# d#) = case isDoubleNan_prim d# of
    0# -> False
    _  -> True

main :: IO ()
main = do
    let testNaN x = putStrLn $ "Testing " ++ show x ++ ": " ++ show
(isDoubleNan x)
    testNaN 10.3
    testNaN (0/0)

Compile with: ghc -Wall -O IsDoubleNan.hs IsDoubleNanPrim.s

I haven't benchmarked this but I would be interested to see the
comparison with the other versions on your benchmarks!

Cheers,
Sylvain


On 05/03/2018 22:53, Mateusz Kowalczyk wrote:

> Hi,
>
> Recently at a client I was profiling some code and isDoubleNaN lit up.
> We were checking a lot of doubles for NaN as that's what customer would
> send in.
>
> I went to investigate and I found that FFI is used to achieve this. I
> was always under the impression that FFI costs a little. I had at the
> time replaced the code with a hack with great results:
>
> ```
> isNaN' :: Double -> Bool
> isNaN' d = d /= d
> ```
>
> While this worked and provided good speedup in my case, this fails
> catastrophically if the program is compiled with -ffast-math. This is
> expected. I have since reverted it. Seeking an alternative solution I
> have thought about re-implementing the C code with a native Haskell
> version: after all it just checks a few bits. Apparently unsafeCoerce#
> and friends were a big no-no but I found
> https://phabricator.haskell.org/D3358 . I have implemented the code at
> the bottom of this post. Obviously it's missing endianness (compile-time
> switch).
>
> This seems to be faster for smaller `mkInput` list than Prelude.isNaN
> but slower slightly on the one below. The `/=` version is the fastest
> but very fragile.
>
> My question to you all is whether implementing a version of this
> function in Haskell makes sense and if not, why not? The
> stgDoubleToWord64 is implemented in CMM and I don't know anything about
> the costs of that.
>
> * Is there a cheaper alternative to FFI way?
> * If yes, does anyone know how to write it such that it compiles to same
> code but without the call overhead? I must have failed below as it's
> slower on some inputs.
>
> Basically if a faster way exists for isNaN, something I have to do a
> lot, I'd love to hear about it.
>
> I leave you with basic code I managed to come up with. 8.4.x only.
>
>
> ```
> {-# LANGUAGE MagicHash    #-}
> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
> module Main (main) where
>
> import GHC.Float
> import GHC.Prim
>
> isNaN' :: Double -> Bool
> isNaN' d = d /= d
>
> isNaNBits :: Double -> Bool
> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
>    1# -> case bits `and#` mantissaMask of
>      0## -> False
>      _ -> True
>    _ -> False
>    where
>      bits :: Word#
>      bits = stgDoubleToWord64 d
>
>      expMask, mantissaMask :: Word#
>      expMask = 0x7FF0000000000000##
>      mantissaMask = 0x000FFFFFFFFFFFFF##
>
> main :: IO ()
> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
> `seq` pure ()
>    where
>      nan :: Double
>      nan = log (-1)
>
>      mkInput :: Int -> [Double]
>      mkInput n = take n $ cycle [1, nan]
>
>      sumFilter :: (Double -> Bool) -> [Double] -> Double
>      sumFilter p = Prelude.sum . Prelude.filter (not . p)
> ```
>

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs