GHC fails to fuse [1 .. 30000000 :: Double] but fuses fromIntegral <$> [1 :: Int .. 30000000]?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

GHC fails to fuse [1 .. 30000000 :: Double] but fuses fromIntegral <$> [1 :: Int .. 30000000]?

Mateusz Kowalczyk
Hi,

I had few minutes timeout waiting for CI today and I stumbled upon [1].
This is a many years old question and the optimal solution given there
was using Vector. I thought that for such a problem it should not be
necessary at all.

Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution.
However the original signature was `mean :: [Double] -> Int`. After
trivial changes (change couple of places to Double and add a single
fromIntegral) I expected the same result: after all, code is nearly
exactly the same.

However to my disappointment, the code in question was much slower than
the vector solution: how could this be? Looking in Core, I saw that GHC
was not getting rid of [Double] like it was with [Int]. I was able to
convince GHC to go back to fusing the list into the worker with a `map
fromIntegral` but I am unhappy: why is this needed? I don't understand
why GHC would not decide to fuse the initial attempt. Honestly it seems
like a bug to me.

What are your thoughts? For reference here is the core with
fromIntegral: as expected, GHC fuses the list and inserts int2Double# as
it's generating it:

```
Main.$wgo =
  \ (w_s5xT :: GHC.Prim.Int#)
    (ww_s5xX :: GHC.Prim.Int#)
    (ww1_s5xY :: GHC.Prim.Double#) ->
    case w_s5xT of wild_Xz {
      __DEFAULT ->
        Main.$wgo
          (GHC.Prim.+# wild_Xz 1#)
          (GHC.Prim.+# ww_s5xX 1#)
          (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz));
      30000000# ->
        (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
    }
```

Contrast this with version that's commented out in [2]:

```
Main.$wgo =
  \ (w_s5vE :: [Double])
    (ww_s5vI :: GHC.Prim.Int#)
    (ww1_s5vJ :: GHC.Prim.Double#) ->
    case w_s5vE of _ [Occ=Dead] {
      [] -> (# ww_s5vI, ww1_s5vJ #);
      : y_a3dU ys_a3dV ->
        case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG ->
        Main.$wgo
          ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG)
        }
    }



    case Main.$wgo Main.main3 0# 0.0##

Main.main3 =
  GHC.Real.numericEnumFromTo
    @ Double
    GHC.Classes.$fOrdDouble
    GHC.Float.$fFractionalDouble
    Main.main5
    Main.main4

Main.main4 = GHC.Types.D# 3.0e7##

Main.main5 = GHC.Types.D# 1.0##
```

Why? There should be nothing stopping it from doing

```

Main.$wgo =
  \ (w_s5xT :: GHC.Prim.Double#)
    (ww_s5xX :: GHC.Prim.Int#)
    (ww1_s5xY :: GHC.Prim.Double#) ->
    case w_s5xT of wild_Xz {
      __DEFAULT ->
        Main.$wgo
          (GHC.Prim.+# wild_Xz 1#)
          (GHC.Prim.+# ww_s5xX 1#)
          (GHC.Prim.+## ww1_s5xY wild_Xz);
      3.0e7## ->
        (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
```

Perhaps it's afraid to make the pattern match on a floating point number?

Insights welcome.


[1]:
https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148
[2]: https://stackoverflow.com/a/45840148/1432740
--
Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: GHC fails to fuse [1 .. 30000000 :: Double] but fuses fromIntegral <$> [1 :: Int .. 30000000]?

Mateusz Kowalczyk
On 08/23/2017 02:00 PM, Mateusz Kowalczyk wrote:

> Hi,
>
> I had few minutes timeout waiting for CI today and I stumbled upon [1].
> This is a many years old question and the optimal solution given there
> was using Vector. I thought that for such a problem it should not be
> necessary at all.
>
> Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution.
> However the original signature was `mean :: [Double] -> Int`. After
> trivial changes (change couple of places to Double and add a single
> fromIntegral) I expected the same result: after all, code is nearly
> exactly the same.
>
> However to my disappointment, the code in question was much slower than
> the vector solution: how could this be? Looking in Core, I saw that GHC
> was not getting rid of [Double] like it was with [Int]. I was able to
> convince GHC to go back to fusing the list into the worker with a `map
> fromIntegral` but I am unhappy: why is this needed? I don't understand
> why GHC would not decide to fuse the initial attempt. Honestly it seems
> like a bug to me.
>
> What are your thoughts? For reference here is the core with
> fromIntegral: as expected, GHC fuses the list and inserts int2Double# as
> it's generating it:
>
> ```
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Int#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz));
>       30000000# ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
>     }
> ```
>
> Contrast this with version that's commented out in [2]:
>
> ```
> Main.$wgo =
>   \ (w_s5vE :: [Double])
>     (ww_s5vI :: GHC.Prim.Int#)
>     (ww1_s5vJ :: GHC.Prim.Double#) ->
>     case w_s5vE of _ [Occ=Dead] {
>       [] -> (# ww_s5vI, ww1_s5vJ #);
>       : y_a3dU ys_a3dV ->
>         case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG ->
>         Main.$wgo
>           ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG)
>         }
>     }
>
> …
>
>     case Main.$wgo Main.main3 0# 0.0##
> …
> Main.main3 =
>   GHC.Real.numericEnumFromTo
>     @ Double
>     GHC.Classes.$fOrdDouble
>     GHC.Float.$fFractionalDouble
>     Main.main5
>     Main.main4
> …
> Main.main4 = GHC.Types.D# 3.0e7##
> …
> Main.main5 = GHC.Types.D# 1.0##
> ```
>
> Why? There should be nothing stopping it from doing
>
> ```
>
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Double#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY wild_Xz);
>       3.0e7## ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
> ```
>
> Perhaps it's afraid to make the pattern match on a floating point number?
>
> Insights welcome.
>
>
> [1]:
> https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148
> [2]: https://stackoverflow.com/a/45840148/1432740
>

Correction: The signatures of mean I was referring to were supposed to
be ‘[Int] -> Double’ and then ‘[Double] -> Double’. Somewhat obviously.

--
Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: GHC fails to fuse [1 .. 30000000 :: Double] but fuses fromIntegral <$> [1 :: Int .. 30000000]?

Siddhanathan Shanmugam
I believe it is because of numericEnumFromTo which forces the elements of the list.


On Wed, Aug 23, 2017 at 10:54 AM, Mateusz Kowalczyk <[hidden email]> wrote:
On 08/23/2017 02:00 PM, Mateusz Kowalczyk wrote:
> Hi,
>
> I had few minutes timeout waiting for CI today and I stumbled upon [1].
> This is a many years old question and the optimal solution given there
> was using Vector. I thought that for such a problem it should not be
> necessary at all.
>
> Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution.
> However the original signature was `mean :: [Double] -> Int`. After
> trivial changes (change couple of places to Double and add a single
> fromIntegral) I expected the same result: after all, code is nearly
> exactly the same.
>
> However to my disappointment, the code in question was much slower than
> the vector solution: how could this be? Looking in Core, I saw that GHC
> was not getting rid of [Double] like it was with [Int]. I was able to
> convince GHC to go back to fusing the list into the worker with a `map
> fromIntegral` but I am unhappy: why is this needed? I don't understand
> why GHC would not decide to fuse the initial attempt. Honestly it seems
> like a bug to me.
>
> What are your thoughts? For reference here is the core with
> fromIntegral: as expected, GHC fuses the list and inserts int2Double# as
> it's generating it:
>
> ```
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Int#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz));
>       30000000# ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
>     }
> ```
>
> Contrast this with version that's commented out in [2]:
>
> ```
> Main.$wgo =
>   \ (w_s5vE :: [Double])
>     (ww_s5vI :: GHC.Prim.Int#)
>     (ww1_s5vJ :: GHC.Prim.Double#) ->
>     case w_s5vE of _ [Occ=Dead] {
>       [] -> (# ww_s5vI, ww1_s5vJ #);
>       : y_a3dU ys_a3dV ->
>         case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG ->
>         Main.$wgo
>           ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG)
>         }
>     }
>
> …
>
>     case Main.$wgo Main.main3 0# 0.0##
> …
> Main.main3 =
>   GHC.Real.numericEnumFromTo
>     @ Double
>     GHC.Classes.$fOrdDouble
>     GHC.Float.$fFractionalDouble
>     Main.main5
>     Main.main4
> …
> Main.main4 = GHC.Types.D# 3.0e7##
> …
> Main.main5 = GHC.Types.D# 1.0##
> ```
>
> Why? There should be nothing stopping it from doing
>
> ```
>
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Double#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY wild_Xz);
>       3.0e7## ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
> ```
>
> Perhaps it's afraid to make the pattern match on a floating point number?
>
> Insights welcome.
>
>
> [1]:
> https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148
> [2]: https://stackoverflow.com/a/45840148/1432740
>

Correction: The signatures of mean I was referring to were supposed to
be ‘[Int] -> Double’ and then ‘[Double] -> Double’. Somewhat obviously.

--
Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: GHC fails to fuse [1 .. 30000000 :: Double] but fuses fromIntegral <$> [1 :: Int .. 30000000]?

Joachim Breitner-2
In reply to this post by Mateusz Kowalczyk
Hi,

fusion can only happen when the appropriate rewrite rules have been
defined in the library. For Int, that is the case:
http://hackage.haskell.org/package/base-4.10.0.0/docs/src/GHC.Enum.html#line-457

But the enumeration for general numeric types, defined here:
http://hackage.haskell.org/package/base-4.10.0.0/docs/src/GHC.Real.html#numericEnumFrom
does not have such a setup.

Presumably it could be added, but enumerating doubles is bad style
anyways; see https://ghc.haskell.org/trac/ghc/ticket/8012

Greetings,
Joachim


Am Mittwoch, den 23.08.2017, 14:00 +0100 schrieb Mateusz Kowalczyk:

> Hi,
>
> I had few minutes timeout waiting for CI today and I stumbled upon [1].
> This is a many years old question and the optimal solution given there
> was using Vector. I thought that for such a problem it should not be
> necessary at all.
>
> Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution.
> However the original signature was `mean :: [Double] -> Int`. After
> trivial changes (change couple of places to Double and add a single
> fromIntegral) I expected the same result: after all, code is nearly
> exactly the same.
>
> However to my disappointment, the code in question was much slower than
> the vector solution: how could this be? Looking in Core, I saw that GHC
> was not getting rid of [Double] like it was with [Int]. I was able to
> convince GHC to go back to fusing the list into the worker with a `map
> fromIntegral` but I am unhappy: why is this needed? I don't understand
> why GHC would not decide to fuse the initial attempt. Honestly it seems
> like a bug to me.
>
> What are your thoughts? For reference here is the core with
> fromIntegral: as expected, GHC fuses the list and inserts int2Double# as
> it's generating it:
>
> ```
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Int#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz));
>       30000000# ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
>     }
> ```
>
> Contrast this with version that's commented out in [2]:
>
> ```
> Main.$wgo =
>   \ (w_s5vE :: [Double])
>     (ww_s5vI :: GHC.Prim.Int#)
>     (ww1_s5vJ :: GHC.Prim.Double#) ->
>     case w_s5vE of _ [Occ=Dead] {
>       [] -> (# ww_s5vI, ww1_s5vJ #);
>       : y_a3dU ys_a3dV ->
>         case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG ->
>         Main.$wgo
>           ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG)
>         }
>     }
>
> …
>
>     case Main.$wgo Main.main3 0# 0.0##
> …
> Main.main3 =
>   GHC.Real.numericEnumFromTo
>     @ Double
>     GHC.Classes.$fOrdDouble
>     GHC.Float.$fFractionalDouble
>     Main.main5
>     Main.main4
> …
> Main.main4 = GHC.Types.D# 3.0e7##
> …
> Main.main5 = GHC.Types.D# 1.0##
> ```
>
> Why? There should be nothing stopping it from doing
>
> ```
>
> Main.$wgo =
>   \ (w_s5xT :: GHC.Prim.Double#)
>     (ww_s5xX :: GHC.Prim.Int#)
>     (ww1_s5xY :: GHC.Prim.Double#) ->
>     case w_s5xT of wild_Xz {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xz 1#)
>           (GHC.Prim.+# ww_s5xX 1#)
>           (GHC.Prim.+## ww1_s5xY wild_Xz);
>       3.0e7## ->
>         (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #)
> ```
>
> Perhaps it's afraid to make the pattern match on a floating point number?
>
> Insights welcome.
>
>
> [1]:
> https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148
> [2]: https://stackoverflow.com/a/45840148/1432740
--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

--
Joachim “nomeata” Breitner
  [hidden email]
  https://www.joachim-breitner.de/

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (849 bytes) Download Attachment