Hi,
I had few minutes timeout waiting for CI today and I stumbled upon [1]. This is a many years old question and the optimal solution given there was using Vector. I thought that for such a problem it should not be necessary at all. Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution. However the original signature was `mean :: [Double] -> Int`. After trivial changes (change couple of places to Double and add a single fromIntegral) I expected the same result: after all, code is nearly exactly the same. However to my disappointment, the code in question was much slower than the vector solution: how could this be? Looking in Core, I saw that GHC was not getting rid of [Double] like it was with [Int]. I was able to convince GHC to go back to fusing the list into the worker with a `map fromIntegral` but I am unhappy: why is this needed? I don't understand why GHC would not decide to fuse the initial attempt. Honestly it seems like a bug to me. What are your thoughts? For reference here is the core with fromIntegral: as expected, GHC fuses the list and inserts int2Double# as it's generating it: ``` Main.$wgo = \ (w_s5xT :: GHC.Prim.Int#) (ww_s5xX :: GHC.Prim.Int#) (ww1_s5xY :: GHC.Prim.Double#) -> case w_s5xT of wild_Xz { __DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xz 1#) (GHC.Prim.+# ww_s5xX 1#) (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz)); 30000000# -> (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) } ``` Contrast this with version that's commented out in [2]: ``` Main.$wgo = \ (w_s5vE :: [Double]) (ww_s5vI :: GHC.Prim.Int#) (ww1_s5vJ :: GHC.Prim.Double#) -> case w_s5vE of _ [Occ=Dead] { [] -> (# ww_s5vI, ww1_s5vJ #); : y_a3dU ys_a3dV -> case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG -> Main.$wgo ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG) } } … case Main.$wgo Main.main3 0# 0.0## … Main.main3 = GHC.Real.numericEnumFromTo @ Double GHC.Classes.$fOrdDouble GHC.Float.$fFractionalDouble Main.main5 Main.main4 … Main.main4 = GHC.Types.D# 3.0e7## … Main.main5 = GHC.Types.D# 1.0## ``` Why? There should be nothing stopping it from doing ``` Main.$wgo = \ (w_s5xT :: GHC.Prim.Double#) (ww_s5xX :: GHC.Prim.Int#) (ww1_s5xY :: GHC.Prim.Double#) -> case w_s5xT of wild_Xz { __DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xz 1#) (GHC.Prim.+# ww_s5xX 1#) (GHC.Prim.+## ww1_s5xY wild_Xz); 3.0e7## -> (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) ``` Perhaps it's afraid to make the pattern match on a floating point number? Insights welcome. [1]: https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148 [2]: https://stackoverflow.com/a/45840148/1432740 -- Mateusz K. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
On 08/23/2017 02:00 PM, Mateusz Kowalczyk wrote:
> Hi, > > I had few minutes timeout waiting for CI today and I stumbled upon [1]. > This is a many years old question and the optimal solution given there > was using Vector. I thought that for such a problem it should not be > necessary at all. > > Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution. > However the original signature was `mean :: [Double] -> Int`. After > trivial changes (change couple of places to Double and add a single > fromIntegral) I expected the same result: after all, code is nearly > exactly the same. > > However to my disappointment, the code in question was much slower than > the vector solution: how could this be? Looking in Core, I saw that GHC > was not getting rid of [Double] like it was with [Int]. I was able to > convince GHC to go back to fusing the list into the worker with a `map > fromIntegral` but I am unhappy: why is this needed? I don't understand > why GHC would not decide to fuse the initial attempt. Honestly it seems > like a bug to me. > > What are your thoughts? For reference here is the core with > fromIntegral: as expected, GHC fuses the list and inserts int2Double# as > it's generating it: > > ``` > Main.$wgo = > \ (w_s5xT :: GHC.Prim.Int#) > (ww_s5xX :: GHC.Prim.Int#) > (ww1_s5xY :: GHC.Prim.Double#) -> > case w_s5xT of wild_Xz { > __DEFAULT -> > Main.$wgo > (GHC.Prim.+# wild_Xz 1#) > (GHC.Prim.+# ww_s5xX 1#) > (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz)); > 30000000# -> > (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) > } > ``` > > Contrast this with version that's commented out in [2]: > > ``` > Main.$wgo = > \ (w_s5vE :: [Double]) > (ww_s5vI :: GHC.Prim.Int#) > (ww1_s5vJ :: GHC.Prim.Double#) -> > case w_s5vE of _ [Occ=Dead] { > [] -> (# ww_s5vI, ww1_s5vJ #); > : y_a3dU ys_a3dV -> > case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG -> > Main.$wgo > ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG) > } > } > > … > > case Main.$wgo Main.main3 0# 0.0## > … > Main.main3 = > GHC.Real.numericEnumFromTo > @ Double > GHC.Classes.$fOrdDouble > GHC.Float.$fFractionalDouble > Main.main5 > Main.main4 > … > Main.main4 = GHC.Types.D# 3.0e7## > … > Main.main5 = GHC.Types.D# 1.0## > ``` > > Why? There should be nothing stopping it from doing > > ``` > > Main.$wgo = > \ (w_s5xT :: GHC.Prim.Double#) > (ww_s5xX :: GHC.Prim.Int#) > (ww1_s5xY :: GHC.Prim.Double#) -> > case w_s5xT of wild_Xz { > __DEFAULT -> > Main.$wgo > (GHC.Prim.+# wild_Xz 1#) > (GHC.Prim.+# ww_s5xX 1#) > (GHC.Prim.+## ww1_s5xY wild_Xz); > 3.0e7## -> > (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) > ``` > > Perhaps it's afraid to make the pattern match on a floating point number? > > Insights welcome. > > > [1]: > https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148 > [2]: https://stackoverflow.com/a/45840148/1432740 > Correction: The signatures of mean I was referring to were supposed to be ‘[Int] -> Double’ and then ‘[Double] -> Double’. Somewhat obviously. -- Mateusz K. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
I believe it is because of numericEnumFromTo which forces the elements of the list.
On Wed, Aug 23, 2017 at 10:54 AM, Mateusz Kowalczyk <[hidden email]> wrote:
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Mateusz Kowalczyk
Hi,
fusion can only happen when the appropriate rewrite rules have been defined in the library. For Int, that is the case: http://hackage.haskell.org/package/base-4.10.0.0/docs/src/GHC.Enum.html#line-457 But the enumeration for general numeric types, defined here: http://hackage.haskell.org/package/base-4.10.0.0/docs/src/GHC.Real.html#numericEnumFrom does not have such a setup. Presumably it could be added, but enumerating doubles is bad style anyways; see https://ghc.haskell.org/trac/ghc/ticket/8012 Greetings, Joachim Am Mittwoch, den 23.08.2017, 14:00 +0100 schrieb Mateusz Kowalczyk: > Hi, > > I had few minutes timeout waiting for CI today and I stumbled upon [1]. > This is a many years old question and the optimal solution given there > was using Vector. I thought that for such a problem it should not be > necessary at all. > > Indeed I wrote `mean :: [Int] -> Int` and got a much faster solution. > However the original signature was `mean :: [Double] -> Int`. After > trivial changes (change couple of places to Double and add a single > fromIntegral) I expected the same result: after all, code is nearly > exactly the same. > > However to my disappointment, the code in question was much slower than > the vector solution: how could this be? Looking in Core, I saw that GHC > was not getting rid of [Double] like it was with [Int]. I was able to > convince GHC to go back to fusing the list into the worker with a `map > fromIntegral` but I am unhappy: why is this needed? I don't understand > why GHC would not decide to fuse the initial attempt. Honestly it seems > like a bug to me. > > What are your thoughts? For reference here is the core with > fromIntegral: as expected, GHC fuses the list and inserts int2Double# as > it's generating it: > > ``` > Main.$wgo = > \ (w_s5xT :: GHC.Prim.Int#) > (ww_s5xX :: GHC.Prim.Int#) > (ww1_s5xY :: GHC.Prim.Double#) -> > case w_s5xT of wild_Xz { > __DEFAULT -> > Main.$wgo > (GHC.Prim.+# wild_Xz 1#) > (GHC.Prim.+# ww_s5xX 1#) > (GHC.Prim.+## ww1_s5xY (GHC.Prim.int2Double# wild_Xz)); > 30000000# -> > (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) > } > ``` > > Contrast this with version that's commented out in [2]: > > ``` > Main.$wgo = > \ (w_s5vE :: [Double]) > (ww_s5vI :: GHC.Prim.Int#) > (ww1_s5vJ :: GHC.Prim.Double#) -> > case w_s5vE of _ [Occ=Dead] { > [] -> (# ww_s5vI, ww1_s5vJ #); > : y_a3dU ys_a3dV -> > case y_a3dU of _ [Occ=Dead] { GHC.Types.D# y1_a3dG -> > Main.$wgo > ys_a3dV (GHC.Prim.+# ww_s5vI 1#) (GHC.Prim.+## ww1_s5vJ y1_a3dG) > } > } > > … > > case Main.$wgo Main.main3 0# 0.0## > … > Main.main3 = > GHC.Real.numericEnumFromTo > @ Double > GHC.Classes.$fOrdDouble > GHC.Float.$fFractionalDouble > Main.main5 > Main.main4 > … > Main.main4 = GHC.Types.D# 3.0e7## > … > Main.main5 = GHC.Types.D# 1.0## > ``` > > Why? There should be nothing stopping it from doing > > ``` > > Main.$wgo = > \ (w_s5xT :: GHC.Prim.Double#) > (ww_s5xX :: GHC.Prim.Int#) > (ww1_s5xY :: GHC.Prim.Double#) -> > case w_s5xT of wild_Xz { > __DEFAULT -> > Main.$wgo > (GHC.Prim.+# wild_Xz 1#) > (GHC.Prim.+# ww_s5xX 1#) > (GHC.Prim.+## ww1_s5xY wild_Xz); > 3.0e7## -> > (# GHC.Prim.+# ww_s5xX 1#, GHC.Prim.+## ww1_s5xY 3.0e7## #) > ``` > > Perhaps it's afraid to make the pattern match on a floating point number? > > Insights welcome. > > > [1]: > https://stackoverflow.com/questions/3300995/computing-the-mean-of-a-list-efficiently-in-haskell/45840148 > [2]: https://stackoverflow.com/a/45840148/1432740 [hidden email] http://www.joachim-breitner.de/ -- Joachim “nomeata” Breitner [hidden email] https://www.joachim-breitner.de/ _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. signature.asc (849 bytes) Download Attachment |
Free forum by Nabble | Edit this page |