Data.List.sortBy creates unnecessary thunks

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

Data.List.sortBy creates unnecessary thunks

Siddhanathan Shanmugam
Data.List.sortBy uses a DList-like representation to efficiently chunk a list, but the final conversion from DList to List is not strict. This leaves unnecessary suspensions while sorting. This was first noticed in https://github.com/ghc-proposals/ghc-proposals/pull/50, where sorting a random list with a naive merge sort is faster than the one in base.

Proposing the following change, which should improve performance and space usage.

diff --git a/libraries/base/Data/OldList.hs b/libraries/base/Data/OldList.hs
index c3618c42a9..a68c4f42af 100644
--- a/libraries/base/Data/OldList.hs
+++ b/libraries/base/Data/OldList.hs
@@ -851,7 +851,7 @@ sortBy cmp = mergeAll . sequences
     ascending a as (b:bs)
       | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
-    ascending a as bs   = as [a]: sequences bs
+    ascending a as bs   = as [a] `seq` as [a] : sequences bs
     mergeAll [x] = x
     mergeAll xs  = mergeAll (mergePairs xs)

And, as a side note, can we please add more type signatures and comments to the sort/sortBy function?


Libraries mailing list
[hidden email]