Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Gregory Popovitch

Motivation:
----------

Data.List.sort is a very important functionality in Haskell. I believe that
the proposed implementation is:

- significantly faster than the current implementation on unsorted lists,
typically 14% to 27% faster
- more laziness-friendly, i.e.:
    take 3 $ sort l
  will require significantly less comparisons than the current
implementation

Proposed Implementation
-----------------------

sort :: (Ord a) => [a] -> [a]
sort =  sortBy compare

sortBy cmp [] = []
sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' : reduce t
                         where v' = merge v w
                               x' = merge x y

    reduce (x:y:t) = merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys')
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


Effect and Interactions
-----------------------

I have a stack project with a criterion test for this new implementation,
available at https://github.com/greg7mdp/ghc-sort.
I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and had the following
results:

- sorting of random lists of integers is 27% faster
- sorting of random lists of strings is 14% faster
- sorting of already sorted lists is significantly slower, but still much
faster than sorting random lists
- proposed version is more laziness friendly. For example this version of
sortBy requires 11 comparisons to find
  the smallest element of a 15 element list, while the default
Data.List.sortBy requires 15 comparisons.
  (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs)


Test results
------------

Criterion output (descending/ascending results are for already sorted
lists).
I barely understand what Criterion does, and I am puzzled with the various
"T" output - maybe there is a bug in my bench code:

vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec ghc-sort
benchmarking ascending ints/ghc
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 160.6 ms   (153.4
ms .. 167.8 ms)
                     0.997 R²   (0.986 R² .. 1.000 R²)
mean                 161.7 ms   (158.3 ms .. 165.9 ms)
std dev              5.210 ms   (3.193 ms .. 7.006 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking ascending ints/greg
TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms .. 554.9 ms)
                     0.996 R²   (0.987 R² .. 1.000 R²)
mean                 466.2 ms   (449.0 ms .. 475.0 ms)
std dev              14.94 ms   (0.0 s .. 15.29 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking descending ints/ghc
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 165.1 ms   (148.2
ms .. 178.2 ms)
                     0.991 R²   (0.957 R² .. 1.000 R²)
mean                 158.7 ms   (154.0 ms .. 164.3 ms)
std dev              7.075 ms   (4.152 ms .. 9.903 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking descending ints/greg
TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms .. 508.3 ms)
                     0.999 R²   (0.995 R² .. 1.000 R²)
mean                 476.0 ms   (467.5 ms .. 480.0 ms)
std dev              7.447 ms   (67.99 as .. 7.865 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/ghc
TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s .. 3.019 s)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 2.812 s    (2.785 s .. 2.838 s)
std dev              44.06 ms   (543.9 as .. 44.97 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s .. 2.076 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.028 s    (2.019 s .. 2.033 s)
std dev              7.832 ms   (0.0 s .. 8.178 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking shakespeare/ghc
TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s .. 6.694 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.499 s    (6.468 s .. 6.518 s)
std dev              28.85 ms   (0.0 s .. 32.62 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking shakespeare/greg
TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s .. 5.763 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 5.582 s    (5.537 s .. 5.607 s)
std dev              39.30 ms   (0.0 s .. 43.49 ms)
variance introduced by outliers: 19% (moderately inflated)


Costs and Drawbacks
-------------------

The only cost I see is the reduced performance when sorting already sorted
lists. However, since this remains quite efficient, indeed over 4 times
faster than sorting unsorted lists, I think it is an acceptable tradeoff.

Final note
----------

My Haskell is very rusty. I worked on this a couple years ago when I was
learning Haskell, and meant to propose it to the Haskell community, but
never got to it at the time.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Siddhanathan Shanmugam
Thank you! This identifies a space leak in base which went unnoticed for 7 years.

Your implementation can be improved further. Instead of splitting into pairs, you could instead split into lists of sorted sublists by replacing the pairs function with the following

    pair = foldr f []
      where
        f x [] = [[x]]
        f x (y:ys)
          | x `cmp` head y == LT = (x:y):ys
          | otherwise            = [x]:y:ys

This should give you the same performance improvements for sorting random lists, but better performance while sorting ascending lists.

The version in base takes it one step further by using a DList to handle the descending case efficiently as well, except there's a space leak right now because of which it is slower.

On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch <[hidden email]> wrote:

Motivation:
----------

Data.List.sort is a very important functionality in Haskell. I believe that
the proposed implementation is:

- significantly faster than the current implementation on unsorted lists,
typically 14% to 27% faster
- more laziness-friendly, i.e.:
    take 3 $ sort l
  will require significantly less comparisons than the current
implementation

Proposed Implementation
-----------------------

sort :: (Ord a) => [a] -> [a]
sort =  sortBy compare

sortBy cmp [] = []
sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' : reduce t
                         where v' = merge v w
                               x' = merge x y

    reduce (x:y:t) = merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys')
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


Effect and Interactions
-----------------------

I have a stack project with a criterion test for this new implementation,
available at https://github.com/greg7mdp/ghc-sort.
I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and had the following
results:

- sorting of random lists of integers is 27% faster
- sorting of random lists of strings is 14% faster
- sorting of already sorted lists is significantly slower, but still much
faster than sorting random lists
- proposed version is more laziness friendly. For example this version of
sortBy requires 11 comparisons to find
  the smallest element of a 15 element list, while the default
Data.List.sortBy requires 15 comparisons.
  (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs)


Test results
------------

Criterion output (descending/ascending results are for already sorted
lists).
I barely understand what Criterion does, and I am puzzled with the various
"T" output - maybe there is a bug in my bench code:

vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec ghc-sort
benchmarking ascending ints/ghc
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 160.6 ms   (153.4
ms .. 167.8 ms)
                     0.997 R²   (0.986 R² .. 1.000 R²)
mean                 161.7 ms   (158.3 ms .. 165.9 ms)
std dev              5.210 ms   (3.193 ms .. 7.006 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking ascending ints/greg
TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms .. 554.9 ms)
                     0.996 R²   (0.987 R² .. 1.000 R²)
mean                 466.2 ms   (449.0 ms .. 475.0 ms)
std dev              14.94 ms   (0.0 s .. 15.29 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking descending ints/ghc
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 165.1 ms   (148.2
ms .. 178.2 ms)
                     0.991 R²   (0.957 R² .. 1.000 R²)
mean                 158.7 ms   (154.0 ms .. 164.3 ms)
std dev              7.075 ms   (4.152 ms .. 9.903 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking descending ints/greg
TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms .. 508.3 ms)
                     0.999 R²   (0.995 R² .. 1.000 R²)
mean                 476.0 ms   (467.5 ms .. 480.0 ms)
std dev              7.447 ms   (67.99 as .. 7.865 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/ghc
TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s .. 3.019 s)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 2.812 s    (2.785 s .. 2.838 s)
std dev              44.06 ms   (543.9 as .. 44.97 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s .. 2.076 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.028 s    (2.019 s .. 2.033 s)
std dev              7.832 ms   (0.0 s .. 8.178 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking shakespeare/ghc
TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s .. 6.694 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.499 s    (6.468 s .. 6.518 s)
std dev              28.85 ms   (0.0 s .. 32.62 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking shakespeare/greg
TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s .. 5.763 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 5.582 s    (5.537 s .. 5.607 s)
std dev              39.30 ms   (0.0 s .. 43.49 ms)
variance introduced by outliers: 19% (moderately inflated)


Costs and Drawbacks
-------------------

The only cost I see is the reduced performance when sorting already sorted
lists. However, since this remains quite efficient, indeed over 4 times
faster than sorting unsorted lists, I think it is an acceptable tradeoff.

Final note
----------

My Haskell is very rusty. I worked on this a couple years ago when I was
learning Haskell, and meant to propose it to the Haskell community, but
never got to it at the time.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Gregory Popovitch
Thank you @Siddhanathan! I welcome any improvement you may make, as I said I
am very far from a Haskell expert.
 
I just tested your change with my test project
(https://github.com/greg7mdp/ghc-sort)
and here are my results (mean times in ms):

input                        GHC sort          Orig proposal        your
change
----------------------------------------------------------------------------
---
sorted ints (ascending)      153               467                  139
sorted ints (descending)     152               472                  599
random ints                 2824              2077                 2126
random strings              6564              5613                 5983
 
Your change is a definite improvement for sorted integers in ascending
order, but is worse for other cases.

Is there a real need to optimize the sort for already sorted list? Of course
it should not be a degenerate
case and take longer than sorting random numbers, but this is not the case
here. Sorting already sorted
lists is, even with my version, over 4 times faster than sorting random
lists. This sounds perfectly
acceptable to me, and I feel that trying to optimize this specific case
further, if it comes at the
detriment of the general case, is not desirable.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 11:41 AM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Thank you! This identifies a space leak in base which went unnoticed for 7
years.

Your implementation can be improved further. Instead of splitting into
pairs, you could instead split into lists of sorted sublists by replacing
the pairs function with the following

    pair = foldr f []
      where
        f x [] = [[x]]
        f x (y:ys)
          | x `cmp` head y == LT = (x:y):ys
          | otherwise            = [x]:y:ys

This should give you the same performance improvements for sorting random
lists, but better performance while sorting ascending lists.

The version in base takes it one step further by using a DList to handle the
descending case efficiently as well, except there's a space leak right now
because of which it is slower.

On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch <[hidden email]>
wrote:



        Motivation:
        ----------
       
        Data.List.sort is a very important functionality in Haskell. I
believe that
        the proposed implementation is:
       
        - significantly faster than the current implementation on unsorted
lists,
        typically 14% to 27% faster
        - more laziness-friendly, i.e.:
            take 3 $ sort l
          will require significantly less comparisons than the current
        implementation
       
        Proposed Implementation
        -----------------------
       
        sort :: (Ord a) => [a] -> [a]
        sort =  sortBy compare
       
        sortBy cmp [] = []
        sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
          where
            pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                         | otherwise        = [x, y] : pair t
            pair [x] = [[x]]
            pair []  = []
       
            reduce (v:w:x:y:t) = merge v' x' : reduce t
                                 where v' = merge v w
                                       x' = merge x y
       
            reduce (x:y:t) = merge x y : reduce t
            reduce xs      = xs
       
            merge xs []           = xs
            merge []  ys          = ys
            merge xs@(x:xs') ys@(y:ys')
                 | x `cmp` y == GT  = y : merge xs  ys'
                 | otherwise        = x : merge xs' ys
       
       
        Effect and Interactions
        -----------------------
       
        I have a stack project with a criterion test for this new
implementation,
        available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> .
        I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and had the
following
        results:
       
        - sorting of random lists of integers is 27% faster
        - sorting of random lists of strings is 14% faster
        - sorting of already sorted lists is significantly slower, but still
much
        faster than sorting random lists
        - proposed version is more laziness friendly. For example this
version of
        sortBy requires 11 comparisons to find
          the smallest element of a 15 element list, while the default
        Data.List.sortBy requires 15 comparisons.
          (see
       
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> )
       
       
        Test results
        ------------
       
        Criterion output (descending/ascending results are for already
sorted
        lists).
        I barely understand what Criterion does, and I am puzzled with the
various
        "T" output - maybe there is a bug in my bench code:
       
        vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec ghc-sort
        benchmarking ascending ints/ghc
        TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 160.6 ms
(153.4
        ms .. 167.8 ms)
                             0.997 R²   (0.986 R² .. 1.000 R²)
        mean                 161.7 ms   (158.3 ms .. 165.9 ms)
        std dev              5.210 ms   (3.193 ms .. 7.006 ms)
        variance introduced by outliers: 12% (moderately inflated)
       
        benchmarking ascending ints/greg
        TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms .. 554.9
ms)
                             0.996 R²   (0.987 R² .. 1.000 R²)
        mean                 466.2 ms   (449.0 ms .. 475.0 ms)
        std dev              14.94 ms   (0.0 s .. 15.29 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
        benchmarking descending ints/ghc
        TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 165.1 ms
(148.2
        ms .. 178.2 ms)
                             0.991 R²   (0.957 R² .. 1.000 R²)
        mean                 158.7 ms   (154.0 ms .. 164.3 ms)
        std dev              7.075 ms   (4.152 ms .. 9.903 ms)
        variance introduced by outliers: 12% (moderately inflated)
       
        benchmarking descending ints/greg
        TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms .. 508.3
ms)
                             0.999 R²   (0.995 R² .. 1.000 R²)
        mean                 476.0 ms   (467.5 ms .. 480.0 ms)
        std dev              7.447 ms   (67.99 as .. 7.865 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
        benchmarking random ints/ghc
        TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s .. 3.019 s)
                             0.999 R²   (0.997 R² .. 1.000 R²)
        mean                 2.812 s    (2.785 s .. 2.838 s)
        std dev              44.06 ms   (543.9 as .. 44.97 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
        benchmarking random ints/greg
        TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s .. 2.076 s)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 2.028 s    (2.019 s .. 2.033 s)
        std dev              7.832 ms   (0.0 s .. 8.178 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
        benchmarking shakespeare/ghc
        TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s .. 6.694 s)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 6.499 s    (6.468 s .. 6.518 s)
        std dev              28.85 ms   (0.0 s .. 32.62 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
        benchmarking shakespeare/greg
        TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s .. 5.763 s)
                             1.000 R²   (0.999 R² .. 1.000 R²)
        mean                 5.582 s    (5.537 s .. 5.607 s)
        std dev              39.30 ms   (0.0 s .. 43.49 ms)
        variance introduced by outliers: 19% (moderately inflated)
       
       
        Costs and Drawbacks
        -------------------
       
        The only cost I see is the reduced performance when sorting already
sorted
        lists. However, since this remains quite efficient, indeed over 4
times
        faster than sorting unsorted lists, I think it is an acceptable
tradeoff.
       
        Final note
        ----------
       
        My Haskell is very rusty. I worked on this a couple years ago when I
was
        learning Haskell, and meant to propose it to the Haskell community,
but
        never got to it at the time.
       
        _______________________________________________
        Libraries mailing list
        [hidden email]
        http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
       



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Siddhanathan Shanmugam
Interesting. You are right, performance for sorting random lists has priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]

    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs

    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs

    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)

    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs

    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]> wrote:
Thank you @Siddhanathan! I welcome any improvement you may make, as I said I
am very far from a Haskell expert.

I just tested your change with my test project
(https://github.com/greg7mdp/ghc-sort)
and here are my results (mean times in ms):

input                        GHC sort          Orig proposal        your
change
----------------------------------------------------------------------------
---
sorted ints (ascending)      153               467                  139
sorted ints (descending)     152               472                  599
random ints                 2824              2077                 2126
random strings              6564              5613                 5983

Your change is a definite improvement for sorted integers in ascending
order, but is worse for other cases.

Is there a real need to optimize the sort for already sorted list? Of course
it should not be a degenerate
case and take longer than sorting random numbers, but this is not the case
here. Sorting already sorted
lists is, even with my version, over 4 times faster than sorting random
lists. This sounds perfectly
acceptable to me, and I feel that trying to optimize this specific case
further, if it comes at the
detriment of the general case, is not desirable.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 11:41 AM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Thank you! This identifies a space leak in base which went unnoticed for 7
years.

Your implementation can be improved further. Instead of splitting into
pairs, you could instead split into lists of sorted sublists by replacing
the pairs function with the following

    pair = foldr f []
      where
        f x [] = [[x]]
        f x (y:ys)
          | x `cmp` head y == LT = (x:y):ys
          | otherwise            = [x]:y:ys

This should give you the same performance improvements for sorting random
lists, but better performance while sorting ascending lists.

The version in base takes it one step further by using a DList to handle the
descending case efficiently as well, except there's a space leak right now
because of which it is slower.

On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch <[hidden email]>
wrote:



        Motivation:
        ----------

        Data.List.sort is a very important functionality in Haskell. I
believe that
        the proposed implementation is:

        - significantly faster than the current implementation on unsorted
lists,
        typically 14% to 27% faster
        - more laziness-friendly, i.e.:
            take 3 $ sort l
          will require significantly less comparisons than the current
        implementation

        Proposed Implementation
        -----------------------

        sort :: (Ord a) => [a] -> [a]
        sort =  sortBy compare

        sortBy cmp [] = []
        sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
          where
            pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                         | otherwise        = [x, y] : pair t
            pair [x] = [[x]]
            pair []  = []

            reduce (v:w:x:y:t) = merge v' x' : reduce t
                                 where v' = merge v w
                                       x' = merge x y

            reduce (x:y:t) = merge x y : reduce t
            reduce xs      = xs

            merge xs []           = xs
            merge []  ys          = ys
            merge xs@(x:xs') ys@(y:ys')
                 | x `cmp` y == GT  = y : merge xs  ys'
                 | otherwise        = x : merge xs' ys


        Effect and Interactions
        -----------------------

        I have a stack project with a criterion test for this new
implementation,
        available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> .
        I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and had the
following
        results:

        - sorting of random lists of integers is 27% faster
        - sorting of random lists of strings is 14% faster
        - sorting of already sorted lists is significantly slower, but still
much
        faster than sorting random lists
        - proposed version is more laziness friendly. For example this
version of
        sortBy requires 11 comparisons to find
          the smallest element of a 15 element list, while the default
        Data.List.sortBy requires 15 comparisons.
          (see

https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> )


        Test results
        ------------

        Criterion output (descending/ascending results are for already
sorted
        lists).
        I barely understand what Criterion does, and I am puzzled with the
various
        "T" output - maybe there is a bug in my bench code:

        vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec ghc-sort
        benchmarking ascending ints/ghc
        TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 160.6 ms
(153.4
        ms .. 167.8 ms)
                             0.997 R²   (0.986 R² .. 1.000 R²)
        mean                 161.7 ms   (158.3 ms .. 165.9 ms)
        std dev              5.210 ms   (3.193 ms .. 7.006 ms)
        variance introduced by outliers: 12% (moderately inflated)

        benchmarking ascending ints/greg
        TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms .. 554.9
ms)
                             0.996 R²   (0.987 R² .. 1.000 R²)
        mean                 466.2 ms   (449.0 ms .. 475.0 ms)
        std dev              14.94 ms   (0.0 s .. 15.29 ms)
        variance introduced by outliers: 19% (moderately inflated)

        benchmarking descending ints/ghc
        TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime                 165.1 ms
(148.2
        ms .. 178.2 ms)
                             0.991 R²   (0.957 R² .. 1.000 R²)
        mean                 158.7 ms   (154.0 ms .. 164.3 ms)
        std dev              7.075 ms   (4.152 ms .. 9.903 ms)
        variance introduced by outliers: 12% (moderately inflated)

        benchmarking descending ints/greg
        TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms .. 508.3
ms)
                             0.999 R²   (0.995 R² .. 1.000 R²)
        mean                 476.0 ms   (467.5 ms .. 480.0 ms)
        std dev              7.447 ms   (67.99 as .. 7.865 ms)
        variance introduced by outliers: 19% (moderately inflated)

        benchmarking random ints/ghc
        TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s .. 3.019 s)
                             0.999 R²   (0.997 R² .. 1.000 R²)
        mean                 2.812 s    (2.785 s .. 2.838 s)
        std dev              44.06 ms   (543.9 as .. 44.97 ms)
        variance introduced by outliers: 19% (moderately inflated)

        benchmarking random ints/greg
        TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s .. 2.076 s)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 2.028 s    (2.019 s .. 2.033 s)
        std dev              7.832 ms   (0.0 s .. 8.178 ms)
        variance introduced by outliers: 19% (moderately inflated)

        benchmarking shakespeare/ghc
        TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s .. 6.694 s)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 6.499 s    (6.468 s .. 6.518 s)
        std dev              28.85 ms   (0.0 s .. 32.62 ms)
        variance introduced by outliers: 19% (moderately inflated)

        benchmarking shakespeare/greg
        TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s .. 5.763 s)
                             1.000 R²   (0.999 R² .. 1.000 R²)
        mean                 5.582 s    (5.537 s .. 5.607 s)
        std dev              39.30 ms   (0.0 s .. 43.49 ms)
        variance introduced by outliers: 19% (moderately inflated)


        Costs and Drawbacks
        -------------------

        The only cost I see is the reduced performance when sorting already
sorted
        lists. However, since this remains quite efficient, indeed over 4
times
        faster than sorting unsorted lists, I think it is an acceptable
tradeoff.

        Final note
        ----------

        My Haskell is very rusty. I worked on this a couple years ago when I
was
        learning Haskell, and meant to propose it to the Haskell community,
but
        never got to it at the time.

        _______________________________________________
        Libraries mailing list
        [hidden email]
        http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>






_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Evan Laforge
In reply to this post by Gregory Popovitch
On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]> wrote:
> Is there a real need to optimize the sort for already sorted list?

It's useful for data constructors with the precondition that the input
is sorted.  I generally keep things in sorted order, but since
sortedness is not tracked in the type and I'm not willing to face
corruption if a transformation slightly de-sorts the list I just
unconditionally sort the input.  It's nice to have that be relatively
cheap for the common case.

I'm just giving a case where it's useful, not advocating for the
status quo.  I don't know how to weight the trade-off for a general
purpose sort.  I'd personally be willing to switch to a specialized
sort that works well on mostly sorted input if it made a performance
difference.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Gregory Popovitch
In reply to this post by Siddhanathan Shanmugam
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().
 
I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.
       
        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):
       
        input                        GHC sort          Orig proposal
your
        change
       
----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983
       
        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.
       
        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.
       
        Thanks,
       
        greg
       
        ________________________________
       
        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.
       


        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.
       
        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following
       
            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys
       
        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.
       
        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.
       
        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:
       
       
       
                Motivation:
                ----------
       
                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:
       
                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation
       
                Proposed Implementation
                -----------------------
       
                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare
       
                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []
       
                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y
       
                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs
       
                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys
       
       
                Effect and Interactions
                -----------------------
       
                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>
       
        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:
       
                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see
       
       
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>
       
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)
       


                Test results
                ------------
       
                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:
       
                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)
       
                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)
       
                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)
       
       
                Costs and Drawbacks
                -------------------
       
                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.
       
                Final note
                ----------
       
                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.
       
                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
       
        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >
       
       
       
       
       



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Siddhanathan Shanmugam
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >










_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Tobias Florek
In reply to this post by Gregory Popovitch
Hi!

> Test results

See e.g.

 https://github.com/rust-lang/rust/pull/38192

for a good suite of test vectors for sort functions. I still remember
first benchmarking a sort I wrote in university which performed well,
but broke down (relatively) with test cases that are not random, sorted,
rev sorted.

Great work you two btw!

Cheers,
 Tob(ias Florek)


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

signature.asc (856 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Siddhanathan Shanmugam


On Sun, Mar 26, 2017 at 2:05 PM, Tobias Florek <[hidden email]> wrote:
Hi!

> Test results

See e.g.

 https://github.com/rust-lang/rust/pull/38192

That's a good idea actually. Maybe it would be a welcomed addition to the https://github.com/haskell-perf/sequences benchmarks.
 


for a good suite of test vectors for sort functions. I still remember
first benchmarking a sort I wrote in university which performed well,
but broke down (relatively) with test cases that are not random, sorted,
rev sorted.

Great work you two btw!

Cheers,
 Tob(ias Florek)


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Siddhanathan Shanmugam
In reply to this post by Siddhanathan Shanmugam
We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >











_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Gregory Popovitch
Pretty cool by the way, we now have a 31% improvement for sorting lists of random integers vs the current Data.List.sortBy.
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >











_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: fix for Data.List.sortBy

Gregory Popovitch
In reply to this post by Siddhanathan Shanmugam
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: fix for Data.List.sortBy

David Feuer
The first seq is useless: constructor application is never suspended. I haven't had a chance to look at the rest yet.

On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <[hidden email]> wrote:
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: fix for Data.List.sortBy

Dan Burton
Does this rely on Common Subexpression Elimination optimization in order to work? Would it work more reliably if the `seq`-ed expression were let-bound? 

-- Dan Burton

On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <[hidden email]> wrote:
The first seq is useless: constructor application is never suspended. I haven't had a chance to look at the rest yet.

On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <[hidden email]> wrote:
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: fix for Data.List.sortBy

Siddhanathan Shanmugam
Turns out we don't need seq at all. A simple refactoring of the merge function does the trick equally well.

    mergePairs (a:b:xs) = merge id a b : mergePairs xs
    mergePairs xs       = xs

    merge f as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = merge (f.(b:)) as  bs'
      | otherwise       = merge (f.(a:)) as' bs
    merge f [] bs       = f bs
    merge f as []       = f as

This variant is 10% faster in my tests.


On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <[hidden email]> wrote:
Does this rely on Common Subexpression Elimination optimization in order to work? Would it work more reliably if the `seq`-ed expression were let-bound? 

I don't think it relies heavily on CSE. The seq's are there to avoid a cascading series of thunk evaluations. Using let expressions doesn't seem to affect the benchmarks.
 

-- Dan Burton

On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <[hidden email]> wrote:
The first seq is useless: constructor application is never suspended. I haven't had a chance to look at the rest yet.

On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <[hidden email]> wrote:
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: fix for Data.List.sortBy

Gregory Popovitch
Sid, this new version (diff below) is really fast when sorting random ints, but slower when sorting strings:
 
input                        GHC sort          Orig proposal        gSort
-------------------------------------------------------------------------
sorted ints (ascending)      151               460                  147
sorted ints (descending)     151               467                  171
random ints                 2771              2010                 1365
random strings              6542              5524                 5991
 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Tuesday, March 28, 2017 2:45 AM
To: Dan Burton
Cc: Haskell Libraries; Gregory Popovitch
Subject: Re: fix for Data.List.sortBy

Turns out we don't need seq at all. A simple refactoring of the merge function does the trick equally well.

    mergePairs (a:b:xs) = merge id a b : mergePairs xs
    mergePairs xs       = xs

    merge f as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = merge (f.(b:)) as  bs'
      | otherwise       = merge (f.(a:)) as' bs
    merge f [] bs       = f bs
    merge f as []       = f as

This variant is 10% faster in my tests.


On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <[hidden email]> wrote:
Does this rely on Common Subexpression Elimination optimization in order to work? Would it work more reliably if the `seq`-ed expression were let-bound? 

I don't think it relies heavily on CSE. The seq's are there to avoid a cascading series of thunk evaluations. Using let expressions doesn't seem to affect the benchmarks.
 

-- Dan Burton

On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <[hidden email]> wrote:
The first seq is useless: constructor application is never suspended. I haven't had a chance to look at the rest yet.

On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <[hidden email]> wrote:
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: fix for Data.List.sortBy

Siddhanathan Shanmugam


On Tue, Mar 28, 2017 at 5:03 AM, Gregory Popovitch <[hidden email]> wrote:
Sid, this new version (diff below) is really fast when sorting random ints, but slower when sorting strings:

Ok, let's not use that then. Also, add an inline pragma on the function merge.


    {-# INLINE merge #-}
    merge as@(a:as') bs@(b:bs')


 
 
input                        GHC sort          Orig proposal        gSort
-------------------------------------------------------------------------
sorted ints (ascending)      151               460                  147
sorted ints (descending)     151               467                  171
random ints                 2771              2010                 1365
random strings              6542              5524                 5991
 
 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Tuesday, March 28, 2017 2:45 AM
To: Dan Burton
Cc: Haskell Libraries; Gregory Popovitch
Subject: Re: fix for Data.List.sortBy

Turns out we don't need seq at all. A simple refactoring of the merge function does the trick equally well.

    mergePairs (a:b:xs) = merge id a b : mergePairs xs
    mergePairs xs       = xs

    merge f as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = merge (f.(b:)) as  bs'
      | otherwise       = merge (f.(a:)) as' bs
    merge f [] bs       = f bs
    merge f as []       = f as

This variant is 10% faster in my tests.


On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <[hidden email]> wrote:
Does this rely on Common Subexpression Elimination optimization in order to work? Would it work more reliably if the `seq`-ed expression were let-bound? 

I don't think it relies heavily on CSE. The seq's are there to avoid a cascading series of thunk evaluations. Using let expressions doesn't seem to affect the benchmarks.
 

-- Dan Burton

On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <[hidden email]> wrote:
The first seq is useless: constructor application is never suspended. I haven't had a chance to look at the rest yet.

On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <[hidden email]> wrote:
Sid,
 
I'd be delighted to submit the patch, as long as I have permission (which I probably don't), you feel confident about the change and maybe a couple of other people agree.
 
Here is the proposed change. Tests shows significant speed improvement (30%) when sorting lists of random numbers, and same efficiency for sorting already sorted lists (both ascending and descending). 
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 6:53 PM
To: Gregory Popovitch
Subject: RE: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Since I don't see any regressions, this doesn't really need CLC approval. The changes are also small enough that a Github PR may be accepted (otherwise, the change goes in via Phabricator).

Are you interested in implementing this patch? If yes, a standard Github PR should be fine. Right now gSort is a three line change I think. It will be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on Github.

I'm hoping for some more comments from other Haskellers, before pushing for this change in base. I feel like we may be missing a potential optimization that someone else might spot. So probably going to wait a few days. 


On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <[hidden email]> wrote:
Hi Sid,
 
Thanks, glad you looked into that. My understanding of the Haskell execution model is really poor, so I can't say one way or the other, but I felt that laziness ought to be considered as well, and I'm glad it was :-)
 
So  in conclusion it looks like we have a winner with your latest gSortBy version. How do we get this pushed to the GHC library?
 
Thanks,
 
greg


From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 2:12 PM
To: Gregory Popovitch

Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Hi Greg,

On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <[hidden email]> wrote:
Unfortunately, this optimization makes the sort less lazy, so doing something like:
 
take 4 $ sort l
 
requires more sorting of the list l with this change. I'm not sure it is a good tradeoff.
 

I think you're running without optimizations turned on. It is lazy in my case.

Also, the difference should be negligible (if any at all). Here's an example of the list being sorted:

[11,4,6,8,2,5,1,7,9,55,11,3]
...
[[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
...
[[1,2,4,5,6,7,8,9,11,55],[3,11]]
 * 1 3
 * 2 3
 * 4 3
 * 4 11
[1,2,3,4]

The number of operations saved is only in the last merge. It's only lazy at this step.

So we save at most one traversal of the list, which is not too expensive since our worst case bounds is O(n log n) anyway.

This should mean that the asymptotic performance should be identical, regardless of the number of comparisons saved. Of course, you do get better constants, but I would be surprised if those constants translated to significantly better performance for a reasonable size list.
 
 
 I do agree that it would be nice to have a more serious validation test suite.
 
Thanks,
 
greg
 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Siddhanathan Shanmugam
Sent: Monday, March 27, 2017 12:53 PM

To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

We can improve things a bit further by forcing evaluation (with seq) along the way appropriately.



gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
  where
    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                 | otherwise        = [x, y] : pair t
    pair [x] = [[x]]
    pair []  = []

    reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
                         where v' = merge v w `seq` merge v w
                               x' = merge x y `seq` merge x y
                         
    reduce (x:y:t) = merge x y `seq` merge x y : reduce t
    reduce xs      = xs

    merge xs []           = xs
    merge []  ys          = ys
    merge xs@(x:xs') ys@(y:ys') 
         | x `cmp` y == GT  = y : merge xs  ys'
         | otherwise        = x : merge xs' ys


gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) `seq` (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (as . (a:)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as




Before the change:

benchmarking random ints/ghc
time                 3.687 s    (3.541 s .. NaN s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 3.691 s    (3.669 s .. 3.705 s)
std dev              21.45 ms   (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/greg
time                 2.648 s    (2.482 s .. 2.822 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 2.704 s    (2.670 s .. 2.736 s)
std dev              52.68 ms   (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.733 s    (2.682 s .. 2.758 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.707 s    (2.689 s .. 2.718 s)
std dev              16.84 ms   (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)

After the change:

benchmarking random ints/greg
time                 2.576 s    (2.548 s .. 2.628 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.590 s    (2.578 s .. 2.599 s)
std dev              12.99 ms   (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking random ints/gSort
time                 2.538 s    (2.412 s .. 2.627 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.543 s    (2.517 s .. 2.560 s)
std dev              26.16 ms   (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)





On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <[hidden email]> wrote:
Theoretically, we could do better. We currently only exploit monotonic runs in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]


The constant factors here might be too high to notice the difference though.


> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called natural merge sort. The algorithm is linearithmic in the worst case, but linear in the best case (already sorted list).



On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <[hidden email]> wrote:
Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
Data.List.sort().

I have updated the test programs in https://github.com/greg7mdp/ghc-sort to
include your new version.

Here are the results (your new version looks like a definite improvement vs
the current GHC one):

input                        GHC sort         My Orig proposal     gSort
----------------------------------------------------------------------------
---
sorted ints (ascending)      151               456                  148
sorted ints (descending)     152               466                  155
random ints                 2732              2006                 2004
random strings              6564              5549                 5528


So replacing the current GHC version with gSort is a no brainer, as it is
better in all regards.

However, still my version is more laziness-friendly, i.e. it requires fewer
comparisons to get the
N smallest elements of a list (see
https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).

I wonder if this might not be a more useful trait than being able to sort
already sorted lists super fast.

Thanks,

greg

________________________________

From: [hidden email] [mailto:[hidden email]] On Behalf Of
Siddhanathan Shanmugam
Sent: Sunday, March 26, 2017 1:05 PM
To: Gregory Popovitch
Cc: Haskell Libraries
Subject: Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is more
laziness-friendly.


Interesting. You are right, performance for sorting random lists has
priority over performance for sorting already-sorted lists.

Ignore the numbers for my previous version. Can you compare GHC's sort, your
proposal, and gSort below?


gSort :: Ord a => [a] -> [a]
gSort = gSortBy compare
gSortBy cmp = mergeAll . sequences
  where
    sequences (a:b:xs)
      | a `cmp` b == GT = descending b [a]  xs
      | otherwise       = ascending  b (a:) xs
    sequences xs = [xs]


    descending a as (b:bs)
      | a `cmp` b == GT = descending b (a:as) bs
    descending a as bs  = (a:as) : sequences bs


    ascending a as (b:bs)
      | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
    ascending a as bs   = as [a] `seq` as [a] : sequences bs


    mergeAll [x] = x
    mergeAll xs  = mergeAll (mergePairs xs)


    mergePairs (a:b:xs) = merge a b : mergePairs xs
    mergePairs xs       = xs


    merge as@(a:as') bs@(b:bs')
      | a `cmp` b == GT = b : merge as  bs'
      | otherwise       = a : merge as' bs
    merge [] bs         = bs
    merge as []         = as


Thanks,
Sid


On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <[hidden email]>
wrote:


        Thank you @Siddhanathan! I welcome any improvement you may make, as
I said I
        am very far from a Haskell expert.

        I just tested your change with my test project
        (https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> )
        and here are my results (mean times in ms):

        input                        GHC sort          Orig proposal
your
        change

----------------------------------------------------------------------------
        ---
        sorted ints (ascending)      153               467
139
        sorted ints (descending)     152               472
599
        random ints                 2824              2077
2126
        random strings              6564              5613
5983

        Your change is a definite improvement for sorted integers in
ascending
        order, but is worse for other cases.

        Is there a real need to optimize the sort for already sorted list?
Of course
        it should not be a degenerate
        case and take longer than sorting random numbers, but this is not
the case
        here. Sorting already sorted
        lists is, even with my version, over 4 times faster than sorting
random
        lists. This sounds perfectly
        acceptable to me, and I feel that trying to optimize this specific
case
        further, if it comes at the
        detriment of the general case, is not desirable.

        Thanks,

        greg

        ________________________________

        From: [hidden email] [mailto:[hidden email]] On
Behalf Of
        Siddhanathan Shanmugam
        Sent: Sunday, March 26, 2017 11:41 AM
        To: Gregory Popovitch
        Cc: Haskell Libraries
        Subject: Re: Proposal: a new implementation for Data.List.sort and
        Data.List.sortBy, which has better performance characteristics and
is more
        laziness-friendly.



        Thank you! This identifies a space leak in base which went unnoticed
for 7
        years.

        Your implementation can be improved further. Instead of splitting
into
        pairs, you could instead split into lists of sorted sublists by
replacing
        the pairs function with the following

            pair = foldr f []
              where
                f x [] = [[x]]
                f x (y:ys)
                  | x `cmp` head y == LT = (x:y):ys
                  | otherwise            = [x]:y:ys

        This should give you the same performance improvements for sorting
random
        lists, but better performance while sorting ascending lists.

        The version in base takes it one step further by using a DList to
handle the
        descending case efficiently as well, except there's a space leak
right now
        because of which it is slower.

        On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
<[hidden email]>
        wrote:



                Motivation:
                ----------

                Data.List.sort is a very important functionality in Haskell.
I
        believe that
                the proposed implementation is:

                - significantly faster than the current implementation on
unsorted
        lists,
                typically 14% to 27% faster
                - more laziness-friendly, i.e.:
                    take 3 $ sort l
                  will require significantly less comparisons than the
current
                implementation

                Proposed Implementation
                -----------------------

                sort :: (Ord a) => [a] -> [a]
                sort =  sortBy compare

                sortBy cmp [] = []
                sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
                  where
                    pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
                                 | otherwise        = [x, y] : pair t
                    pair [x] = [[x]]
                    pair []  = []

                    reduce (v:w:x:y:t) = merge v' x' : reduce t
                                         where v' = merge v w
                                               x' = merge x y

                    reduce (x:y:t) = merge x y : reduce t
                    reduce xs      = xs

                    merge xs []           = xs
                    merge []  ys          = ys
                    merge xs@(x:xs') ys@(y:ys')
                         | x `cmp` y == GT  = y : merge xs  ys'
                         | otherwise        = x : merge xs' ys


                Effect and Interactions
                -----------------------

                I have a stack project with a criterion test for this new
        implementation,
                available at https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort>

        <https://github.com/greg7mdp/ghc-sort
<https://github.com/greg7mdp/ghc-sort> > .
                I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
had the
        following
                results:

                - sorting of random lists of integers is 27% faster
                - sorting of random lists of strings is 14% faster
                - sorting of already sorted lists is significantly slower,
but still
        much
                faster than sorting random lists
                - proposed version is more laziness friendly. For example
this
        version of
                sortBy requires 11 comparisons to find
                  the smallest element of a 15 element list, while the
default
                Data.List.sortBy requires 15 comparisons.
                  (see


https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>

<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
<https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs> >
)



                Test results
                ------------

                Criterion output (descending/ascending results are for
already
        sorted
                lists).
                I barely understand what Criterion does, and I am puzzled
with the
        various
                "T" output - maybe there is a bug in my bench code:

                vagrant@vagrant-ubuntu-trusty-64:/vagrant$ stack exec
ghc-sort
                benchmarking ascending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
160.6 ms
        (153.4
                ms .. 167.8 ms)
                                     0.997 R²   (0.986 R² .. 1.000 R²)
                mean                 161.7 ms   (158.3 ms .. 165.9 ms)
                std dev              5.210 ms   (3.193 ms .. 7.006 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking ascending ints/greg
                TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms ..
554.9
        ms)
                                     0.996 R²   (0.987 R² .. 1.000 R²)
                mean                 466.2 ms   (449.0 ms .. 475.0 ms)
                std dev              14.94 ms   (0.0 s .. 15.29 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking descending ints/ghc
                TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
165.1 ms
        (148.2
                ms .. 178.2 ms)
                                     0.991 R²   (0.957 R² .. 1.000 R²)
                mean                 158.7 ms   (154.0 ms .. 164.3 ms)
                std dev              7.075 ms   (4.152 ms .. 9.903 ms)
                variance introduced by outliers: 12% (moderately inflated)

                benchmarking descending ints/greg
                TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms ..
508.3
        ms)
                                     0.999 R²   (0.995 R² .. 1.000 R²)
                mean                 476.0 ms   (467.5 ms .. 480.0 ms)
                std dev              7.447 ms   (67.99 as .. 7.865 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/ghc
                TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
3.019 s)
                                     0.999 R²   (0.997 R² .. 1.000 R²)
                mean                 2.812 s    (2.785 s .. 2.838 s)
                std dev              44.06 ms   (543.9 as .. 44.97 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking random ints/greg
                TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
2.076 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 2.028 s    (2.019 s .. 2.033 s)
                std dev              7.832 ms   (0.0 s .. 8.178 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/ghc
                TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
6.694 s)
                                     1.000 R²   (1.000 R² .. 1.000 R²)
                mean                 6.499 s    (6.468 s .. 6.518 s)
                std dev              28.85 ms   (0.0 s .. 32.62 ms)
                variance introduced by outliers: 19% (moderately inflated)

                benchmarking shakespeare/greg
                TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
5.763 s)
                                     1.000 R²   (0.999 R² .. 1.000 R²)
                mean                 5.582 s    (5.537 s .. 5.607 s)
                std dev              39.30 ms   (0.0 s .. 43.49 ms)
                variance introduced by outliers: 19% (moderately inflated)


                Costs and Drawbacks
                -------------------

                The only cost I see is the reduced performance when sorting
already
        sorted
                lists. However, since this remains quite efficient, indeed
over 4
        times
                faster than sorting unsorted lists, I think it is an
acceptable
        tradeoff.

                Final note
                ----------

                My Haskell is very rusty. I worked on this a couple years
ago when I
        was
                learning Haskell, and meant to propose it to the Haskell
community,
        but
                never got to it at the time.

                _______________________________________________
                Libraries mailing list
                [hidden email]
                http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>

        <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
<http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >













_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries




_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: fix for Data.List.sortBy

Gregory Popovitch
OK, so current proposed changes are below:
 

test results look good.
 
Thanks,
 
greg



_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: fix for Data.List.sortBy

Siddhanathan Shanmugam
    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs

as well.

On Tue, Mar 28, 2017 at 11:43 AM, Gregory Popovitch <[hidden email]> wrote:
OK, so current proposed changes are below:
 

test results look good.
 
Thanks,
 
greg




_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: fix for Data.List.sortBy

David Feuer
You probably should bind the value you're forcing, even if the difference is small.

On Mar 28, 2017 2:51 PM, "Siddhanathan Shanmugam" <[hidden email]> wrote:
    mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs

as well.

On Tue, Mar 28, 2017 at 11:43 AM, Gregory Popovitch <[hidden email]> wrote:
OK, so current proposed changes are below:
 

test results look good.
 
Thanks,
 
greg




_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
12
Loading...