# How to improve its performance ?

11 messages
Open this post in threaded view
|

## How to improve its performance ?

 import Data.List combination :: [a] -> [[a]] combination [] =  [[]] combination (x:xs) =  (map (x:) (combination xs) )++ (combination xs) samp = [1..100] allTwoGroup = [(x, samp\\x) | x <- combination samp] The above code is used to calculate all the two groups from sample data ? It is very slow ! Sincerely! e^(π.i) + 1 = 0
Open this post in threaded view
|

## Re: How to improve its performance ?

 Am Donnerstag 18 März 2010 00:53:28 schrieb zaxis: > import Data.List > > combination :: [a] -> [[a]] > combination [] =  [[]] > combination (x:xs) =  (map (x:) (combination xs) )++ (combination xs) That would normally be called sublists (or subsets, if one regards lists as representing a set), I think. And, apart from the order in which they are generated, it's the same as Data.List.subsequences (only less efficient). > > samp = [1..100] > allTwoGroup = [(x, samp\\x) | x <- combination samp] > > The above code is used to calculate all the two groups from sample data All partitions into two sublists/sets/samples. > ? It is very slow ! I found it surprisingly not-slow (code compiled with -O2, as usual). There are two points where you waste time. First, in combination (x:xs) you calculate (combination xs) twice. If the order in which the sublists come doesn't matter, it's better to do it only once: combination (x:xs) = concat [(x:ys), ys] | ys <- combination xs] Second, (\\) is slow, xs \\ ys is O(length xs * length ys). Also, (\\) requires an Eq constraint. If you're willing to constrain the type further, to (Ord a => [a] -> [([a],[a])]), and call it only on ordered lists, you can replace (\\) by the much faster difference of oredered lists (implementation left as an exercise for the reader). But you can work with unconstrained types, and faster, if you build the two complementary sublists at the same time. The idea is, -- An empty list has one partition into two complementary sublists: partitions2 [] = [([],[])] -- For a nonempty list (x:xs), the partitions into two complementary -- sublists each have x either in the first sublist or in the second. -- Each partition induces a corresponding partition of the tail, xs, -- by removing x from the group in which it appears. -- Conversely, every partition ox xs gives rise to two partitions -- of (x:xs), by adding x to either the first or the second sublist. So partitions2 (x:xs)     = concat [ [(x:ys,zs),(ys,x:zs)] | (ys,zs) <- partitions2 xs ] We can also write the second case as partitions2 (x:xs) = concatMap (choice x) (partitions2 xs) where choice x (ys,zs) = [(x:ys,zs),(ys,x:zs)] Now it's very easy to recognise that partitions2 is a fold, partitions2 xs = foldr (concatMap . choice) [([],[])] xs > > Sincerely! > _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: How to improve its performance ?

 On Mar 17, 2010, at 6:14 PM, Daniel Fischer wrote: > I found it surprisingly not-slow (code compiled with -O2, as usual). > There are two points where you waste time. I found one big point where time is wasted:  in computing the powerset   of a list.  He's making 2^n elements, and then iterating through them   all and filtering, but only needs n^2 or n `choose` 2 of the   (depending on the semantics for his "groups"). The answer is to do something like: allPairs list = [(x,y) | x <- list, y <- list] to get it done in n^2 time. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: How to improve its performance ?

 In reply to this post by Daniel Fischer-4 The time is wasted to run combination even if use `combination (x:xs) = concat [(x:ys), ys] | ys <- combination xs] ' instead. in ghci >combination [1..20] will wait for a long time ....... Daniel Fischer-4 wrote Am Donnerstag 18 März 2010 00:53:28 schrieb zaxis: > import Data.List > > combination :: [a] -> [[a]] > combination [] =  [[]] > combination (x:xs) =  (map (x:) (combination xs) )++ (combination xs) That would normally be called sublists (or subsets, if one regards lists as representing a set), I think. And, apart from the order in which they are generated, it's the same as Data.List.subsequences (only less efficient). > > samp = [1..100] > allTwoGroup = [(x, samp\\x) | x <- combination samp] > > The above code is used to calculate all the two groups from sample data All partitions into two sublists/sets/samples. > ? It is very slow ! I found it surprisingly not-slow (code compiled with -O2, as usual). There are two points where you waste time. First, in combination (x:xs) you calculate (combination xs) twice. If the order in which the sublists come doesn't matter, it's better to do it only once: combination (x:xs) = concat [(x:ys), ys] | ys <- combination xs] Second, (\\) is slow, xs \\ ys is O(length xs * length ys). Also, (\\) requires an Eq constraint. If you're willing to constrain the type further, to (Ord a => [a] -> [([a],[a])]), and call it only on ordered lists, you can replace (\\) by the much faster difference of oredered lists (implementation left as an exercise for the reader). But you can work with unconstrained types, and faster, if you build the two complementary sublists at the same time. The idea is, -- An empty list has one partition into two complementary sublists: partitions2 [] = [([],[])] -- For a nonempty list (x:xs), the partitions into two complementary -- sublists each have x either in the first sublist or in the second. -- Each partition induces a corresponding partition of the tail, xs, -- by removing x from the group in which it appears. -- Conversely, every partition ox xs gives rise to two partitions -- of (x:xs), by adding x to either the first or the second sublist. So partitions2 (x:xs)     = concat [ [(x:ys,zs),(ys,x:zs)] | (ys,zs) <- partitions2 xs ] We can also write the second case as partitions2 (x:xs) = concatMap (choice x) (partitions2 xs) where choice x (ys,zs) = [(x:ys,zs),(ys,x:zs)] Now it's very easy to recognise that partitions2 is a fold, partitions2 xs = foldr (concatMap . choice) [([],[])] xs > > Sincerely! > _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe e^(π.i) + 1 = 0
Open this post in threaded view
|

## Re: How to improve its performance ?

 In reply to this post by Alexander Solla-2 `allPairs list = [(x,y) | x <- list, y <- list]  ` is not what `combination` does ! >let allPairs list = [(x,y) | x <- list, y <- list] >allPairs [1,2,3] [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] >combination [1,2,3] [[1,2,3],[2,3],[1,3],[3],[1,2],[2],[1],[]] Alexander Solla-2 wrote On Mar 17, 2010, at 6:14 PM, Daniel Fischer wrote: > I found it surprisingly not-slow (code compiled with -O2, as usual). > There are two points where you waste time. I found one big point where time is wasted:  in computing the powerset   of a list.  He's making 2^n elements, and then iterating through them   all and filtering, but only needs n^2 or n `choose` 2 of the   (depending on the semantics for his "groups"). The answer is to do something like: allPairs list = [(x,y) | x <- list, y <- list] to get it done in n^2 time. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe e^(π.i) + 1 = 0
Open this post in threaded view
|

## Re: How to improve its performance ?

 On Mar 17, 2010, at 8:33 PM, zaxis wrote: > > `allPairs list = [(x,y) | x <- list, y <- list]  ` is not what   > `combination` > does ! >> let allPairs list = [(x,y) | x <- list, y <- list] >> allPairs [1,2,3] > [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] Yeah, I know that.  I said so specifically.  combination computes the   power set of a list.  You said your goal was to compute a set of "two   groups".  You don't need the power set in order to compute a set of   pairs.  Moreover, computing the power set is a slow operation.   Indeed, it is the source of your slowness. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: How to improve its performance ?

 As Daniel pointed out earlier, this may be of a little bit of help:http://haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/src/Data-List.html#subsequences On 18 March 2010 04:03, Alexander Solla wrote: On Mar 17, 2010, at 8:33 PM, zaxis wrote: `allPairs list = [(x,y) | x <- list, y <- list]  ` is not what `combination` does ! let allPairs list = [(x,y) | x <- list, y <- list] allPairs [1,2,3] [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] Yeah, I know that.  I said so specifically.  combination computes the power set of a list.  You said your goal was to compute a set of "two groups".  You don't need the power set in order to compute a set of pairs.  Moreover, computing the power set is a slow operation.  Indeed, it is the source of your slowness. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe -- Ozgur Akgun _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: How to improve its performance ?

 In reply to this post by Alexander Solla-2 Am Donnerstag 18 März 2010 05:03:28 schrieb Alexander Solla: > On Mar 17, 2010, at 8:33 PM, zaxis wrote: > > `allPairs list = [(x,y) | x <- list, y <- list]  ` is not what > > `combination` > > does ! > > > >> let allPairs list = [(x,y) | x <- list, y <- list] > >> allPairs [1,2,3] > > > > [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] > > Yeah, I know that.  I said so specifically.  combination computes the > power set of a list.  You said your goal was to compute a set of "two > groups".  You don't need the power set in order to compute a set of > pairs.  Moreover, computing the power set is a slow operation. > Indeed, it is the source of your slowness. I think you've been fooled by the names. If you look at the code (or run it on a smaller sample), you'll see that allTwoGroup computes the list of all partitions of sample into two complementary sets. If sample were [1,2,3], the result would be (apart from the order) [([1,2,3],[]),([1,2],[3]),([1,3],[2]),([2,3],[1]),([1],[2,3]),([2],[1,3]), ([3],[1,2]),([],[1,2,3])] with sample = [1 .. 100], it's a list of 2^100 elements, that'll take a long time to compute no matter how. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: How to improve its performance ?

 In reply to this post by z_axis Am Donnerstag 18 März 2010 04:29:53 schrieb zaxis: > The time is wasted to run combination even if use `combination (x:xs) = > concat [(x:ys), ys] | ys <- combination xs] ' instead. > in ghci > > >combination [1..20] > > will wait for a long time ....... Hm, really? Prelude> :set +s                                         Prelude> let combination [] = [[]]; combination (x:xs) = [x:ys | ys <- combination xs] ++ combination xs                                                               (0.00 secs, 1818304 bytes)                                                               Prelude> let combs [] = [[]]; combs (x:xs) = concat [[x:ys,ys] | ys <- combs xs]   (0.00 secs, 2102280 bytes)                                                               Prelude> length [1 .. 2^20]                                                       1048576                                                                                 (0.07 secs, 49006024 bytes)                                                             Prelude> length \$ combination [1 .. 20] 1048576                                       (8.28 secs, 915712452 bytes)                   Prelude> length \$ combs [1 .. 20] 1048576                                 (0.78 secs, 146841964 bytes) That's interpreted, so not optimised. Optimisation narrows the gap, speeds both up significantly, so put combination :: [a] -> [[a]] combination [] = [[]] combination (x:xs) = [x:ys | ys <- combination xs] ++ combination xs combs :: [a] -> [[a]] combs [] = [[]] combs (x:xs) = concat [[x:ys,ys] | ys <- combs xs] mcombs :: [a] -> [[a]] mcombs = foldr (flip (>>=) . f) [[]]       where         f x xs = [x:xs,xs] in a file, compile with -O2 and load into ghci: Prelude Parts> length \$ combination [1 .. 20] 1048576 (0.16 secs, 43215220 bytes) Prelude Parts> length \$ combs [1 .. 20] 1048576 (0.05 secs, 55573436 bytes) Prelude Parts> length \$ mcombs [1 .. 20] 1048576 (0.06 secs, 55572692 bytes) Prelude Parts> length \$ combination [1 .. 24] 16777216 (3.06 secs, 674742880 bytes) Prelude Parts> length \$ combs [1 .. 24] 16777216 (0.62 secs, 881788880 bytes) Prelude Parts> length \$ mcombs [1 .. 24] 16777216 (0.62 secs, 881788956 bytes) Prelude Parts> length [1 .. 2^24] 16777216 (0.64 secs, 675355184 bytes) So combs and the pointfree combinator version mcombs are equally fast and significantly faster than combination. In fact they're as fast as a simple enumeration. Now, if you actually let ghci print out the result, the printing takes a long time. So much that the difference in efficiency is hardly discernible or not at all. > > Daniel Fischer-4 wrote: > > Am Donnerstag 18 März 2010 00:53:28 schrieb zaxis: > >> import Data.List > >> > >> combination :: [a] -> [[a]] > >> combination [] =  [[]] > >> combination (x:xs) =  (map (x:) (combination xs) )++ (combination xs) > > > > That would normally be called sublists (or subsets, if one regards > > lists as > > representing a set), I think. And, apart from the order in which they > > are generated, it's the same as Data.List.subsequences (only less > > efficient). > > > >> samp = [1..100] > >> allTwoGroup = [(x, samp\\x) | x <- combination samp] > >> > >> The above code is used to calculate all the two groups from sample > >> data > > > > All partitions into two sublists/sets/samples. > > > >> ? It is very slow ! > > > > I found it surprisingly not-slow (code compiled with -O2, as usual). > > There are two points where you waste time. > > First, in > > > > combination (x:xs) > > > > you calculate (combination xs) twice. If the order in which the > > sublists come doesn't matter, it's better to do it only once: > > > > combination (x:xs) = concat [(x:ys), ys] | ys <- combination xs] > > > > Second, (\\) is slow, xs \\ ys is O(length xs * length ys). > > Also, (\\) requires an Eq constraint. If you're willing to constrain > > the type further, to (Ord a => [a] -> [([a],[a])]), and call it only > > on ordered > > lists, you can replace (\\) by the much faster difference of oredered > > lists > > (implementation left as an exercise for the reader). > > > > But you can work with unconstrained types, and faster, if you build > > the two > > complementary sublists at the same time. > > The idea is, > > -- An empty list has one partition into two complementary sublists: > > partitions2 [] = [([],[])] > > -- For a nonempty list (x:xs), the partitions into two complementary > > -- sublists each have x either in the first sublist or in the second. > > -- Each partition induces a corresponding partition of the tail, xs, > > -- by removing x from the group in which it appears. > > -- Conversely, every partition ox xs gives rise to two partitions > > -- of (x:xs), by adding x to either the first or the second sublist. > > So partitions2 (x:xs) > >     = concat [ [(x:ys,zs),(ys,x:zs)] | (ys,zs) <- partitions2 xs ] > > > > We can also write the second case as > > > > partitions2 (x:xs) = concatMap (choice x) (partitions2 xs) > > > > where > > > > choice x (ys,zs) = [(x:ys,zs),(ys,x:zs)] > > > > Now it's very easy to recognise that partitions2 is a fold, > > > > partitions2 xs = foldr (concatMap . choice) [([],[])] xs _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe