multiple computations, same input

21 messages
12
Open this post in threaded view
|

multiple computations, same input

 How do I perform multiple computations on a long lazy list without introducing a space leak?Doing a single computation like this works great:       f = show . filter (> 1)But if I do something like this:      f lst = show (filter (> 1) lst, filter (> 2) lst)then it looks like GHC won't garbage collect list elements until the first filter has completely finished, and the second filter has iterated over them. Is there an easy way to feed each element into both functions, instead of feeding all elements to one function, and then all to the next?Thanks,Greg _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 Hi Greg, > But if I do something like this: >       f lst = show (filter (> 1) lst, filter (> 2) lst) > then it looks like GHC won't garbage collect list elements > until the first > filter has completely finished There is a very good reason for this, show (a,b) is essentially show (a,b) = "(" ++ show a ++ ", " ++ show b ++ ")" so all the elements of a are shown first, which means that lst is held up in a thunk of filter, to be evaluated later. In this particular case, you know that not (>2) implies not (>1) so you can speed it up with: f lst = show (part1, part2)    where        part1 = filter (>1) lst        part2 = fitler (>2) part1 note that part2 only examines part1. As long as part1 is smaller than lst, this will reduce the space leak. There is really no way to "eliminate" the space leak in this circumstance, since you really do need to hold a part of the data in memory while you show the first one, so you can then show the second one. Thanks Neil _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 > hold a part of the data in memory while you show the first one,Here would be a better example then.     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))It ought to be *possible* to compute both operations without holding onto any of the list elements.  In the imperative world, you'd say:    sum1 = 0    sum2 = 0    for num in lst        sum1 += num if num > 1        sum2 += num if num > 2    end    puts sum1, sum2One could probably hack it together with foldM, the State monad, and maybe some strictness, but I'd like to make full use of laziness and stick to the basic list operations if it at all possible. I'm not so concerned with solving this particular problem as I am in learning the purely functional technique for performing orthogonal computations on the same input without introducing a space leak.Maybe something like this?     arr (sum . filter (>1)) &&& arr (sum . filter (>2))Thanks,Greg _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 Hi, > Here would be a better example then. > >     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst)) I suspected that you actually wanted to do something "cleverer" with the list, for the sake of argument, I'm going to change >1 to p1 and >2 to p2 - to show how this can be done in the general case. With the specific information you know about >1 vs >2 you can do better, but this gets across the general point: f lst = show (sumPairs (>1) (>2) lst) sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int) sumPairs p1 p2 [] = (0, 0) sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b)     where        (a,b) = sumPairs xs        add pred value = if pred x then x+value else value [Untested, something like this should work] You can actually arrive at this solution entirely be reasoning on the program, i.e. not coming up with a fresh definition. The above code essentially follows your imperative pseudo code - I think its constant space, but I'm not too sure... Thanks Neil _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 Neil Mitchell wrote: > I suspected that you actually wanted to do something "cleverer" with > the list, for the sake of argument, I'm going to change >1 to p1 and > >2 to p2 - to show how this can be done in the general case. With the > specific information you know about >1 vs >2 you can do better, but > this gets across the general point: > > f lst = show (sumPairs (>1) (>2) lst) > > sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int) > sumPairs p1 p2 [] = (0, 0) > sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b) >     where >        (a,b) = sumPairs xs >        add pred value = if pred x then x+value else value > > [Untested, something like this should work]     Nope.  That won't work because you end up creating huge "add" thunks which cause end up causing a stack overflow (tested with GHC -O2).  I think you are probably going to need strictness in order to skin this cat in Haskell.  Here's an example that does work... import Data.List main = print \$ para_filter_sum (> 1) (> 2) lst twos = 2: twos lst = take 10000000 \$ [1,2,3,4,5] ++ twos -- f lst = show (filter (> 1) lst, filter (> 2) lst) para_filter_sum f g xs =     foldl' (\(n,m) elem -> seq n \$ seq m \$              (n+if f elem then elem else 0,               m+if g elem then elem else 0 ) ) (0,0) xs Greg Buchholz _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 In reply to this post by Neil Mitchell Thanks Neil.  How do I add in another ~10 computations, or map a list of a 100 computations to the same input?Isn't there a way to do this without one computation having to be aware of the other?  This feels like a situation Parsec users would find themselves in all the time.  When you have a bunch of parsers in a 'choice', does the start of the input stream linger until the last parser is executed? Thanks,GregOn 3/27/06, Neil Mitchell <[hidden email]> wrote: Hi,> Here would be a better example then.>>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))I suspected that you actually wanted to do something "cleverer" with the list, for the sake of argument, I'm going to change >1 to p1 and>2 to p2 - to show how this can be done in the general case. With thespecific information you know about >1 vs >2 you can do better, but this gets across the general point:f lst = show (sumPairs (>1) (>2) lst)sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int)sumPairs p1 p2 [] = (0, 0)sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b)     where       (a,b) = sumPairs xs       add pred value = if pred x then x+value else value[Untested, something like this should work]You can actually arrive at this solution entirely be reasoning on the program, i.e. not coming up with a fresh definition.The above code essentially follows your imperative pseudo code - Ithink its constant space, but I'm not too sure...ThanksNeil _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 > Thanks Neil.  How do I add in another ~10 computations, or map a list of a > 100 computations to the same input? > > Isn't there a way to do this without one computation having to be aware of > the other? I guess they have to be aware at some level, perhaps arrows generalise the awareness they need, to perhaps you'd need something else. > This feels like a situation Parsec users would find themselves in all the > time.  When you have a bunch of parsers in a 'choice', does the start of the > input stream linger until the last parser is executed? No, as soon as one token is accepted from any parser, that input is decided upon, and it will never go back. If you want that behaviour you have to wrap the particular parser in try, which does give the backtracking (and space leak) Thanks Neil _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 On 3/28/06, Neil Mitchell <[hidden email]> wrote: > > This feels like a situation Parsec users would find themselves in all the > > time.  When you have a bunch of parsers in a 'choice', does the start of the > > input stream linger until the last parser is executed? > > No, as soon as one token is accepted from any parser, that input is > decided upon, and it will never go back. If you want that behaviour > you have to wrap the particular parser in try, which does give the > backtracking (and space leak) > I personally find this behaviour terribly confusing. It makes writing the parser highly unmodular. It forces me to know exactly what a certain parser recognizes to know whether I need to wrap a 'try' around it when I compose it in parallel with another parser. Which is why I much prefer to use parsing libraries based on Koen Claessen's technique which performs all parses at the same time. It works breadth-first on the parse forest (the set of parse trees). Text.ParserCombinators.ReadP is one example which uses this technique. Cheers, /Josef _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 In reply to this post by Greg Fitzgerald On Mon, Mar 27, 2006 at 03:10:18PM -0800, Greg Fitzgerald wrote: >  > hold a part of the data in memory while you show the first one, > > Here would be a better example then. > >     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst)) > > It ought to be *possible* to compute both operations without holding onto > any of the list elements. I wonder if it would be possible to remove the space-leak by running both branches concurrently, and scheduling threads in a way that would minimise the space-leak. I proposed this before   http://www.haskell.org/pipermail/haskell-cafe/2005-December/013428.htmlI would like to hear opinions from some compiler gurus. Best regards Tomasz _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 Tomasz Zielonka wrote: > I wonder if it would be possible to remove the space-leak by running both > branches concurrently, and scheduling threads in a way that would > minimise the space-leak. I proposed this before > >   http://www.haskell.org/pipermail/haskell-cafe/2005-December/013428.html    FWIW, (not much), I asked a similar questions over on the Lambda-the-Ultimate blog a while back...     http://lambda-the-ultimate.org/node/923    http://lambda-the-ultimate.org/node/485...Anyway, I can't help but think that there might be a happy medium between eager and lazy evaluation. Greg Buchholz _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 In reply to this post by Tomasz Zielonka On Mar 28, 2006, at 1:02 AM, Tomasz Zielonka wrote: > On Mon, Mar 27, 2006 at 03:10:18PM -0800, Greg Fitzgerald wrote: >>> hold a part of the data in memory while you show the first one, >> >> Here would be a better example then. >> >>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst)) >> >> It ought to be *possible* to compute both operations without   >> holding onto >> any of the list elements. > > I wonder if it would be possible to remove the space-leak by   > running both > branches concurrently, and scheduling threads in a way that would > minimise the space-leak. I proposed this before > >   http://www.haskell.org/pipermail/haskell-cafe/2005-December/  > 013428.html > > I would like to hear opinions from some compiler gurus. This is possible in principle with something like resource-bounded   eagerness, but it's not at all easy.  The problem is this: when lst   gets big, you need to identify who's hanging on to it, and figure out   that they are actually planning to consume it and come up with   something smaller as a result.  This is all pretty heavyweight---not   hard in principle, but hard enough in practice that it may not be   worth the investment. That said, there's a transformation that goes something like this: a = foldr f z xs         ==>           (a,b) = foldr (f `cross` g)   (z,y) xs b = foldr g y xs This could in principle at least pluck the lowest-hanging fruit (sum,   filter, etc.). However it runs into some significant problems: - Only works with folds - Has some problems with bottoms, if I recall rightly - Not expressible using something like RULES;     requires a special transformation in the compiler. - It is often a pessimization. That last point is a killer. > > Best regards > Tomasz > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe_______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 In reply to this post by Greg Buchholz ...Anyway, I can't help but think that there might be a happy mediumbetween eager and lazy evaluation. What I'd love to see is the compiler continue to be call-by-need, but be smart enough to recognize when multiple expressions will all eventually need to be evaluated.  A simple example:    show (a + b) (+) requires *both* 'a' and 'b' be evaluated to show the result, not 'a' *then* 'b'.  It'd be great if the compiler can seek out any shared lazy data structures in evaluating 'a' and 'b', and start computing them both with one element at a time. Has anyone put any thought into something like this?Thanks,Greg  _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: multiple computations, same input

 On Tue, Mar 28, 2006 at 12:27:43PM -0800, Greg Fitzgerald wrote: > > > > ...Anyway, I can't help but think that there might be a happy medium > > between eager and lazy evaluation. > > > What I'd love to see is the compiler continue to be call-by-need, but be > smart enough to recognize when multiple expressions will all eventually need > to be evaluated.  A simple example: > >     show (a + b) > > (+) requires *both* 'a' and 'b' be evaluated to show the result, not 'a' > *then* 'b'.  It'd be great if the compiler can seek out any shared lazy data > structures in evaluating 'a' and 'b', and start computing them both with one > element at a time. > > Has anyone put any thought into something like this? This is called strictness analysis and is a fundamental optimization of any haskell compiler. this paper has information on how this information is used in ghc, and a search for 'strictness analysis' will turn up a plethora of algorithms for calculating it.  http://citeseer.ist.psu.edu/jones91unboxed.html        John -- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Pragmatic concurrency Re: multiple computations, same input

Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

 Robin Green wrote: > On Wed, 29 Mar 2006 12:50:02 +0100 > Jon Fairbairn <[hidden email]> wrote: >> [snip] >> 1) choosing the optimal reduction strategy is undecidable >> >> 2) we shouldn't (in general) attempt to do undecidable >>    things automatically >> [snip] >[snip] > I suggest that a Haskell program should be treated as an executable > specification. In some cases the compiler can't optimise the program > well enough, so we (by which I mean, ordinary programmers, not > compiler geeks) should be able to explicitly provide our own > optimisations, as rewrite rules (generalised ones, or specialised > ones for individual functions). Democratise the means of automated > optimisation! This sounds good. The only thing I'm wondering is what do we actually gain by using Haskell in the first place instead of just a strict language? It seems that Haskell's lazyness gives a succinct but too inefficient program which then needs extra code in the form of rewrite rules/pragmas, or else a complete rewrite in terms of seq etc to get it to run fast enough without space leaks... Regards, Brian. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

 In reply to this post by Brian Hulley On Wed, 29 Mar 2006, Brian Hulley wrote: > This sounds good. The only thing I'm wondering is what do we actually gain by > using Haskell in the first place instead of just a strict language? It seems > that Haskell's lazyness gives a succinct but too inefficient program which > then needs extra code in the form of rewrite rules/pragmas, or else a complete > rewrite in terms of seq etc to get it to run fast enough without space > leaks... > Often the laziness is useful for purposes of efficiency as well though. -- [hidden email] Sometimes you gotta fight fire with fire. Most of the time you just get burnt worse though. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

 In reply to this post by greenrd On Wed, Mar 29, 2006 at 03:23:04PM +0100, Robin Green wrote: > I suggest that a Haskell program should be treated as an executable > specification. In some cases the compiler can't optimise the program > well enough, so we (by which I mean, ordinary programmers, not compiler > geeks) should be able to explicitly provide our own optimisations, as > rewrite rules (generalised ones, or specialised ones for individual > functions). Democratise the means of automated optimisation! Then we > should be able to prove formally that our rewrite rules preserve > functional correctness. This is the approach I am pursuing in the > programming language I am working on, which is a derivative of Haskell. have you seen the RULES pragma? it is implemented in both ghc and jhc.   http://www.haskell.org/ghc/docs/6.4/html/users_guide/rewrite-rules.html        John -- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe