multiple computations, same input

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

multiple computations, same input

Greg Fitzgerald
How do I perform multiple computations on a long lazy list without introducing a space leak?

Doing a single computation like this works great:
      f = show . filter (> 1)

But if I do something like this:
      f lst = show (filter (> 1) lst, filter (> 2) lst)
then it looks like GHC won't garbage collect list elements until the first filter has completely finished, and the second filter has iterated over them.

Is there an easy way to feed each element into both functions, instead of feeding all elements to one function, and then all to the next?

Thanks,
Greg




_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Neil Mitchell
Hi Greg,

> But if I do something like this:
>       f lst = show (filter (> 1) lst, filter (> 2) lst)
> then it looks like GHC won't garbage collect list elements
> until the first
> filter has completely finished

There is a very good reason for this, show (a,b) is essentially

show (a,b) = "(" ++ show a ++ ", " ++ show b ++ ")"

so all the elements of a are shown first, which means that lst is held
up in a thunk of filter, to be evaluated later. In this particular
case, you know that not (>2) implies not (>1) so you can speed it up
with:

f lst = show (part1, part2)
   where
       part1 = filter (>1) lst
       part2 = fitler (>2) part1

note that part2 only examines part1. As long as part1 is smaller than
lst, this will reduce the space leak.

There is really no way to "eliminate" the space leak in this
circumstance, since you really do need to hold a part of the data in
memory while you show the first one, so you can then show the second
one.

Thanks

Neil
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Greg Fitzgerald
> hold a part of the data in memory while you show the first one,

Here would be a better example then.

    f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))

It ought to be *possible* to compute both operations without holding onto any of the list elements. 

In the imperative world, you'd say:
    sum1 = 0
    sum2 = 0
    for num in lst
        sum1 += num if num > 1
        sum2 += num if num > 2
    end
    puts sum1, sum2

One could probably hack it together with foldM, the State monad, and maybe some strictness, but I'd like to make full use of laziness and stick to the basic list operations if it at all possible.

I'm not so concerned with solving this particular problem as I am in learning the purely functional technique for performing orthogonal computations on the same input without introducing a space leak.

Maybe something like this?
    arr (sum . filter (>1)) &&& arr (sum . filter (>2))

Thanks,
Greg


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Neil Mitchell
Hi,

> Here would be a better example then.
>
>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))

I suspected that you actually wanted to do something "cleverer" with
the list, for the sake of argument, I'm going to change >1 to p1 and
>2 to p2 - to show how this can be done in the general case. With the
specific information you know about >1 vs >2 you can do better, but
this gets across the general point:

f lst = show (sumPairs (>1) (>2) lst)

sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int)
sumPairs p1 p2 [] = (0, 0)
sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b)
    where
       (a,b) = sumPairs xs
       add pred value = if pred x then x+value else value

[Untested, something like this should work]

You can actually arrive at this solution entirely be reasoning on the
program, i.e. not coming up with a fresh definition.

The above code essentially follows your imperative pseudo code - I
think its constant space, but I'm not too sure...

Thanks

Neil
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Greg Buchholz
Neil Mitchell wrote:

> I suspected that you actually wanted to do something "cleverer" with
> the list, for the sake of argument, I'm going to change >1 to p1 and
> >2 to p2 - to show how this can be done in the general case. With the
> specific information you know about >1 vs >2 you can do better, but
> this gets across the general point:
>
> f lst = show (sumPairs (>1) (>2) lst)
>
> sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int)
> sumPairs p1 p2 [] = (0, 0)
> sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b)
>     where
>        (a,b) = sumPairs xs
>        add pred value = if pred x then x+value else value
>
> [Untested, something like this should work]

    Nope.  That won't work because you end up creating huge "add" thunks
which cause end up causing a stack overflow (tested with GHC -O2).  I
think you are probably going to need strictness in order to skin this
cat in Haskell.  Here's an example that does work...

import Data.List
main = print $ para_filter_sum (> 1) (> 2) lst

twos = 2: twos
lst = take 10000000 $ [1,2,3,4,5] ++ twos

-- f lst = show (filter (> 1) lst, filter (> 2) lst)
para_filter_sum f g xs =
    foldl' (\(n,m) elem -> seq n $ seq m $
             (n+if f elem then elem else 0,
              m+if g elem then elem else 0 ) ) (0,0) xs


Greg Buchholz

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Greg Fitzgerald
In reply to this post by Neil Mitchell
Thanks Neil.  How do I add in another ~10 computations, or map a list of a 100 computations to the same input?

Isn't there a way to do this without one computation having to be aware of the other? 

This feels like a situation Parsec users would find themselves in all the time.  When you have a bunch of parsers in a 'choice', does the start of the input stream linger until the last parser is executed?

Thanks,
Greg


On 3/27/06, Neil Mitchell <[hidden email]> wrote:
Hi,

> Here would be a better example then.
>
>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))

I suspected that you actually wanted to do something "cleverer" with
the list, for the sake of argument, I'm going to change >1 to p1 and
>2 to p2 - to show how this can be done in the general case. With the
specific information you know about >1 vs >2 you can do better, but
this gets across the general point:

f lst = show (sumPairs (>1) (>2) lst)

sumPairs :: (Int -> Bool) -> (Int -> Bool) -> [Int] -> (Int, Int)
sumPairs p1 p2 [] = (0, 0)
sumPairs p1 p2 (x:xs) = (add p1 a, add p2 b)
    where
       (a,b) = sumPairs xs
       add pred value = if pred x then x+value else value

[Untested, something like this should work]

You can actually arrive at this solution entirely be reasoning on the
program, i.e. not coming up with a fresh definition.

The above code essentially follows your imperative pseudo code - I
think its constant space, but I'm not too sure...

Thanks

Neil


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Neil Mitchell
> Thanks Neil.  How do I add in another ~10 computations, or map a list of a
> 100 computations to the same input?
>
> Isn't there a way to do this without one computation having to be aware of
> the other?

I guess they have to be aware at some level, perhaps arrows generalise
the awareness they need, to perhaps you'd need something else.


> This feels like a situation Parsec users would find themselves in all the
> time.  When you have a bunch of parsers in a 'choice', does the start of the
> input stream linger until the last parser is executed?

No, as soon as one token is accepted from any parser, that input is
decided upon, and it will never go back. If you want that behaviour
you have to wrap the particular parser in try, which does give the
backtracking (and space leak)

Thanks

Neil
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Josef Svenningsson
On 3/28/06, Neil Mitchell <[hidden email]> wrote:
> > This feels like a situation Parsec users would find themselves in all the
> > time.  When you have a bunch of parsers in a 'choice', does the start of the
> > input stream linger until the last parser is executed?
>
> No, as soon as one token is accepted from any parser, that input is
> decided upon, and it will never go back. If you want that behaviour
> you have to wrap the particular parser in try, which does give the
> backtracking (and space leak)
>
I personally find this behaviour terribly confusing. It makes writing
the parser highly unmodular. It forces me to know exactly what a
certain parser recognizes to know whether I need to wrap a 'try'
around it when I compose it in parallel with another parser. Which is
why I much prefer to use parsing libraries based on Koen Claessen's
technique which performs all parses at the same time. It works
breadth-first on the parse forest (the set of parse trees).
Text.ParserCombinators.ReadP is one example which uses this technique.

Cheers,

/Josef
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Tomasz Zielonka
In reply to this post by Greg Fitzgerald
On Mon, Mar 27, 2006 at 03:10:18PM -0800, Greg Fitzgerald wrote:
>  > hold a part of the data in memory while you show the first one,
>
> Here would be a better example then.
>
>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))
>
> It ought to be *possible* to compute both operations without holding onto
> any of the list elements.

I wonder if it would be possible to remove the space-leak by running both
branches concurrently, and scheduling threads in a way that would
minimise the space-leak. I proposed this before

  http://www.haskell.org/pipermail/haskell-cafe/2005-December/013428.html

I would like to hear opinions from some compiler gurus.

Best regards
Tomasz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Greg Buchholz
Tomasz Zielonka wrote:
> I wonder if it would be possible to remove the space-leak by running both
> branches concurrently, and scheduling threads in a way that would
> minimise the space-leak. I proposed this before
>
>   http://www.haskell.org/pipermail/haskell-cafe/2005-December/013428.html

    FWIW, (not much), I asked a similar questions over on the
Lambda-the-Ultimate blog a while back...

    http://lambda-the-ultimate.org/node/923
    http://lambda-the-ultimate.org/node/485

...Anyway, I can't help but think that there might be a happy medium
between eager and lazy evaluation.

Greg Buchholz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Jan-Willem Maessen-2
In reply to this post by Tomasz Zielonka

On Mar 28, 2006, at 1:02 AM, Tomasz Zielonka wrote:

> On Mon, Mar 27, 2006 at 03:10:18PM -0800, Greg Fitzgerald wrote:
>>> hold a part of the data in memory while you show the first one,
>>
>> Here would be a better example then.
>>
>>     f lst = show (sum (filter (> 1) lst), sum (filter (> 2) lst))
>>
>> It ought to be *possible* to compute both operations without  
>> holding onto
>> any of the list elements.
>
> I wonder if it would be possible to remove the space-leak by  
> running both
> branches concurrently, and scheduling threads in a way that would
> minimise the space-leak. I proposed this before
>
>   http://www.haskell.org/pipermail/haskell-cafe/2005-December/ 
> 013428.html
>
> I would like to hear opinions from some compiler gurus.

This is possible in principle with something like resource-bounded  
eagerness, but it's not at all easy.  The problem is this: when lst  
gets big, you need to identify who's hanging on to it, and figure out  
that they are actually planning to consume it and come up with  
something smaller as a result.  This is all pretty heavyweight---not  
hard in principle, but hard enough in practice that it may not be  
worth the investment.

That said, there's a transformation that goes something like this:

a = foldr f z xs         ==>           (a,b) = foldr (f `cross` g)  
(z,y) xs
b = foldr g y xs

This could in principle at least pluck the lowest-hanging fruit (sum,  
filter, etc.).
However it runs into some significant problems:
- Only works with folds
- Has some problems with bottoms, if I recall rightly
- Not expressible using something like RULES;
    requires a special transformation in the compiler.
- It is often a pessimization.

That last point is a killer.

>
> Best regards
> Tomasz
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

Greg Fitzgerald
In reply to this post by Greg Buchholz
...Anyway, I can't help but think that there might be a happy medium
between eager and lazy evaluation.

What I'd love to see is the compiler continue to be call-by-need, but be smart enough to recognize when multiple expressions will all eventually need to be evaluated.  A simple example:

    show (a + b)

(+) requires *both* 'a' and 'b' be evaluated to show the result, not 'a' *then* 'b'.  It'd be great if the compiler can seek out any shared lazy data structures in evaluating 'a' and 'b', and start computing them both with one element at a time.

Has anyone put any thought into something like this?

Thanks,
Greg




 



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: multiple computations, same input

John Meacham
On Tue, Mar 28, 2006 at 12:27:43PM -0800, Greg Fitzgerald wrote:

> >
> > ...Anyway, I can't help but think that there might be a happy medium
> > between eager and lazy evaluation.
>
>
> What I'd love to see is the compiler continue to be call-by-need, but be
> smart enough to recognize when multiple expressions will all eventually need
> to be evaluated.  A simple example:
>
>     show (a + b)
>
> (+) requires *both* 'a' and 'b' be evaluated to show the result, not 'a'
> *then* 'b'.  It'd be great if the compiler can seek out any shared lazy data
> structures in evaluating 'a' and 'b', and start computing them both with one
> element at a time.
>
> Has anyone put any thought into something like this?

This is called strictness analysis and is a fundamental optimization of
any haskell compiler.

this paper has information on how this information is used in ghc, and a
search for 'strictness analysis' will turn up a plethora of algorithms
for calculating it.

 http://citeseer.ist.psu.edu/jones91unboxed.html

        John


--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Pragmatic concurrency Re: multiple computations, same input

Jon Fairbairn
In reply to this post by Tomasz Zielonka
On 2006-03-28 at 08:02+0200 Tomasz Zielonka wrote:
> I wonder if it would be possible to remove the space-leak by running both
> branches concurrently, and scheduling threads in a way that would
> minimise the space-leak. I proposed this before
>
>   http://www.haskell.org/pipermail/haskell-cafe/2005-December/013428.html
>
> I would like to hear opinions from some compiler gurus.

This is something I've been thinking about on and off for a
long time (probably since John Hughes mentioned the case of
"average"). I even kept Tomasz's original message in my
inbox until today in the hope that I'd get round to sending
a response, but my flaky health gets in the way. So here,
and I hope people will allow for the fact that I'm half
asleep as I write this, is an attempt.

There are some observations I'd like to make, and a
proposal. Since the proposal relates (in a small way) to
concurrency and is, I think worthwhile, I've cc'd this
message to haskell-prime.

1) choosing the optimal reduction strategy is undecidable

2) we shouldn't (in general) attempt to do undecidable
   things automatically

3) Separation of concerns: Pragmatic decisions about
   evaluation order should be kept separate from the
   denotational aspect of the code. By this token, seq
   shouldn't be a function (because it isn't one), but a
   pragma.  The fact that it's shorter to write seq a b than
   {-# SEQ a #-} b is a matter of syntax, so shouldn't rate
   highly in language design decisions. Perhaps we want a
   different syntax for this kind of pragma, but that's a
   side issue.

So, to take Tomasz's example of wc, we want to be able to
define it essentially this way:

wc cs = (ll, ww, cc) where ll = lines cs
                           ww = words cs
                           cc = length cs

but add [a] pragma[s] to the effect that evaluation should
be input driven, and that ll, ww, and cc are to be given
equal time. Something like {-# STEPPER cs; ROUND_ROBIN
ll,ww,cc #-} (please do not take this as a suggestion of
real syntax!).

The way I would implement this is to add a new primitive,
STEP, which is like seq except that it only evaluates its
argument until it encounters another STEP. (It really isn't
much different to seq).

So after the compiler understood the pragma, it would
replace wc with this (allowing the compiler to pretend step
is a function):

wc cs = (ll, ww, cc) where ll = lines cs'
                           ww = words cs'
                           cc = length cs'
                           cs' = foldr (\a -> STEP ll . STEP ww . STEP cc . (a:))
                                       []
                                       cs

Evaluation would start as normal (a wrinkle here is that the
way I've written it, whichever element of the tuple is
evaluated first gets two goes at the start, but that's a
compiler detail). when it came to evaluating cs', it would
be looking at a thunk something like

STEP ll (STEP ww (STEP cc ('x': ...)))

update the thunk to

(STEP ww (STEP cc ('x': ...)))

evaluate ll until (and if) it hits the thunk again, update
it to

(STEP cc ('x': ...))

evaluate ww until it hits the thunk, update it to

'x' : (STEP ...)

evaluate cc, and so on.

It seems to me that this wouldn't take much effort to
implement, but it would provide a simple means of removing
space leaks from a whole bunch of programmes without
mangling the source code much.

  Jón


--
Jón Fairbairn                              Jon.Fairbairn at cl.cam.ac.uk


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

greenrd
On Wed, 29 Mar 2006 12:50:02 +0100
Jon Fairbairn <[hidden email]> wrote:

> There are some observations I'd like to make, and a
> proposal. Since the proposal relates (in a small way) to
> concurrency and is, I think worthwhile, I've cc'd this
> message to haskell-prime.
>
> 1) choosing the optimal reduction strategy is undecidable
>
> 2) we shouldn't (in general) attempt to do undecidable
>    things automatically
>
> 3) Separation of concerns: Pragmatic decisions about
>    evaluation order should be kept separate from the
>    denotational aspect of the code. By this token, seq
>    shouldn't be a function (because it isn't one), but a
>    pragma.  The fact that it's shorter to write seq a b than
>    {-# SEQ a #-} b is a matter of syntax, so shouldn't rate
>    highly in language design decisions. Perhaps we want a
>    different syntax for this kind of pragma, but that's a
>    side issue.

I don't like pragmas because (at least in C) they are defined to be
optional and can be ignored by the compiler. We need optimisation
methods that work across all Haskell implementations (of a given
Haskell standard).

I suggest that a Haskell program should be treated as an executable
specification. In some cases the compiler can't optimise the program
well enough, so we (by which I mean, ordinary programmers, not compiler
geeks) should be able to explicitly provide our own optimisations, as
rewrite rules (generalised ones, or specialised ones for individual
functions). Democratise the means of automated optimisation! Then we
should be able to prove formally that our rewrite rules preserve
functional correctness. This is the approach I am pursuing in the
programming language I am working on, which is a derivative of Haskell.

(In principle you could write rewrite rules in Template Haskell, but I
don't know if anyone has tried that.)

This way of looking at it is nice, because then we don't have to shut
off whole avenues of fruitful thought, on the grounds of "Oh no, the
compiler is far too stupid to do that", or "Oh no, that's far too much
of a special case for this particular example, and it would bloat the
compiler too much to include little things like this".

The way I would optimise the wc example in my language is as follows:

First translate it into a monadic pipeline in the State monad:

wc = evalState $ do
        w <- passthru (length . words)
        l <- passthru (length . lines)
        c <- passthru length
        return (w,l,c)
        where
                passthru = gets

Then convert that monadic action into a semi-lazy imperative pipeline on
lists (semi-lazy because the pipeline is evaluated lazily, but the
side-effects of the pipeline are evaluated strictly - or something
like that - I have difficulty explaining it). This is too involved to go
into here (and I haven't worked out the details of the rewrite rules
yet), but the basic idea looks like this pseudo-shell-script:

words -output w | lines -output l | length -output c >/dev/null
echo "(`cat w`, `cat l`, `cat c`)"
rm -f w l c

Each command in the first line of this pseudo-shell-script copies its
list from standard input to standard output, and stores its result in a
temporary file named by the -output option. (Obviously, in the real
code, temporary files wouldn't be used, and nor would operating system
pipes be used - I just found them convenient in order to analogise my
solution as a shell script.)

Despite the apparent waste of copying a list three times, this is
actually more efficient than the original code because it doesn't need
to store any lists in memory.

There might be better ways to do it, but that's just an idea off the top
of my head.
--
Robin
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Brian Hulley
Robin Green wrote:

> On Wed, 29 Mar 2006 12:50:02 +0100
> Jon Fairbairn <[hidden email]> wrote:
>> [snip]
>> 1) choosing the optimal reduction strategy is undecidable
>>
>> 2) we shouldn't (in general) attempt to do undecidable
>>    things automatically
>> [snip]
>[snip]
> I suggest that a Haskell program should be treated as an executable
> specification. In some cases the compiler can't optimise the program
> well enough, so we (by which I mean, ordinary programmers, not
> compiler geeks) should be able to explicitly provide our own
> optimisations, as rewrite rules (generalised ones, or specialised
> ones for individual functions). Democratise the means of automated
> optimisation!

This sounds good. The only thing I'm wondering is what do we actually gain
by using Haskell in the first place instead of just a strict language? It
seems that Haskell's lazyness gives a succinct but too inefficient program
which then needs extra code in the form of rewrite rules/pragmas, or else a
complete rewrite in terms of seq etc to get it to run fast enough without
space leaks...

Regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Brian Hulley
Brian Hulley wrote:

> Robin Green wrote:
>> On Wed, 29 Mar 2006 12:50:02 +0100
>> Jon Fairbairn <[hidden email]> wrote:
>>> [snip]
>>> 1) choosing the optimal reduction strategy is undecidable
>>>
>>> 2) we shouldn't (in general) attempt to do undecidable
>>>    things automatically
>>> [snip]
>> [snip]
>> I suggest that a Haskell program should be treated as an executable
>> specification. In some cases the compiler can't optimise the program
>> well enough, so we (by which I mean, ordinary programmers, not
>> compiler geeks) should be able to explicitly provide our own
>> optimisations, as rewrite rules (generalised ones, or specialised
>> ones for individual functions). Democratise the means of automated
>> optimisation!
>
> This sounds good. The only thing I'm wondering is what do we actually
> gain by using Haskell in the first place instead of just a strict
> language? It seems that Haskell's lazyness gives a succinct but too
> inefficient program which then needs extra code in the form of
> rewrite rules/pragmas, or else a complete rewrite in terms of seq etc
> to get it to run fast enough without space leaks...

Thinking about this some more, I realised Jon had already answered this
question in his 3rd point:

     On Wed, 29 Mar 2006 12:50:02 +0100
     Jon Fairbairn <[hidden email]> wrote:
     > 3) Separation of concerns: Pragmatic decisions about
     >    evaluation order should be kept separate from the
     >    denotational aspect of the code. By this token, seq

I wonder if there could be a really large repository of rewrite rules on the
web somewhere, with heuristics to determine various strategies for applying
them.

There would also need to be some automated way of proving correctness of
rewrite rules, so that if someone submitted a new one it would be sure not
to introduce bugs into the optimization.

In this way, the Haskell community could gradually chip away at the
undecidableness of automatically optimizing Haskell programs, because it may
turn out to be the case that most functions are members of a very small
subset of the possible Haskell functions and could thus be handled by a
finite set of rewrite rules.

Regards, Brian.


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Philippa Cowderoy
In reply to this post by Brian Hulley
On Wed, 29 Mar 2006, Brian Hulley wrote:

> This sounds good. The only thing I'm wondering is what do we actually gain by
> using Haskell in the first place instead of just a strict language? It seems
> that Haskell's lazyness gives a succinct but too inefficient program which
> then needs extra code in the form of rewrite rules/pragmas, or else a complete
> rewrite in terms of seq etc to get it to run fast enough without space
> leaks...
>

Often the laziness is useful for purposes of efficiency as well though.

--
[hidden email]

Sometimes you gotta fight fire with fire. Most
of the time you just get burnt worse though.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

John Meacham
In reply to this post by greenrd
On Wed, Mar 29, 2006 at 03:23:04PM +0100, Robin Green wrote:
> I suggest that a Haskell program should be treated as an executable
> specification. In some cases the compiler can't optimise the program
> well enough, so we (by which I mean, ordinary programmers, not compiler
> geeks) should be able to explicitly provide our own optimisations, as
> rewrite rules (generalised ones, or specialised ones for individual
> functions). Democratise the means of automated optimisation! Then we
> should be able to prove formally that our rewrite rules preserve
> functional correctness. This is the approach I am pursuing in the
> programming language I am working on, which is a derivative of Haskell.

have you seen the RULES pragma? it is implemented in both ghc and jhc.
 
http://www.haskell.org/ghc/docs/6.4/html/users_guide/rewrite-rules.html

        John

--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Pragmatic concurrency Re: multiple computations, same input

Tomasz Zielonka
In reply to this post by Jon Fairbairn
On Wed, Mar 29, 2006 at 12:50:02PM +0100, Jon Fairbairn wrote:

> [...]
>
> but add [a] pragma[s] to the effect that evaluation should
> be input driven, and that ll, ww, and cc are to be given
> equal time. Something like {-# STEPPER cs; ROUND_ROBIN
> ll,ww,cc #-} (please do not take this as a suggestion of
> real syntax!).
>
> The way I would implement this is to add a new primitive,
> STEP, which is like seq except that it only evaluates its
> argument until it encounters another STEP. (It really isn't
> much different to seq).
>
> [...]
>
> It seems to me that this wouldn't take much effort to
> implement, but it would provide a simple means of removing
> space leaks from a whole bunch of programmes without
> mangling the source code much.

Actually, it may require no effort from compiler implementors.
I just managed to get the desired effect in current GHC! :-)

I implemented your idea of stepper by writing the function stepper that
rewrites the list invoking "yield" every 500 processed elements. This
way I can concurrently consume the list without the space leak - when a
thread evaluates too many list elements, it gets preempted. I think it
suffices if RTS employs a round-robin scheduler. I am not sure it's
important.

The code isn't as beautiful as the naive wc implementation. That's
because I haven't yet thought how to hide newEmptyMVar, forkIO, putMVar
i takeMVar. Perhaps someone will come up with a solution to this.

    import Control.Concurrent
    import Control.Monad
    import System.IO.Unsafe (unsafePerformIO)

    stepper l = s n l
      where
        n = 500
        s 0 (x:xs) = unsafePerformIO $ do
            yield
            return (x : s n xs)
        s i (x:xs) = x : s (i-1) xs
        s _ []     = []

    main = do
        cs <- liftM stepper getContents
        ll <- newEmptyMVar
        ww <- newEmptyMVar
        cc <- newEmptyMVar
        forkIO $ putMVar ll $! length (lines cs)
        forkIO $ putMVar ww $! length (words cs)
        forkIO $ putMVar cc $! length cs
        takeMVar ll >>= print
        takeMVar ww >>= print
        takeMVar cc >>= print

See how well it works:

    $ cat words words words words | ./A +RTS -sstderr
    ./A +RTS -K8M -sstderr
    394276
    394272
    3725868                                             <- that's the size of cs
    643,015,284 bytes allocated in the heap
     72,227,708 bytes copied during GC
        109,948 bytes maximum residency (46 sample(s))  <- no space leak!

           2452 collections in generation 0 (  0.33s)
             46 collections in generation 1 (  0.00s)

              2 Mb total memory in use                  <- no space leak!

      INIT  time    0.00s  (  0.01s elapsed)
      MUT   time    1.25s  (  1.27s elapsed)
      GC    time    0.33s  (  0.36s elapsed)
      EXIT  time    0.00s  (  0.00s elapsed)
      Total time    1.58s  (  1.64s elapsed)

      %GC time      20.9%  (22.0% elapsed)

      Alloc rate    514,412,227 bytes per MUT second

      Productivity  79.1% of total user, 76.2% of total elapsed

Thanks for your idea, Jon! :-)

Best regards
Tomasz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
12