laziness and optimization

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

laziness and optimization

Michael Mossey
I understand a bit about the concept of "lazy evaluation." I think of
that as saying an imperative language would always make one evaluation,
whereas Haskell might make 0 evaluations. I have another similar
situation where an imperative language would make N evaluations of the
same expression, and I would like Haskell to make only 1.

This is the situation: the graphical score editor displays
"LayoutItems." A LayoutItem can be a single displayed entity, like a
round notehead, or it can be composed of several entities.

A common situation in my code is the need to determine the size and
shape of a LayoutItem. For a fundamental item, this can be looked up in
a table or read from the font properties. For a composite item, some
computation is required: the code must determine the positions of each
sub-item and compute the bounds of a shape containing all of them.

It's this latter computation, finding the bounds of a composite item,
which might come up multiple times. Consider that I ask for the bounds
of a composite-composite item (a composite item composed of composite
items). It will run the computation associated with each composite
sub-item, even though it is very likely I already make that computation
when I first constructed and placed that sub-item.

In an imperative language, one might cache values for later lookup. This
raises the problem of keeping the cache current to the current state.

So I'm wondering to what extent the haskell compiler recognizes
computations it's done before. In a purely functional language this
should be pretty easy, right? If it sees the same expression, it knows
it will have the same value. That's my understanding, so far.

Thanks,
Mike
Reply | Threaded
Open this post in threaded view
|

laziness and optimization

Brent Yorgey-2
On Sat, Mar 21, 2009 at 06:02:58AM -0700, Michael Mossey wrote:
>
> So I'm wondering to what extent the haskell compiler recognizes
> computations it's done before. In a purely functional language this should
> be pretty easy, right? If it sees the same expression, it knows it will
> have the same value. That's my understanding, so far.

It really doesn't; this isn't as easy in a purely functional language
as you think.  For one thing, how to recognize which computations have
been done before?  The runtime would have to keep around some sort of
hash mapping expressions to values, and incur a lot of overhead
checking each thing to be evaluated against this map. Furthermore,
this may be very expensive memory-wise.  Consider the the case where
some expression generates a very long (say, million-element) list, but
it is immediately processed by some other code which counts how many
threes the list contains.  This can be done in constant memory, since
the garbage collector can come along behind and clean up the processed
parts of the list even while it is still being generated.  But if we
need to keep the list around just in case the expression that
generated it is used again, we will need to keep a million-element
list in memory!  So sometimes, memoizing expression values like this
is a pessimization, and it's very difficult to tell the difference.

This is a tricky problem; I've run into pretty much the exact same
problem (the need to cache computed sizes for recursively constructed
elements) in my 'diagrams' library.  Probably the easiest solution
(which I haven't actually yet implemented in my library, but plan to
do something along these lines!) is just to cache the size like you
would in an imperative language.  For example:
 
    data Thingy = Composite [Sized Thingy]
                | Primitive String

    data Sized a = S { size :: Int, thing :: a }

    thingySize :: Thingy -> Int
    thingySize (Primitive s) = length s
    thingySize (Composite sts) = sum (map size sts)

    mkSizedThingy :: Thingy -> Sized Thingy
    mkSizedThingy t = S (thingySize t) t

This has exactly the property you want: due to laziness, the size of a
SizedThingy will be computed *only* if it is ever needed; and it will
be computed at most once.

-Brent