strange stack overflow with Data.Map

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

strange stack overflow with Data.Map

David Roundy-2
Hi all,

I've got a problem that I'm seeing using either Data.Map or Data.IntMap.

> module Main where
> import Data.List
> import qualified Data.IntMap as Map

> stats elems = foldl add_elem Map.empty elems
> add_elem m x = Map.insertWith (+) x 1 m

> main = print $ stats $ take 1000000 $ repeat 1

This program has a space leak and runs out of stack space.  I'm guessing
that I'm being bit here by an unnatural amount of laziness in
Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
something elementary...

I tried defining

add_elem m x = let m' = Map.insertWith (+) x 1 m
                   Just num = Map.lookup x m'
               in seq num m'
to force the (+) to be evaluated strictly, but that didn't help.
--
David Roundy
http://www.darcs.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Udo Stenzel
David Roundy wrote:
> > stats elems = foldl add_elem Map.empty elems
> > add_elem m x = Map.insertWith (+) x 1 m
> [...]
> I tried defining
>
> add_elem m x = let m' = Map.insertWith (+) x 1 m
>                    Just num = Map.lookup x m'
>                in seq num m'
> to force the (+) to be evaluated strictly, but that didn't help.

No, it doesn't, unless you also use foldl'.  Not tested, but been
there, seen it, bought the t-shirt.


Udo.
--
The Force is what holds everything together.  It has its dark side, and
it has its light side.  It's sort of like cosmic duct tape.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Bulat Ziganshin
In reply to this post by David Roundy-2
Hello David,

Thursday, December 29, 2005, 3:42:05 AM, you wrote:

>> stats elems = foldl add_elem Map.empty elems

DR> This program has a space leak and runs out of stack space.  I'm guessing
DR> that I'm being bit here by an unnatural amount of laziness in
DR> Map.insertWith

stack overflows AFAIK is never occur because of laziness, but only
because your recursion is not tail-optimized. as Udo says, problem may
be fixed by using proper fold. if foldl' will not work - try to write
your own, explicit looping scheme


--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Tomasz Zielonka
On Thu, Dec 29, 2005 at 11:22:11AM +0300, Bulat Ziganshin wrote:
> stack overflows AFAIK is never occur because of laziness, but only
> because your recursion is not tail-optimized.

AFAIK, evaluating a thunk in GHC-compiled code does use the stack -
update frames are being put on it.

There is a nice description in this paper, at the bottom of page 3:
http://research.microsoft.com/Users/simonpj/papers/parallel/multiproc.pdf

Best regards
Tomasz

--
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Christian Maeder
In reply to this post by David Roundy-2
David Roundy wrote:
>> stats elems = foldl add_elem Map.empty elems
>> add_elem m x = Map.insertWith (+) x 1 m
>
>> main = print $ stats $ take 1000000 $ repeat 1
>
> This program has a space leak and runs out of stack space.  I'm guessing
> that I'm being bit here by an unnatural amount of laziness in
> Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
> something elementary...

try:

stats = foldl' add_elem Map.empty
add_elem m x = myinsertWith (+) x 1 m

myinsertWith f k v m =
   case Map.lookup k m of
         Nothing -> Map.insert k v m
         Just w -> let r = f v w in r `seq` Map.insert k r m

main = print $ stats $ take 1000000 $ repeat 1
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

David Roundy
On Thu, Dec 29, 2005 at 01:42:29PM +0100, Christian Maeder wrote:

> David Roundy wrote:
> >>stats elems = foldl add_elem Map.empty elems
> >>add_elem m x = Map.insertWith (+) x 1 m
> >
> >>main = print $ stats $ take 1000000 $ repeat 1
> >
> >This program has a space leak and runs out of stack space.  I'm guessing
> >that I'm being bit here by an unnatural amount of laziness in
> >Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
> >something elementary...
>
> try:
>
> stats = foldl' add_elem Map.empty
> add_elem m x = myinsertWith (+) x 1 m
>
> myinsertWith f k v m =
>   case Map.lookup k m of
>         Nothing -> Map.insert k v m
>         Just w -> let r = f v w in r `seq` Map.insert k r m

Thanks, that does it! I had tried something like that, but without the
foldl', and had tried foldl', but without the myinsertWith.  The reason I
didn't spend much time using this implementation is that for large maps, it
will take twice as long as Map.insertWith, so I assumed (incorrectly) that
Map.insertWith should be written so that it behaves in a non-leaky manner.

Should the Map modules have an additional Map.insertWith' that behaves
strictly, or might it be the case that you always want strict behavior when
using insertWith?
--
David Roundy
http://www.darcs.net
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Jean-Philippe Bernardy
On 12/29/05, David Roundy <[hidden email]> wrote:

> On Thu, Dec 29, 2005 at 01:42:29PM +0100, Christian Maeder wrote:
> > David Roundy wrote:
> > >>stats elems = foldl add_elem Map.empty elems
> > >>add_elem m x = Map.insertWith (+) x 1 m
> > >
> > >>main = print $ stats $ take 1000000 $ repeat 1
> > >
> > >This program has a space leak and runs out of stack space.  I'm guessing
> > >that I'm being bit here by an unnatural amount of laziness in
> > >Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
> > >something elementary...
> > try:
> >
> > stats = foldl' add_elem Map.empty
> > add_elem m x = myinsertWith (+) x 1 m
> >
> > myinsertWith f k v m =
> >   case Map.lookup k m of
> >         Nothing -> Map.insert k v m
> >         Just w -> let r = f v w in r `seq` Map.insert k r m
>
> Thanks, that does it! I had tried something like that, but without the
> foldl', and had tried foldl', but without the myinsertWith.  The reason I
> didn't spend much time using this implementation is that for large maps, it
> will take twice as long as Map.insertWith, so I assumed (incorrectly) that
> Map.insertWith should be written so that it behaves in a non-leaky manner.

I'm not sure I understand this ... Do you say that the leaky program
is faster than the non-leaky one ?

> Should the Map modules have an additional Map.insertWith' that behaves
> strictly, or might it be the case that you always want strict behavior when
> using insertWith?

IMO, there is always a possibility for needing the lazy version. I
also have to say that making a strict version for each "...With" does
not fill my heart with joy. :) Such cases call for cheapness analysis,
or optimistic evaluation, or some other kind of solution implemented
at the compiler level, IMHO.

Cheers,
JP.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Udo Stenzel
In reply to this post by David Roundy
David Roundy wrote:
> Should the Map modules have an additional Map.insertWith' that behaves
> strictly, or might it be the case that you always want strict behavior when
> using insertWith?

I think so.  Once strictness is lost, there's nothing the user of a
library could do about it.  If a container is too strict however,
lazyness can be recovered by wrapping values in an additional data type.
So strict variants of updating functions would be a big win, and if two
versions of every function are deemed too much of a burden, I'd rather
get rid of the lazy version.


Udo.
--
Ours is a world where people don't know what they want and are willing
to go through hell to get it. -- Don Marquis

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

David Roundy
In reply to this post by Jean-Philippe Bernardy
On Thu, Dec 29, 2005 at 04:24:02PM +0100, Jean-Philippe Bernardy wrote:

> On 12/29/05, David Roundy <[hidden email]> wrote:
> > On Thu, Dec 29, 2005 at 01:42:29PM +0100, Christian Maeder wrote:
> > > David Roundy wrote:
> > > >>stats elems = foldl add_elem Map.empty elems
> > > >>add_elem m x = Map.insertWith (+) x 1 m
> > > >
> > > >>main = print $ stats $ take 1000000 $ repeat 1
> > > >
> > > >This program has a space leak and runs out of stack space.  I'm guessing
> > > >that I'm being bit here by an unnatural amount of laziness in
> > > >Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
> > > >something elementary...
> > > try:
> > >
> > > stats = foldl' add_elem Map.empty
> > > add_elem m x = myinsertWith (+) x 1 m
> > >
> > > myinsertWith f k v m =
> > >   case Map.lookup k m of
> > >         Nothing -> Map.insert k v m
> > >         Just w -> let r = f v w in r `seq` Map.insert k r m
> >
> > Thanks, that does it! I had tried something like that, but without the
> > foldl', and had tried foldl', but without the myinsertWith.  The reason I
> > didn't spend much time using this implementation is that for large maps, it
> > will take twice as long as Map.insertWith, so I assumed (incorrectly) that
> > Map.insertWith should be written so that it behaves in a non-leaky manner.
>
> I'm not sure I understand this ... Do you say that the leaky program
> is faster than the non-leaky one ?

No, what I mean is that Map.insertWith only traverses the Map once, while
myinsertWith has to traverse it twice, so for a large map, each call to
myinsertWith will take twice as long as Map.insertWith.  Or to put it a
different way, if myinsertWith were part of the Map module, it could be
twice as fast.

> > Should the Map modules have an additional Map.insertWith' that behaves
> > strictly, or might it be the case that you always want strict behavior when
> > using insertWith?
>
> IMO, there is always a possibility for needing the lazy version. I
> also have to say that making a strict version for each "...With" does
> not fill my heart with joy. :) Such cases call for cheapness analysis,
> or optimistic evaluation, or some other kind of solution implemented
> at the compiler level, IMHO.

I agree, there is a possibility for wanting the lazy version.  If you only
want to provite one insertWith, I suppose you'd want to figure out whether
the lazy or strict version is more commonly needed (or "safer"), and make
people wanting the other version implement it themselves.  In my opinion,
the strict version seems nicer, largely because HNF is cheap for most
"large" computations I do, so strictness wouldn't cost much.

On the subject of excessive functions, what are the uses of insertWithKey?
It seems like anything you do with insertWithKey you could just as
efficiently do with insertWith... it seems pretty trivial to write

insertWithKey f k x = insertWith (f k) k x

There may be a use to all the WithKey variants, but I'd much rather have
strict varients...
--
David Roundy
http://www.darcs.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Cale Gibbard
In reply to this post by David Roundy-2
Although it's already been solved, I'd like to point out here that
foldl is (or may be) getting tail optimised, but that the stack
overflow isn't from the foldl itself, but from the evaluation of the
huge expression which that foldl builds. Evaluating the left
associative expression involves immediately pushing 1000000 items on
the stack.

Note that:
foldl (flip (:)) [] (replicate 1000000 1)
works fine after a short pause, due to the fact that the result can be
lazily evaluated and printed one piece at a time, whereas
foldl (+) 0 (replicate 1000000 1)
causes a stack overflow.

As was mentioned, the solution is to use the stricter foldl' to keep
the accumulated expression small.

 - Cale

On 28/12/05, David Roundy <[hidden email]> wrote:

> Hi all,
>
> I've got a problem that I'm seeing using either Data.Map or Data.IntMap.
>
> > module Main where
> > import Data.List
> > import qualified Data.IntMap as Map
>
> > stats elems = foldl add_elem Map.empty elems
> > add_elem m x = Map.insertWith (+) x 1 m
>
> > main = print $ stats $ take 1000000 $ repeat 1
>
> This program has a space leak and runs out of stack space.  I'm guessing
> that I'm being bit here by an unnatural amount of laziness in
> Map.insertWith, but I can't see how to fix this.  :(  I'm sure it's
> something elementary...
>
> I tried defining
>
> add_elem m x = let m' = Map.insertWith (+) x 1 m
>                    Just num = Map.lookup x m'
>                in seq num m'
> to force the (+) to be evaluated strictly, but that didn't help.
> --
> David Roundy
> http://www.darcs.net
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

David Mercer-2
In reply to this post by Udo Stenzel
On 12/29/05, Udo Stenzel <[hidden email]> wrote:
***snipped a bunch of use case specific stuff**

>and if two
> versions of every function are deemed too much of a burden, I'd rather
> get rid of the lazy version.

Huh, then why use haskell, there are strict purely functional languages?

/snarky philosophical comment

-David Mercer
Tucson, AZ
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: strange stack overflow with Data.Map

Cale Gibbard
In reply to this post by Udo Stenzel
Laziness and strictness are both important depending on the situation.
I'd recommend keeping both variants around. Having to wrap values in
an extra data type just to keep laziness sort of defeats the point of
Haskell being lazy in the first place. It also makes it somewhat
awkward to use in the cases where laziness is important, which is, I
suspect, quite a lot of the time where the codomain of the Map is a
recursive datatype.

Having lazy and strict variants of data structure libraries (i.e.
splitting the modules into *.Lazy and *.Strict) seems like a good idea
though, but I can imagine some cases arising where it would be
convenient to swap between the two, so it would be good to keep the
type the same anyway. The parent module would then reexport one of the
variants. (Obviously the lazy one, since this is Haskell, right? ;)

 - Cale

On 29/12/05, Udo Stenzel <[hidden email]> wrote:

> David Roundy wrote:
> > Should the Map modules have an additional Map.insertWith' that behaves
> > strictly, or might it be the case that you always want strict behavior when
> > using insertWith?
>
> I think so.  Once strictness is lost, there's nothing the user of a
> library could do about it.  If a container is too strict however,
> lazyness can be recovered by wrapping values in an additional data type.
> So strict variants of updating functions would be a big win, and if two
> versions of every function are deemed too much of a burden, I'd rather
> get rid of the lazy version.
>
>
> Udo.
> --
> Ours is a world where people don't know what they want and are willing
> to go through hell to get it. -- Don Marquis
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
>
> iD8DBQFDtAMdc1ZCC9bsOpURAs7ZAJ9pS/L7pf9tpRoTGi3HDR0gyindOQCfakED
> Xdpm15vvyqF51QAmCJeCm1U=
> =tZhs
> -----END PGP SIGNATURE-----
>
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>
>
>
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries