Quantcast

"containing" memory-consuming computations

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

"containing" memory-consuming computations

Herbert Valerio Riedel
Hello GHC Devs,

One issue that's been bothering me when writing Haskell programs meant
to be long-running processes performing computations on external
input-data in terms of an event/request-loop (think web-services,
SQL-servers, or REPLs), that it is desirable to be able to limit
resource-usage and be able to "contain" the effects of computations
which exhausts the resource-limits (i.e. w/o crashing and burning the
whole process)

For the time-dimension, I'm already using functions such as
System.Timeout.timeout which I can use to make sure that even a (forced)
pure computation doesn't require (significantly) more wall-clock time
than I expect it to.

But I'm missing a similar facility for constraining the
space-dimension. In some other languages such as C, I have (more or
less) the ability to check for /local/ out-of-memory conditions (e.g. by
checking the return value of e.g. malloc(3) for heap-allocations, or by
handling an OOM exception), rollback the computation, and be able to
skip to the next computation request (which hopefully requires less
memory...)

So, is there already any such facility provided by the GHC Platform I've
missed so far?

...and if not, would such a memory-limiting facility be reconcilable
with the GHC RTS architecture?

Cheers,
  hvr
--

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Ryan Newton
Hi Herbert,

It sounds like you're interested in running just one client computation at once?  Hence you don't have a disambiguation problem -- if the total memory footprint crosses a threshold you know who to blame.

At least this seems easier than needing a per-computation or per-IO-thread caps.  By the way, the folks who implement Second Life did an interesting job of that -- they hacked Mono to be able to execute untrusted code with resource bounds.

Cheers,
  -Ryan

On Thu, Apr 19, 2012 at 6:45 AM, Herbert Valerio Riedel <[hidden email]> wrote:
Hello GHC Devs,

One issue that's been bothering me when writing Haskell programs meant
to be long-running processes performing computations on external
input-data in terms of an event/request-loop (think web-services,
SQL-servers, or REPLs), that it is desirable to be able to limit
resource-usage and be able to "contain" the effects of computations
which exhausts the resource-limits (i.e. w/o crashing and burning the
whole process)

For the time-dimension, I'm already using functions such as
System.Timeout.timeout which I can use to make sure that even a (forced)
pure computation doesn't require (significantly) more wall-clock time
than I expect it to.

But I'm missing a similar facility for constraining the
space-dimension. In some other languages such as C, I have (more or
less) the ability to check for /local/ out-of-memory conditions (e.g. by
checking the return value of e.g. malloc(3) for heap-allocations, or by
handling an OOM exception), rollback the computation, and be able to
skip to the next computation request (which hopefully requires less
memory...)

So, is there already any such facility provided by the GHC Platform I've
missed so far?

...and if not, would such a memory-limiting facility be reconcilable
with the GHC RTS architecture?

Cheers,
 hvr
--

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Bardur Arantsson-2
In reply to this post by Herbert Valerio Riedel
On 04/19/2012 12:45 PM, Herbert Valerio Riedel wrote:
> Hello GHC Devs,
 >
> But I'm missing a similar facility for constraining the
> space-dimension. In some other languages such as C, I have (more or
> less) the ability to check for /local/ out-of-memory conditions (e.g. by
> checking the return value of e.g. malloc(3) for heap-allocations, or by
> handling an OOM exception), rollback the computation, and be able to

Just a minor point which may be of interest: Many operating systems
(including Linux by default) overcommit memory, so there is in fact no
guarantee that memory returned by malloc() will in fact be usable under
memory pressure.

So, getting a non-NULL pointer from malloc() is not a guarantee.

(There are good reasons for this behavior.)

Regards,


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Herbert Valerio Riedel
In reply to this post by Ryan Newton
Ryan Newton <[hidden email]> writes:

> It sounds like you're interested in running just one client computation at
> once?

Actually, I was hoping for a more general solution, which is also
applicable to e.g. a web-server running with `+RTS -N8`, where each HTTP
request spawns a new thread, and multiple requests are supposed to run
in parallel and concurrently.

> Hence you don't have a disambiguation problem -- if the total memory
> footprint crosses a threshold you know who to blame.

I could use that, but then I'd have to clone the process (or start the
processes multiple times requiring to duplicate all in-memory
data-structures), and I'd have the problem that the amount of
parallelism/concurrency is limited by the number of cloned unix
processes. Alas, this works only for some use-cases (like e.g. a
single-user GHCi REPL)

> At least this seems easier than needing a per-computation or
> per-IO-thread caps.

How hard would per-IO-thread caps be?

> By the way, the folks who implement Second Life did an interesting job
> of that -- they hacked Mono to be able to execute untrusted code with
> resource bounds.

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Simon Marlow-7
On 19/04/2012 11:45, Herbert Valerio Riedel wrote:

 > For the time-dimension, I'm already using functions such as
 > System.Timeout.timeout which I can use to make sure that even a (forced)
 > pure computation doesn't require (significantly) more wall-clock time
 > than I expect it to.

Note that timeout uses wall-clock time, but you're really interested in
CPU time (presumably).  If there are other threads running, then using
timeout will not do what you want.

You could track allocation and CPU usage per thread, but note that
laziness could present a problem: if a thread evaluates a lazy
computation created by another thread, it will be charged to the thread
that evaluated it, not the thread that created it.  To get around this
you would need to use the profiling system, which tracks costs
independently of lazy evaluation.

On 19/04/2012 17:04, Herbert Valerio Riedel wrote:

>> At least this seems easier than needing a per-computation or
>> per-IO-thread caps.
>
> How hard would per-IO-thread caps be?

For tracking "memory use", which I think is what you're asking for, it
would be quite hard.  One problem is sharing: when a data structure is
shared between multiple threads, which one should it be charged to?  Both?

To calculate the amount of memory use per thread you would need to run
the GC multiple times, once per thread, and observe how much data is
reachable.  I can't think of any fundamental difficulties with doing
that, but it could be quite expensive.  There might be some tricky
interactions with the reachability property of threads themselves: a
blocked thread is only reachable if the object it is blocked on is also
reachable.

Cheers,
        Simon



_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Edward Z. Yang
So, it would be pretty interesting if we could have an ST s style
mechanism, where the data structure is not allowed to escape.
But I wonder if this would be too cumbersome for anyone to use.

Edward

Excerpts from Simon Marlow's message of Fri Apr 20 06:07:20 -0400 2012:

> On 19/04/2012 11:45, Herbert Valerio Riedel wrote:
>
>  > For the time-dimension, I'm already using functions such as
>  > System.Timeout.timeout which I can use to make sure that even a (forced)
>  > pure computation doesn't require (significantly) more wall-clock time
>  > than I expect it to.
>
> Note that timeout uses wall-clock time, but you're really interested in
> CPU time (presumably).  If there are other threads running, then using
> timeout will not do what you want.
>
> You could track allocation and CPU usage per thread, but note that
> laziness could present a problem: if a thread evaluates a lazy
> computation created by another thread, it will be charged to the thread
> that evaluated it, not the thread that created it.  To get around this
> you would need to use the profiling system, which tracks costs
> independently of lazy evaluation.
>
> On 19/04/2012 17:04, Herbert Valerio Riedel wrote:
>
> >> At least this seems easier than needing a per-computation or
> >> per-IO-thread caps.
> >
> > How hard would per-IO-thread caps be?
>
> For tracking "memory use", which I think is what you're asking for, it
> would be quite hard.  One problem is sharing: when a data structure is
> shared between multiple threads, which one should it be charged to?  Both?
>
> To calculate the amount of memory use per thread you would need to run
> the GC multiple times, once per thread, and observe how much data is
> reachable.  I can't think of any fundamental difficulties with doing
> that, but it could be quite expensive.  There might be some tricky
> interactions with the reachability property of threads themselves: a
> blocked thread is only reachable if the object it is blocked on is also
> reachable.
>
> Cheers,
>     Simon
>

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Brandon Allbery
On Fri, Apr 20, 2012 at 12:56, Edward Z. Yang <[hidden email]> wrote:
So, it would be pretty interesting if we could have an ST s style
mechanism, where the data structure is not allowed to escape.
But I wonder if this would be too cumbersome for anyone to use.

Isn't this what monadic regions are for?

--
brandon s allbery                                      [hidden email]
wandering unix systems administrator (available)     (412) 475-9364 vm/sms


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Edward Z. Yang
Excerpts from Brandon Allbery's message of Fri Apr 20 19:31:54 -0400 2012:
> > So, it would be pretty interesting if we could have an ST s style
> > mechanism, where the data structure is not allowed to escape.
> > But I wonder if this would be too cumbersome for anyone to use.
>
> Isn't this what monadic regions are for?

That's right!  But we have a hard enough time convincing people it's
worth it, just for file handles.

Edward

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "containing" memory-consuming computations

Ryan Newton
Simon mentioned a system of doing multiple GC's to measure actual live data.

But wouldn't a more limited alternative be capping allocation rather than live data?  GHC already has an mechanism to preempt IO threads based on an allocation trip wire.  In fact that's *the* preemption mechanism.  Isn't the only piece missing to have a primitive similar to chez Scheme's "make-engine":


... which would transfer control to a child computation, but would return control to the parent (along with a continuation) when its allocation budget is exhausted?   

Make-engine + safe-haskell + timeouts should be everything one needs to resist an adversarial untrusted program.  Maybe?

  -Ryan

P.S. Chez Scheme engines are actually related to # procedure calls, not allocation as far as I know.


On Fri, Apr 20, 2012 at 7:35 PM, Edward Z. Yang <[hidden email]> wrote:
Excerpts from Brandon Allbery's message of Fri Apr 20 19:31:<a href="tel:54%20-0400%202012" value="+15404002012">54 -0400 2012:
> > So, it would be pretty interesting if we could have an ST s style
> > mechanism, where the data structure is not allowed to escape.
> > But I wonder if this would be too cumbersome for anyone to use.
>
> Isn't this what monadic regions are for?

That's right!  But we have a hard enough time convincing people it's
worth it, just for file handles.

Edward

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Loading...