Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Gregory Collins-3
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Merijn Verstraaten
Handles use MVar’s internally, therefore any operation on a Handle can potentially block, making the operation interruptible.

FWIW, I’m +1 on this. It’s been a pain to get this correct in my code.

Cheers,
Merijn

On 03 Sep 2014, at 17:01 , Gregory Collins <[hidden email]> wrote:

> Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.
>
> The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.
>
> G
>
>
> On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
> I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.
>
> For example, bracket is defined as:
> bracket before after thing =
>   mask $ \restore -> do
>     a <- before
>     r <- restore (thing a) `onException` after a
>     _ <- after a
>     return r
> This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.
>
> This means bracket does not in fact guarantee the cleanup occurs.
>
> For example:
>
> readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!
>
> withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!
>
> Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.
>
> I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:
>
> bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException
>
> should all simply wrap their exception/cancellation handler with uninterruptibleMask.
>
> The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.
>
> --
> Eyal
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>
>
>
>
> --
> Gregory Collins <[hidden email]>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem
In reply to this post by Gregory Collins-3

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

John Lato-2
Being a primop or not has nothing to do with whether an MVar operation is interruptible.  What matters is whether or not the operation will block.  withMVar (or readMVar in any incarnation) on an empty MVar is interruptible.  If the MVar happened to be full, it's not interruptible.

I agree this is a problem.  I don't think the proposed solution is perfect, but I do think it's possibly better than the status quo.  Perhaps the user should be required to use uninterruptibleMask_ on the cleanup action if necessary?  I've long thought that using an interruptible operation in a cleanup handler is a programming error.

John L.


On Thu, Sep 4, 2014 at 12:29 AM, Eyal Lotem <[hidden email]> wrote:

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem
The problem with requiring the user to use uninterruptibleMask_ on their cleanup is how error-prone it is.

If we examine some uses of bracket in the GHC repo:

compiler/main/GhcMake.hs:992:        let withSem sem = bracket_ (waitQSem sem) (signalQSem sem)
signalQSem is interruptible, this is a bug!

bracket_ (hSetBinaryMode h False) (hSetBinaryMode h True)
Is the hSetBinaryMode operation interruptible? Is this a bug? Hard to tell!

withExtendedLinkEnv new_env action
    = gbracket (liftIO $ extendLinkEnv new_env)
               (\_ -> reset_old_env)
               (\_ -> action)

reset_old_env uses MVars so is probably interruptible, probably a bug.

Lots of manual openFile/hClose duplications of withFile, which are all buggy due to hClose being interruptible.

InteractiveUI.doLoad:
  gbracket (liftIO $ do hSetBuffering stdout LineBuffering
                        hSetBuffering stderr LineBuffering)
           (\_ ->
            liftIO $ do hSetBuffering stdout NoBuffering
                        hSetBuffering stderr NoBuffering) $ \_ -> do

hSetBuffering uses hFlush and is thus interruptible(?).

If many uses of bracket by the GHC developers are broken, is it not a clear indication that the default behavior is too error prone?

If someone wants an interruptible cleanup handler (Very weird thing to want!) they can use "mask" and "onException" directly. Or variants of all the cleanup functions with an "interruptible" suffix can be exported too (bracketInterruptible, catchInterruptible, etc).

Keeping the current behavior has only one benefit: Use cases that need to have interruptible cleanups.
Does anyone have any such use case at all? I can't imagine one...


On Thu, Sep 4, 2014 at 10:46 AM, John Lato <[hidden email]> wrote:
Being a primop or not has nothing to do with whether an MVar operation is interruptible.  What matters is whether or not the operation will block.  withMVar (or readMVar in any incarnation) on an empty MVar is interruptible.  If the MVar happened to be full, it's not interruptible.

I agree this is a problem.  I don't think the proposed solution is perfect, but I do think it's possibly better than the status quo.  Perhaps the user should be required to use uninterruptibleMask_ on the cleanup action if necessary?  I've long thought that using an interruptible operation in a cleanup handler is a programming error.

John L.


On Thu, Sep 4, 2014 at 12:29 AM, Eyal Lotem <[hidden email]> wrote:

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries





--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Roman Cheplyaka-2
I find your arguments quite convincing. Count that as +1 from me.

Roman

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eric Mertens
In reply to this post by John Lato-2

It's hasn't been noted or I haven't noticed that putMVar is only interruptible when the MVar is full.

This means that there isn't a problem in cases like withMVar because the  MVar is empty due to the establishing takeMVar, and there is no leak.

For hClose if you are closing the handle, presumably the contained MVar is full and there is no block and therefore no interruption. Only if another thread is actively using the handle and you are trying to close the handle out from under it is there potential for failure.

In the exotic cases like this there is already a way to make an interruptible operation uninterruptible.

I'm -1 until it becomes clear that it is actually an issue in common code.

On Sep 4, 2014 12:47 AM, "John Lato" <[hidden email]> wrote:
Being a primop or not has nothing to do with whether an MVar operation is interruptible.  What matters is whether or not the operation will block.  withMVar (or readMVar in any incarnation) on an empty MVar is interruptible.  If the MVar happened to be full, it's not interruptible.

I agree this is a problem.  I don't think the proposed solution is perfect, but I do think it's possibly better than the status quo.  Perhaps the user should be required to use uninterruptibleMask_ on the cleanup action if necessary?  I've long thought that using an interruptible operation in a cleanup handler is a programming error.

John L.


On Thu, Sep 4, 2014 at 12:29 AM, Eyal Lotem <[hidden email]> wrote:

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem


On Thu, Sep 4, 2014 at 6:16 PM, Eric Mertens <[hidden email]> wrote:

It's hasn't been noted or I haven't noticed that putMVar is only interruptible when the MVar is full.

This means that there isn't a problem in cases like withMVar because the  MVar is empty due to the establishing takeMVar, and there is no leak.

It is plausible that another thread will put into the MVar in the mean time.

For hClose if you are closing the handle, presumably the contained MVar is full and there is no block and therefore no interruption. Only if another thread is actively using the handle and you are trying to close the handle out from under it is there potential for failure.

Only in rare cases is this a bug, I agree. But rare bugs are in many ways even worse than frequent bugs. They are harder to debug, detect, etc. 

In the exotic cases like this there is already a way to make an interruptible operation uninterruptible.

I'm -1 until it becomes clear that it is actually an issue in common code.

I think you're missing an important point.

A) Cases that were not interruptible will remain the same.
B) Cases that were interruptible were bugs and will be fixed.

You're claiming that B is rare, but I don't think it is a valid argument against this change, because whether or not you agree B is frequent or not -- the change only affects B and not A. So the question is whether the change is desirable in this circumstance.
 
As an empirical point: My project (buildsome) was very buggy with a lot of leaks and deadlocks, until I replaced my use of Control.Exception with a wrapper that properly used uninterruptible-mask.
 
On Sep 4, 2014 12:47 AM, "John Lato" <[hidden email]> wrote:
Being a primop or not has nothing to do with whether an MVar operation is interruptible.  What matters is whether or not the operation will block.  withMVar (or readMVar in any incarnation) on an empty MVar is interruptible.  If the MVar happened to be full, it's not interruptible.

I agree this is a problem.  I don't think the proposed solution is perfect, but I do think it's possibly better than the status quo.  Perhaps the user should be required to use uninterruptibleMask_ on the cleanup action if necessary?  I've long thought that using an interruptible operation in a cleanup handler is a programming error.

John L.


On Thu, Sep 4, 2014 at 12:29 AM, Eyal Lotem <[hidden email]> wrote:

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Bardur Arantsson-2
On 2014-09-04 18:03, Eyal Lotem wrote:
> On Thu, Sep 4, 2014 at 6:16 PM, Eric Mertens <[hidden email]> wrote:
>
[--snip--]

>> I'm -1 until it becomes clear that it is actually an issue in common code.
>>
> I think you're missing an important point.
>
> A) Cases that were not interruptible will remain the same.
> B) Cases that were interruptible *were bugs* and will be fixed.
>
> You're claiming that B is rare, but I don't think it is a valid argument
> against this change, because whether or not you agree B is frequent or not
> -- the change only affects B and not A. So the question is whether the
> change is desirable in this circumstance.
>

I'm in *no* way an expert on the finer points of concurrency in Haskell,
but if the above is true, then +1.

Regards,




_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Felipe Lessa
On 04-09-2014 13:09, Bardur Arantsson wrote:
> I'm in *no* way an expert on the finer points of concurrency in Haskell,
> but if the above is true, then +1.

My thoughts, exactly.  Eyal's arguments look solid to me, I can't see
any flaw, so +1 with the caveat that I'm not an expert here.

Assuming he's right, there is a *lot* of broken Haskell code out there!

--
Felipe.


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

John Lato-2
In reply to this post by Eyal Lotem
I suspect the main reason someone would want an interruptible cleanup handler is because if the action blocks, it's the only way to kill the thread.  I'm not sure how common this case is though.

IMHO I think the real problem is that the current exception implementation is far too complex to reason about.  Do other languages have this interruptible/uninterruptible distinction?  If so, I've yet to encounter it.  I'm starting to think that throwing an exception to a thread is the wrong way to kill it, and it should use something other than the exception mechanism.  Then I think we could discard the notion of interruptible actions entirely.

I guess count me as +0.5 for this proposal.

John L.


On Thu, Sep 4, 2014 at 10:01 PM, Eyal Lotem <[hidden email]> wrote:
The problem with requiring the user to use uninterruptibleMask_ on their cleanup is how error-prone it is.

If we examine some uses of bracket in the GHC repo:

compiler/main/GhcMake.hs:992:        let withSem sem = bracket_ (waitQSem sem) (signalQSem sem)
signalQSem is interruptible, this is a bug!

bracket_ (hSetBinaryMode h False) (hSetBinaryMode h True)
Is the hSetBinaryMode operation interruptible? Is this a bug? Hard to tell!

withExtendedLinkEnv new_env action
    = gbracket (liftIO $ extendLinkEnv new_env)
               (\_ -> reset_old_env)
               (\_ -> action)

reset_old_env uses MVars so is probably interruptible, probably a bug.

Lots of manual openFile/hClose duplications of withFile, which are all buggy due to hClose being interruptible.

InteractiveUI.doLoad:
  gbracket (liftIO $ do hSetBuffering stdout LineBuffering
                        hSetBuffering stderr LineBuffering)
           (\_ ->
            liftIO $ do hSetBuffering stdout NoBuffering
                        hSetBuffering stderr NoBuffering) $ \_ -> do

hSetBuffering uses hFlush and is thus interruptible(?).

If many uses of bracket by the GHC developers are broken, is it not a clear indication that the default behavior is too error prone?

If someone wants an interruptible cleanup handler (Very weird thing to want!) they can use "mask" and "onException" directly. Or variants of all the cleanup functions with an "interruptible" suffix can be exported too (bracketInterruptible, catchInterruptible, etc).

Keeping the current behavior has only one benefit: Use cases that need to have interruptible cleanups.
Does anyone have any such use case at all? I can't imagine one...


On Thu, Sep 4, 2014 at 10:46 AM, John Lato <[hidden email]> wrote:
Being a primop or not has nothing to do with whether an MVar operation is interruptible.  What matters is whether or not the operation will block.  withMVar (or readMVar in any incarnation) on an empty MVar is interruptible.  If the MVar happened to be full, it's not interruptible.

I agree this is a problem.  I don't think the proposed solution is perfect, but I do think it's possibly better than the status quo.  Perhaps the user should be required to use uninterruptibleMask_ on the cleanup action if necessary?  I've long thought that using an interruptible operation in a cleanup handler is a programming error.

John L.


On Thu, Sep 4, 2014 at 12:29 AM, Eyal Lotem <[hidden email]> wrote:

In addition to hClose, the majority of cleanup handlers in my programs turned out to be interruptible. Moreover, whether something is interruptible is unclear. You can easily add a putStrLn to a cleanup handler and now it is accidentally interruptible.

I'd love to see examples of some code where interruptible cleanup handlers are not a bug.

Every single one in my programs that I examined was a bug.

Is withMVar also a primop? Because it's buggy in the same way as withFile currently is.

The current situation is that virtually all uses of bracket in the entire Haskell ecosystem are subtle bugs.

On Sep 4, 2014 3:01 AM, "Gregory Collins" <[hidden email]> wrote:
Unless I'm mistaken, here the "mask" call inside bracket already makes sure you don't receive asynchronous exceptions unless you call a function that is interruptible (i.e. goes back into the runtime system). The hClose example you give doesn't fall in this category, as something inside the RTS needs to call "allowInterrupt" (or otherwise unmask exceptions) in order for async exceptions to be delivered. The "readMVar" example you give *does* have this issue (because putMVar does an implicit allowInterrupt) but in recent GHC readMVar has been redefined as a primop.

The danger of deadlock is *not* minimal here, doing what you suggest will transform many valid programs (i.e. if you block on a "takeMVar" in the cleanup action) into ones that have unkillable orphan threads.

G


On Wed, Sep 3, 2014 at 1:56 PM, Eyal Lotem <[hidden email]> wrote:
I'd like to propose a change in the behavior of Control.Exception to help guarantee cleanups are not accidentally lost.

For example, bracket is defined as:
bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r
This definition has a serious problem: "after a" (in either the exception handling case, or the ordinary case) can include interruptible actions which abort the cleanup.

This means bracket does not in fact guarantee the cleanup occurs.

For example:

readMVar = bracket takeMVar putMVar -- If async exception occurs during putMVar, MVar is broken!

withFile .. = bracket (openFile ..) hClose -- Async exception during hClose leaks the file handle!

Interruptible actions during "before" are fine (as long as "before" handles them properly).  Interruptible actions during "after" are virtually always a bug -- at best leaking a resource, and at worst breaking the program's invariants.

I propose changing all the cleanup handlers to run under uninterruptibleMask, specifically:

bracket, bracketOnError, bracket_, catch, catchJust, finally, handle, handleJust, onException

should all simply wrap their exception/cancellation handler with uninterruptibleMask.

The danger of a deadlock is minimal when compared with the virtually guaranteed buggy incorrect handling of async exceptions during cleanup.

--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries





--
Eyal


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Herbert Valerio Riedel
In reply to this post by Eyal Lotem
Hello,

I'm forwarding this on behalf of Edsko:

-------------------- Start of forwarded message --------------------
[...]

Hi Herbert,

> ...and I'd like to point your attention to yet another related proposal
> that came in yesterday:
>
>  Subject: Proposal: Use uninterruptibleMask for cleanup actions in
>           Control.Exception
>
>  http://thread.gmane.org/gmane.comp.lang.haskell.libraries/22775

Hmmm, I thought I was subscribed to libraries@. It seems I wasn’t. I
subscribed earlier today but haven’t got the confirmation email yetI
(?). Perhaps you can forward my reply for now?

Anyhow, I’m not convinced about this proposal. uninterruptibleMask is
rather crude. It would be one thing to block, say, timeout signals
during resource cleanup (perhaps), but uninterruptibleMask blocks _all_
asynchronous exceptions, including stack overflow and heap overflow. I
don’t think that the _default_ behaviour of bracket should be to disable
those.

Note also that in the

bracket takeMVar putMVar

example, the putMVar is _NOT_ interruptible. By definition, putMVar is
interruptible _only_ if the resource is not available. Since the
takeMVar succeeded, the MVar must be empty at the time of the putMVar
and hence it cannot be interrupted. (This is assuming of course that
there are no putMVars in the body of the bracket, but if you’re doing
that, you have bigger problems..). Incidentally, if that behavior of
putMVar is not in fact the case, then that is a bug in putMVar, not in
bracket.

The hClose example ironically _is_ an example, though it doesn’t look
like one — Merijn is right, it uses an MVar under the hood and in fact
it does a withMVar, which is interruptible, of course — in case somebody
else happens to have the file handle at that time. But again, I’m not
sure what the right answer is. I sort of feel that these file operations
should _not_ be interruptible. The fact that they use MVars under the
hood makes them so, but perhaps that could be considered a bug (and they
should do some uninterruptibleMasking of their own) — certainly, as far
as I know, it’s nowhere actually _documented_ that these operations are
interruptible and most people don’t think of them as such.

I’m not sure to be honest. I would like Simon Marlow’s opinion here. But
my feeling right now is that I’d vote against this, but to try and make
sure that the operations we use in cleanup handlers are, in fact, not
interruptible.

-E
-------------------- End of forwarded message --------------------
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem



On Fri, Sep 5, 2014 at 11:37 AM, Herbert Valerio Riedel <[hidden email]> wrote:
Hello,

I'm forwarding this on behalf of Edsko:

-------------------- Start of forwarded message --------------------
[...]

Hi Herbert,

> ...and I'd like to point your attention to yet another related proposal
> that came in yesterday:
>
>  Subject: Proposal: Use uninterruptibleMask for cleanup actions in
>           Control.Exception
>
http://thread.gmane.org/gmane.comp.lang.haskell.libraries/22775

Hmmm, I thought I was subscribed to libraries@. It seems I wasn’t. I
subscribed earlier today but haven’t got the confirmation email yetI
(?). Perhaps you can forward my reply for now?

Anyhow, I’m not convinced about this proposal. uninterruptibleMask is
rather crude. It would be one thing to block, say, timeout signals
during resource cleanup (perhaps), but uninterruptibleMask blocks _all_
asynchronous exceptions, including stack overflow and heap overflow. I
don’t think that the _default_ behaviour of bracket should be to disable
those.

I just tested a little program that stack-overflows within an uninterruptibleMask_. The exception doesn't seem to be blocked.
Can you(he?) give an example of the actual problem?


Note also that in the

bracket takeMVar putMVar

example, the putMVar is _NOT_ interruptible. By definition, putMVar is
interruptible _only_ if the resource is not available. Since the
takeMVar succeeded, the MVar must be empty at the time of the putMVar

Of course not! Different threads may have putMVar during the bracket. Typical MVar use will not do this, but it's certainly a possible use case that is broken.

I think the fact people are missing this bug is more evidence that this behavior is a problem.

and hence it cannot be interrupted. (This is assuming of course that
there are no putMVars in the body of the bracket, but if you’re doing
that, you have bigger problems..). Incidentally, if that behavior of
putMVar is not in fact the case, then that is a bug in putMVar, not in
bracket.

The hClose example ironically _is_ an example, though it doesn’t look
like one — Merijn is right, it uses an MVar under the hood and in fact
it does a withMVar, which is interruptible, of course — in case somebody
else happens to have the file handle at that time. But again, I’m not
sure what the right answer is. I sort of feel that these file operations
should _not_ be interruptible. The fact that they use MVars under the
hood makes them so, but perhaps that could be considered a bug (and they
should do some uninterruptibleMasking of their own) — certainly, as far
as I know, it’s nowhere actually _documented_ that these operations are
interruptible and most people don’t think of them as such.

But this seems like a weird position to have. On the one hand, you are claiming that making the cleanup uninterruptible is too crude. On the other hand, you're claiming that all cleanup handlers that happen to be interruptible are a bug, each.

These positions contradict.
 

I’m not sure to be honest. I would like Simon Marlow’s opinion here. But
my feeling right now is that I’d vote against this, but to try and make
sure that the operations we use in cleanup handlers are, in fact, not
interruptible.

The question is what do you want to do with cleanup handlers that are still interruptible?
I think the only valid solutions are:

A) Mask them uninterruptible as I suggest

B) Raise a runtime error (or at least a warning) that this is a bug that should be fixed by making sure that the cleanup handlers are not interruptible, manually.

Leaving the current behavior is the worst of all worlds.
 

-E
-------------------- End of forwarded message --------------------



--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Eyal Lotem
In reply to this post by Roman Cheplyaka-2
Hey Simon, thanks for the reply!


On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <[hidden email]> wrote:
Eyal, thanks for bringing up this issue.  It's been at the back of my mind for a while, but I've never really thought through the issues and consequences of changes.  So this is a good opportunity to do that.  You point out (in another email in the thread) that:

A) Cases that were not interruptible will remain the same.
B) Cases that were interruptible were bugs and will be fixed.

However,

C) Some bugs will turn into deadlocks (unkillable threads)

Being able to recover from bugs is an important property in large long-running systems.  So this is a serious problem.  Hence why I always treat uninterruptibleMask with the deepest suspicion.

Recovering from various kinds of failures makes a lot of sense. But how can you recover from arbitrary invariants of the program being broken?

For example, if you use a bracket on some semaphore monitoring a global resource. How do you recover from a bug of leaking semaphore tokens?

Recovering from crashes of whole processes whose internal state can be recovered to a fresh, usable state, is a great feature.
Recovering from thread crashes that share arbitrary mutable state with other threads is not practical, I believe.
 
Let's consider the case where we have an interruptible operation in the handler, and divide it into two (er three):

 1. it blocks for a short bounded amount of time.
 2. It blocks for a long time
 3. It blocks indefinitely

These are all buggy, but in different ways.  Only (1) is fixed by adding uninterruptibleMask.  (2) is "fixed", but in exchange for an unresponsive thread - also undesirable.  (3) was a bug in the application code, and turns into a deadlock with uninterruptibleMask, which is undesirable.

I think that (1) is by far the most common and is very prevalent. I think 0-time interruptible (that can block but almost never do) operations are the most common cleanup handlers.

For (2) and (3), we need to choose the lesser evil:

A) Deadlocks and/or unresponsiveness
B) Arbitrary invariants being broken and leaks

In my experience, A tends to manifest exactly where the bug is, and is therefore easy to debug and mostly a "performance bug" .
B tends to manifest as difficult to explain behavior elsewhere from where the bug actually is, and is usually a "correctness bug", which is almost always worse.

Therefore, I think A is a far far lesser evil than B, when (2) and (3) are involved. 

I'd like to reemphasize that this change will almost always fix the problem completely since the most common case is (1), and in rare cases, it will convert B to A, which is also, IMO, very desirable.


This is as far as I've got thinking through the issues so far.  I wonder to what extent the programmer can and should mitigate these cases, and how much we can help them.  I don't want unkillable threads, even when caused by buggy code. 
 

Cheers,
Simon


On 04/09/2014 16:46, Roman Cheplyaka wrote:
I find your arguments quite convincing. Count that as +1 from me.

Roman



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries




--
Eyal

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Bertram Felgenhauer-2
In reply to this post by Eyal Lotem
Eyal Lotem wrote:
> The problem with requiring the user to use uninterruptibleMask_ on their
> cleanup is how error-prone it is.
>
> If we examine some uses of bracket in the GHC repo:
>
> compiler/main/GhcMake.hs:992:        let withSem sem = bracket_ (waitQSem
> sem) (*signalQSem sem*)
> *signalQSem is interruptible, this is a bug!*

Not really, but signalQSem is a rare exception from the rule.

signalQSem :: QSem -> IO ()
signalQSem (QSem m) =
  uninterruptibleMask_ $ do -- Note [signal uninterruptible]
    r <- takeMVar m
    r' <- signal r
    putMVar m r'

It's tempting to suggest that every cleanup type function should just
mask exceptions itself like this. However, that does not solve the
problem when several cleanup actions are required: at the end of the
first uninterruptibleMask_, any pending exceptions will be delivered.

Overall I agree with your assessment that a general uninterruptibleMask_
for cleanup handlers avoids more trouble than it causes.

Cheers,

Bertram
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Simon Marlow-7
In reply to this post by Eyal Lotem
Ok, sorry for the delay, we still need a resolution on this one.

So thanks to your persuasive comments I think I'm convinced.  What
finally tipped me over the edge was this:

https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/base/Control/Concurrent/QSem.hs;165072b334ebb2ccbef38a963ac4d126f1e08c96$103-112

It turns out I've been a victim of this "bug" myself :-)  So let's fix it.

But what is the cost? Adding an uninterruptibleMask won't be free.

In the case of `catch`, since the mask is already built in to the
primitive, we can just change it to be an uninterruptibleMask, and that
applies to handle and onException too.  For `finally` we can replace the
mask with an uninterruptibleMask, but for `bracket` we have to add a new
layer of uninterruptibleMask.

Lots of documentation probably needs to be updated.  Any chance you
could make a patch and upload it to Phabricator?

Cheers,
Simon

On 05/09/2014 18:34, Eyal Lotem wrote:

> Hey Simon, thanks for the reply!
>
>
> On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Eyal, thanks for bringing up this issue.  It's been at the back of
>     my mind for a while, but I've never really thought through the
>     issues and consequences of changes.  So this is a good opportunity
>     to do that.  You point out (in another email in the thread) that:
>
>     A) Cases that were not interruptible will remain the same.
>     B) Cases that were interruptible were bugs and will be fixed.
>
>     However,
>
>     C) Some bugs will turn into deadlocks (unkillable threads)
>
>     Being able to recover from bugs is an important property in large
>     long-running systems.  So this is a serious problem.  Hence why I
>     always treat uninterruptibleMask with the deepest suspicion.
>
>
> Recovering from various kinds of failures makes a lot of sense. But how
> can you recover from arbitrary invariants of the program being broken?
>
> For example, if you use a bracket on some semaphore monitoring a global
> resource. How do you recover from a bug of leaking semaphore tokens?
>
> Recovering from crashes of whole processes whose internal state can be
> recovered to a fresh, usable state, is a great feature.
> Recovering from thread crashes that share arbitrary mutable state with
> other threads is not practical, I believe.
>
>     Let's consider the case where we have an interruptible operation in
>     the handler, and divide it into two (er three):
>
>       1. it blocks for a short bounded amount of time.
>       2. It blocks for a long time
>       3. It blocks indefinitely
>
>     These are all buggy, but in different ways.  Only (1) is fixed by
>     adding uninterruptibleMask.  (2) is "fixed", but in exchange for an
>     unresponsive thread - also undesirable.  (3) was a bug in the
>     application code, and turns into a deadlock with
>     uninterruptibleMask, which is undesirable.
>
>
> I think that (1) is by far the most common and is very prevalent. I
> think 0-time interruptible (that can block but almost never do)
> operations are the most common cleanup handlers.
>
> For (2) and (3), we need to choose the lesser evil:
>
> A) Deadlocks and/or unresponsiveness
> B) Arbitrary invariants being broken and leaks
>
> In my experience, A tends to manifest exactly where the bug is, and is
> therefore easy to debug and mostly a "performance bug" .
> B tends to manifest as difficult to explain behavior elsewhere from
> where the bug actually is, and is usually a "correctness bug", which is
> almost always worse.
>
> Therefore, I think A is a far far lesser evil than B, when (2) and (3)
> are involved.
>
> I'd like to reemphasize that this change will almost always fix the
> problem completely since the most common case is (1), and in rare cases,
> it will convert B to A, which is also, IMO, very desirable.
>
>
>     This is as far as I've got thinking through the issues so far.  I
>     wonder to what extent the programmer can and should mitigate these
>     cases, and how much we can help them.  I don't want unkillable
>     threads, even when caused by buggy code.
>
>
>     Cheers,
>     Simon
>
>
>     On 04/09/2014 16:46, Roman Cheplyaka wrote:
>
>         I find your arguments quite convincing. Count that as +1 from me.
>
>         Roman
>
>
>
>         _________________________________________________
>         Libraries mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://www.haskell.org/__mailman/listinfo/libraries
>         <http://www.haskell.org/mailman/listinfo/libraries>
>
>
>
>
> --
> Eyal
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Merijn Verstraaten
Hi Simon,

Well, another issue that was raised (I forget by whom!) was the fact that stackoverflows are currently thrown externally from the executing thread and that this should really be changed into being thrown to the thread by itself, thereby avoiding the mask.

Cheers,
Merijn

On 24 Sep 2014, at 05:51 , Simon Marlow <[hidden email]> wrote:

> Ok, sorry for the delay, we still need a resolution on this one.
>
> So thanks to your persuasive comments I think I'm convinced.  What finally tipped me over the edge was this:
>
> https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/base/Control/Concurrent/QSem.hs;165072b334ebb2ccbef38a963ac4d126f1e08c96$103-112
>
> It turns out I've been a victim of this "bug" myself :-)  So let's fix it.
>
> But what is the cost? Adding an uninterruptibleMask won't be free.
>
> In the case of `catch`, since the mask is already built in to the primitive, we can just change it to be an uninterruptibleMask, and that applies to handle and onException too.  For `finally` we can replace the mask with an uninterruptibleMask, but for `bracket` we have to add a new layer of uninterruptibleMask.
>
> Lots of documentation probably needs to be updated.  Any chance you could make a patch and upload it to Phabricator?
>
> Cheers,
> Simon
>
> On 05/09/2014 18:34, Eyal Lotem wrote:
>> Hey Simon, thanks for the reply!
>>
>>
>> On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>    Eyal, thanks for bringing up this issue.  It's been at the back of
>>    my mind for a while, but I've never really thought through the
>>    issues and consequences of changes.  So this is a good opportunity
>>    to do that.  You point out (in another email in the thread) that:
>>
>>    A) Cases that were not interruptible will remain the same.
>>    B) Cases that were interruptible were bugs and will be fixed.
>>
>>    However,
>>
>>    C) Some bugs will turn into deadlocks (unkillable threads)
>>
>>    Being able to recover from bugs is an important property in large
>>    long-running systems.  So this is a serious problem.  Hence why I
>>    always treat uninterruptibleMask with the deepest suspicion.
>>
>>
>> Recovering from various kinds of failures makes a lot of sense. But how
>> can you recover from arbitrary invariants of the program being broken?
>>
>> For example, if you use a bracket on some semaphore monitoring a global
>> resource. How do you recover from a bug of leaking semaphore tokens?
>>
>> Recovering from crashes of whole processes whose internal state can be
>> recovered to a fresh, usable state, is a great feature.
>> Recovering from thread crashes that share arbitrary mutable state with
>> other threads is not practical, I believe.
>>
>>    Let's consider the case where we have an interruptible operation in
>>    the handler, and divide it into two (er three):
>>
>>      1. it blocks for a short bounded amount of time.
>>      2. It blocks for a long time
>>      3. It blocks indefinitely
>>
>>    These are all buggy, but in different ways.  Only (1) is fixed by
>>    adding uninterruptibleMask.  (2) is "fixed", but in exchange for an
>>    unresponsive thread - also undesirable.  (3) was a bug in the
>>    application code, and turns into a deadlock with
>>    uninterruptibleMask, which is undesirable.
>>
>>
>> I think that (1) is by far the most common and is very prevalent. I
>> think 0-time interruptible (that can block but almost never do)
>> operations are the most common cleanup handlers.
>>
>> For (2) and (3), we need to choose the lesser evil:
>>
>> A) Deadlocks and/or unresponsiveness
>> B) Arbitrary invariants being broken and leaks
>>
>> In my experience, A tends to manifest exactly where the bug is, and is
>> therefore easy to debug and mostly a "performance bug" .
>> B tends to manifest as difficult to explain behavior elsewhere from
>> where the bug actually is, and is usually a "correctness bug", which is
>> almost always worse.
>>
>> Therefore, I think A is a far far lesser evil than B, when (2) and (3)
>> are involved.
>>
>> I'd like to reemphasize that this change will almost always fix the
>> problem completely since the most common case is (1), and in rare cases,
>> it will convert B to A, which is also, IMO, very desirable.
>>
>>
>>    This is as far as I've got thinking through the issues so far.  I
>>    wonder to what extent the programmer can and should mitigate these
>>    cases, and how much we can help them.  I don't want unkillable
>>    threads, even when caused by buggy code.
>>
>>
>>    Cheers,
>>    Simon
>>
>>
>>    On 04/09/2014 16:46, Roman Cheplyaka wrote:
>>
>>        I find your arguments quite convincing. Count that as +1 from me.
>>
>>        Roman
>>
>>
>>
>>        _________________________________________________
>>        Libraries mailing list
>>        [hidden email] <mailto:[hidden email]>
>>        http://www.haskell.org/__mailman/listinfo/libraries
>>        <http://www.haskell.org/mailman/listinfo/libraries>
>>
>>
>>
>>
>> --
>> Eyal
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Simon Marlow-7
On 24/09/2014 17:58, Merijn Verstraaten wrote:

> Well, another issue that was raised (I forget by whom!) was the fact
> that stackoverflows are currently thrown externally from the
> executing thread and that this should really be changed into being
> thrown to the thread by itself, thereby avoiding the mask.

Yeah, a stack overflow is an asynchronous exception.  Tt is not thrown
by any particular thread, but it is treated as an async exception from
the point of view of the receiving thread.  And this is the right thing
to do, since a stack overflow can occur absolutely anywhere, so it has
exactly the characteristics of an async exception.

So it is already the case that a stack overflow inside mask does not
cause an exception.  Instead the exception is deferred until the thread
exits the mask.  This proposal wouldn't change anything in that respect.

Cheers,
Simon

>
> Cheers, Merijn
>
> On 24 Sep 2014, at 05:51 , Simon Marlow <[hidden email]> wrote:
>> Ok, sorry for the delay, we still need a resolution on this one.
>>
>> So thanks to your persuasive comments I think I'm convinced.  What
>> finally tipped me over the edge was this:
>>
>> https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/base/Control/Concurrent/QSem.hs;165072b334ebb2ccbef38a963ac4d126f1e08c96$103-112
>>
>>
>>
>>
It turns out I've been a victim of this "bug" myself :-)  So let's fix it.

>>
>> But what is the cost? Adding an uninterruptibleMask won't be free.
>>
>> In the case of `catch`, since the mask is already built in to the
>> primitive, we can just change it to be an uninterruptibleMask, and
>> that applies to handle and onException too.  For `finally` we can
>> replace the mask with an uninterruptibleMask, but for `bracket` we
>> have to add a new layer of uninterruptibleMask.
>>
>> Lots of documentation probably needs to be updated.  Any chance
>> you could make a patch and upload it to Phabricator?
>>
>> Cheers, Simon
>>
>> On 05/09/2014 18:34, Eyal Lotem wrote:
>>> Hey Simon, thanks for the reply!
>>>
>>>
>>> On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <[hidden email]
>>>  <mailto:[hidden email]>> wrote:
>>>
>>> Eyal, thanks for bringing up this issue.  It's been at the back
>>> of my mind for a while, but I've never really thought through the
>>> issues and consequences of changes.  So this is a good
>>> opportunity to do that.  You point out (in another email in the
>>> thread) that:
>>>
>>> A) Cases that were not interruptible will remain the same. B)
>>> Cases that were interruptible were bugs and will be fixed.
>>>
>>> However,
>>>
>>> C) Some bugs will turn into deadlocks (unkillable threads)
>>>
>>> Being able to recover from bugs is an important property in large
>>> long-running systems.  So this is a serious problem.  Hence why I
>>> always treat uninterruptibleMask with the deepest suspicion.
>>>
>>>
>>> Recovering from various kinds of failures makes a lot of sense.
>>> But how can you recover from arbitrary invariants of the program
>>> being broken?
>>>
>>> For example, if you use a bracket on some semaphore monitoring a
>>> global resource. How do you recover from a bug of leaking
>>> semaphore tokens?
>>>
>>> Recovering from crashes of whole processes whose internal state
>>> can be recovered to a fresh, usable state, is a great feature.
>>> Recovering from thread crashes that share arbitrary mutable
>>> state with other threads is not practical, I believe.
>>>
>>> Let's consider the case where we have an interruptible operation
>>> in the handler, and divide it into two (er three):
>>>
>>> 1. it blocks for a short bounded amount of time. 2. It blocks
>>> for a long time 3. It blocks indefinitely
>>>
>>> These are all buggy, but in different ways.  Only (1) is fixed by
>>> adding uninterruptibleMask.  (2) is "fixed", but in exchange for
>>> an unresponsive thread - also undesirable.  (3) was a bug in the
>>> application code, and turns into a deadlock with
>>> uninterruptibleMask, which is undesirable.
>>>
>>>
>>> I think that (1) is by far the most common and is very
>>> prevalent. I think 0-time interruptible (that can block but
>>> almost never do) operations are the most common cleanup
>>> handlers.
>>>
>>> For (2) and (3), we need to choose the lesser evil:
>>>
>>> A) Deadlocks and/or unresponsiveness B) Arbitrary invariants
>>> being broken and leaks
>>>
>>> In my experience, A tends to manifest exactly where the bug is,
>>> and is therefore easy to debug and mostly a "performance bug" .
>>> B tends to manifest as difficult to explain behavior elsewhere
>>> from where the bug actually is, and is usually a "correctness
>>> bug", which is almost always worse.
>>>
>>> Therefore, I think A is a far far lesser evil than B, when (2)
>>> and (3) are involved.
>>>
>>> I'd like to reemphasize that this change will almost always fix
>>> the problem completely since the most common case is (1), and in
>>> rare cases, it will convert B to A, which is also, IMO, very
>>> desirable.
>>>
>>>
>>> This is as far as I've got thinking through the issues so far. I
>>> wonder to what extent the programmer can and should mitigate
>>> these cases, and how much we can help them.  I don't want
>>> unkillable threads, even when caused by buggy code.
>>>
>>>
>>> Cheers, Simon
>>>
>>>
>>> On 04/09/2014 16:46, Roman Cheplyaka wrote:
>>>
>>> I find your arguments quite convincing. Count that as +1 from
>>> me.
>>>
>>> Roman
>>>
>>>
>>>
>>> _________________________________________________ Libraries
>>> mailing list [hidden email] <mailto:[hidden email]>
>>>  http://www.haskell.org/__mailman/listinfo/libraries
>>> <http://www.haskell.org/mailman/listinfo/libraries>
>>>
>>>
>>>
>>>
>>> -- Eyal
>> _______________________________________________ Libraries mailing
>> list [hidden email]
>> http://www.haskell.org/mailman/listinfo/libraries
>
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries