addFinalizer in GHC 7.10

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

addFinalizer in GHC 7.10

Kazu Yamamoto (山本和彦)
Hello cafe,

We, the network library maintainers, have questions about
System.Mem.Weak.addFinalizer in GHC 7.10.

We are trying to make the library extensible especially for the socket
type and the socket address type. For this purpose, a proposed
definition for Socket in version 3.0.0.0 is a just CInt:

    newtype Socket = Socket CInt deriving (Eq, Show)

One problem is that unreachable Socket cannot be GCed if CInt (a file
descriptor) is not closed. To let GC work, we are trying to add a
finalizer via addFinalizer:

    socket family stype protocol = do
       fd <- c_socket ...
       ...
       let s = Socket fd
       addFinalizer s $ close s
       ruturn s

This works well for many cases. Unfortunately, we *sometime* got the
following error with GHC 7.10:

    tests/SimpleSpec.hs:56:
      1) Simple.sendMany works well
           uncaught exception: IOException of type InvalidArgument (threadWait: invalid argument (Bad file descriptor))

It seems to us that Socket is GCed too early in GHC 7.10. We don't
see this behavior in other versions of GHC.

So, here are our questions:

Q1) Do we use addFinalizer correctly?
Q2) Is this a bug of GHC 7.10?
Q3) If this is a bug of GHC 7.10, are there any workarounds?

--Kazu
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Viktor Dukhovni


> On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦) <[hidden email]> wrote:
>
>    socket family stype protocol = do
>       fd <- c_socket ...
>       ...
>       let s = Socket fd
>       addFinalizer s $ close s
>       ruturn s

For the record, I think I've convinced Kazu Yamamoto that this is
an anti-pattern.  Such a finalizer would easily end up closing
already closed sockets, whose file-descriptors may already be
associated with other open files or sockets.  That way lie all
sorts of difficult to isolate race-condition bugs.  To make this
safe, the close function would need to mutate the socket,
invalidating the enclosed file-descriptor, and would then need to
be a NOP or just raise an exception if the socket is closed again
(the finalizer should invoke a close variant that just returns
without raising exceptions if the socket is already closed).

There is, AFAIK still an unresolved bug along these lines somewhere
in http-client and its dependencies.  So far no reproducing cases have
been provided.  No very recent reports either, perhaps it went away,
or people have just been more lucky lately:

   https://github.com/snoyberg/http-client/issues/252
   https://github.com/vincenthz/hs-tls/issues/179

All that said, the original question about addFinalizer vs. GHC 7.10
may still be worth exploring, even if its use-case for Sockets goes
away.  So, please don't take this poset to mean that the original
question should be ignored.

--
        Viktor.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Kazu Yamamoto (山本和彦)
Hi,

I registered this issue to network:

        https://github.com/haskell/network/issues/302

Viktor, thank you!

--Kazu

>> On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦) <[hidden email]> wrote:
>>
>>    socket family stype protocol = do
>>       fd <- c_socket ...
>>       ...
>>       let s = Socket fd
>>       addFinalizer s $ close s
>>       ruturn s
>
> For the record, I think I've convinced Kazu Yamamoto that this is
> an anti-pattern.  Such a finalizer would easily end up closing
> already closed sockets, whose file-descriptors may already be
> associated with other open files or sockets.  That way lie all
> sorts of difficult to isolate race-condition bugs.  To make this
> safe, the close function would need to mutate the socket,
> invalidating the enclosed file-descriptor, and would then need to
> be a NOP or just raise an exception if the socket is closed again
> (the finalizer should invoke a close variant that just returns
> without raising exceptions if the socket is already closed).
>
> There is, AFAIK still an unresolved bug along these lines somewhere
> in http-client and its dependencies.  So far no reproducing cases have
> been provided.  No very recent reports either, perhaps it went away,
> or people have just been more lucky lately:
>
>    https://github.com/snoyberg/http-client/issues/252
>    https://github.com/vincenthz/hs-tls/issues/179
>
> All that said, the original question about addFinalizer vs. GHC 7.10
> may still be worth exploring, even if its use-case for Sockets goes
> away.  So, please don't take this poset to mean that the original
> question should be ignored.
>
> --
> Viktor.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Edward Kmett-2
Attaching a finalizer to a regular data type like that is a bit of a hazard prone process. If GHC unpack your Socket in another data constructor it will get "freed" and then your socket can get finalized while you still have access to the CInt! 

data MySockets = MySockets {-# unpack #-} !Socket {-# unpack #-} !Socket

will happily 'eat' your data constructor wrapper and if there are no references remaining to a copy of it on the heap, their finalizers will fire. Worker-wrapper transforms can do this even without any "containing" data type if you just increase your optimization level!

It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:

data Socket = Socket !CInt (Weak# ())

Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)

tl;dr don't attach finalizers to regular Haskell data types if you can help it

-Edward

On Tue, Jan 30, 2018 at 12:18 AM, Kazu Yamamoto <[hidden email]> wrote:
Hi,

I registered this issue to network:

        https://github.com/haskell/network/issues/302

Viktor, thank you!

--Kazu

>> On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦) <[hidden email]> wrote:
>>
>>    socket family stype protocol = do
>>       fd <- c_socket ...
>>       ...
>>       let s = Socket fd
>>       addFinalizer s $ close s
>>       ruturn s
>
> For the record, I think I've convinced Kazu Yamamoto that this is
> an anti-pattern.  Such a finalizer would easily end up closing
> already closed sockets, whose file-descriptors may already be
> associated with other open files or sockets.  That way lie all
> sorts of difficult to isolate race-condition bugs.  To make this
> safe, the close function would need to mutate the socket,
> invalidating the enclosed file-descriptor, and would then need to
> be a NOP or just raise an exception if the socket is closed again
> (the finalizer should invoke a close variant that just returns
> without raising exceptions if the socket is already closed).
>
> There is, AFAIK still an unresolved bug along these lines somewhere
> in http-client and its dependencies.  So far no reproducing cases have
> been provided.  No very recent reports either, perhaps it went away,
> or people have just been more lucky lately:
>
>    https://github.com/snoyberg/http-client/issues/252
>    https://github.com/vincenthz/hs-tls/issues/179
>
> All that said, the original question about addFinalizer vs. GHC 7.10
> may still be worth exploring, even if its use-case for Sockets goes
> away.  So, please don't take this poset to mean that the original
> question should be ignored.
>
> --
>       Viktor.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Viktor Dukhovni


> On Jan 30, 2018, at 4:50 PM, Edward Kmett <[hidden email]> wrote:
>
> It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:
>
> data Socket = Socket !CInt (Weak# ())
>
> Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)
>
> tl;dr don't attach finalizers to regular Haskell data types if you can help it

THanks, good to know.  I gather the unpacking can/will happen even if
Socket internals are [made] opaque to other modules?

And of course in this case, in addition to avoiding
running the finalizer too early, it is critical that each socket be closed
at most once.  Therefore, to support finalization, and make the API safe
for multiple close (as seems to be the case with System.IO Handle's for
example) there's a need for additional mutable state in the Socket, to
keep track of whether it has or has not yet been closed.

I am curious as to what your suggestion would be as to how to best keep
track of such state.

   1. Employ a separate MVar to keep track of socket state, and update
      it on close to ensure at most once close.

   2. Wrap the file descriptor in an IORef, and set it to an invalid
      value (-1 on Unix, INVALID_SOCKET on Windows) on close.  This
      avoids misuse not only with close, but also with attempts at
      read/write/... I/O after close.  However it does not avoid
      (far less likely I think) races to close the socket from
      multiple threads.

   3. Move the state to a wrapper structure managed in FFI code
      so that all socket operations are via a foreign pointer to
      a C-structure in which the file descriptor is invalidated on
      close.

   4.  Other suggestions...

--
        Viktor.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Edward Kmett-2
Yes, this can happen even if you're opaque and don't export stuff.

You could attach the finalizer to an (unpacked) IORef that holds the Bool that says whether you've been finalized. An IORef holds onto a MutVar# under the hood, and so it also maintains the same sort of stable presence Weak# offered above. Similarly you _could_ just stuff the info in an MVar or ForeignPtr. Each of those has support for attaching a finalizer directly to the heap allocated part that lives in # and therefore isn't vulnerable to being inlined or unpacked.

There probably should be able to be similar options for attaching the finalizer to a MutableByteArray#, even if the combinators don't exist right now, etc.

I'd personally go with an IORef and then atomicModifyIORef during the "close" and "finalize" operations to extract and set the flag in one operation. That way there isn't any communication overhead, but IORef, ForeignPtr, MVar, etc. could all work.

-Edward

On Tue, Jan 30, 2018 at 5:09 PM, Viktor Dukhovni <[hidden email]> wrote:


> On Jan 30, 2018, at 4:50 PM, Edward Kmett <[hidden email]> wrote:
>
> It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:
>
> data Socket = Socket !CInt (Weak# ())
>
> Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)
>
> tl;dr don't attach finalizers to regular Haskell data types if you can help it

THanks, good to know.  I gather the unpacking can/will happen even if
Socket internals are [made] opaque to other modules?

And of course in this case, in addition to avoiding
running the finalizer too early, it is critical that each socket be closed
at most once.  Therefore, to support finalization, and make the API safe
for multiple close (as seems to be the case with System.IO Handle's for
example) there's a need for additional mutable state in the Socket, to
keep track of whether it has or has not yet been closed.

I am curious as to what your suggestion would be as to how to best keep
track of such state.

   1. Employ a separate MVar to keep track of socket state, and update
      it on close to ensure at most once close.

   2. Wrap the file descriptor in an IORef, and set it to an invalid
      value (-1 on Unix, INVALID_SOCKET on Windows) on close.  This
      avoids misuse not only with close, but also with attempts at
      read/write/... I/O after close.  However it does not avoid
      (far less likely I think) races to close the socket from
      multiple threads.

   3. Move the state to a wrapper structure managed in FFI code
      so that all socket operations are via a foreign pointer to
      a C-structure in which the file descriptor is invalidated on
      close.

   4.  Other suggestions...

--
        Viktor.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: addFinalizer in GHC 7.10

Kazu Yamamoto (山本和彦)
Hi Edward,

Thank you for your excellent explanation!
I'm now being convinced by the IORef approach.
I should write PoC for that.

--Kazu

> Yes, this can happen even if you're opaque and don't export stuff.
>
> You could attach the finalizer to an (unpacked) IORef that holds the Bool
> that says whether you've been finalized. An IORef holds onto a MutVar#
> under the hood, and so it also maintains the same sort of stable presence
> Weak# offered above. Similarly you _could_ just stuff the info in an MVar
> or ForeignPtr. Each of those has support for attaching a finalizer directly
> to the heap allocated part that lives in # and therefore isn't vulnerable
> to being inlined or unpacked.
>
> There probably should be able to be similar options for attaching the
> finalizer to a MutableByteArray#, even if the combinators don't exist right
> now, etc.
>
> I'd personally go with an IORef and then atomicModifyIORef during the
> "close" and "finalize" operations to extract and set the flag in one
> operation. That way there isn't any communication overhead, but IORef,
> ForeignPtr, MVar, etc. could all work.
>
> -Edward
>
> On Tue, Jan 30, 2018 at 5:09 PM, Viktor Dukhovni <[hidden email]>
> wrote:
>
>>
>>
>> > On Jan 30, 2018, at 4:50 PM, Edward Kmett <[hidden email]> wrote:
>> >
>> > It would be much, much safer to attach the finalizer to something that
>> has a "presence" all its own, like Weak# () as is done in ForeignPtr. This
>> would result in something like:
>> >
>> > data Socket = Socket !CInt (Weak# ())
>> >
>> > Then when it gets unpacked into another data constructor, then the Weak#
>> () still exists. This isn't free, it comes at the cost that your sockets
>> take a couple of words each (plus finalizer space), but the approach you
>> are taking now isn't free either as it isn't really sound. ;)
>> >
>> > tl;dr don't attach finalizers to regular Haskell data types if you can
>> help it
>>
>> THanks, good to know.  I gather the unpacking can/will happen even if
>> Socket internals are [made] opaque to other modules?
>>
>> And of course in this case, in addition to avoiding
>> running the finalizer too early, it is critical that each socket be closed
>> at most once.  Therefore, to support finalization, and make the API safe
>> for multiple close (as seems to be the case with System.IO Handle's for
>> example) there's a need for additional mutable state in the Socket, to
>> keep track of whether it has or has not yet been closed.
>>
>> I am curious as to what your suggestion would be as to how to best keep
>> track of such state.
>>
>>    1. Employ a separate MVar to keep track of socket state, and update
>>       it on close to ensure at most once close.
>>
>>    2. Wrap the file descriptor in an IORef, and set it to an invalid
>>       value (-1 on Unix, INVALID_SOCKET on Windows) on close.  This
>>       avoids misuse not only with close, but also with attempts at
>>       read/write/... I/O after close.  However it does not avoid
>>       (far less likely I think) races to close the socket from
>>       multiple threads.
>>
>>    3. Move the state to a wrapper structure managed in FFI code
>>       so that all socket operations are via a foreign pointer to
>>       a C-structure in which the file descriptor is invalidated on
>>       close.
>>
>>    4.  Other suggestions...
>>
>> --
>>         Viktor.
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
>>
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.