Compact Normal Form and ST

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Compact Normal Form and ST

Andrew Martin
In the primops file where the compact normal form functions are
documented (https://github.com/ghc/ghc/blob/adcd1c62b6d372f100ccf1d5d7cd94f79aaac194/compiler/prelude/primops.txt.pp#L2487),
I noticed that all of the functions have type signatures that constrain
them to only being used in IO. For example:

    compactAdd# :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, a #)

I would like to know if generalizing these to allow them to work in ST would
be sound. That is, changing the type signature to:

    compactAdd# :: Compact# -> a -> State# s -> (# State# s, a #)

I'm not requesting that this change actually be made. I only want to
know if using unsafeCoerce to create the second function for my own
project would actually be sound.

For those interested in knowing why I want this, it's because there are
situation where I'm interested in building up a giant structure in a
compact region, but in a way that doesn't actually require IO. I think
it's a pity to have to use a type signature with IO and then call
unsafePerformIO at the end instead of using the more constrained ST,
and runST, which makes it clear that I'm not doing anything observable
form the outside.

As a minor bonus, the ST version of the Compact data type shouldn't need
the lock that the IO version does, since concurrent calls to compactAdd
are not possible.

-Andrew Martin

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Compact Normal Form and ST

Edward Z. Yang
There are two senses of referential transparency here which should be
considered.  First is whether or not you will get the same value results
if you use the compact functionality in ST.  Here, the answer is yes.
Compact normal form has very trivial semantics in this domain, and
it would have been OK even to make compact normal forms be pure
functions.

Second is whether or not the performance characteristics are preserved.
Here, the situation is different.  Most notably, pure expressions and
invocations of the same runST block may be commoned up (via an
optimization pass like CSE.)  In that case, what was previously two
separate compact blocks may be commoned up into a single one.  This
could be disaster if you were planning to use these blocks as separate
allocation buffers for subsequent modifications.

This motivated specializing compact to IO.  It won't segfault if you
put it in ST, but the performance characteristics might change.

Edward

Excerpts from Andrew Martin's message of 2017-06-18 10:24:09 -0400:

> In the primops file where the compact normal form functions are
> documented (https://github.com/ghc/ghc/blob/adcd1c62b6d372f100ccf1d5d7cd94f79aaac194/compiler/prelude/primops.txt.pp#L2487),
> I noticed that all of the functions have type signatures that constrain
> them to only being used in IO. For example:
>
>     compactAdd# :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, a #)
>
> I would like to know if generalizing these to allow them to work in ST would
> be sound. That is, changing the type signature to:
>
>     compactAdd# :: Compact# -> a -> State# s -> (# State# s, a #)
>
> I'm not requesting that this change actually be made. I only want to
> know if using unsafeCoerce to create the second function for my own
> project would actually be sound.
>
> For those interested in knowing why I want this, it's because there are
> situation where I'm interested in building up a giant structure in a
> compact region, but in a way that doesn't actually require IO. I think
> it's a pity to have to use a type signature with IO and then call
> unsafePerformIO at the end instead of using the more constrained ST,
> and runST, which makes it clear that I'm not doing anything observable
> form the outside.
>
> As a minor bonus, the ST version of the Compact data type shouldn't need
> the lock that the IO version does, since concurrent calls to compactAdd
> are not possible.
>
> -Andrew Martin
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Compact Normal Form and ST

David Feuer
Can you explain how things could go wrong in ST, perhaps with an example? It's hard to see the potential problem.

On Jun 18, 2017 11:49 PM, "Edward Z. Yang" <[hidden email]> wrote:
There are two senses of referential transparency here which should be
considered.  First is whether or not you will get the same value results
if you use the compact functionality in ST.  Here, the answer is yes.
Compact normal form has very trivial semantics in this domain, and
it would have been OK even to make compact normal forms be pure
functions.

Second is whether or not the performance characteristics are preserved.
Here, the situation is different.  Most notably, pure expressions and
invocations of the same runST block may be commoned up (via an
optimization pass like CSE.)  In that case, what was previously two
separate compact blocks may be commoned up into a single one.  This
could be disaster if you were planning to use these blocks as separate
allocation buffers for subsequent modifications.

This motivated specializing compact to IO.  It won't segfault if you
put it in ST, but the performance characteristics might change.

Edward

Excerpts from Andrew Martin's message of 2017-06-18 10:24:09 -0400:
> In the primops file where the compact normal form functions are
> documented (https://github.com/ghc/ghc/blob/adcd1c62b6d372f100ccf1d5d7cd94f79aaac194/compiler/prelude/primops.txt.pp#L2487),
> I noticed that all of the functions have type signatures that constrain
> them to only being used in IO. For example:
>
>     compactAdd# :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, a #)
>
> I would like to know if generalizing these to allow them to work in ST would
> be sound. That is, changing the type signature to:
>
>     compactAdd# :: Compact# -> a -> State# s -> (# State# s, a #)
>
> I'm not requesting that this change actually be made. I only want to
> know if using unsafeCoerce to create the second function for my own
> project would actually be sound.
>
> For those interested in knowing why I want this, it's because there are
> situation where I'm interested in building up a giant structure in a
> compact region, but in a way that doesn't actually require IO. I think
> it's a pity to have to use a type signature with IO and then call
> unsafePerformIO at the end instead of using the more constrained ST,
> and runST, which makes it clear that I'm not doing anything observable
> form the outside.
>
> As a minor bonus, the ST version of the Compact data type shouldn't need
> the lock that the IO version does, since concurrent calls to compactAdd
> are not possible.
>
> -Andrew Martin
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Compact Normal Form and ST

Andrew Martin
I too am struggling to find a scenario in which this causes something to
go wrong. Ending up with only one block instead of two copies of it
seems like it could only be a problem, as Edward says, "if you were
planning to use these blocks as separate allocation buffers for
subsequent modifications". But, mutable byte arrays allocated by
newByteArray# cannot escape runST, so I don't see how this could happen.
Well, I guess if you returned a frozen array, and then you thawed it
for subsequent modification outside of runST, that would be problematic,
but CSE already makes that unsafe, even without involving compact
regions. It's good to know that, for my purposes, I'm in the clear, but
I would also like to hear from Edward clarifying the original statement.

-Andrew Martin

On Mon, Jun 19, 2017 at 4:41 AM, David Feuer <[hidden email]> wrote:
Can you explain how things could go wrong in ST, perhaps with an example? It's hard to see the potential problem.

On Jun 18, 2017 11:49 PM, "Edward Z. Yang" <[hidden email]> wrote:
There are two senses of referential transparency here which should be
considered.  First is whether or not you will get the same value results
if you use the compact functionality in ST.  Here, the answer is yes.
Compact normal form has very trivial semantics in this domain, and
it would have been OK even to make compact normal forms be pure
functions.

Second is whether or not the performance characteristics are preserved.
Here, the situation is different.  Most notably, pure expressions and
invocations of the same runST block may be commoned up (via an
optimization pass like CSE.)  In that case, what was previously two
separate compact blocks may be commoned up into a single one.  This
could be disaster if you were planning to use these blocks as separate
allocation buffers for subsequent modifications.

This motivated specializing compact to IO.  It won't segfault if you
put it in ST, but the performance characteristics might change.

Edward

Excerpts from Andrew Martin's message of 2017-06-18 10:24:09 -0400:
> In the primops file where the compact normal form functions are
> documented (https://github.com/ghc/ghc/blob/adcd1c62b6d372f100ccf1d5d7cd94f79aaac194/compiler/prelude/primops.txt.pp#L2487),
> I noticed that all of the functions have type signatures that constrain
> them to only being used in IO. For example:
>
>     compactAdd# :: Compact# -> a -> State# RealWorld -> (# State# RealWorld, a #)
>
> I would like to know if generalizing these to allow them to work in ST would
> be sound. That is, changing the type signature to:
>
>     compactAdd# :: Compact# -> a -> State# s -> (# State# s, a #)
>
> I'm not requesting that this change actually be made. I only want to
> know if using unsafeCoerce to create the second function for my own
> project would actually be sound.
>
> For those interested in knowing why I want this, it's because there are
> situation where I'm interested in building up a giant structure in a
> compact region, but in a way that doesn't actually require IO. I think
> it's a pity to have to use a type signature with IO and then call
> unsafePerformIO at the end instead of using the more constrained ST,
> and runST, which makes it clear that I'm not doing anything observable
> form the outside.
>
> As a minor bonus, the ST version of the Compact data type shouldn't need
> the lock that the IO version does, since concurrent calls to compactAdd
> are not possible.
>
> -Andrew Martin
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries




--
-Andrew Thaddeus Martin

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Compact Normal Form and ST

Edward Z. Yang
Actually, this is more interesting than I thought!  Let me explain.

Let's take the compact API and make a hypothetical ST style
interface.  Here's one possibility:

    data Compact s a
    compact :: a -> ST s (Compact s a)
    getCompact :: Compact a -> a
    runST :: (forall s. ST s a) -> a

Let's consider the following code:

    let x1 = runST (fmap getCompact (compact big_value))
        x2 = runST (fmap getCompact (compact big_value))

GHC may CSE x1 and x2 into a single expression, which means that
what was previously two compact expressions turns into a single
one.

"But Edward", you may say, "That's fine, I only cared about the compact
region being distinct within a single ST block. If you really wanted
x1 and x2 to be distinct, you should have written it this way:"

    let (x1, x2) = runST $ do x1 <- fmap getCompact (compact big_value)
                              x2 <- fmap getCompact (compact big_value)
                              return (x1, x2)

which is indeed true!  So I suppose it depends on what your expectations
are.

Edward

Excerpts from Andrew Martin's message of 2017-06-19 09:06:39 -0400:

> I too am struggling to find a scenario in which this causes something to
> go wrong. Ending up with only one block instead of two copies of it
> seems like it could only be a problem, as Edward says, "if you were
> planning to use these blocks as separate allocation buffers for
> subsequent modifications". But, mutable byte arrays allocated by
> newByteArray# cannot escape runST, so I don't see how this could happen.
> Well, I guess if you returned a frozen array, and then you thawed it
> for subsequent modification outside of runST, that would be problematic,
> but CSE already makes that unsafe, even without involving compact
> regions. It's good to know that, for my purposes, I'm in the clear, but
> I would also like to hear from Edward clarifying the original statement.
>
> -Andrew Martin
>
> On Mon, Jun 19, 2017 at 4:41 AM, David Feuer <[hidden email]> wrote:
>
> > Can you explain how things could go wrong in ST, perhaps with an example?
> > It's hard to see the potential problem.
> >
> > On Jun 18, 2017 11:49 PM, "Edward Z. Yang" <[hidden email]> wrote:
> >
> > There are two senses of referential transparency here which should be
> > considered.  First is whether or not you will get the same value results
> > if you use the compact functionality in ST.  Here, the answer is yes.
> > Compact normal form has very trivial semantics in this domain, and
> > it would have been OK even to make compact normal forms be pure
> > functions.
> >
> > Second is whether or not the performance characteristics are preserved.
> > Here, the situation is different.  Most notably, pure expressions and
> > invocations of the same runST block may be commoned up (via an
> > optimization pass like CSE.)  In that case, what was previously two
> > separate compact blocks may be commoned up into a single one.  This
> > could be disaster if you were planning to use these blocks as separate
> > allocation buffers for subsequent modifications.
> >
> > This motivated specializing compact to IO.  It won't segfault if you
> > put it in ST, but the performance characteristics might change.
> >
> > Edward
> >
> > Excerpts from Andrew Martin's message of 2017-06-18 10:24:09 -0400:
> > > In the primops file where the compact normal form functions are
> > > documented (https://github.com/ghc/ghc/blob/adcd1c62b6d372f100ccf1d5d7c
> > d94f79aaac194/compiler/prelude/primops.txt.pp#L2487),
> > > I noticed that all of the functions have type signatures that constrain
> > > them to only being used in IO. For example:
> > >
> > >     compactAdd# :: Compact# -> a -> State# RealWorld -> (# State#
> > RealWorld, a #)
> > >
> > > I would like to know if generalizing these to allow them to work in ST
> > would
> > > be sound. That is, changing the type signature to:
> > >
> > >     compactAdd# :: Compact# -> a -> State# s -> (# State# s, a #)
> > >
> > > I'm not requesting that this change actually be made. I only want to
> > > know if using unsafeCoerce to create the second function for my own
> > > project would actually be sound.
> > >
> > > For those interested in knowing why I want this, it's because there are
> > > situation where I'm interested in building up a giant structure in a
> > > compact region, but in a way that doesn't actually require IO. I think
> > > it's a pity to have to use a type signature with IO and then call
> > > unsafePerformIO at the end instead of using the more constrained ST,
> > > and runST, which makes it clear that I'm not doing anything observable
> > > form the outside.
> > >
> > > As a minor bonus, the ST version of the Compact data type shouldn't need
> > > the lock that the IO version does, since concurrent calls to compactAdd
> > > are not possible.
> > >
> > > -Andrew Martin
> > >
> > _______________________________________________
> > Libraries mailing list
> > [hidden email]
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >
> >
> >
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries