Template Haskell determinism

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Template Haskell determinism

Bartosz Nitka
Template Haskell with its ability to do arbitrary IO is non-deterministic by
design. You could for example embed the current date in a file. There is
however one kind of non-deterministic behavior that you can trigger 
accidentally. It has to do with how Names are reified. If you take a look at 
the definition of reifyName you can see that it puts the assigned Unique in a
NameU:

  reifyName :: NamedThing n => n -> TH.Name
  reifyName thing
    | isExternalName name = mk_varg pkg_str mod_str occ_str
    | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))
    ...
NameFlavour which NameU is a constructor of has a default Ord instance, meaning
that it ends up comparing the Uniques. The relative ordering of Uniques is not
guaranteed to be stable across recompilations [1], so this can lead to 
ABI-incompatible binaries.

This isn't an abstract problem and it actually happens in practice. The 
microlens package keeps Names in a Set and later turns that set into a list.
The results have different orders of TyVars resulting in different ABI hashes
and can potentially be optimized differently.

I believe it's worth to handle this case in a deterministic way and I have a
solution in mind. The idea is to extend NameU (and potentially NameL) with an
ordering key. To be more concrete:

-   | NameU !Int
+   | NameU !Int !Int

This way the Ord instance can use a stable key and the problem reduces to
ensuring the keys are stable. To generate stable keys we can use the fact that
reify traverses the expressions in the same order every time and sequentially
allocate new keys based on traversal order. The way I have it implemented now 
is to add a new field in TcGblEnv which maps Uniques to allocated keys:

+        tcg_th_names :: TcRef (UniqFM Int, Int),

Then the reifyName and qNewName do the necessary bookkeeping and translate the
Uniques on the fly.

This is a breaking change and it doesn't fix the problem that NameFlavour is
not abstract and leaks the Uniques. It would break at least:

- singletons
- th-lift
- haskell-src-meta
- shakespeare
- distributed-closure

I'd like to get feedback if this is an acceptable solution and if the problem
is worth solving.

Cheers,
Bartosz


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Richard Eisenberg-2

On May 31, 2016, at 9:54 AM, Bartosz Nitka <[hidden email]> wrote:

> I'd like to get feedback if this is an acceptable solution and if the problem
> is worth solving.

I don't have an opinion about "worth solving". While I understand your description of the problem and believe you that it crops up in practice, I don't have a grasp on the net effect of this all. Bottom line on this front: I trust your judgment.

As to the breaking change: go for it. Template Haskell churns a good deal between releases (though not within a single major release) and so breaking changes are common. And NameFlavour really should be abstract (it can't be due to the linkage between the template-haskell and ghc packages) and anyone who uses it (including me) is doing something fishy. Accordingly, I'd personally be OK with a breaking change in a minor release around NameFlavour, as long as nothing exported from Language.Haskell.TH is changed. (We do have a way of using CPP to detect minor version bumps, right?)

I hope this helps,
Richard

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Michael Sloan
In reply to this post by Bartosz Nitka
+1 to solving this.  Not sure about the approach, but assuming the following concerns are addressed, I'm (+1) on it too:

This solution is clever!  However, I think there is some difficulty to determining this ordering key.  Namely, what happens when I construct the (Set Name) using results from multiple reifies?

One solution is to have the ordering key be a consecutive supply that's initialized on a per-module basis.  There is still an issue there, though, which is that you might store one of these names in a global IORef that's used by a later TH splice.  Or, similarly, serialize the names to a file and later load them.  At least in those cases you need to use 'runIO' to break determinism.

If names get different ordering keys when reified from different modules (seems like they'd have to, particularly given ghc's "-j"), then we end up with an unpleasant circumstance where these do not compare as equal.  How about having the Eq instance ignore the ordering key?  I think that mostly resolves this concern.  This implies that the Ord instance should also yield EQ and ignore the ordering key, when the unique key matches.

One issue with this is that switching the order of reify could unexpectedly vary the behavior.

Does the map in TcGblEnv imply that a reify from a later module will get the same ordering key?  So does this mean that the keys used in a given reify depend on which things have already been reified?  In that case, then this is also an issue with your solution.  Now, it's not a big problem at all, just surprising to the user.


If the internal API for Name does change, may as well address https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's suggested solution of having both the traditional package identifier and package keys in 'Name'.  

-Michael

On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]> wrote:
Template Haskell with its ability to do arbitrary IO is non-deterministic by
design. You could for example embed the current date in a file. There is
however one kind of non-deterministic behavior that you can trigger 
accidentally. It has to do with how Names are reified. If you take a look at 
the definition of reifyName you can see that it puts the assigned Unique in a
NameU:

  reifyName :: NamedThing n => n -> TH.Name
  reifyName thing
    | isExternalName name = mk_varg pkg_str mod_str occ_str
    | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))
    ...
NameFlavour which NameU is a constructor of has a default Ord instance, meaning
that it ends up comparing the Uniques. The relative ordering of Uniques is not
guaranteed to be stable across recompilations [1], so this can lead to 
ABI-incompatible binaries.

This isn't an abstract problem and it actually happens in practice. The 
microlens package keeps Names in a Set and later turns that set into a list.
The results have different orders of TyVars resulting in different ABI hashes
and can potentially be optimized differently.

I believe it's worth to handle this case in a deterministic way and I have a
solution in mind. The idea is to extend NameU (and potentially NameL) with an
ordering key. To be more concrete:

-   | NameU !Int
+   | NameU !Int !Int

This way the Ord instance can use a stable key and the problem reduces to
ensuring the keys are stable. To generate stable keys we can use the fact that
reify traverses the expressions in the same order every time and sequentially
allocate new keys based on traversal order. The way I have it implemented now 
is to add a new field in TcGblEnv which maps Uniques to allocated keys:

+        tcg_th_names :: TcRef (UniqFM Int, Int),

Then the reifyName and qNewName do the necessary bookkeeping and translate the
Uniques on the fly.

This is a breaking change and it doesn't fix the problem that NameFlavour is
not abstract and leaks the Uniques. It would break at least:

- singletons
- th-lift
- haskell-src-meta
- shakespeare
- distributed-closure

I'd like to get feedback if this is an acceptable solution and if the problem
is worth solving.

Cheers,
Bartosz


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Template Haskell determinism

Simon Peyton Jones

If names get different ordering keys when reified from different modules (seems like they'd have to, particularly given ghc's "-j"), then we end up with an unpleasant circumstance where these do not compare as equal

 

The I believe that global, top level names (NameG) are not subject to this ordering stuff, so I don’t think this problem can occur.

 

This is a breaking change and it doesn't fix the problem that NameFlavour is

not abstract and leaks the Uniques. It would break at least:

 

But why is NameU exposed to clients?   GHC needs to know, but clients don’t.  What use are these packages making of it?

 

S

 

 

From: ghc-devs [mailto:[hidden email]] On Behalf Of Michael Sloan
Sent: 02 June 2016 02:07
To: Bartosz Nitka <[hidden email]>
Cc: ghc-devs Devs <[hidden email]>
Subject: Re: Template Haskell determinism

 

+1 to solving this.  Not sure about the approach, but assuming the following concerns are addressed, I'm (+1) on it too:

 

This solution is clever!  However, I think there is some difficulty to determining this ordering key.  Namely, what happens when I construct the (Set Name) using results from multiple reifies?

 

One solution is to have the ordering key be a consecutive supply that's initialized on a per-module basis.  There is still an issue there, though, which is that you might store one of these names in a global IORef that's used by a later TH splice.  Or, similarly, serialize the names to a file and later load them.  At least in those cases you need to use 'runIO' to break determinism.

 

If names get different ordering keys when reified from different modules (seems like they'd have to, particularly given ghc's "-j"), then we end up with an unpleasant circumstance where these do not compare as equal.  How about having the Eq instance ignore the ordering key?  I think that mostly resolves this concern.  This implies that the Ord instance should also yield EQ and ignore the ordering key, when the unique key matches.

 

One issue with this is that switching the order of reify could unexpectedly vary the behavior.

 

Does the map in TcGblEnv imply that a reify from a later module will get the same ordering key?  So does this mean that the keys used in a given reify depend on which things have already been reified?  In that case, then this is also an issue with your solution.  Now, it's not a big problem at all, just surprising to the user.

 

 

If the internal API for Name does change, may as well address https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's suggested solution of having both the traditional package identifier and package keys in 'Name'.  

 

-Michael

 

On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]> wrote:

Template Haskell with its ability to do arbitrary IO is non-deterministic by

design. You could for example embed the current date in a file. There is

however one kind of non-deterministic behavior that you can trigger 

accidentally. It has to do with how Names are reified. If you take a look at 

the definition of reifyName you can see that it puts the assigned Unique in a

NameU:

 

  reifyName :: NamedThing n => n -> TH.Name

  reifyName thing

    | isExternalName name = mk_varg pkg_str mod_str occ_str

    | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))

    ...

NameFlavour which NameU is a constructor of has a default Ord instance, meaning

that it ends up comparing the Uniques. The relative ordering of Uniques is not

guaranteed to be stable across recompilations [1], so this can lead to 

ABI-incompatible binaries.

 

This isn't an abstract problem and it actually happens in practice. The 

microlens package keeps Names in a Set and later turns that set into a list.

The results have different orders of TyVars resulting in different ABI hashes

and can potentially be optimized differently.

 

I believe it's worth to handle this case in a deterministic way and I have a

solution in mind. The idea is to extend NameU (and potentially NameL) with an

ordering key. To be more concrete:

 

-   | NameU !Int

+   | NameU !Int !Int

 

This way the Ord instance can use a stable key and the problem reduces to

ensuring the keys are stable. To generate stable keys we can use the fact that

reify traverses the expressions in the same order every time and sequentially

allocate new keys based on traversal order. The way I have it implemented now 

is to add a new field in TcGblEnv which maps Uniques to allocated keys:

 

+        tcg_th_names :: TcRef (UniqFM Int, Int),

 

Then the reifyName and qNewName do the necessary bookkeeping and translate the

Uniques on the fly.

 

This is a breaking change and it doesn't fix the problem that NameFlavour is

not abstract and leaks the Uniques. It would break at least:

 

- singletons

- th-lift

- haskell-src-meta

- shakespeare

- distributed-closure

 

I'd like to get feedback if this is an acceptable solution and if the problem

is worth solving.

 

Cheers,

Bartosz

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Richard Eisenberg-2

On Jun 2, 2016, at 7:12 AM, Simon Peyton Jones <[hidden email]> wrote:

But why is NameU exposed to clients?   GHC needs to know, but clients don’t.  What use are these packages making of it?

singletons uses NameU in two places:

1. To generate unique numbers. It would be easy enough for me to put this functionality in my own monad, though.

2. More importantly, to work around GHC's #11812, caused by the fact that `NameU`s don't always work when other Names would. So I have to squeeze out `NameU`s in one spot.

Richard


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Michael Sloan
In reply to this post by Simon Peyton Jones
On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <[hidden email]> wrote:

If names get different ordering keys when reified from different modules (seems like they'd have to, particularly given ghc's "-j"), then we end up with an unpleasant circumstance where these do not compare as equal

 

The I believe that global, top level names (NameG) are not subject to this ordering stuff, so I don’t think this problem can occur.


True, top level names are NameG.  The reified Info for a top level Dec may include NameU, though.  For example, the type variables in 'Maybe' are NameU:

$(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
     lift (show nf))

The resulting expression is something like "NameU 822083586"
 

This is a breaking change and it doesn't fix the problem that NameFlavour is

not abstract and leaks the Uniques. It would break at least:

 

But why is NameU exposed to clients?   GHC needs to know, but clients don’t.  What use are these packages making of it?


It's being leaked in the public inteface via Ord.  The Eq instance is fine, because these are Uniques, so the results should be consistent.  

There are two goals in contention here:

1) Having some ordering on Names so that they can be used in Map or Set
2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to really handle these well.  In that case, the ordering would be based on everything but the NameU int, but 'Eq' would still follow it

A few ideas for different approaches to resolving this:

1) Document it.  Less appealing than fixing it in the API, but still would be good.  

2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd' newtype (partial ord on the non-unique info), or 'UnstableNameOrd' newtype (current behavior).  A trickyness of this approach is that you'd need containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases people are using the 'Ord' instance with 'Name's that are not 'NameU', so this would break a lot of code that was already deterministic.

3) Some approaches like this ordering key, but I'm not sure how it will help when comparing NameUs from different modules?
 

S

 

 

From: ghc-devs [mailto:[hidden email]] On Behalf Of Michael Sloan
Sent: 02 June 2016 02:07
To: Bartosz Nitka <[hidden email]>
Cc: ghc-devs Devs <[hidden email]>
Subject: Re: Template Haskell determinism

 

+1 to solving this.  Not sure about the approach, but assuming the following concerns are addressed, I'm (+1) on it too:

 

This solution is clever!  However, I think there is some difficulty to determining this ordering key.  Namely, what happens when I construct the (Set Name) using results from multiple reifies?

 

One solution is to have the ordering key be a consecutive supply that's initialized on a per-module basis.  There is still an issue there, though, which is that you might store one of these names in a global IORef that's used by a later TH splice.  Or, similarly, serialize the names to a file and later load them.  At least in those cases you need to use 'runIO' to break determinism.

 

If names get different ordering keys when reified from different modules (seems like they'd have to, particularly given ghc's "-j"), then we end up with an unpleasant circumstance where these do not compare as equal.  How about having the Eq instance ignore the ordering key?  I think that mostly resolves this concern.  This implies that the Ord instance should also yield EQ and ignore the ordering key, when the unique key matches.

 

One issue with this is that switching the order of reify could unexpectedly vary the behavior.

 

Does the map in TcGblEnv imply that a reify from a later module will get the same ordering key?  So does this mean that the keys used in a given reify depend on which things have already been reified?  In that case, then this is also an issue with your solution.  Now, it's not a big problem at all, just surprising to the user.

 

 

If the internal API for Name does change, may as well address https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's suggested solution of having both the traditional package identifier and package keys in 'Name'.  

 

-Michael

 

On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]> wrote:

Template Haskell with its ability to do arbitrary IO is non-deterministic by

design. You could for example embed the current date in a file. There is

however one kind of non-deterministic behavior that you can trigger 

accidentally. It has to do with how Names are reified. If you take a look at 

the definition of reifyName you can see that it puts the assigned Unique in a

NameU:

 

  reifyName :: NamedThing n => n -> TH.Name

  reifyName thing

    | isExternalName name = mk_varg pkg_str mod_str occ_str

    | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))

    ...

NameFlavour which NameU is a constructor of has a default Ord instance, meaning

that it ends up comparing the Uniques. The relative ordering of Uniques is not

guaranteed to be stable across recompilations [1], so this can lead to 

ABI-incompatible binaries.

 

This isn't an abstract problem and it actually happens in practice. The 

microlens package keeps Names in a Set and later turns that set into a list.

The results have different orders of TyVars resulting in different ABI hashes

and can potentially be optimized differently.

 

I believe it's worth to handle this case in a deterministic way and I have a

solution in mind. The idea is to extend NameU (and potentially NameL) with an

ordering key. To be more concrete:

 

-   | NameU !Int

+   | NameU !Int !Int

 

This way the Ord instance can use a stable key and the problem reduces to

ensuring the keys are stable. To generate stable keys we can use the fact that

reify traverses the expressions in the same order every time and sequentially

allocate new keys based on traversal order. The way I have it implemented now 

is to add a new field in TcGblEnv which maps Uniques to allocated keys:

 

+        tcg_th_names :: TcRef (UniqFM Int, Int),

 

Then the reifyName and qNewName do the necessary bookkeeping and translate the

Uniques on the fly.

 

This is a breaking change and it doesn't fix the problem that NameFlavour is

not abstract and leaks the Uniques. It would break at least:

 

- singletons

- th-lift

- haskell-src-meta

- shakespeare

- distributed-closure

 

I'd like to get feedback if this is an acceptable solution and if the problem

is worth solving.

 

Cheers,

Bartosz

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

 



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Edward Z. Yang
I must admit, I am a bit confused by this discussion.

It is true that every Name is associated with a Unique.  But you don't
need the Unique to equality/ordering tests; the names also contain
enough (stable) information for stable comparisons of that sort.  So
why don't we expose that instead of the Unique?

Edward

Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:

> On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <[hidden email]>
> wrote:
>
> > If names get different ordering keys when reified from different modules
> > (seems like they'd have to, particularly given ghc's "-j"), then we end up
> > with an unpleasant circumstance where these do not compare as equal
> >
> >
> >
> > The I believe that global, top level names (NameG) are not subject to this
> > ordering stuff, so I don’t think this problem can occur.
> >
>
> True, top level names are NameG.  The reified Info for a top level Dec may
> include NameU, though.  For example, the type variables in 'Maybe' are
> NameU:
>
> $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
>      lift (show nf))
>
> The resulting expression is something like "NameU 822083586"
>
> > This is a breaking change and it doesn't fix the problem that NameFlavour
> > is
> >
> > not abstract and leaks the Uniques. It would break at least:
> >
> >
> >
> > But why is NameU exposed to clients?   GHC needs to know, but clients
> > don’t.  What use are these packages making of it?
> >
>
> It's being leaked in the public inteface via Ord.  The Eq instance is fine,
> because these are Uniques, so the results should be consistent.
>
> There are two goals in contention here:
>
> 1) Having some ordering on Names so that they can be used in Map or Set
> 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
> really handle these well.  In that case, the ordering would be based on
> everything but the NameU int, but 'Eq' would still follow it
>
> A few ideas for different approaches to resolving this:
>
> 1) Document it.  Less appealing than fixing it in the API, but still would
> be good.
>
> 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
> newtype (partial ord on the non-unique info), or 'UnstableNameOrd' newtype
> (current behavior).  A trickyness of this approach is that you'd need
> containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
> people are using the 'Ord' instance with 'Name's that are not 'NameU', so
> this would break a lot of code that was already deterministic.
>
> 3) Some approaches like this ordering key, but I'm not sure how it will
> help when comparing NameUs from different modules?
>
> > S
> >
> >
> >
> >
> >
> > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of *Michael
> > Sloan
> > *Sent:* 02 June 2016 02:07
> > *To:* Bartosz Nitka <[hidden email]>
> > *Cc:* ghc-devs Devs <[hidden email]>
> > *Subject:* Re: Template Haskell determinism
> >
> >
> >
> > +1 to solving this.  Not sure about the approach, but assuming the
> > following concerns are addressed, I'm (+1) on it too:
> >
> >
> >
> > This solution is clever!  However, I think there is some difficulty to
> > determining this ordering key.  Namely, what happens when I construct the
> > (Set Name) using results from multiple reifies?
> >
> >
> >
> > One solution is to have the ordering key be a consecutive supply that's
> > initialized on a per-module basis.  There is still an issue there, though,
> > which is that you might store one of these names in a global IORef that's
> > used by a later TH splice.  Or, similarly, serialize the names to a file
> > and later load them.  At least in those cases you need to use 'runIO' to
> > break determinism.
> >
> >
> >
> > If names get different ordering keys when reified from different modules
> > (seems like they'd have to, particularly given ghc's "-j"), then we end up
> > with an unpleasant circumstance where these do not compare as equal.  How
> > about having the Eq instance ignore the ordering key?  I think that mostly
> > resolves this concern.  This implies that the Ord instance should also
> > yield EQ and ignore the ordering key, when the unique key matches.
> >
> >
> >
> > One issue with this is that switching the order of reify could
> > unexpectedly vary the behavior.
> >
> >
> >
> > Does the map in TcGblEnv imply that a reify from a later module will get
> > the same ordering key?  So does this mean that the keys used in a given
> > reify depend on which things have already been reified?  In that case, then
> > this is also an issue with your solution.  Now, it's not a big problem at
> > all, just surprising to the user.
> >
> >
> >
> >
> >
> > If the internal API for Name does change, may as well address
> > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
> > suggested solution of having both the traditional package identifier and
> > package keys in 'Name'.
> >
> >
> >
> > -Michael
> >
> >
> >
> > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]> wrote:
> >
> > Template Haskell with its ability to do arbitrary IO is non-deterministic
> > by
> >
> > design. You could for example embed the current date in a file. There is
> >
> > however one kind of non-deterministic behavior that you can trigger
> >
> > accidentally. It has to do with how Names are reified. If you take a look
> > at
> >
> > the definition of reifyName you can see that it puts the assigned Unique
> > in a
> >
> > NameU:
> >
> >
> >
> >   reifyName :: NamedThing n => n -> TH.Name
> >
> >   reifyName thing
> >
> >     | isExternalName name = mk_varg pkg_str mod_str occ_str
> >
> >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))
> >
> >     ...
> >
> > NameFlavour which NameU is a constructor of has a default Ord instance,
> > meaning
> >
> > that it ends up comparing the Uniques. The relative ordering of Uniques is
> > not
> >
> > guaranteed to be stable across recompilations [1], so this can lead to
> >
> > ABI-incompatible binaries.
> >
> >
> >
> > This isn't an abstract problem and it actually happens in practice. The
> >
> > microlens package keeps Names in a Set and later turns that set into a
> > list.
> >
> > The results have different orders of TyVars resulting in different ABI
> > hashes
> >
> > and can potentially be optimized differently.
> >
> >
> >
> > I believe it's worth to handle this case in a deterministic way and I have
> > a
> >
> > solution in mind. The idea is to extend NameU (and potentially NameL) with
> > an
> >
> > ordering key. To be more concrete:
> >
> >
> >
> > -   | NameU !Int
> >
> > +   | NameU !Int !Int
> >
> >
> >
> > This way the Ord instance can use a stable key and the problem reduces to
> >
> > ensuring the keys are stable. To generate stable keys we can use the fact
> > that
> >
> > reify traverses the expressions in the same order every time and
> > sequentially
> >
> > allocate new keys based on traversal order. The way I have it implemented
> > now
> >
> > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
> >
> >
> >
> > +        tcg_th_names :: TcRef (UniqFM Int, Int),
> >
> >
> >
> > Then the reifyName and qNewName do the necessary bookkeeping and translate
> > the
> >
> > Uniques on the fly.
> >
> >
> >
> > This is a breaking change and it doesn't fix the problem that NameFlavour
> > is
> >
> > not abstract and leaks the Uniques. It would break at least:
> >
> >
> >
> > - singletons
> >
> > - th-lift
> >
> > - haskell-src-meta
> >
> > - shakespeare
> >
> > - distributed-closure
> >
> >
> >
> > I'd like to get feedback if this is an acceptable solution and if the
> > problem
> >
> > is worth solving.
> >
> >
> >
> > Cheers,
> >
> > Bartosz
> >
> >
> >
> > [1]
> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
> >
> >
> > _______________________________________________
> > ghc-devs mailing list
> > [hidden email]
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> > <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d>
> >
> >
> >
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Michael Sloan
Hey, sorry for not getting back to this sooner!

Perhaps I should have added the following to my list of goals in contention:

(3) (==) shouldn't yield True for Names that have different unique ids.

We can only have stable comparisons if goal (3) isn't met, and two different unique Names would be considered to be equivalent based on the nameBase.  This is because Ord is a total order, not a partial order.  As described in my prior email, PartialOrd could be added, but it'd be inconvenient to use with existing Ord based containers.

-Michael

On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <[hidden email]> wrote:
I must admit, I am a bit confused by this discussion.

It is true that every Name is associated with a Unique.  But you don't
need the Unique to equality/ordering tests; the names also contain
enough (stable) information for stable comparisons of that sort.  So
why don't we expose that instead of the Unique?

Edward

Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
> On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <[hidden email]>
> wrote:
>
> > If names get different ordering keys when reified from different modules
> > (seems like they'd have to, particularly given ghc's "-j"), then we end up
> > with an unpleasant circumstance where these do not compare as equal
> >
> >
> >
> > The I believe that global, top level names (NameG) are not subject to this
> > ordering stuff, so I don’t think this problem can occur.
> >
>
> True, top level names are NameG.  The reified Info for a top level Dec may
> include NameU, though.  For example, the type variables in 'Maybe' are
> NameU:
>
> $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
>      lift (show nf))
>
> The resulting expression is something like "NameU 822083586"
>
> > This is a breaking change and it doesn't fix the problem that NameFlavour
> > is
> >
> > not abstract and leaks the Uniques. It would break at least:
> >
> >
> >
> > But why is NameU exposed to clients?   GHC needs to know, but clients
> > don’t.  What use are these packages making of it?
> >
>
> It's being leaked in the public inteface via Ord.  The Eq instance is fine,
> because these are Uniques, so the results should be consistent.
>
> There are two goals in contention here:
>
> 1) Having some ordering on Names so that they can be used in Map or Set
> 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
> really handle these well.  In that case, the ordering would be based on
> everything but the NameU int, but 'Eq' would still follow it
>
> A few ideas for different approaches to resolving this:
>
> 1) Document it.  Less appealing than fixing it in the API, but still would
> be good.
>
> 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
> newtype (partial ord on the non-unique info), or 'UnstableNameOrd' newtype
> (current behavior).  A trickyness of this approach is that you'd need
> containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
> people are using the 'Ord' instance with 'Name's that are not 'NameU', so
> this would break a lot of code that was already deterministic.
>
> 3) Some approaches like this ordering key, but I'm not sure how it will
> help when comparing NameUs from different modules?
>
> > S
> >
> >
> >
> >
> >
> > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of *Michael
> > Sloan
> > *Sent:* 02 June 2016 02:07
> > *To:* Bartosz Nitka <[hidden email]>
> > *Cc:* ghc-devs Devs <[hidden email]>
> > *Subject:* Re: Template Haskell determinism
> >
> >
> >
> > +1 to solving this.  Not sure about the approach, but assuming the
> > following concerns are addressed, I'm (+1) on it too:
> >
> >
> >
> > This solution is clever!  However, I think there is some difficulty to
> > determining this ordering key.  Namely, what happens when I construct the
> > (Set Name) using results from multiple reifies?
> >
> >
> >
> > One solution is to have the ordering key be a consecutive supply that's
> > initialized on a per-module basis.  There is still an issue there, though,
> > which is that you might store one of these names in a global IORef that's
> > used by a later TH splice.  Or, similarly, serialize the names to a file
> > and later load them.  At least in those cases you need to use 'runIO' to
> > break determinism.
> >
> >
> >
> > If names get different ordering keys when reified from different modules
> > (seems like they'd have to, particularly given ghc's "-j"), then we end up
> > with an unpleasant circumstance where these do not compare as equal.  How
> > about having the Eq instance ignore the ordering key?  I think that mostly
> > resolves this concern.  This implies that the Ord instance should also
> > yield EQ and ignore the ordering key, when the unique key matches.
> >
> >
> >
> > One issue with this is that switching the order of reify could
> > unexpectedly vary the behavior.
> >
> >
> >
> > Does the map in TcGblEnv imply that a reify from a later module will get
> > the same ordering key?  So does this mean that the keys used in a given
> > reify depend on which things have already been reified?  In that case, then
> > this is also an issue with your solution.  Now, it's not a big problem at
> > all, just surprising to the user.
> >
> >
> >
> >
> >
> > If the internal API for Name does change, may as well address
> > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
> > suggested solution of having both the traditional package identifier and
> > package keys in 'Name'.
> >
> >
> >
> > -Michael
> >
> >
> >
> > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]> wrote:
> >
> > Template Haskell with its ability to do arbitrary IO is non-deterministic
> > by
> >
> > design. You could for example embed the current date in a file. There is
> >
> > however one kind of non-deterministic behavior that you can trigger
> >
> > accidentally. It has to do with how Names are reified. If you take a look
> > at
> >
> > the definition of reifyName you can see that it puts the assigned Unique
> > in a
> >
> > NameU:
> >
> >
> >
> >   reifyName :: NamedThing n => n -> TH.Name
> >
> >   reifyName thing
> >
> >     | isExternalName name = mk_varg pkg_str mod_str occ_str
> >
> >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique name))
> >
> >     ...
> >
> > NameFlavour which NameU is a constructor of has a default Ord instance,
> > meaning
> >
> > that it ends up comparing the Uniques. The relative ordering of Uniques is
> > not
> >
> > guaranteed to be stable across recompilations [1], so this can lead to
> >
> > ABI-incompatible binaries.
> >
> >
> >
> > This isn't an abstract problem and it actually happens in practice. The
> >
> > microlens package keeps Names in a Set and later turns that set into a
> > list.
> >
> > The results have different orders of TyVars resulting in different ABI
> > hashes
> >
> > and can potentially be optimized differently.
> >
> >
> >
> > I believe it's worth to handle this case in a deterministic way and I have
> > a
> >
> > solution in mind. The idea is to extend NameU (and potentially NameL) with
> > an
> >
> > ordering key. To be more concrete:
> >
> >
> >
> > -   | NameU !Int
> >
> > +   | NameU !Int !Int
> >
> >
> >
> > This way the Ord instance can use a stable key and the problem reduces to
> >
> > ensuring the keys are stable. To generate stable keys we can use the fact
> > that
> >
> > reify traverses the expressions in the same order every time and
> > sequentially
> >
> > allocate new keys based on traversal order. The way I have it implemented
> > now
> >
> > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
> >
> >
> >
> > +        tcg_th_names :: TcRef (UniqFM Int, Int),
> >
> >
> >
> > Then the reifyName and qNewName do the necessary bookkeeping and translate
> > the
> >
> > Uniques on the fly.
> >
> >
> >
> > This is a breaking change and it doesn't fix the problem that NameFlavour
> > is
> >
> > not abstract and leaks the Uniques. It would break at least:
> >
> >
> >
> > - singletons
> >
> > - th-lift
> >
> > - haskell-src-meta
> >
> > - shakespeare
> >
> > - distributed-closure
> >
> >
> >
> > I'd like to get feedback if this is an acceptable solution and if the
> > problem
> >
> > is worth solving.
> >
> >
> >
> > Cheers,
> >
> > Bartosz
> >
> >
> >
> > [1]
> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
> >
> >
> > _______________________________________________
> > ghc-devs mailing list
> > [hidden email]
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> > <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d>
> >
> >
> >


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Edward Z. Yang
No, nameBase is not the right thing to use here; you also need the
unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
in GHC 7.8 and before).  If you have that information, then
GHC establishes an invariant that if two names compare stably equal,
then the uniques associated with them are the same.

Edward

Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:

> Hey, sorry for not getting back to this sooner!
>
> Perhaps I should have added the following to my list of goals in contention:
>
> (3) (==) shouldn't yield True for Names that have different unique ids.
>
> We can only have stable comparisons if goal (3) isn't met, and two
> different unique Names would be considered to be equivalent based on the
> nameBase.  This is because Ord is a total order, not a partial order.  As
> described in my prior email, PartialOrd could be added, but it'd be
> inconvenient to use with existing Ord based containers.
>
> -Michael
>
> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <[hidden email]> wrote:
>
> > I must admit, I am a bit confused by this discussion.
> >
> > It is true that every Name is associated with a Unique.  But you don't
> > need the Unique to equality/ordering tests; the names also contain
> > enough (stable) information for stable comparisons of that sort.  So
> > why don't we expose that instead of the Unique?
> >
> > Edward
> >
> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
> > [hidden email]>
> > > wrote:
> > >
> > > > If names get different ordering keys when reified from different
> > modules
> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> > end up
> > > > with an unpleasant circumstance where these do not compare as equal
> > > >
> > > >
> > > >
> > > > The I believe that global, top level names (NameG) are not subject to
> > this
> > > > ordering stuff, so I don’t think this problem can occur.
> > > >
> > >
> > > True, top level names are NameG.  The reified Info for a top level Dec
> > may
> > > include NameU, though.  For example, the type variables in 'Maybe' are
> > > NameU:
> > >
> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
> > >      lift (show nf))
> > >
> > > The resulting expression is something like "NameU 822083586"
> > >
> > > > This is a breaking change and it doesn't fix the problem that
> > NameFlavour
> > > > is
> > > >
> > > > not abstract and leaks the Uniques. It would break at least:
> > > >
> > > >
> > > >
> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
> > > > don’t.  What use are these packages making of it?
> > > >
> > >
> > > It's being leaked in the public inteface via Ord.  The Eq instance is
> > fine,
> > > because these are Uniques, so the results should be consistent.
> > >
> > > There are two goals in contention here:
> > >
> > > 1) Having some ordering on Names so that they can be used in Map or Set
> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
> > > really handle these well.  In that case, the ordering would be based on
> > > everything but the NameU int, but 'Eq' would still follow it
> > >
> > > A few ideas for different approaches to resolving this:
> > >
> > > 1) Document it.  Less appealing than fixing it in the API, but still
> > would
> > > be good.
> > >
> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
> > newtype
> > > (current behavior).  A trickyness of this approach is that you'd need
> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
> > > this would break a lot of code that was already deterministic.
> > >
> > > 3) Some approaches like this ordering key, but I'm not sure how it will
> > > help when comparing NameUs from different modules?
> > >
> > > > S
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of
> > *Michael
> > > > Sloan
> > > > *Sent:* 02 June 2016 02:07
> > > > *To:* Bartosz Nitka <[hidden email]>
> > > > *Cc:* ghc-devs Devs <[hidden email]>
> > > > *Subject:* Re: Template Haskell determinism
> > > >
> > > >
> > > >
> > > > +1 to solving this.  Not sure about the approach, but assuming the
> > > > following concerns are addressed, I'm (+1) on it too:
> > > >
> > > >
> > > >
> > > > This solution is clever!  However, I think there is some difficulty to
> > > > determining this ordering key.  Namely, what happens when I construct
> > the
> > > > (Set Name) using results from multiple reifies?
> > > >
> > > >
> > > >
> > > > One solution is to have the ordering key be a consecutive supply that's
> > > > initialized on a per-module basis.  There is still an issue there,
> > though,
> > > > which is that you might store one of these names in a global IORef
> > that's
> > > > used by a later TH splice.  Or, similarly, serialize the names to a
> > file
> > > > and later load them.  At least in those cases you need to use 'runIO'
> > to
> > > > break determinism.
> > > >
> > > >
> > > >
> > > > If names get different ordering keys when reified from different
> > modules
> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> > end up
> > > > with an unpleasant circumstance where these do not compare as equal.
> > How
> > > > about having the Eq instance ignore the ordering key?  I think that
> > mostly
> > > > resolves this concern.  This implies that the Ord instance should also
> > > > yield EQ and ignore the ordering key, when the unique key matches.
> > > >
> > > >
> > > >
> > > > One issue with this is that switching the order of reify could
> > > > unexpectedly vary the behavior.
> > > >
> > > >
> > > >
> > > > Does the map in TcGblEnv imply that a reify from a later module will
> > get
> > > > the same ordering key?  So does this mean that the keys used in a given
> > > > reify depend on which things have already been reified?  In that case,
> > then
> > > > this is also an issue with your solution.  Now, it's not a big problem
> > at
> > > > all, just surprising to the user.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > If the internal API for Name does change, may as well address
> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
> > > > suggested solution of having both the traditional package identifier
> > and
> > > > package keys in 'Name'.
> > > >
> > > >
> > > >
> > > > -Michael
> > > >
> > > >
> > > >
> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]>
> > wrote:
> > > >
> > > > Template Haskell with its ability to do arbitrary IO is
> > non-deterministic
> > > > by
> > > >
> > > > design. You could for example embed the current date in a file. There
> > is
> > > >
> > > > however one kind of non-deterministic behavior that you can trigger
> > > >
> > > > accidentally. It has to do with how Names are reified. If you take a
> > look
> > > > at
> > > >
> > > > the definition of reifyName you can see that it puts the assigned
> > Unique
> > > > in a
> > > >
> > > > NameU:
> > > >
> > > >
> > > >
> > > >   reifyName :: NamedThing n => n -> TH.Name
> > > >
> > > >   reifyName thing
> > > >
> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
> > > >
> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
> > name))
> > > >
> > > >     ...
> > > >
> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
> > > > meaning
> > > >
> > > > that it ends up comparing the Uniques. The relative ordering of
> > Uniques is
> > > > not
> > > >
> > > > guaranteed to be stable across recompilations [1], so this can lead to
> > > >
> > > > ABI-incompatible binaries.
> > > >
> > > >
> > > >
> > > > This isn't an abstract problem and it actually happens in practice. The
> > > >
> > > > microlens package keeps Names in a Set and later turns that set into a
> > > > list.
> > > >
> > > > The results have different orders of TyVars resulting in different ABI
> > > > hashes
> > > >
> > > > and can potentially be optimized differently.
> > > >
> > > >
> > > >
> > > > I believe it's worth to handle this case in a deterministic way and I
> > have
> > > > a
> > > >
> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
> > with
> > > > an
> > > >
> > > > ordering key. To be more concrete:
> > > >
> > > >
> > > >
> > > > -   | NameU !Int
> > > >
> > > > +   | NameU !Int !Int
> > > >
> > > >
> > > >
> > > > This way the Ord instance can use a stable key and the problem reduces
> > to
> > > >
> > > > ensuring the keys are stable. To generate stable keys we can use the
> > fact
> > > > that
> > > >
> > > > reify traverses the expressions in the same order every time and
> > > > sequentially
> > > >
> > > > allocate new keys based on traversal order. The way I have it
> > implemented
> > > > now
> > > >
> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
> > > >
> > > >
> > > >
> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
> > > >
> > > >
> > > >
> > > > Then the reifyName and qNewName do the necessary bookkeeping and
> > translate
> > > > the
> > > >
> > > > Uniques on the fly.
> > > >
> > > >
> > > >
> > > > This is a breaking change and it doesn't fix the problem that
> > NameFlavour
> > > > is
> > > >
> > > > not abstract and leaks the Uniques. It would break at least:
> > > >
> > > >
> > > >
> > > > - singletons
> > > >
> > > > - th-lift
> > > >
> > > > - haskell-src-meta
> > > >
> > > > - shakespeare
> > > >
> > > > - distributed-closure
> > > >
> > > >
> > > >
> > > > I'd like to get feedback if this is an acceptable solution and if the
> > > > problem
> > > >
> > > > is worth solving.
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Bartosz
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
> > > >
> > > >
> > > > _______________________________________________
> > > > ghc-devs mailing list
> > > > [hidden email]
> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> > > > <
> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
> > >
> > > >
> > > >
> > > >
> >
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Michael Sloan
No, NameU and NameL both lack package key / package id.

-Michael

On Wed, Jun 29, 2016 at 7:34 AM, Edward Z. Yang <[hidden email]> wrote:

> No, nameBase is not the right thing to use here; you also need the
> unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
> in GHC 7.8 and before).  If you have that information, then
> GHC establishes an invariant that if two names compare stably equal,
> then the uniques associated with them are the same.
>
> Edward
>
> Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:
>> Hey, sorry for not getting back to this sooner!
>>
>> Perhaps I should have added the following to my list of goals in contention:
>>
>> (3) (==) shouldn't yield True for Names that have different unique ids.
>>
>> We can only have stable comparisons if goal (3) isn't met, and two
>> different unique Names would be considered to be equivalent based on the
>> nameBase.  This is because Ord is a total order, not a partial order.  As
>> described in my prior email, PartialOrd could be added, but it'd be
>> inconvenient to use with existing Ord based containers.
>>
>> -Michael
>>
>> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <[hidden email]> wrote:
>>
>> > I must admit, I am a bit confused by this discussion.
>> >
>> > It is true that every Name is associated with a Unique.  But you don't
>> > need the Unique to equality/ordering tests; the names also contain
>> > enough (stable) information for stable comparisons of that sort.  So
>> > why don't we expose that instead of the Unique?
>> >
>> > Edward
>> >
>> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
>> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
>> > [hidden email]>
>> > > wrote:
>> > >
>> > > > If names get different ordering keys when reified from different
>> > modules
>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>> > end up
>> > > > with an unpleasant circumstance where these do not compare as equal
>> > > >
>> > > >
>> > > >
>> > > > The I believe that global, top level names (NameG) are not subject to
>> > this
>> > > > ordering stuff, so I don’t think this problem can occur.
>> > > >
>> > >
>> > > True, top level names are NameG.  The reified Info for a top level Dec
>> > may
>> > > include NameU, though.  For example, the type variables in 'Maybe' are
>> > > NameU:
>> > >
>> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
>> > >      lift (show nf))
>> > >
>> > > The resulting expression is something like "NameU 822083586"
>> > >
>> > > > This is a breaking change and it doesn't fix the problem that
>> > NameFlavour
>> > > > is
>> > > >
>> > > > not abstract and leaks the Uniques. It would break at least:
>> > > >
>> > > >
>> > > >
>> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
>> > > > don’t.  What use are these packages making of it?
>> > > >
>> > >
>> > > It's being leaked in the public inteface via Ord.  The Eq instance is
>> > fine,
>> > > because these are Uniques, so the results should be consistent.
>> > >
>> > > There are two goals in contention here:
>> > >
>> > > 1) Having some ordering on Names so that they can be used in Map or Set
>> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
>> > > really handle these well.  In that case, the ordering would be based on
>> > > everything but the NameU int, but 'Eq' would still follow it
>> > >
>> > > A few ideas for different approaches to resolving this:
>> > >
>> > > 1) Document it.  Less appealing than fixing it in the API, but still
>> > would
>> > > be good.
>> > >
>> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
>> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
>> > newtype
>> > > (current behavior).  A trickyness of this approach is that you'd need
>> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
>> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
>> > > this would break a lot of code that was already deterministic.
>> > >
>> > > 3) Some approaches like this ordering key, but I'm not sure how it will
>> > > help when comparing NameUs from different modules?
>> > >
>> > > > S
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of
>> > *Michael
>> > > > Sloan
>> > > > *Sent:* 02 June 2016 02:07
>> > > > *To:* Bartosz Nitka <[hidden email]>
>> > > > *Cc:* ghc-devs Devs <[hidden email]>
>> > > > *Subject:* Re: Template Haskell determinism
>> > > >
>> > > >
>> > > >
>> > > > +1 to solving this.  Not sure about the approach, but assuming the
>> > > > following concerns are addressed, I'm (+1) on it too:
>> > > >
>> > > >
>> > > >
>> > > > This solution is clever!  However, I think there is some difficulty to
>> > > > determining this ordering key.  Namely, what happens when I construct
>> > the
>> > > > (Set Name) using results from multiple reifies?
>> > > >
>> > > >
>> > > >
>> > > > One solution is to have the ordering key be a consecutive supply that's
>> > > > initialized on a per-module basis.  There is still an issue there,
>> > though,
>> > > > which is that you might store one of these names in a global IORef
>> > that's
>> > > > used by a later TH splice.  Or, similarly, serialize the names to a
>> > file
>> > > > and later load them.  At least in those cases you need to use 'runIO'
>> > to
>> > > > break determinism.
>> > > >
>> > > >
>> > > >
>> > > > If names get different ordering keys when reified from different
>> > modules
>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>> > end up
>> > > > with an unpleasant circumstance where these do not compare as equal.
>> > How
>> > > > about having the Eq instance ignore the ordering key?  I think that
>> > mostly
>> > > > resolves this concern.  This implies that the Ord instance should also
>> > > > yield EQ and ignore the ordering key, when the unique key matches.
>> > > >
>> > > >
>> > > >
>> > > > One issue with this is that switching the order of reify could
>> > > > unexpectedly vary the behavior.
>> > > >
>> > > >
>> > > >
>> > > > Does the map in TcGblEnv imply that a reify from a later module will
>> > get
>> > > > the same ordering key?  So does this mean that the keys used in a given
>> > > > reify depend on which things have already been reified?  In that case,
>> > then
>> > > > this is also an issue with your solution.  Now, it's not a big problem
>> > at
>> > > > all, just surprising to the user.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > If the internal API for Name does change, may as well address
>> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
>> > > > suggested solution of having both the traditional package identifier
>> > and
>> > > > package keys in 'Name'.
>> > > >
>> > > >
>> > > >
>> > > > -Michael
>> > > >
>> > > >
>> > > >
>> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]>
>> > wrote:
>> > > >
>> > > > Template Haskell with its ability to do arbitrary IO is
>> > non-deterministic
>> > > > by
>> > > >
>> > > > design. You could for example embed the current date in a file. There
>> > is
>> > > >
>> > > > however one kind of non-deterministic behavior that you can trigger
>> > > >
>> > > > accidentally. It has to do with how Names are reified. If you take a
>> > look
>> > > > at
>> > > >
>> > > > the definition of reifyName you can see that it puts the assigned
>> > Unique
>> > > > in a
>> > > >
>> > > > NameU:
>> > > >
>> > > >
>> > > >
>> > > >   reifyName :: NamedThing n => n -> TH.Name
>> > > >
>> > > >   reifyName thing
>> > > >
>> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
>> > > >
>> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
>> > name))
>> > > >
>> > > >     ...
>> > > >
>> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
>> > > > meaning
>> > > >
>> > > > that it ends up comparing the Uniques. The relative ordering of
>> > Uniques is
>> > > > not
>> > > >
>> > > > guaranteed to be stable across recompilations [1], so this can lead to
>> > > >
>> > > > ABI-incompatible binaries.
>> > > >
>> > > >
>> > > >
>> > > > This isn't an abstract problem and it actually happens in practice. The
>> > > >
>> > > > microlens package keeps Names in a Set and later turns that set into a
>> > > > list.
>> > > >
>> > > > The results have different orders of TyVars resulting in different ABI
>> > > > hashes
>> > > >
>> > > > and can potentially be optimized differently.
>> > > >
>> > > >
>> > > >
>> > > > I believe it's worth to handle this case in a deterministic way and I
>> > have
>> > > > a
>> > > >
>> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
>> > with
>> > > > an
>> > > >
>> > > > ordering key. To be more concrete:
>> > > >
>> > > >
>> > > >
>> > > > -   | NameU !Int
>> > > >
>> > > > +   | NameU !Int !Int
>> > > >
>> > > >
>> > > >
>> > > > This way the Ord instance can use a stable key and the problem reduces
>> > to
>> > > >
>> > > > ensuring the keys are stable. To generate stable keys we can use the
>> > fact
>> > > > that
>> > > >
>> > > > reify traverses the expressions in the same order every time and
>> > > > sequentially
>> > > >
>> > > > allocate new keys based on traversal order. The way I have it
>> > implemented
>> > > > now
>> > > >
>> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
>> > > >
>> > > >
>> > > >
>> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
>> > > >
>> > > >
>> > > >
>> > > > Then the reifyName and qNewName do the necessary bookkeeping and
>> > translate
>> > > > the
>> > > >
>> > > > Uniques on the fly.
>> > > >
>> > > >
>> > > >
>> > > > This is a breaking change and it doesn't fix the problem that
>> > NameFlavour
>> > > > is
>> > > >
>> > > > not abstract and leaks the Uniques. It would break at least:
>> > > >
>> > > >
>> > > >
>> > > > - singletons
>> > > >
>> > > > - th-lift
>> > > >
>> > > > - haskell-src-meta
>> > > >
>> > > > - shakespeare
>> > > >
>> > > > - distributed-closure
>> > > >
>> > > >
>> > > >
>> > > > I'd like to get feedback if this is an acceptable solution and if the
>> > > > problem
>> > > >
>> > > > is worth solving.
>> > > >
>> > > >
>> > > >
>> > > > Cheers,
>> > > >
>> > > > Bartosz
>> > > >
>> > > >
>> > > >
>> > > > [1]
>> > > >
>> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
>> > > >
>> > > >
>> > > > _______________________________________________
>> > > > ghc-devs mailing list
>> > > > [hidden email]
>> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> > > > <
>> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
>> > >
>> > > >
>> > > >
>> > > >
>> >
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Michael Sloan
Also, revisiting this issue, I don't think it is worth solving, just
worth documenting.

Why?  Because TH already lets you do lots of incorrect things.  TH
already allows you to shoot yourself in the foot all over the place,
and that's ok.  I'd much rather it be a dangerous power tool than a
weak safe tool.

When writing TH, you are writing something that's part of the
compiler, and the code can often resemble the sort of involved in the
middle bits of a compiler.  For the cases where stable Ord on Name
matters even for local names, you're definitely writing something
that's a fairly fancy, nearly compiler-like transformation.  Concerns
like the stability of Names should be handled by the TH user.

-Michael

On Wed, Jun 29, 2016 at 10:41 AM, Michael Sloan <[hidden email]> wrote:

> No, NameU and NameL both lack package key / package id.
>
> -Michael
>
> On Wed, Jun 29, 2016 at 7:34 AM, Edward Z. Yang <[hidden email]> wrote:
>> No, nameBase is not the right thing to use here; you also need the
>> unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
>> in GHC 7.8 and before).  If you have that information, then
>> GHC establishes an invariant that if two names compare stably equal,
>> then the uniques associated with them are the same.
>>
>> Edward
>>
>> Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:
>>> Hey, sorry for not getting back to this sooner!
>>>
>>> Perhaps I should have added the following to my list of goals in contention:
>>>
>>> (3) (==) shouldn't yield True for Names that have different unique ids.
>>>
>>> We can only have stable comparisons if goal (3) isn't met, and two
>>> different unique Names would be considered to be equivalent based on the
>>> nameBase.  This is because Ord is a total order, not a partial order.  As
>>> described in my prior email, PartialOrd could be added, but it'd be
>>> inconvenient to use with existing Ord based containers.
>>>
>>> -Michael
>>>
>>> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <[hidden email]> wrote:
>>>
>>> > I must admit, I am a bit confused by this discussion.
>>> >
>>> > It is true that every Name is associated with a Unique.  But you don't
>>> > need the Unique to equality/ordering tests; the names also contain
>>> > enough (stable) information for stable comparisons of that sort.  So
>>> > why don't we expose that instead of the Unique?
>>> >
>>> > Edward
>>> >
>>> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
>>> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
>>> > [hidden email]>
>>> > > wrote:
>>> > >
>>> > > > If names get different ordering keys when reified from different
>>> > modules
>>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>>> > end up
>>> > > > with an unpleasant circumstance where these do not compare as equal
>>> > > >
>>> > > >
>>> > > >
>>> > > > The I believe that global, top level names (NameG) are not subject to
>>> > this
>>> > > > ordering stuff, so I don’t think this problem can occur.
>>> > > >
>>> > >
>>> > > True, top level names are NameG.  The reified Info for a top level Dec
>>> > may
>>> > > include NameU, though.  For example, the type variables in 'Maybe' are
>>> > > NameU:
>>> > >
>>> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
>>> > >      lift (show nf))
>>> > >
>>> > > The resulting expression is something like "NameU 822083586"
>>> > >
>>> > > > This is a breaking change and it doesn't fix the problem that
>>> > NameFlavour
>>> > > > is
>>> > > >
>>> > > > not abstract and leaks the Uniques. It would break at least:
>>> > > >
>>> > > >
>>> > > >
>>> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
>>> > > > don’t.  What use are these packages making of it?
>>> > > >
>>> > >
>>> > > It's being leaked in the public inteface via Ord.  The Eq instance is
>>> > fine,
>>> > > because these are Uniques, so the results should be consistent.
>>> > >
>>> > > There are two goals in contention here:
>>> > >
>>> > > 1) Having some ordering on Names so that they can be used in Map or Set
>>> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
>>> > > really handle these well.  In that case, the ordering would be based on
>>> > > everything but the NameU int, but 'Eq' would still follow it
>>> > >
>>> > > A few ideas for different approaches to resolving this:
>>> > >
>>> > > 1) Document it.  Less appealing than fixing it in the API, but still
>>> > would
>>> > > be good.
>>> > >
>>> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
>>> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
>>> > newtype
>>> > > (current behavior).  A trickyness of this approach is that you'd need
>>> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
>>> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
>>> > > this would break a lot of code that was already deterministic.
>>> > >
>>> > > 3) Some approaches like this ordering key, but I'm not sure how it will
>>> > > help when comparing NameUs from different modules?
>>> > >
>>> > > > S
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of
>>> > *Michael
>>> > > > Sloan
>>> > > > *Sent:* 02 June 2016 02:07
>>> > > > *To:* Bartosz Nitka <[hidden email]>
>>> > > > *Cc:* ghc-devs Devs <[hidden email]>
>>> > > > *Subject:* Re: Template Haskell determinism
>>> > > >
>>> > > >
>>> > > >
>>> > > > +1 to solving this.  Not sure about the approach, but assuming the
>>> > > > following concerns are addressed, I'm (+1) on it too:
>>> > > >
>>> > > >
>>> > > >
>>> > > > This solution is clever!  However, I think there is some difficulty to
>>> > > > determining this ordering key.  Namely, what happens when I construct
>>> > the
>>> > > > (Set Name) using results from multiple reifies?
>>> > > >
>>> > > >
>>> > > >
>>> > > > One solution is to have the ordering key be a consecutive supply that's
>>> > > > initialized on a per-module basis.  There is still an issue there,
>>> > though,
>>> > > > which is that you might store one of these names in a global IORef
>>> > that's
>>> > > > used by a later TH splice.  Or, similarly, serialize the names to a
>>> > file
>>> > > > and later load them.  At least in those cases you need to use 'runIO'
>>> > to
>>> > > > break determinism.
>>> > > >
>>> > > >
>>> > > >
>>> > > > If names get different ordering keys when reified from different
>>> > modules
>>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>>> > end up
>>> > > > with an unpleasant circumstance where these do not compare as equal.
>>> > How
>>> > > > about having the Eq instance ignore the ordering key?  I think that
>>> > mostly
>>> > > > resolves this concern.  This implies that the Ord instance should also
>>> > > > yield EQ and ignore the ordering key, when the unique key matches.
>>> > > >
>>> > > >
>>> > > >
>>> > > > One issue with this is that switching the order of reify could
>>> > > > unexpectedly vary the behavior.
>>> > > >
>>> > > >
>>> > > >
>>> > > > Does the map in TcGblEnv imply that a reify from a later module will
>>> > get
>>> > > > the same ordering key?  So does this mean that the keys used in a given
>>> > > > reify depend on which things have already been reified?  In that case,
>>> > then
>>> > > > this is also an issue with your solution.  Now, it's not a big problem
>>> > at
>>> > > > all, just surprising to the user.
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > If the internal API for Name does change, may as well address
>>> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
>>> > > > suggested solution of having both the traditional package identifier
>>> > and
>>> > > > package keys in 'Name'.
>>> > > >
>>> > > >
>>> > > >
>>> > > > -Michael
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]>
>>> > wrote:
>>> > > >
>>> > > > Template Haskell with its ability to do arbitrary IO is
>>> > non-deterministic
>>> > > > by
>>> > > >
>>> > > > design. You could for example embed the current date in a file. There
>>> > is
>>> > > >
>>> > > > however one kind of non-deterministic behavior that you can trigger
>>> > > >
>>> > > > accidentally. It has to do with how Names are reified. If you take a
>>> > look
>>> > > > at
>>> > > >
>>> > > > the definition of reifyName you can see that it puts the assigned
>>> > Unique
>>> > > > in a
>>> > > >
>>> > > > NameU:
>>> > > >
>>> > > >
>>> > > >
>>> > > >   reifyName :: NamedThing n => n -> TH.Name
>>> > > >
>>> > > >   reifyName thing
>>> > > >
>>> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
>>> > > >
>>> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
>>> > name))
>>> > > >
>>> > > >     ...
>>> > > >
>>> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
>>> > > > meaning
>>> > > >
>>> > > > that it ends up comparing the Uniques. The relative ordering of
>>> > Uniques is
>>> > > > not
>>> > > >
>>> > > > guaranteed to be stable across recompilations [1], so this can lead to
>>> > > >
>>> > > > ABI-incompatible binaries.
>>> > > >
>>> > > >
>>> > > >
>>> > > > This isn't an abstract problem and it actually happens in practice. The
>>> > > >
>>> > > > microlens package keeps Names in a Set and later turns that set into a
>>> > > > list.
>>> > > >
>>> > > > The results have different orders of TyVars resulting in different ABI
>>> > > > hashes
>>> > > >
>>> > > > and can potentially be optimized differently.
>>> > > >
>>> > > >
>>> > > >
>>> > > > I believe it's worth to handle this case in a deterministic way and I
>>> > have
>>> > > > a
>>> > > >
>>> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
>>> > with
>>> > > > an
>>> > > >
>>> > > > ordering key. To be more concrete:
>>> > > >
>>> > > >
>>> > > >
>>> > > > -   | NameU !Int
>>> > > >
>>> > > > +   | NameU !Int !Int
>>> > > >
>>> > > >
>>> > > >
>>> > > > This way the Ord instance can use a stable key and the problem reduces
>>> > to
>>> > > >
>>> > > > ensuring the keys are stable. To generate stable keys we can use the
>>> > fact
>>> > > > that
>>> > > >
>>> > > > reify traverses the expressions in the same order every time and
>>> > > > sequentially
>>> > > >
>>> > > > allocate new keys based on traversal order. The way I have it
>>> > implemented
>>> > > > now
>>> > > >
>>> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
>>> > > >
>>> > > >
>>> > > >
>>> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
>>> > > >
>>> > > >
>>> > > >
>>> > > > Then the reifyName and qNewName do the necessary bookkeeping and
>>> > translate
>>> > > > the
>>> > > >
>>> > > > Uniques on the fly.
>>> > > >
>>> > > >
>>> > > >
>>> > > > This is a breaking change and it doesn't fix the problem that
>>> > NameFlavour
>>> > > > is
>>> > > >
>>> > > > not abstract and leaks the Uniques. It would break at least:
>>> > > >
>>> > > >
>>> > > >
>>> > > > - singletons
>>> > > >
>>> > > > - th-lift
>>> > > >
>>> > > > - haskell-src-meta
>>> > > >
>>> > > > - shakespeare
>>> > > >
>>> > > > - distributed-closure
>>> > > >
>>> > > >
>>> > > >
>>> > > > I'd like to get feedback if this is an acceptable solution and if the
>>> > > > problem
>>> > > >
>>> > > > is worth solving.
>>> > > >
>>> > > >
>>> > > >
>>> > > > Cheers,
>>> > > >
>>> > > > Bartosz
>>> > > >
>>> > > >
>>> > > >
>>> > > > [1]
>>> > > >
>>> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
>>> > > >
>>> > > >
>>> > > > _______________________________________________
>>> > > > ghc-devs mailing list
>>> > > > [hidden email]
>>> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>> > > > <
>>> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
>>> > >
>>> > > >
>>> > > >
>>> > > >
>>> >
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Template Haskell determinism

Edward Z. Yang
In reply to this post by Michael Sloan
Oh drat, that's right, local names don't get given a package
key / package id, and externally visible local names aren't given a
deterministic name until we tidy (which is too late to help
Template Haskell.)  So I suppose there is not much we can do here.

Edward

Excerpts from Michael Sloan's message of 2016-06-29 13:41:13 -0400:

> No, NameU and NameL both lack package key / package id.
>
> -Michael
>
> On Wed, Jun 29, 2016 at 7:34 AM, Edward Z. Yang <[hidden email]> wrote:
> > No, nameBase is not the right thing to use here; you also need the
> > unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
> > in GHC 7.8 and before).  If you have that information, then
> > GHC establishes an invariant that if two names compare stably equal,
> > then the uniques associated with them are the same.
> >
> > Edward
> >
> > Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:
> >> Hey, sorry for not getting back to this sooner!
> >>
> >> Perhaps I should have added the following to my list of goals in contention:
> >>
> >> (3) (==) shouldn't yield True for Names that have different unique ids.
> >>
> >> We can only have stable comparisons if goal (3) isn't met, and two
> >> different unique Names would be considered to be equivalent based on the
> >> nameBase.  This is because Ord is a total order, not a partial order.  As
> >> described in my prior email, PartialOrd could be added, but it'd be
> >> inconvenient to use with existing Ord based containers.
> >>
> >> -Michael
> >>
> >> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <[hidden email]> wrote:
> >>
> >> > I must admit, I am a bit confused by this discussion.
> >> >
> >> > It is true that every Name is associated with a Unique.  But you don't
> >> > need the Unique to equality/ordering tests; the names also contain
> >> > enough (stable) information for stable comparisons of that sort.  So
> >> > why don't we expose that instead of the Unique?
> >> >
> >> > Edward
> >> >
> >> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
> >> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
> >> > [hidden email]>
> >> > > wrote:
> >> > >
> >> > > > If names get different ordering keys when reified from different
> >> > modules
> >> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> >> > end up
> >> > > > with an unpleasant circumstance where these do not compare as equal
> >> > > >
> >> > > >
> >> > > >
> >> > > > The I believe that global, top level names (NameG) are not subject to
> >> > this
> >> > > > ordering stuff, so I don’t think this problem can occur.
> >> > > >
> >> > >
> >> > > True, top level names are NameG.  The reified Info for a top level Dec
> >> > may
> >> > > include NameU, though.  For example, the type variables in 'Maybe' are
> >> > > NameU:
> >> > >
> >> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
> >> > >      lift (show nf))
> >> > >
> >> > > The resulting expression is something like "NameU 822083586"
> >> > >
> >> > > > This is a breaking change and it doesn't fix the problem that
> >> > NameFlavour
> >> > > > is
> >> > > >
> >> > > > not abstract and leaks the Uniques. It would break at least:
> >> > > >
> >> > > >
> >> > > >
> >> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
> >> > > > don’t.  What use are these packages making of it?
> >> > > >
> >> > >
> >> > > It's being leaked in the public inteface via Ord.  The Eq instance is
> >> > fine,
> >> > > because these are Uniques, so the results should be consistent.
> >> > >
> >> > > There are two goals in contention here:
> >> > >
> >> > > 1) Having some ordering on Names so that they can be used in Map or Set
> >> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
> >> > > really handle these well.  In that case, the ordering would be based on
> >> > > everything but the NameU int, but 'Eq' would still follow it
> >> > >
> >> > > A few ideas for different approaches to resolving this:
> >> > >
> >> > > 1) Document it.  Less appealing than fixing it in the API, but still
> >> > would
> >> > > be good.
> >> > >
> >> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
> >> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
> >> > newtype
> >> > > (current behavior).  A trickyness of this approach is that you'd need
> >> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
> >> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
> >> > > this would break a lot of code that was already deterministic.
> >> > >
> >> > > 3) Some approaches like this ordering key, but I'm not sure how it will
> >> > > help when comparing NameUs from different modules?
> >> > >
> >> > > > S
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > *From:* ghc-devs [mailto:[hidden email]] *On Behalf Of
> >> > *Michael
> >> > > > Sloan
> >> > > > *Sent:* 02 June 2016 02:07
> >> > > > *To:* Bartosz Nitka <[hidden email]>
> >> > > > *Cc:* ghc-devs Devs <[hidden email]>
> >> > > > *Subject:* Re: Template Haskell determinism
> >> > > >
> >> > > >
> >> > > >
> >> > > > +1 to solving this.  Not sure about the approach, but assuming the
> >> > > > following concerns are addressed, I'm (+1) on it too:
> >> > > >
> >> > > >
> >> > > >
> >> > > > This solution is clever!  However, I think there is some difficulty to
> >> > > > determining this ordering key.  Namely, what happens when I construct
> >> > the
> >> > > > (Set Name) using results from multiple reifies?
> >> > > >
> >> > > >
> >> > > >
> >> > > > One solution is to have the ordering key be a consecutive supply that's
> >> > > > initialized on a per-module basis.  There is still an issue there,
> >> > though,
> >> > > > which is that you might store one of these names in a global IORef
> >> > that's
> >> > > > used by a later TH splice.  Or, similarly, serialize the names to a
> >> > file
> >> > > > and later load them.  At least in those cases you need to use 'runIO'
> >> > to
> >> > > > break determinism.
> >> > > >
> >> > > >
> >> > > >
> >> > > > If names get different ordering keys when reified from different
> >> > modules
> >> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> >> > end up
> >> > > > with an unpleasant circumstance where these do not compare as equal.
> >> > How
> >> > > > about having the Eq instance ignore the ordering key?  I think that
> >> > mostly
> >> > > > resolves this concern.  This implies that the Ord instance should also
> >> > > > yield EQ and ignore the ordering key, when the unique key matches.
> >> > > >
> >> > > >
> >> > > >
> >> > > > One issue with this is that switching the order of reify could
> >> > > > unexpectedly vary the behavior.
> >> > > >
> >> > > >
> >> > > >
> >> > > > Does the map in TcGblEnv imply that a reify from a later module will
> >> > get
> >> > > > the same ordering key?  So does this mean that the keys used in a given
> >> > > > reify depend on which things have already been reified?  In that case,
> >> > then
> >> > > > this is also an issue with your solution.  Now, it's not a big problem
> >> > at
> >> > > > all, just surprising to the user.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > If the internal API for Name does change, may as well address
> >> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
> >> > > > suggested solution of having both the traditional package identifier
> >> > and
> >> > > > package keys in 'Name'.
> >> > > >
> >> > > >
> >> > > >
> >> > > > -Michael
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <[hidden email]>
> >> > wrote:
> >> > > >
> >> > > > Template Haskell with its ability to do arbitrary IO is
> >> > non-deterministic
> >> > > > by
> >> > > >
> >> > > > design. You could for example embed the current date in a file. There
> >> > is
> >> > > >
> >> > > > however one kind of non-deterministic behavior that you can trigger
> >> > > >
> >> > > > accidentally. It has to do with how Names are reified. If you take a
> >> > look
> >> > > > at
> >> > > >
> >> > > > the definition of reifyName you can see that it puts the assigned
> >> > Unique
> >> > > > in a
> >> > > >
> >> > > > NameU:
> >> > > >
> >> > > >
> >> > > >
> >> > > >   reifyName :: NamedThing n => n -> TH.Name
> >> > > >
> >> > > >   reifyName thing
> >> > > >
> >> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
> >> > > >
> >> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
> >> > name))
> >> > > >
> >> > > >     ...
> >> > > >
> >> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
> >> > > > meaning
> >> > > >
> >> > > > that it ends up comparing the Uniques. The relative ordering of
> >> > Uniques is
> >> > > > not
> >> > > >
> >> > > > guaranteed to be stable across recompilations [1], so this can lead to
> >> > > >
> >> > > > ABI-incompatible binaries.
> >> > > >
> >> > > >
> >> > > >
> >> > > > This isn't an abstract problem and it actually happens in practice. The
> >> > > >
> >> > > > microlens package keeps Names in a Set and later turns that set into a
> >> > > > list.
> >> > > >
> >> > > > The results have different orders of TyVars resulting in different ABI
> >> > > > hashes
> >> > > >
> >> > > > and can potentially be optimized differently.
> >> > > >
> >> > > >
> >> > > >
> >> > > > I believe it's worth to handle this case in a deterministic way and I
> >> > have
> >> > > > a
> >> > > >
> >> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
> >> > with
> >> > > > an
> >> > > >
> >> > > > ordering key. To be more concrete:
> >> > > >
> >> > > >
> >> > > >
> >> > > > -   | NameU !Int
> >> > > >
> >> > > > +   | NameU !Int !Int
> >> > > >
> >> > > >
> >> > > >
> >> > > > This way the Ord instance can use a stable key and the problem reduces
> >> > to
> >> > > >
> >> > > > ensuring the keys are stable. To generate stable keys we can use the
> >> > fact
> >> > > > that
> >> > > >
> >> > > > reify traverses the expressions in the same order every time and
> >> > > > sequentially
> >> > > >
> >> > > > allocate new keys based on traversal order. The way I have it
> >> > implemented
> >> > > > now
> >> > > >
> >> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
> >> > > >
> >> > > >
> >> > > >
> >> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
> >> > > >
> >> > > >
> >> > > >
> >> > > > Then the reifyName and qNewName do the necessary bookkeeping and
> >> > translate
> >> > > > the
> >> > > >
> >> > > > Uniques on the fly.
> >> > > >
> >> > > >
> >> > > >
> >> > > > This is a breaking change and it doesn't fix the problem that
> >> > NameFlavour
> >> > > > is
> >> > > >
> >> > > > not abstract and leaks the Uniques. It would break at least:
> >> > > >
> >> > > >
> >> > > >
> >> > > > - singletons
> >> > > >
> >> > > > - th-lift
> >> > > >
> >> > > > - haskell-src-meta
> >> > > >
> >> > > > - shakespeare
> >> > > >
> >> > > > - distributed-closure
> >> > > >
> >> > > >
> >> > > >
> >> > > > I'd like to get feedback if this is an acceptable solution and if the
> >> > > > problem
> >> > > >
> >> > > > is worth solving.
> >> > > >
> >> > > >
> >> > > >
> >> > > > Cheers,
> >> > > >
> >> > > > Bartosz
> >> > > >
> >> > > >
> >> > > >
> >> > > > [1]
> >> > > >
> >> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
> >> > > >
> >> > > >
> >> > > > _______________________________________________
> >> > > > ghc-devs mailing list
> >> > > > [hidden email]
> >> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >> > > > <
> >> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
> >> > >
> >> > > >
> >> > > >
> >> > > >
> >> >
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs