Again: Uniques in GHC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Dear all,


I *finally* had some time again to look at how to properly redo Uniques in GHC. A few things stood out to me:



- The export-list of Unique has some comments stating that function X is only exported for module Y, yet is used elsewhere. This may be because these comments do not show up in haddock etc. leading some people to think they're up for general use. In my refactoring, I'm sticking the restriction in the function name, so it's no longer mkUniqueGrimily, but rather mkUniqueOnlyForUniqSupply (making the name even longer should discourage their use more). If at all possible, these should be removed altogether asap.


- The implementation is based on FastInts, which, on most machines nowadays, is a 64-bit thing. The serialisation in BinIface is explicitly based on Word32s. Aside from the obvious potential (albeit with a low probability) for errors, this lead me to wonder about 32/64-bitness. Is there a reason for 64-bit versions of GHC to write Word32s, or is this a historic thing? Must the interface-files be bit-compatible between different versions (32/64-bits) of the compiler? Lastly, is the choice of whether "this" is a 32 or 64-bit version completely determined by WORD_SIZE_IN_BITS (MachDeps.h)?


- There are a few libraries that are explicitly dependent on GHC, yet have their own versions of Unique. So far, I've seen this for Hoopl and Template Haskell. They have unboxed ints to do the job. I have not researched whether they manipulate them in any way, or just store things. If the former; I would like to call to reconsider this, because it seems like a poor separation of concerns toe me. If the latter, I think refactoring them to use Unique instead of Int# should be straightforward.


?

The point of refactoring Unique is to no longer have low-level optimisations of manually unpacking (and repacking using the MkUnique constructor), which should serve as a lovely test of how far the optimisations have come. Furthermore, it seemed that the use of characters to encode the domain was somewhat awkward, but I refer anyone interested to earlier posts on this list.


Thoughts? Comments? Suggestions? Objections?


Regards,

Philip


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/03c65091/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Johan Tibell-2
On Mon, Oct 6, 2014 at 11:58 AM, <p.k.f.holzenspies at utwente.nl> wrote:

>  - The export-list of Unique has some comments stating that function X is
> only exported for module Y, yet is used elsewhere. This may be because
> these comments do not show up in haddock etc. leading some people to think
> they're up for general use. In my refactoring, I'm sticking the restriction
> in the function name, so it's no longer mkUniqueGrimily, but rather
> mkUniqueOnlyForUniqSupply (making the name even longer should discourage
> their use more). If at all possible, these should be removed altogether
> asap.
>

Since you're touching this code base it would be a terrific time to add
some Haddocks! (We recently decided, on the ghc-devs@ list, that all new
top-level entities, i.e. functions, data types, and classes, should have
Haddocks.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/6cb27d3f/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Very much part of my plan, Johan! I was a fervent "+1" on that recommendation.

Ph.
?




________________________________
From: Johan Tibell <johan.tibell at gmail.com>
Sent: 06 October 2014 12:06
To: Holzenspies, P.K.F. (EWI)
Cc: ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

On Mon, Oct 6, 2014 at 11:58 AM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

- The export-list of Unique has some comments stating that function X is only exported for module Y, yet is used elsewhere. This may be because these comments do not show up in haddock etc. leading some people to think they're up for general use. In my refactoring, I'm sticking the restriction in the function name, so it's no longer mkUniqueGrimily, but rather mkUniqueOnlyForUniqSupply (making the name even longer should discourage their use more). If at all possible, these should be removed altogether asap.

Since you're touching this code base it would be a terrific time to add some Haddocks! (We recently decided, on the ghc-devs@ list, that all new top-level entities, i.e. functions, data types, and classes, should have Haddocks.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/b8c9c531/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Joachim Breitner-2
In reply to this post by p.k.f.holzenspies@utwente.nl
Hi,


Am Montag, den 06.10.2014, 09:58 +0000 schrieb
p.k.f.holzenspies at utwente.nl:

> - The implementation is based on FastInts, which, on most machines
> nowadays, is a 64-bit thing. The serialisation in BinIface is
> explicitly based on Word32s. Aside from the obvious potential (albeit
> with a low probability) for errors, this lead me to wonder about
> 32/64-bitness. Is there a reason for 64-bit versions of GHC to write
> Word32s, or is this a historic thing? Must the interface-files be
> bit-compatible between different versions (32/64-bits) of the
> compiler? Lastly, is the choice of whether "this" is a 32 or 64-bit
> version completely determined by WORD_SIZE_IN_BITS (MachDeps.h)?
>
A while ago we had problems with haddock in Debian when the
serialization became bit-dependent.? I suggest to keep the specification
of any on-disk format independent of architecture specifics.

Greetings,
Joachim

? http://bugs.debian.org/586723#15

--
Joachim ?nomeata? Breitner
  mail at joachim-breitner.de ? http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  ? GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/86165015/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Dear Joachim,


Although I can't quite get what you're saying from the posts on that link, I'm not immediately sure what you're saying should extend to hi-files. These files are very much specific to the compiler version you're using, as in, new GHCs add stuff to them all the time and their binary format does not (seem to) provision for being able to skip unknown things (i.e. it doesn't say how large the next semantic block is in the hi-file).


If we're going to keep the formats the same for any architecture, we're going to have to limit 64-bit machines to 32-bit (actually 30-bits, another thing I don't quite understand in BinIface) Uniques. There seem to be possibilities to alleviate the issues with parallel generation of fresh Uniques in a parallel version of GHC. The idea is that, since 64-bits is more than we'll ever assign anyway, to use a few for thread-ids, so we would guarantee non-conflicting Uniques generated by different threads.


Anyway, maybe someone a tad more knowledgeable about Uniques could maybe tell me on what scale Uniques in the hi-files should be unique? Must they only be non-conflicting in a Module? In a Package? If I first compile a file with GHC and then, in a separate invocation of GHC, compile another, surely their hi-files will have some of the same Uniques for their own, different things? Where are these conflicts resolved when linking multiple independently compiled files? Are they ever?


Regards,

Philip




?

________________________________
From: Joachim Breitner <mail at joachim-breitner.de>
Sent: 06 October 2014 12:36
Subject: Re: Again: Uniques in GHC

<snip>

A while ago we had problems with haddock in Debian when the
serialization became bit-dependent.^1 I suggest to keep the specification
of any on-disk format independent of architecture specifics.

Greetings,
Joachim

^1 http://bugs.debian.org/586723#15


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/a826dac1/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Joachim Breitner-2
Hi,

Am Montag, den 06.10.2014, 19:55 +0000 schrieb p.k.f.holzenspies at utwente.nl:
> Although I can't quite get what you're saying from the posts on that
> link, I'm not immediately sure what you're saying should extend to
> hi-files. These files are very much specific to the compiler version
> you're using, as in, new GHCs add stuff to them all the time and their
> binary format does not (seem to) provision for being able to skip
> unknown things (i.e. it doesn't say how large the next semantic block
> is in the hi-file).

Some of this may not be true, but to my knowledge (part of) that
interface reading code is (or was?) used by haddock when generating
its .haddock file.

> If we're going to keep the formats the same for any architecture,
> we're going to have to limit 64-bit machines to 32-bit (actually
> 30-bits, another thing I don't quite understand in BinIface) Uniques.

Why? You can just serialize Uniques always as 64 bit numbers, even on
32-bit systems. This way, the data format is the same across
architectures, with little cost.

>  There seem to be possibilities to alleviate the issues with parallel
> generation of fresh Uniques in a parallel version of GHC. The idea is
> that, since 64-bits is more than we'll ever assign anyway, to use a
> few for thread-ids, so we would guarantee non-conflicting Uniques
> generated by different threads.
>
But that would only work on 64 bit systems, right?

Greetings,
Joachim

--
Joachim ?nomeata? Breitner
  mail at joachim-breitner.de ? http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  ? GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141006/fdfc0343/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Dear Joachim,

> Some of this may not be true, but to my knowledge (part of) that
> interface reading code is (or was?) used by haddock when generating
> its .haddock file.

Ah, well, I didn't know this. How does haddock use this code? If Haddock uses the GHC-API to do this; problem solved, because we're back at the specific compiler version that generated it. Otherwise... we may be in trouble.

> Why? You can just serialize Uniques always as 64 bit numbers, even on
> 32-bit systems. This way, the data format is the same across
> architectures, with little cost.

Ah, the cost is this; if we start relying on the 64-bitness for uniqueness (which may well happen; there are categories - currently characters - used only for four compile-time known Uniques, waisting 30 - 8 - 2 = 20 bits), this will break the 32-bit compilers. Arguably, their breakage should reject the change leading to the waisted Uniques. Seems a risk, though. Similarly to how currently Uniques are 64-bits, but serialised as 30. Alternatively, 32-bit GHC could use 64-bit Uniques, but that is going to give you quite the performance hit (speculating here).

> But that would only work on 64 bit systems, right?

Yes, this approach to a parallel GHC would only work on 64-bit machines. The idea is, I guess, that we're not going to see a massive demand for parallel GHC running on multi-core 32-bit systems. In other words; 32-bit systems wouldn't get a parallel GHC.

Regards,
Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141007/630c59ae/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Isaac Dupree-3
On 10/07/2014 02:32 AM, p.k.f.holzenspies at utwente.nl wrote:
>> But that would only work on 64 bit systems, right?
>
> Yes, this approach to a parallel GHC would only work on 64-bit machines.
> The idea is, I guess, that we're not going to see a massive demand for
> parallel GHC running on multi-core 32-bit systems. In other words;
> 32-bit systems wouldn't get a parallel GHC.

On ARM, 32-bit quad-cores are common.  But maybe no one will care to run
GHC on (32-bit) ARM.


Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Austin Seipp-5
In reply to this post by p.k.f.holzenspies@utwente.nl
On Tue, Oct 7, 2014 at 1:32 AM,  <p.k.f.holzenspies at utwente.nl> wrote:
> Yes, this approach to a parallel GHC would only work on 64-bit machines. The
> idea is, I guess, that we're not going to see a massive demand for parallel
> GHC running on multi-core 32-bit systems. In other words; 32-bit systems
> wouldn't get a parallel GHC.

Let me make sure I'm understanding this correctly: in this particular
proposed solution, the side effect would be that we no longer have a
capable 32bit runtime which supports multicore parallelism?

Sorry, but I'm afraid this approach is pretty much unacceptable IMO,
for precisely the reason outlined in your last sentence. 32bit systems
are surprisingly commen. I have several multicore 32bit ARMv7 machines
on my desk right now, for example. And there are a lot more of those
floating around than you might think.

If that's the 'cure', I think I (and other users) would consider it
far worse than the disease.

> Regards,
> Philip
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>

--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Wait, wait, wait! I wasn't talking about a parallel *runtime*. Nothing changes there. All I'm talking about is something that is a very old issue that never got added / solved / resolved. Somewhere on the commentary, or the mailing list, I seem to recall that the generation of Uniques was the bottleneck for the parallelisation of GHC *Itself*. It's about having a compiler using multiple threads and says nothing about programs coming out of it.

I'm all with you on embedded processors and that kind of stuff, but I don't see a pressing need to compile *on* them. Isn't all ARM-stuff assuming cross-compilation?

Ph.


________________________________________
From: mad.one at gmail.com <mad.one at gmail.com> on behalf of Austin Seipp <austin at well-typed.com>
Sent: 07 October 2014 17:46
To: Holzenspies, P.K.F. (EWI)
Cc: ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

On Tue, Oct 7, 2014 at 1:32 AM,  <p.k.f.holzenspies at utwente.nl> wrote:
> Yes, this approach to a parallel GHC would only work on 64-bit machines. The
> idea is, I guess, that we're not going to see a massive demand for parallel
> GHC running on multi-core 32-bit systems. In other words; 32-bit systems
> wouldn't get a parallel GHC.

Let me make sure I'm understanding this correctly: in this particular
proposed solution, the side effect would be that we no longer have a
capable 32bit runtime which supports multicore parallelism?

Sorry, but I'm afraid this approach is pretty much unacceptable IMO,
for precisely the reason outlined in your last sentence. 32bit systems
are surprisingly commen. I have several multicore 32bit ARMv7 machines
on my desk right now, for example. And there are a lot more of those
floating around than you might think.

If that's the 'cure', I think I (and other users) would consider it
far worse than the disease.

> Regards,
> Philip
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>

--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Austin Seipp-5
On Tue, Oct 7, 2014 at 11:26 AM,  <p.k.f.holzenspies at utwente.nl> wrote:
> Wait, wait, wait! I wasn't talking about a parallel *runtime*. Nothing changes there. All I'm talking about is something that is a very old issue that never got added / solved / resolved. Somewhere on the commentary, or the mailing list, I seem to recall that the generation of Uniques was the bottleneck for the parallelisation of GHC *Itself*. It's about having a compiler using multiple threads and says nothing about programs coming out of it.

OK, cool. Just making sure. :)

> I'm all with you on embedded processors and that kind of stuff, but I don't see a pressing need to compile *on* them. Isn't all ARM-stuff assuming cross-compilation?

No, not all ARM builds assume cross compilation. In fact, if you want
a fully working GHC, cross compilation is impossible: you cannot cross
compile GHCi, meaning you can't use Template Haskell (as well as some
of the linker features, I believe). However, I don't think this change
impacts building GHC at all, since we get parallelism through 'make',
not through GHC itself (and on low-end systems, parallelism in the
build system is crucial and really necessary.)

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.

> Ph.
>
>
> ________________________________________
> From: mad.one at gmail.com <mad.one at gmail.com> on behalf of Austin Seipp <austin at well-typed.com>
> Sent: 07 October 2014 17:46
> To: Holzenspies, P.K.F. (EWI)
> Cc: ghc-devs at haskell.org
> Subject: Re: Again: Uniques in GHC
>
> On Tue, Oct 7, 2014 at 1:32 AM,  <p.k.f.holzenspies at utwente.nl> wrote:
>> Yes, this approach to a parallel GHC would only work on 64-bit machines. The
>> idea is, I guess, that we're not going to see a massive demand for parallel
>> GHC running on multi-core 32-bit systems. In other words; 32-bit systems
>> wouldn't get a parallel GHC.
>
> Let me make sure I'm understanding this correctly: in this particular
> proposed solution, the side effect would be that we no longer have a
> capable 32bit runtime which supports multicore parallelism?
>
> Sorry, but I'm afraid this approach is pretty much unacceptable IMO,
> for precisely the reason outlined in your last sentence. 32bit systems
> are surprisingly commen. I have several multicore 32bit ARMv7 machines
> on my desk right now, for example. And there are a lot more of those
> floating around than you might think.
>
> If that's the 'cure', I think I (and other users) would consider it
> far worse than the disease.
>
>> Regards,
>> Philip
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>
> --
> Regards,
>
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/
>



--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Austin Seipp-5
On Tue, Oct 7, 2014 at 11:40 AM, Austin Seipp <austin at well-typed.com> wrote:
> I still consider this a big limitation, one which is only due to an
> implementation detail.

To make this point more clear: I'm generally hesitant about basing the
availability of architecture-independent features ('ghc supporting
-j', which almost entirely was fixed in the frontend compiler
pipeline) on platform details, or the underlying hardware or execution
model GHC runs on. Unless the changes are very significant (like, -j
gets significantly faster), I think we should be very hesitant about
gating frontend features behind architectural/implementation details.

--
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
In reply to this post by Austin Seipp-5

________________________________________
From: mad.one at gmail.com <mad.one at gmail.com> on behalf of Austin Seipp <austin at well-typed.com>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Carter Schonwald
in some respects, having fully deterministic builds is a very important
goal: a lot of tooling for eg, caching builds of libraries works much much
better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl> wrote:

>
> ________________________________________
> From: mad.one at gmail.com <mad.one at gmail.com> on behalf of Austin Seipp <
> austin at well-typed.com>
>
> So I assume your change would mean 'ghc -j' would not work for 32bit.
> I still consider this a big limitation, one which is only due to an
> implementation detail. But we need to confirm this will actually fix
> any bottlenecks first though before getting to that point.
>
>
>
>
> Yes, that's what I'm saying.
>
> Let me just add that what I'm proposing by no means prohibits or hinders
> making 32-bit GHC-versions be parallel later on, it just doesn't solve the
> problem. It depends to what extent the "fully deterministic behaviour" bug
> is considered a priority (there was something about parts of the hi-files
> being non-deterministic across different executions of GHC; don't recall
> the details).
>
> Anyhow, the work I'm doing now exposes a few things about Uniques that
> confuse me a little and that could have been bugs (that maybe never acted
> up). Extended e-mail to follow later on.
>
> Ph.
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141007/1f947a5d/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Dear Carter, Simon, et al,


(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)


I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:


https://phabricator.haskell.org/D323



Here are my open issues:


- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.

?

- Some UniqueDomains only have a handful of instances and seem a bit wasteful.


- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?


- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?


- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?


- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.


- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.


Regards,

Philip




________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141007/63e45081/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Simon Peyton Jones
One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying

  *   What problem is this patch trying to solve?
  *   What is the user-visible design (for language features)?
  *   What are the main ideas in the implementation?

The place we usually put such design documents is on the GHC Trac Wiki.  Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play.  Philip, might you make such a page for this unique stuff?


To answer some of you specific questions (please include the answers in the wiki page in some form):

  *   Uniques are never put in .hi files (as far as I know).  They do not survive a single invocation of GHC.
  *   However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC.  For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B
  *   Yes, TyCons and DataCons must have separate uniques.  We often form sets of Names, which contain both TyCons and DataCons.  Let's not mess with this.
  *   Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it.  Moreover, I would regret doing so because it would force sequentiality where none is needed.
  *   Template Haskell is a completely independent Haskell library.  It does not import GHC.  If uniques were in their own package, then TH and GHC could share them.  Ditto Hoopl.
  *   You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all!
  *   Enforcing determinacy everywhere is a heavy burden.  Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too.

Simon


________________________________
From: ghc-devs [ghc-devs-bounces at haskell.org] on behalf of p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 07 October 2014 22:03
To: carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Carter, Simon, et al,


(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)


I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:


https://phabricator.haskell.org/D323



Here are my open issues:


- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.

?

- Some UniqueDomains only have a handful of instances and seem a bit wasteful.


- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?


- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?


- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?


- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.


- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.


Regards,

Philip




________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141007/2c5bb076/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Dear Simon, et al,


I've created the wiki-page about the Unique-patch [1].

Should it be linked to from the KeyDataTypes [2]?


Regards,

Philip



[1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique

[2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes






________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 07 October 2014 23:23
To: Holzenspies, P.K.F. (EWI); carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC

One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying

  *   What problem is this patch trying to solve?
  *   What is the user-visible design (for language features)?
  *   What are the main ideas in the implementation?

The place we usually put such design documents is on the GHC Trac Wiki.  Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play.  Philip, might you make such a page for this unique stuff?


To answer some of you specific questions (please include the answers in the wiki page in some form):

  *   Uniques are never put in .hi files (as far as I know).  They do not survive a single invocation of GHC.
  *   However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC.  For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B
  *   Yes, TyCons and DataCons must have separate uniques.  We often form sets of Names, which contain both TyCons and DataCons.  Let's not mess with this.
  *   Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it.  Moreover, I would regret doing so because it would force sequentiality where none is needed.
  *   Template Haskell is a completely independent Haskell library.  It does not import GHC.  If uniques were in their own package, then TH and GHC could share them.  Ditto Hoopl.
  *   You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all!
  *   Enforcing determinacy everywhere is a heavy burden.  Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too.

Simon


________________________________
From: ghc-devs [ghc-devs-bounces at haskell.org] on behalf of p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 07 October 2014 22:03
To: carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Carter, Simon, et al,


(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)


I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:


https://phabricator.haskell.org/D323



Here are my open issues:


- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.

?

- Some UniqueDomains only have a handful of instances and seem a bit wasteful.


- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?


- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?


- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?


- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.


- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.


Regards,

Philip




________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141009/2d5671af/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

Simon Peyton Jones
Thank you.  A most helpful beginning.  I have added some comments and queries, as well as clarifying some points.

Simon

________________________________
From: p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 09 October 2014 12:39
To: Simon Peyton Jones; carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Simon, et al,


I've created the wiki-page about the Unique-patch [1].

Should it be linked to from the KeyDataTypes [2]?


Regards,

Philip



[1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique

[2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes






________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 07 October 2014 23:23
To: Holzenspies, P.K.F. (EWI); carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC

One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying

  *   What problem is this patch trying to solve?
  *   What is the user-visible design (for language features)?
  *   What are the main ideas in the implementation?

The place we usually put such design documents is on the GHC Trac Wiki.  Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play.  Philip, might you make such a page for this unique stuff?


To answer some of you specific questions (please include the answers in the wiki page in some form):

  *   Uniques are never put in .hi files (as far as I know).  They do not survive a single invocation of GHC.
  *   However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC.  For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B
  *   Yes, TyCons and DataCons must have separate uniques.  We often form sets of Names, which contain both TyCons and DataCons.  Let's not mess with this.
  *   Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it.  Moreover, I would regret doing so because it would force sequentiality where none is needed.
  *   Template Haskell is a completely independent Haskell library.  It does not import GHC.  If uniques were in their own package, then TH and GHC could share them.  Ditto Hoopl.
  *   You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all!
  *   Enforcing determinacy everywhere is a heavy burden.  Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too.

Simon


________________________________
From: ghc-devs [ghc-devs-bounces at haskell.org] on behalf of p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 07 October 2014 22:03
To: carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Carter, Simon, et al,


(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)


I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:


https://phabricator.haskell.org/D323



Here are my open issues:


- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.

?

- Some UniqueDomains only have a handful of instances and seem a bit wasteful.


- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?


- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?


- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?


- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.


- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.


Regards,

Philip




________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141009/fef37182/attachment-0001.html>

Reply | Threaded
Open this post in threaded view
|

Again: Uniques in GHC

p.k.f.holzenspies@utwente.nl
Some of those clarifying points helped a *great* deal. Thanks.


I've addressed comments / questions and linked from KeyTypes.


Ph.


?

________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 09 October 2014 22:36
To: Holzenspies, P.K.F. (EWI); carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC

Thank you.  A most helpful beginning.  I have added some comments and queries, as well as clarifying some points.

Simon

________________________________
From: p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 09 October 2014 12:39
To: Simon Peyton Jones; carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Simon, et al,


I've created the wiki-page about the Unique-patch [1].

Should it be linked to from the KeyDataTypes [2]?


Regards,

Philip



[1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique

[2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes






________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 07 October 2014 23:23
To: Holzenspies, P.K.F. (EWI); carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC

One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying

  *   What problem is this patch trying to solve?
  *   What is the user-visible design (for language features)?
  *   What are the main ideas in the implementation?

The place we usually put such design documents is on the GHC Trac Wiki.  Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play.  Philip, might you make such a page for this unique stuff?


To answer some of you specific questions (please include the answers in the wiki page in some form):

  *   Uniques are never put in .hi files (as far as I know).  They do not survive a single invocation of GHC.
  *   However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC.  For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B
  *   Yes, TyCons and DataCons must have separate uniques.  We often form sets of Names, which contain both TyCons and DataCons.  Let's not mess with this.
  *   Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it.  Moreover, I would regret doing so because it would force sequentiality where none is needed.
  *   Template Haskell is a completely independent Haskell library.  It does not import GHC.  If uniques were in their own package, then TH and GHC could share them.  Ditto Hoopl.
  *   You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all!
  *   Enforcing determinacy everywhere is a heavy burden.  Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too.

Simon


________________________________
From: ghc-devs [ghc-devs-bounces at haskell.org] on behalf of p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 07 October 2014 22:03
To: carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC


Dear Carter, Simon, et al,


(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)


I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:


https://phabricator.haskell.org/D323



Here are my open issues:


- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.

?

- Some UniqueDomains only have a handful of instances and seem a bit wasteful.


- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?


- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?


- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?


- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.


- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.


Regards,

Philip




________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC

in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)

On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:

________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>

So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.




Yes, that's what I'm saying.

Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).

Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.

Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141010/a0b95d55/attachment-0001.html>