Removing Hoopl dependency?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Removing Hoopl dependency?

Michal Terepeta
Hi all,

I was looking at removing the `BlockId` type synonym in favor of
Hoopl's `Label` (there was already a TODO and it is a bit confusing).
But once I've started making the changes, I've realized that in a
bunch of places this makes the code *less* readable. Mostly because of
`CLabel` (sounds similar but is something quite different and having
to rename local variables from `label` to `clabel` is not great).

I started to look at alternatives and noticed that in general the
interface between GHC and Hoopl is quite noisy and confusing:
- Hoopl has `Label` which is GHC's `BlockId` but different than
  GHC's `CLabel`
- Hoopl has `Unique` which is different than GHC's `Unique`
- Hoopl has `Unique{Map,Set}` which are different than GHC's
  `Uniq{FM,Set}`
- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
  needed just to filter the exposed functions (filter out some of the
  Hoopl's and add the GHC ones).
- Working in `cmm/` requires constant switching between GHC code and
  Hoopl (`CmmNode`/`CmmGraph`/`CmmBlock` and dataflow stuff is in GHC,
  the actual implementation of `Block`/`Graph` are defined in Hoopl,
  etc.)

GHC is actually using only a small subset of Hoopl (e.g., the fixpoint
computation is copied/specialized: `cmm/Hoopl/Dataflow`). So I was
wondering - maybe it's worth to simply drop the dependency on Hoopl?
(and copy the code that is actually necessary in GHC)
I've done an experiment in [1] (to see how much we'd need to actually
copy) and I really like the result:
- We can remove one external dependency and git submodule at the
  cost of only 5 new modules in `cmm/Hoopl` (net gain of only 4
  modules: we add 5 new but can remove `cmm/Hoopl`, which is no longer
  needed)
- We should be able to fix all of the above issues and make the code
  easier to understand (less code, everything in one repo, fewer
  concepts).
- It's going to be easier to change things since we don't need to
  worry about changing the public interface of Hoopl (it's a
  standalone package on Hackage and other people already depend on the
  current behavior).

What do you think? Does anyone think we shouldn't do this?

Thanks,
Michal

    For now I just copied the code/updated imports and didn't do any
    cleanups, but I'd be happy to do them in subsequent PRs


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Herbert Valerio Riedel-3
On 2017-05-27 at 19:58:11 +0200, Michal Terepeta wrote:

[...]

> I've done an experiment in [1] (to see how much we'd need to actually
> copy) and I really like the result:
> - We can remove one external dependency and git submodule at the
>   cost of only 5 new modules in `cmm/Hoopl` (net gain of only 4
>   modules: we add 5 new but can remove `cmm/Hoopl`, which is no longer
>   needed)
> - We should be able to fix all of the above issues and make the code
>   easier to understand (less code, everything in one repo, fewer
>   concepts).
> - It's going to be easier to change things since we don't need to
>   worry about changing the public interface of Hoopl (it's a
>   standalone package on Hackage and other people already depend on the
>   current behavior).
>
> What do you think? Does anyone think we shouldn't do this?

It appears to me that in this case, the benefits in gained flexibility
outweight the cost of independent development and potential loss of
synergies. So I'm +1 on this.
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Ben Gamari-2
In reply to this post by Michal Terepeta
Michal Terepeta <[hidden email]> writes:

> Hi all,
>
...
>
> What do you think? Does anyone think we shouldn't do this?
>
I think this seems quite reasonable. Given that hoopl will need changes
to be truly useful to GHC, it seems quite reasonable to take the parts
we need and iterate independently on the rest.

Cheers,

- Ben


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Erik de Castro Lopo-34
In reply to this post by Michal Terepeta
Michal Terepeta wrote:

> What do you think? Does anyone think we shouldn't do this?

Makes sense. I'm +1 on this.

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Michal Terepeta
Cool, thanks for quick replies!

Cheers,
Michal

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Removing Hoopl dependency?

GHC - devs mailing list
In reply to this post by Michal Terepeta

Is there really a compelling case for forking Hoopl?  I was talking to Kavon last week about doing exactly the opposite: using Hoopl more wholeheartedly!

 

Before going ahead with this, let’s remember the downsides

·        If we fork Hoopl, improvements in one place will not be seen in the other.  GHC originally used its own containers library but now uses ‘containers’, most of which is irrelevant to GHC, just to pick up the work that has been done to make ‘containers’ fast.  Similarly, GHC has a clone of ‘pretty’, but someone is working (I think) to make GHC use ‘pretty’.

·        It’s not clear to me why GHC has a clone of parts of Hoopl.  Would it not be better just to make Hoopl faster?

 

If anything I ‘d like to use Hoopl more in Cmm optimisation passes in GHC, so we may want to use more of Hoopl’s facilities.

 

The main reason you suggest for forking is that there are some awkward name clashes.  Surely we could resolve these? e.g we could change CLabel in GHC; or agree with Hoopl maintainers that BlockId would be more helpful than Label.

 

You mention that Hoopl uses Unique set/map.  Why not use ‘containers’ for that?  (Like GHC!)

 

Let’s discuss this a bit more before executing

 

I’m also interested to know:

·        who is actively working on Hoopl (Michael, Sophie, …)?

·        how are you using it (within GHC, or somewhere else)?

 

It’d be good to review and update https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup.  Are there any other improvements planned?

Simon

 

From: ghc-devs [mailto:[hidden email]] On Behalf Of Michal Terepeta
Sent: 27 May 2017 18:58
To: ghc-devs <[hidden email]>
Subject: Removing Hoopl dependency?

 

Hi all,

 

I was looking at removing the `BlockId` type synonym in favor of

Hoopl's `Label` (there was already a TODO and it is a bit confusing).

But once I've started making the changes, I've realized that in a

bunch of places this makes the code *less* readable. Mostly because of

`CLabel` (sounds similar but is something quite different and having

to rename local variables from `label` to `clabel` is not great).

 

I started to look at alternatives and noticed that in general the

interface between GHC and Hoopl is quite noisy and confusing:

- Hoopl has `Label` which is GHC's `BlockId` but different than

  GHC's `CLabel`

- Hoopl has `Unique` which is different than GHC's `Unique`

- Hoopl has `Unique{Map,Set}` which are different than GHC's

  `Uniq{FM,Set}`

- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is

  needed just to filter the exposed functions (filter out some of the

  Hoopl's and add the GHC ones).

- Working in `cmm/` requires constant switching between GHC code and

  Hoopl (`CmmNode`/`CmmGraph`/`CmmBlock` and dataflow stuff is in GHC,

  the actual implementation of `Block`/`Graph` are defined in Hoopl,

  etc.)

 

GHC is actually using only a small subset of Hoopl (e.g., the fixpoint

computation is copied/specialized: `cmm/Hoopl/Dataflow`). So I was

wondering - maybe it's worth to simply drop the dependency on Hoopl?

(and copy the code that is actually necessary in GHC)

I've done an experiment in [1] (to see how much we'd need to actually

copy) and I really like the result:

- We can remove one external dependency and git submodule at the

  cost of only 5 new modules in `cmm/Hoopl` (net gain of only 4

  modules: we add 5 new but can remove `cmm/Hoopl`, which is no longer

  needed)

- We should be able to fix all of the above issues and make the code

  easier to understand (less code, everything in one repo, fewer

  concepts).

- It's going to be easier to change things since we don't need to

  worry about changing the public interface of Hoopl (it's a

  standalone package on Hackage and other people already depend on the

  current behavior).

 

What do you think? Does anyone think we shouldn't do this?

 

Thanks,

Michal

 

    For now I just copied the code/updated imports and didn't do any

    cleanups, but I'd be happy to do them in subsequent PRs

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Michal Terepeta
On Sun, May 28, 2017 at 11:30 PM Simon Peyton Jones <[hidden email]> wrote:

Is there really a compelling case for forking Hoopl?  I was talking to Kavon last week about doing exactly the opposite: using Hoopl more wholeheartedly!

 

Before going ahead with this, let’s remember the downsides

·        If we fork Hoopl, improvements in one place will not be seen in the other.  GHC originally used its own containers library but now uses ‘containers’, most of which is irrelevant to GHC, just to pick up the work that has been done to make ‘containers’ fast.  Similarly, GHC has a clone of ‘pretty’, but someone is working (I think) to make GHC use ‘pretty’.

·        It’s not clear to me why GHC has a clone of parts of Hoopl.  Would it not be better just to make Hoopl faster?

 

If anything I ‘d like to use Hoopl more in Cmm optimisation passes in GHC, so we may want to use more of Hoopl’s facilities.

 

The main reason you suggest for forking is that there are some awkward name clashes.  Surely we could resolve these? e.g we could change CLabel in GHC; or agree with Hoopl maintainers that BlockId would be more helpful than Label.

 

You mention that Hoopl uses Unique set/map.  Why not use ‘containers’ for that?  (Like GHC!)

 

Let’s discuss this a bit more before executing

 

I’m also interested to know:

·        who is actively working on Hoopl (Michael, Sophie, …)?

·        how are you using it (within GHC, or somewhere else)?

 

It’d be good to review and update https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup.  Are there any other improvements planned?

Simon


Hi Simon,
 
Thanks for chiming in! Let me try to clarify the current situation and
the motivation for my changes.
 
1) Initial fork of Hoopl
 
Note that what I’m actually advocating is to *finish* forking Hoopl. The
fork really started in ~2012 when the “new Cmm backend” was being
finished.
IIRC the main reason was the unacceptable performance and it seems that
even Simon Marlow had trouble making it run fast enough:
The end result is pretty sad: GHC has its own forked/specialized
`Hoopl.Dataflow` module and is using Hoopl only for definitions of
`Block`/`Graph` and maps/sets (if you look at my commit, it’s pretty
clear what I’m copying). In particular it’s not using *any* of dataflow
analysis or rewriting capabilities of the Hoopl package.
 
2) Reasons to finish forking
 
The reasons I listed in my previous email already assumed the we have
the forked `Hoopl.Dataflow` module in GHC. But if we want to discuss
what are reasons for forking in general, then apart from the performance
(as noted above), there’s the issue of Hoopl’s interface. IMHO the
node-oriented approach taken by Hoopl is both not flexible enough and it
makes it harder to optimize it. That’s why I’ve already changed GHC’s
`Hoopl.Dataflow` module to operate “block-at-a-time”
Some concrete examples:
- For proc-point analysis it was necessary to introduce a hack to GHC’s
  `Dataflow` module to expose a separate analysis function that
  *ignores* the middle nodes (since for proc-points they’re irrelevant).
  My change to go “block-at-a-time” allowed us to remove that hack.
- I’m trying to fix non-linearity of `CmmLayoutStack` in
  (https://phabricator.haskell.org/D3586) and again the block-oriented
  interface is useful - I want to do different rewrites based on
  which block is being considered (whether it’s a proc-point or not).
  This is not easily possible if I don’t know which block I’m in (which
  is the case for the node-oriented interface).
 
I also don’t think that name clashes and the tension between Hoopl’s
interface and GHC are easy to solve. Hoopl is a public, stand-alone
package, so we can’t just change things without considering
compatibility. For instance, we can’t use GHC’s `Unique` in Hoopl. But
should we switch all of GHC to use Hoopl’s? Also having closely related
concepts spread around GHC and Hoopl is not helping when trying to
understand what’s happening. Finally, any changes to both GHC & Hoopl
have much higher overhead than just changing GHC.
 
In general, it really seems to me that Hoopl has been released simply
too early, with not enough real-world usage and testing. When you say
that we should “just fix Hoopl”, it sounds to me that we’d really need
to rewrite it from scratch. And it’s much easier to do that if we can
just experiment within GHC without worrying about breaking other
existing Hoopl users. Only once we’re happy with the result, we should
be considering separating it into a stand-alone package.
 
3) Difference between pretty/containers and Hoopl
 
I also think that the situation with pretty/containers is quite
different than Hoopl. They are much more general-purpose libraries,
*far* more widely used and with more contributors. Take containers - the
package is still very actively developed and constantly improved.
Whereas Hoopl hasn’t really seen much activity in the last 5 years. So
the benefit-cost ratio is much better - yes there is some cost in having
containers as a dependency, but the benefits from the regular stream of
improvements easily outweigh it. I don’t think that’s the case for
Hoopl.
 
Does this help understand my motivation? Let me know if anything is
still unclear!
 
Thanks,
Michal
 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Removing Hoopl dependency?

GHC - devs mailing list

Michael

 

Sorry to be slow.

 

Note that what I’m actually advocating is to *finish* forking Hoopl. The

fork really started in ~2012 when the “new Cmm backend” was being

finished.

 

Yes, I know.  But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it faster?  Why is GHC’s version faster?

 

apart from the performance

(as noted above), there’s the issue of Hoopl’s interface. IMHO the

node-oriented approach taken by Hoopl is both not flexible enough and it

makes it harder to optimize it. That’s why I’ve already changed GHC’s

`Hoopl.Dataflow` module to operate “block-at-a-time”

 

Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it.  If it’s a better API, can’t we make it better for everyone?  I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.

 

When you say

that we should “just fix Hoopl”, it sounds to me that we’d really need

to rewrite it from scratch. And it’s much easier to do that if we can

just experiment within GHC without worrying about breaking other

existing Hoopl users

 

Fine.  But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.

 

But do we even need to do that much?  After all, a major version bump on a package is allowed to introduce breaking changes to the API.  Anyone who wants the old API can use the old package.

 

I wonder if you could start a wiki page somewhere (eg on the GHC wiki) listing all the changes you’d like to make in a “rewrite from scratch” story?   That would help to “ground”  the conversation.

 

Thanks

 

Simon

 

 

From: Michal Terepeta [mailto:[hidden email]]
Sent: 29 May 2017 12:53
To: Simon Peyton Jones <[hidden email]>; ghc-devs <[hidden email]>
Subject: Re: Removing Hoopl dependency?

 

On Sun, May 28, 2017 at 11:30 PM Simon Peyton Jones <[hidden email]> wrote:

Is there really a compelling case for forking Hoopl?  I was talking to Kavon last week about doing exactly the opposite: using Hoopl more wholeheartedly!

 

Before going ahead with this, let’s remember the downsides

·        If we fork Hoopl, improvements in one place will not be seen in the other.  GHC originally used its own containers library but now uses ‘containers’, most of which is irrelevant to GHC, just to pick up the work that has been done to make ‘containers’ fast.  Similarly, GHC has a clone of ‘pretty’, but someone is working (I think) to make GHC use ‘pretty’.

·        It’s not clear to me why GHC has a clone of parts of Hoopl.  Would it not be better just to make Hoopl faster?

 

If anything I ‘d like to use Hoopl more in Cmm optimisation passes in GHC, so we may want to use more of Hoopl’s facilities.

 

The main reason you suggest for forking is that there are some awkward name clashes.  Surely we could resolve these? e.g we could change CLabel in GHC; or agree with Hoopl maintainers that BlockId would be more helpful than Label.

 

You mention that Hoopl uses Unique set/map.  Why not use ‘containers’ for that?  (Like GHC!)

 

Let’s discuss this a bit more before executing

 

I’m also interested to know:

·        who is actively working on Hoopl (Michael, Sophie, …)?

·        how are you using it (within GHC, or somewhere else)?

 

It’d be good to review and update https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup.  Are there any other improvements planned?

Simon

 

Hi Simon,

 

Thanks for chiming in! Let me try to clarify the current situation and

the motivation for my changes.

 

1) Initial fork of Hoopl

 

Note that what I’m actually advocating is to *finish* forking Hoopl. The

fork really started in ~2012 when the “new Cmm backend” was being

finished.

IIRC the main reason was the unacceptable performance and it seems that

even Simon Marlow had trouble making it run fast enough:

The end result is pretty sad: GHC has its own forked/specialized

`Hoopl.Dataflow` module and is using Hoopl only for definitions of

`Block`/`Graph` and maps/sets (if you look at my commit, it’s pretty

clear what I’m copying). In particular it’s not using *any* of dataflow

analysis or rewriting capabilities of the Hoopl package.

 

2) Reasons to finish forking

 

The reasons I listed in my previous email already assumed the we have

the forked `Hoopl.Dataflow` module in GHC. But if we want to discuss

what are reasons for forking in general, then apart from the performance

(as noted above), there’s the issue of Hoopl’s interface. IMHO the

node-oriented approach taken by Hoopl is both not flexible enough and it

makes it harder to optimize it. That’s why I’ve already changed GHC’s

`Hoopl.Dataflow` module to operate “block-at-a-time”

Some concrete examples:

- For proc-point analysis it was necessary to introduce a hack to GHC’s

  `Dataflow` module to expose a separate analysis function that

  *ignores* the middle nodes (since for proc-points they’re irrelevant).

  My change to go “block-at-a-time” allowed us to remove that hack.

- I’m trying to fix non-linearity of `CmmLayoutStack` in

  (https://phabricator.haskell.org/D3586) and again the block-oriented

  interface is useful - I want to do different rewrites based on

  which block is being considered (whether it’s a proc-point or not).

  This is not easily possible if I don’t know which block I’m in (which

  is the case for the node-oriented interface).

 

I also don’t think that name clashes and the tension between Hoopl’s

interface and GHC are easy to solve. Hoopl is a public, stand-alone

package, so we can’t just change things without considering

compatibility. For instance, we can’t use GHC’s `Unique` in Hoopl. But

should we switch all of GHC to use Hoopl’s? Also having closely related

concepts spread around GHC and Hoopl is not helping when trying to

understand what’s happening. Finally, any changes to both GHC & Hoopl

have much higher overhead than just changing GHC.

 

In general, it really seems to me that Hoopl has been released simply

too early, with not enough real-world usage and testing. When you say

that we should “just fix Hoopl”, it sounds to me that we’d really need

to rewrite it from scratch. And it’s much easier to do that if we can

just experiment within GHC without worrying about breaking other

existing Hoopl users. Only once we’re happy with the result, we should

be considering separating it into a stand-alone package.

 

3) Difference between pretty/containers and Hoopl

 

I also think that the situation with pretty/containers is quite

different than Hoopl. They are much more general-purpose libraries,

*far* more widely used and with more contributors. Take containers - the

package is still very actively developed and constantly improved.

Whereas Hoopl hasn’t really seen much activity in the last 5 years. So

the benefit-cost ratio is much better - yes there is some cost in having

containers as a dependency, but the benefits from the regular stream of

improvements easily outweigh it. I don’t think that’s the case for

Hoopl.

 

Does this help understand my motivation? Let me know if anything is

still unclear!

 

Thanks,

Michal

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Michal Terepeta
> On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones <[hidden email]> wrote:
> Michael
>  
> Sorry to be slow.
>  
> > Note that what I’m actually advocating is to *finish* forking Hoopl. The
> > fork really started in ~2012 when the “new Cmm backend” was being
> > finished.
>  
> Yes, I know.  But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it faster?  Why is GHC’s version faster?
>  
> > apart from the performance
> > (as noted above), there’s the issue of Hoopl’s interface. IMHO the
> > node-oriented approach taken by Hoopl is both not flexible enough and it
> > makes it harder to optimize it. That’s why I’ve already changed GHC’s
> > `Hoopl.Dataflow` module to operate “block-at-a-time”
>  
> Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it.  If it’s a better API, can’t we make it better for everyone?  I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.

Sure, but re-engineering the API of a publicly use package has significant
cost for everyone involved:
- GHC: we might need to wait longer for any improvements and spend
  more time discussing various options (and compromises - what makes
  sense for GHC might not make sense for other people)
- Hoopl users: will need to migrate to the new APIs potentially
  multiple times
- Hoopl maintainers: might need to maintain more than one branches of
  Hoopl for a while

And note that just bumping a version number might not be enough.  IIRC
Stackage only allows one version of each package and since Hoopl is a
boot package for GHC, the new version will move to Stackage along with
GHC. So any users of Hoopl that want to use the old package, will not
be able to use that version of Stackage.

> > When you say
> > that we should “just fix Hoopl”, it sounds to me that we’d really need
> > to rewrite it from scratch. And it’s much easier to do that if we can
> > just experiment within GHC without worrying about breaking other
> > existing Hoopl users
>  
> Fine.  But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.

Maybe this is the core of our disagreement - why is it a good idea to
have Hoopl as a separate package in the first place?

I've pointed multiple reasons why I think it has a significant cost.
But I don't really see any major benefits. Looking at the commit
history of Hoopl there hasn't been much development on it since 2012
when Simon M was trying to get the new GHC backend working (since
then, it's mostly maintenance patches to keep up with changes in
`base`, etc).
Extracting a core part of any project to a shared library has some
real costs, so there should be equally real benefits that outweigh
that cost. (If I proposed extracting parts of Core optimizer to a
separate package, wouldn't you expect some really good reasons for
doing this?)
I also do think this is quite different than a dependency on, say,
`binary`, `containers` or `pretty`, where the API of the library is
smaller (at least conceptually), much better understood and
established.

Cheers,
Michal


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Ben Gamari-2
Michal Terepeta <[hidden email]> writes:

> Maybe this is the core of our disagreement - why is it a good idea to
> have Hoopl as a separate package in the first place?
>
> I've pointed multiple reasons why I think it has a significant cost.
> But I don't really see any major benefits. Looking at the commit
> history of Hoopl there hasn't been much development on it since 2012
> when Simon M was trying to get the new GHC backend working (since
> then, it's mostly maintenance patches to keep up with changes in
> `base`, etc).
> Extracting a core part of any project to a shared library has some
> real costs, so there should be equally real benefits that outweigh
> that cost. (If I proposed extracting parts of Core optimizer to a
> separate package, wouldn't you expect some really good reasons for
> doing this?)
One way forward here would be to ask those who would be affected by a
API rework whether they would be open to change. I don't believe there
are too many hoopl users at the moment but I recall that previous
efforts to change the library's interface were met with some resistance.

However, even if we found that hoopl's current user-base is agreeable to
change we would still need to account for the fact that advancing GHC
in lockstep with an out-of-tree hoopl will take more effort than
advancing it under Michal's merge proposal. Admittedly, with submodules
this additional effort isn't too large, but it's still more than having
hoopl and GHC under one tree.

Cheers,

- Ben


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: Removing Hoopl dependency?

GHC - devs mailing list
In reply to this post by Michal Terepeta

Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?

 

One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.

 

If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?

 

A re-usable library should be

a)      a significant chunk of code,

b)      that can plausibly be re-purposed by others

c)      and that has an explicable API

 

I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

 

Stackage only allows one version of each package

 

I didn’t know that, but I can see it makes sense.  That makes a strong case for re-doing it as a new package hoopl2, if the API needs to change substantially (something we have yet to discuss).

 

I've pointed multiple reasons why I think it has a significant cost.

Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

 

Thanks!

 

Simon

 

 

 

From: Michal Terepeta [mailto:[hidden email]]
Sent: 08 June 2017 19:59
To: Simon Peyton Jones <[hidden email]>; ghc-devs <[hidden email]>
Cc: Kavon Farvardin <[hidden email]>
Subject: Re: Removing Hoopl dependency?

 

> On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones <[hidden email]> wrote:

> Michael

>  

> Sorry to be slow.

>  

> > Note that what I’m actually advocating is to *finish* forking Hoopl. The

> > fork really started in ~2012 when the “new Cmm backend” was being

> > finished.

>  

> Yes, I know.  But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it faster?  Why is GHC’s version faster?

>  

> > apart from the performance

> > (as noted above), there’s the issue of Hoopl’s interface. IMHO the

> > node-oriented approach taken by Hoopl is both not flexible enough and it

> > makes it harder to optimize it. That’s why I’ve already changed GHC’s

> > `Hoopl.Dataflow` module to operate “block-at-a-time”

>  

> Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it.  If it’s a better API, can’t we make it better for everyone?  I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.

 

Sure, but re-engineering the API of a publicly use package has significant

cost for everyone involved:

- GHC: we might need to wait longer for any improvements and spend

  more time discussing various options (and compromises - what makes

  sense for GHC might not make sense for other people)

- Hoopl users: will need to migrate to the new APIs potentially

  multiple times

- Hoopl maintainers: might need to maintain more than one branches of

  Hoopl for a while

 

And note that just bumping a version number might not be enough.  IIRC

Stackage only allows one version of each package and since Hoopl is a

boot package for GHC, the new version will move to Stackage along with

GHC. So any users of Hoopl that want to use the old package, will not

be able to use that version of Stackage.

 

> > When you say

> > that we should “just fix Hoopl”, it sounds to me that we’d really need

> > to rewrite it from scratch. And it’s much easier to do that if we can

> > just experiment within GHC without worrying about breaking other

> > existing Hoopl users

>  

> Fine.  But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.

 

Maybe this is the core of our disagreement - why is it a good idea to

have Hoopl as a separate package in the first place?

 

I've pointed multiple reasons why I think it has a significant cost.

But I don't really see any major benefits. Looking at the commit

history of Hoopl there hasn't been much development on it since 2012

when Simon M was trying to get the new GHC backend working (since

then, it's mostly maintenance patches to keep up with changes in

`base`, etc).

Extracting a core part of any project to a shared library has some

real costs, so there should be equally real benefits that outweigh

that cost. (If I proposed extracting parts of Core optimizer to a

separate package, wouldn't you expect some really good reasons for

doing this?)

I also do think this is quite different than a dependency on, say,

`binary`, `containers` or `pretty`, where the API of the library is

smaller (at least conceptually), much better understood and

established.

 

Cheers,

Michal

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Merijn Verstraaten
Lemme toss in my 2 cents as an outsider who likes to dabble in programming language and compilers: I would *love* to be able just drop in (parts) of GHC's optimisation into my toy compilers. Optimisation is complicated, lots of work, and not really the part I care about when toying with languages. I wasn't really aware of Hoopl before this thread, so now that I do I'm kinda sad by the idea of this reusable infrastructure being tossed out. I don't really have any vested interest/opinion on how to deal with the current Hoopl situation, so if it's decided to write a Hoopl2.0 instead, without backwards compatibility, I would still consider that a win.

Cheers,
Merijn

> On 9 Jun 2017, at 9:50, Simon Peyton Jones via ghc-devs <[hidden email]> wrote:
>
> Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
>
>  
> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.
>  
> If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
>
>  
> A re-usable library should be
> a)      a significant chunk of code,
> b)      that can plausibly be re-purposed by others
> c)      and that has an explicable API
>  
> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.
>  
> Stackage only allows one version of each package
>  
> I didn’t know that, but I can see it makes sense.  That makes a strong case for re-doing it as a new package hoopl2, if the API needs to change substantially (something we have yet to discuss).
>  
> I've pointed multiple reasons why I think it has a significant cost.
>
> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
>  
> Thanks!
>  
> Simon
>  
>  
>  
> From: Michal Terepeta [mailto:[hidden email]]
> Sent: 08 June 2017 19:59
> To: Simon Peyton Jones <[hidden email]>; ghc-devs <[hidden email]>
> Cc: Kavon Farvardin <[hidden email]>
> Subject: Re: Removing Hoopl dependency?
>  
> > On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones <[hidden email]> wrote:
>
> > Michael
>
> >  
>
> > Sorry to be slow.
>
> >  
>
> > > Note that what I’m actually advocating is to *finish* forking Hoopl. The
>
> > > fork really started in ~2012 when the “new Cmm backend” was being
>
> > > finished.
>
> >  
>
> > Yes, I know.  But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it faster?  Why is GHC’s version faster?
>
> >  
>
> > > apart from the performance
>
> > > (as noted above), there’s the issue of Hoopl’s interface. IMHO the
>
> > > node-oriented approach taken by Hoopl is both not flexible enough and it
>
> > > makes it harder to optimize it. That’s why I’ve already changed GHC’s
>
> > > `Hoopl.Dataflow` module to operate “block-at-a-time”
>
> >  
>
> > Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it.  If it’s a better API, can’t we make it better for everyone?  I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.
>
>  
>
> Sure, but re-engineering the API of a publicly use package has significant
>
> cost for everyone involved:
>
> - GHC: we might need to wait longer for any improvements and spend
>
>   more time discussing various options (and compromises - what makes
>
>   sense for GHC might not make sense for other people)
>
> - Hoopl users: will need to migrate to the new APIs potentially
>
>   multiple times
>
> - Hoopl maintainers: might need to maintain more than one branches of
>
>   Hoopl for a while
>
>  
>
> And note that just bumping a version number might not be enough.  IIRC
>
> Stackage only allows one version of each package and since Hoopl is a
>
> boot package for GHC, the new version will move to Stackage along with
>
> GHC. So any users of Hoopl that want to use the old package, will not
>
> be able to use that version of Stackage.
>
>  
>
> > > When you say
>
> > > that we should “just fix Hoopl”, it sounds to me that we’d really need
>
> > > to rewrite it from scratch. And it’s much easier to do that if we can
>
> > > just experiment within GHC without worrying about breaking other
>
> > > existing Hoopl users
>
> >  
>
> > Fine.  But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.
>
>  
>
> Maybe this is the core of our disagreement - why is it a good idea to
>
> have Hoopl as a separate package in the first place?
>
>  
>
> I've pointed multiple reasons why I think it has a significant cost.
>
> But I don't really see any major benefits. Looking at the commit
>
> history of Hoopl there hasn't been much development on it since 2012
>
> when Simon M was trying to get the new GHC backend working (since
>
> then, it's mostly maintenance patches to keep up with changes in
>
> `base`, etc).
>
> Extracting a core part of any project to a shared library has some
>
> real costs, so there should be equally real benefits that outweigh
>
> that cost. (If I proposed extracting parts of Core optimizer to a
>
> separate package, wouldn't you expect some really good reasons for
>
> doing this?)
>
> I also do think this is quite different than a dependency on, say,
>
> `binary`, `containers` or `pretty`, where the API of the library is
>
> smaller (at least conceptually), much better understood and
>
> established.
>
>  
>
> Cheers,
>
> Michal
>
>  
>
> _______________________________________________
> ghc-devs mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Herbert Valerio Riedel-3
In reply to this post by GHC - devs mailing list
Hi Simon,

On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote:

[...]

>> Stackage only allows one version of each package
>
> I didn’t know that, but I can see it makes sense.  That makes a strong
> case for re-doing it as a new package hoopl2

The limitations of Stackage's design shouldn't drive nor limit
library design. Cabal has been moving to finally allow us to have
multiple versions and even multiple configurations/instances of the same
version of a package registered in the package db at the same time, and
subjecting ourselves to Stackage's limitations after all the work done
(and more in that direction is being considered to push the boundaries
even further) to that effect *now* seems quite backward to me.

If we push the idea to its conclusion, that we shall rather publish a
new package rather than release a new major version of a package to
workaround Stackage, you'd see a proliferation of number-suffixed
packages on Hackage.  Moreover, packages which can easily support
multiple major versions of a package would have to use conditional logic
boilerplate in their .cabal files (which again would be incompatible
with Stackage's inherent limitations, as it allows only *one
configuration* of a given package version).

We should build upon the facilities we already have in place; and major
versions are here to encode the epoch/generation of an API; moreover, as
a big advantage over classic SemVer, we also have this 2-component major
version which gives us more flexibility for versioning during developing
two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.*
could keep evolving independently, each branch being able to perform
major version increments in their respective version namespace.

Cheers,
  HVR
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Alan & Kim Zimmerman
But equally, stackage is a major part of the haskell ecosystem.

As such, implications and paths forward need to be considered.

Alan

On 9 June 2017 at 11:16, Herbert Valerio Riedel <[hidden email]> wrote:
Hi Simon,

On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote:

[...]

>> Stackage only allows one version of each package
>
> I didn’t know that, but I can see it makes sense.  That makes a strong
> case for re-doing it as a new package hoopl2

The limitations of Stackage's design shouldn't drive nor limit
library design. Cabal has been moving to finally allow us to have
multiple versions and even multiple configurations/instances of the same
version of a package registered in the package db at the same time, and
subjecting ourselves to Stackage's limitations after all the work done
(and more in that direction is being considered to push the boundaries
even further) to that effect *now* seems quite backward to me.

If we push the idea to its conclusion, that we shall rather publish a
new package rather than release a new major version of a package to
workaround Stackage, you'd see a proliferation of number-suffixed
packages on Hackage.  Moreover, packages which can easily support
multiple major versions of a package would have to use conditional logic
boilerplate in their .cabal files (which again would be incompatible
with Stackage's inherent limitations, as it allows only *one
configuration* of a given package version).

We should build upon the facilities we already have in place; and major
versions are here to encode the epoch/generation of an API; moreover, as
a big advantage over classic SemVer, we also have this 2-component major
version which gives us more flexibility for versioning during developing
two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.*
could keep evolving independently, each branch being able to perform
major version increments in their respective version namespace.

Cheers,
  HVR
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Michal Terepeta
In reply to this post by GHC - devs mailing list
> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones <[hidden email]> wrote:
> > Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
>
>  
> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.
>  
> > If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
>
>  
> A re-usable library should be
> a)      a significant chunk of code,
> b)      that can plausibly be re-purposed by others
> c)      and that has an explicable API
>  
> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
  of any compilers doing this, LLVM is to some degree close but it
  exposes the whole language as the interface so it's closer to the
  idea of extracting the whole Cmm backend). So I don't think the API
  for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
  (after all it defines blocks and graphs), making reusability
  potentially more tricky.

So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.

>  
> [...]
>  
> > I've pointed multiple reasons why I think it has a significant cost.
>
> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
  etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
  backend would be split between GHC and Hoopl2, with the latter
  necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
  generality/polymorphism will likely require fair amount of
  additional work.
- If Hoopl2 is used by other compilers, then we need to be more
  careful changing anything in incompatible ways, this will require
  more discussions & release coordination.

Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
  long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
  extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.

What do you think?

Cheers,
Michal


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Sophie Taylor
Hello, fellow workers!

So, I'll pop in here with my thoughts.

I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:

1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.  

2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.

3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems: 
  * It seems impossible to change the topology of the graph in a rewriting step.
  * I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties
  * The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.

So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL)  and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)

So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.

So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.

Cheers,
Sophie



On Fri, 9 Jun 2017 at 22:31 Michal Terepeta <[hidden email]> wrote:
> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones <[hidden email]> wrote:
> > Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
>
>  
> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.
>  
> > If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
>
>  
> A re-usable library should be
> a)      a significant chunk of code,
> b)      that can plausibly be re-purposed by others
> c)      and that has an explicable API
>  
> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
  of any compilers doing this, LLVM is to some degree close but it
  exposes the whole language as the interface so it's closer to the
  idea of extracting the whole Cmm backend). So I don't think the API
  for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
  (after all it defines blocks and graphs), making reusability
  potentially more tricky.

So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.

>  
> [...]
>  
> > I've pointed multiple reasons why I think it has a significant cost.
>
> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
  etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
  backend would be split between GHC and Hoopl2, with the latter
  necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
  generality/polymorphism will likely require fair amount of
  additional work.
- If Hoopl2 is used by other compilers, then we need to be more
  careful changing anything in incompatible ways, this will require
  more discussions & release coordination.

Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
  long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
  extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.

What do you think?

Cheers,
Michal

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Removing Hoopl dependency?

GHC - devs mailing list

Interesting!

 

Maybe there are a couple of different alternatives:

 

A.      A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.

B.      A more radical change to use hypergraphs, type-level lists etc.   This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail

 

There’s no reason we couldn’t do (A) and (B) in parallel.

 

Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library.   (I’d advocate making it a separate library in GHC’s tree; we already have a number of those.

 

That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.

 

Does that sound plausible?  Do we know of any other Hoopl users?

 

Simon

 

From: Sophie Taylor [mailto:[hidden email]]
Sent: 11 June 2017 14:09
To: Michal Terepeta <[hidden email]>; Simon Peyton Jones <[hidden email]>; ghc-devs <[hidden email]>
Cc: Kavon Farvardin <[hidden email]>
Subject: Re: Removing Hoopl dependency?

 

Hello, fellow workers!

 

So, I'll pop in here with my thoughts.

 

I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:

 

1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.  

 

2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.

 

3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems: 

  * It seems impossible to change the topology of the graph in a rewriting step.

  * I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties

  * The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.

 

So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL)  and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)

 

So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.

 

So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.

 

Cheers,

Sophie

 

 

 

On Fri, 9 Jun 2017 at 22:31 Michal Terepeta <[hidden email]> wrote:

> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones <[hidden email]> wrote:

> > Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?

> 

>  

> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.

>  

> > If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?

> 

>  

> A re-usable library should be

> a)      a significant chunk of code,

> b)      that can plausibly be re-purposed by others

> c)      and that has an explicable API

>  

> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

 

I agree with your characterization of a re-usable library and that

Core optimizer would not be a good fit. But I do think that Hoopl also

has some problems with b) and c) (although smaller):

- Using an optimizer-as-a-library is not really common (I'm not aware

  of any compilers doing this, LLVM is to some degree close but it

  exposes the whole language as the interface so it's closer to the

  idea of extracting the whole Cmm backend). So I don't think the API

  for such a project is well understood.

- The API is pretty wide and does put serious constraints on the IR

  (after all it defines blocks and graphs), making reusability

  potentially more tricky.

 

So I think I understand your argument and we just disagree on whether

this is worth the effort of having a separate package.

 

>  

> [...]

>  

> > I've pointed multiple reasons why I think it has a significant cost.

> 

> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

 

Having even Hoopl2 as a separate package would still entail

additional work:

- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,

  etc. since it needs to be standalone)

- Understanding code (esp. by newcommers) would be harder: the Cmm

  backend would be split between GHC and Hoopl2, with the latter

  necessarily being far more general/polymorphic than needed by GHC.

- Getting the right performance in the presence of all this additional

  generality/polymorphism will likely require fair amount of

  additional work.

- If Hoopl2 is used by other compilers, then we need to be more

  careful changing anything in incompatible ways, this will require

  more discussions & release coordination.

 

Considering that Hoopl was never actually picked up by other

compilers, I'm not convinced that this cost is justified. But I

understand that other people might have a different opinion.

So how about a compromise:

- decouple GHC from the current Hoopl (ie, go ahead with my diff),

- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the

  long-term intention of creating a separate package,

- experiment with and improve the code,

- once (if?) we're happy with the results, discuss what/how to

  extract to a separate package.

That gives us the freedom to try things out and see what works well

(I simply don't have ready solutions for anything, being able to

experiment is IMHO quite important). And once we reach the right

performance/representation/abstraction/API we can work on extracting

that.

 

What do you think?

 

Cheers,

Michal

 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Sophie Taylor
I don't see why not, other than possible duplication of effort when it comes to some of the basic algorithms.

Speaking of which, what policies are there on bringing in new dependencies to GHC, both compile-time and run-time (e.g. possible SMT solver support)?



On Mon, 12 Jun 2017 at 17:07 Simon Peyton Jones <[hidden email]> wrote:

Interesting!

 

Maybe there are a couple of different alternatives:

 

A.      A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.

B.      A more radical change to use hypergraphs, type-level lists etc.   This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail

 

There’s no reason we couldn’t do (A) and (B) in parallel.

 

Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library.   (I’d advocate making it a separate library in GHC’s tree; we already have a number of those.

 

That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.

 

Does that sound plausible?  Do we know of any other Hoopl users?

 

Simon

 

From: Sophie Taylor [mailto:[hidden email]]
Sent: 11 June 2017 14:09
To: Michal Terepeta <[hidden email]>; Simon Peyton Jones <[hidden email]>; ghc-devs <[hidden email]>


Cc: Kavon Farvardin <[hidden email]>
Subject: Re: Removing Hoopl dependency?

 

Hello, fellow workers!

 

So, I'll pop in here with my thoughts.

 

I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:

 

1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.  

 

2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.

 

3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems: 

  * It seems impossible to change the topology of the graph in a rewriting step.

  * I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties

  * The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.

 

So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL)  and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)

 

So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.

 

So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.

 

Cheers,

Sophie

 

 

 

On Fri, 9 Jun 2017 at 22:31 Michal Terepeta <[hidden email]> wrote:

> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones <[hidden email]> wrote:

> > Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?

> 

>  

> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.

>  

> > If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?

> 

>  

> A re-usable library should be

> a)      a significant chunk of code,

> b)      that can plausibly be re-purposed by others

> c)      and that has an explicable API

>  

> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

 

I agree with your characterization of a re-usable library and that

Core optimizer would not be a good fit. But I do think that Hoopl also

has some problems with b) and c) (although smaller):

- Using an optimizer-as-a-library is not really common (I'm not aware

  of any compilers doing this, LLVM is to some degree close but it

  exposes the whole language as the interface so it's closer to the

  idea of extracting the whole Cmm backend). So I don't think the API

  for such a project is well understood.

- The API is pretty wide and does put serious constraints on the IR

  (after all it defines blocks and graphs), making reusability

  potentially more tricky.

 

So I think I understand your argument and we just disagree on whether

this is worth the effort of having a separate package.

 

>  

> [...]

>  

> > I've pointed multiple reasons why I think it has a significant cost.

> 

> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

 

Having even Hoopl2 as a separate package would still entail

additional work:

- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,

  etc. since it needs to be standalone)

- Understanding code (esp. by newcommers) would be harder: the Cmm

  backend would be split between GHC and Hoopl2, with the latter

  necessarily being far more general/polymorphic than needed by GHC.

- Getting the right performance in the presence of all this additional

  generality/polymorphism will likely require fair amount of

  additional work.

- If Hoopl2 is used by other compilers, then we need to be more

  careful changing anything in incompatible ways, this will require

  more discussions & release coordination.

 

Considering that Hoopl was never actually picked up by other

compilers, I'm not convinced that this cost is justified. But I

understand that other people might have a different opinion.

So how about a compromise:

- decouple GHC from the current Hoopl (ie, go ahead with my diff),

- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the

  long-term intention of creating a separate package,

- experiment with and improve the code,

- once (if?) we're happy with the results, discuss what/how to

  extract to a separate package.

That gives us the freedom to try things out and see what works well

(I simply don't have ready solutions for anything, being able to

experiment is IMHO quite important). And once we reach the right

performance/representation/abstraction/API we can work on extracting

that.

 

What do you think?

 

Cheers,

Michal

 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Removing Hoopl dependency?

Ben Gamari-2
In reply to this post by GHC - devs mailing list
Simon Peyton Jones via ghc-devs <[hidden email]> writes:

Snip
>
> That would leave Sophie free to do (B) free of the constraints of GHC
> depending on it; but we could always use it later.
>
> Does that sound plausible?  Do we know of any other Hoopl users?

CCing Ning, who is currently maintaining hoopl and I believe has some
projects using it.

Ning, you may want to have a look through this thread if you haven't
already seen it. You can find the previous messages in the list archive [1].

Cheers,

- Ben


[1] May messages: https://mail.haskell.org/pipermail/ghc-devs/2017-May/014255.html
    June messages: https://mail.haskell.org/pipermail/ghc-devs/2017-June/014293.html

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing Hoopl dependency?

Michal Terepeta
> On Mon, Jun 12, 2017 at 8:05 PM Ben Gamari <[hidden email]> wrote:
> Simon Peyton Jones via ghc-devs <[hidden email]> writes:
>
> Snip
> >
> > That would leave Sophie free to do (B) free of the constraints of GHC
> > depending on it; but we could always use it later.
> >
> > Does that sound plausible?  Do we know of any other Hoopl users?
>
> CCing Ning, who is currently maintaining hoopl and I believe has some
> projects using it.
>
> Ning, you may want to have a look through this thread if you haven't
> already seen it. You can find the previous messages in the list archive [1].
>
> Cheers,
>
> - Ben

Based on [1] there are four public packages:
- ethereum-analyzer,
- linearscan-hoopl,
- llvm-analysis,
- text-show-instances

But there might be more that are not open-source/uploaded to hackage/stackage.

Cheers,
Michal



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
12