Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

Shayan Najd Javadipour-2
In this thread, I am going to raise a topic for discussion. Please share your opinions and related experiences.

Evaluation of type families within HsSyn ASTs, such as `PostTc`, with a fixed phase index, such as `GhcPs`, gives us distinct ASTs at the *compile-time*.
However, when programming with these ASTs, we use patterns, such as `HsMultiIf :: PostTc p Type -> [LGRHS p (LHsExpr p)] -> HsExpr p` that are shared among phases.
We can 
(1) introduce a layer of abstraction providing a set of type and pattern synonyms specific to each phase, such as `PsMultiIf :: [LPsGRHS  LPsExpr] -> PsExpr`; and
(2) updating code working on ASTs of specific phase to use the interface specific to the phase, such as by changing prefixes from `Hs` to `Ps` and by removing unused variables and placeholders; and
(3) leaving untouched code working uniformly on ASTs of different phases (i.e., the generic functions in Trees that Grow terminology), such as the existing functions whose types are polymorphic on phase index. 

Some comments:

- It can be done gradually and smoothly: we add three separate files in HsSyn (per each phase) containing the phase-specific interfaces, and gradually import them and do the changes per module.
- Using the interfaces is optional: code using the current method (e.g., using `HsMultiIf`) should work just fine.
- It introduces a layer of indirection and three more files to maintain. 
- It makes code working on HsSyn ASTs, such as the renamer, appear cleaner as placeholders and similar machinery are abstracted away by the interfaces (e.g., no need to import bits and pieces of `HsExtension`)
- In theory, there should be zero impact on GHC's runtime performance. 

I am myself undecided about its benefit-cost ratio, but willing to at least implement the phase-specific interfaces.
For me, abstracting away all the `PostRn` stuff, `Out` prefixed constructors, and dummy placeholders from the front-end code is the most valuable.

Yours, 
  Shayan






 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

GHC - devs mailing list

I’m keen NOT to introduce these layers of indirection.  I think they make the code harder to understand.


Can you give an example function or two, and what it would look like under the different approaches. 

 

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

 

S

 

From: Shayan Najd [mailto:[hidden email]]
Sent: 23 August 2017 13:26
To: [hidden email]
Cc: Simon Peyton Jones <[hidden email]>; Alan & Kim Zimmerman <[hidden email]>
Subject: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

 

In this thread, I am going to raise a topic for discussion. Please share your opinions and related experiences.

 

Evaluation of type families within HsSyn ASTs, such as `PostTc`, with a fixed phase index, such as `GhcPs`, gives us distinct ASTs at the *compile-time*.

However, when programming with these ASTs, we use patterns, such as `HsMultiIf :: PostTc p Type -> [LGRHS p (LHsExpr p)] -> HsExpr p` that are shared among phases.

We can 

(1) introduce a layer of abstraction providing a set of type and pattern synonyms specific to each phase, such as `PsMultiIf :: [LPsGRHS  LPsExpr] -> PsExpr`; and

(2) updating code working on ASTs of specific phase to use the interface specific to the phase, such as by changing prefixes from `Hs` to `Ps` and by removing unused variables and placeholders; and

(3) leaving untouched code working uniformly on ASTs of different phases (i.e., the generic functions in Trees that Grow terminology), such as the existing functions whose types are polymorphic on phase index. 

 

Some comments:

 

- It can be done gradually and smoothly: we add three separate files in HsSyn (per each phase) containing the phase-specific interfaces, and gradually import them and do the changes per module.

- Using the interfaces is optional: code using the current method (e.g., using `HsMultiIf`) should work just fine.

- It introduces a layer of indirection and three more files to maintain. 

- It makes code working on HsSyn ASTs, such as the renamer, appear cleaner as placeholders and similar machinery are abstracted away by the interfaces (e.g., no need to import bits and pieces of `HsExtension`)

- In theory, there should be zero impact on GHC's runtime performance. 

 

I am myself undecided about its benefit-cost ratio, but willing to at least implement the phase-specific interfaces.

For me, abstracting away all the `PostRn` stuff, `Out` prefixed constructors, and dummy placeholders from the front-end code is the most valuable.

 

Yours, 

  Shayan

 

 

 

 

 

 

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

Shayan Najd Javadipour-2
(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

(1)-(3) are just steps when we do choose to add the indirection layer: add the layer, and do the changes when desired.
If we choose to not to add the indirection layer, nothing needs to be changed and the internals of the encoding (`PostTc`, place holders, etc) remain visible in the code.

Can you give an example function or two, and what it would look like under the different approaches. 
 
For example, the function clause 

> rnExpr (HsMultiIf _ty alts)
>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts
>       ; return (HsMultiIf placeHolderType alts', fvs) }

becomes

> rnExpr (PsMultiIf alts)
>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts
>       ; return (RnMultiIf alts', fvs) }

I hope it clarifies what I mean a bit.

There is always a choice between how distinct we want the phases to be. 
The more distinct they are, the higher static guarantees. The code also gets more clear in a way, e.g. `RnMultiIf` is talking about a renamed expression, `PsMultiIf` about a parsed expression, while `HsMultiIf` is talking about an expression of any phase.  
At the same, distinctness means more work for the programmer.
Also, such distinction sometimes implies a pedagogic burdon, as readers should now learn about more than one AST. However, this burden is very low here thanks to the prefixing convention: `PsMultiIf` and `RnMultiIf` are easily understood to represent the same thing in different phases.
Finally, such distinctions often lead to code duplication. But in our case, Trees that Grow machinery saves us from such duplication, e.g., we have the same base ASTs and we can write generic programmers over the bases ASTs anytime we want (point/step (3) above).

Thanks,
  Shayan

On Thu, Aug 24, 2017 at 3:35 PM, Simon Peyton Jones <[hidden email]> wrote:

I’m keen NOT to introduce these layers of indirection.  I think they make the code harder to understand.


Can you give an example function or two, and what it would look like under the different approaches. 

 

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

 

S

 

From: Shayan Najd [mailto:[hidden email]]
Sent: 23 August 2017 13:26
To: [hidden email]
Cc: Simon Peyton Jones <[hidden email]>; Alan & Kim Zimmerman <[hidden email]>
Subject: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

 

In this thread, I am going to raise a topic for discussion. Please share your opinions and related experiences.

 

Evaluation of type families within HsSyn ASTs, such as `PostTc`, with a fixed phase index, such as `GhcPs`, gives us distinct ASTs at the *compile-time*.

However, when programming with these ASTs, we use patterns, such as `HsMultiIf :: PostTc p Type -> [LGRHS p (LHsExpr p)] -> HsExpr p` that are shared among phases.

We can 

(1) introduce a layer of abstraction providing a set of type and pattern synonyms specific to each phase, such as `PsMultiIf :: [LPsGRHS  LPsExpr] -> PsExpr`; and

(2) updating code working on ASTs of specific phase to use the interface specific to the phase, such as by changing prefixes from `Hs` to `Ps` and by removing unused variables and placeholders; and

(3) leaving untouched code working uniformly on ASTs of different phases (i.e., the generic functions in Trees that Grow terminology), such as the existing functions whose types are polymorphic on phase index. 

 

Some comments:

 

- It can be done gradually and smoothly: we add three separate files in HsSyn (per each phase) containing the phase-specific interfaces, and gradually import them and do the changes per module.

- Using the interfaces is optional: code using the current method (e.g., using `HsMultiIf`) should work just fine.

- It introduces a layer of indirection and three more files to maintain. 

- It makes code working on HsSyn ASTs, such as the renamer, appear cleaner as placeholders and similar machinery are abstracted away by the interfaces (e.g., no need to import bits and pieces of `HsExtension`)

- In theory, there should be zero impact on GHC's runtime performance. 

 

I am myself undecided about its benefit-cost ratio, but willing to at least implement the phase-specific interfaces.

For me, abstracting away all the `PostRn` stuff, `Out` prefixed constructors, and dummy placeholders from the front-end code is the most valuable.

 

Yours, 

  Shayan

 

 

 

 

 

 

 



_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

GHC - devs mailing list

Hmm I see. I still prefer the concrete form (no intermediate layer).   Many constructors use record fields, so you can just omit the ones that aren’t valid.

 

Simon

 

 

From: Shayan Najd [mailto:[hidden email]]
Sent: 24 August 2017 15:39
To: Simon Peyton Jones <[hidden email]>
Cc: Alan & Kim Zimmerman <[hidden email]>; [hidden email]
Subject: Re: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

 

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

 

(1)-(3) are just steps when we do choose to add the indirection layer: add the layer, and do the changes when desired.

If we choose to not to add the indirection layer, nothing needs to be changed and the internals of the encoding (`PostTc`, place holders, etc) remain visible in the code.

 

Can you give an example function or two, and what it would look like under the different approaches. 

 

For example, the function clause 

 

> rnExpr (HsMultiIf _ty alts)

>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts

>       ; return (HsMultiIf placeHolderType alts', fvs) }

 

becomes

 

> rnExpr (PsMultiIf alts)

>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts

>       ; return (RnMultiIf alts', fvs) }

 

I hope it clarifies what I mean a bit.

 

There is always a choice between how distinct we want the phases to be. 

The more distinct they are, the higher static guarantees. The code also gets more clear in a way, e.g. `RnMultiIf` is talking about a renamed expression, `PsMultiIf` about a parsed expression, while `HsMultiIf` is talking about an expression of any phase.  

At the same, distinctness means more work for the programmer.

Also, such distinction sometimes implies a pedagogic burdon, as readers should now learn about more than one AST. However, this burden is very low here thanks to the prefixing convention: `PsMultiIf` and `RnMultiIf` are easily understood to represent the same thing in different phases.

Finally, such distinctions often lead to code duplication. But in our case, Trees that Grow machinery saves us from such duplication, e.g., we have the same base ASTs and we can write generic programmers over the bases ASTs anytime we want (point/step (3) above).

 

Thanks,

  Shayan

 

On Thu, Aug 24, 2017 at 3:35 PM, Simon Peyton Jones <[hidden email]> wrote:

I’m keen NOT to introduce these layers of indirection.  I think they make the code harder to understand.


Can you give an example function or two, and what it would look like under the different approaches. 

 

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

 

S

 

From: Shayan Najd [mailto:[hidden email]]
Sent: 23 August 2017 13:26
To: [hidden email]
Cc: Simon Peyton Jones <[hidden email]>; Alan & Kim Zimmerman <[hidden email]>
Subject: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

 

In this thread, I am going to raise a topic for discussion. Please share your opinions and related experiences.

 

Evaluation of type families within HsSyn ASTs, such as `PostTc`, with a fixed phase index, such as `GhcPs`, gives us distinct ASTs at the *compile-time*.

However, when programming with these ASTs, we use patterns, such as `HsMultiIf :: PostTc p Type -> [LGRHS p (LHsExpr p)] -> HsExpr p` that are shared among phases.

We can 

(1) introduce a layer of abstraction providing a set of type and pattern synonyms specific to each phase, such as `PsMultiIf :: [LPsGRHS  LPsExpr] -> PsExpr`; and

(2) updating code working on ASTs of specific phase to use the interface specific to the phase, such as by changing prefixes from `Hs` to `Ps` and by removing unused variables and placeholders; and

(3) leaving untouched code working uniformly on ASTs of different phases (i.e., the generic functions in Trees that Grow terminology), such as the existing functions whose types are polymorphic on phase index. 

 

Some comments:

 

- It can be done gradually and smoothly: we add three separate files in HsSyn (per each phase) containing the phase-specific interfaces, and gradually import them and do the changes per module.

- Using the interfaces is optional: code using the current method (e.g., using `HsMultiIf`) should work just fine.

- It introduces a layer of indirection and three more files to maintain. 

- It makes code working on HsSyn ASTs, such as the renamer, appear cleaner as placeholders and similar machinery are abstracted away by the interfaces (e.g., no need to import bits and pieces of `HsExtension`)

- In theory, there should be zero impact on GHC's runtime performance. 

 

I am myself undecided about its benefit-cost ratio, but willing to at least implement the phase-specific interfaces.

For me, abstracting away all the `PostRn` stuff, `Out` prefixed constructors, and dummy placeholders from the front-end code is the most valuable.

 

Yours, 

  Shayan

 

 

 

 

 

 

 

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs