I recently learned about the code for HIE files. This is quite a substantial new development in GHC, judging from the amount of code. I understand broadly why it's here, but I'd like to learn more specifics. For example, if I'm adding a new bit of syntax, how should I update HIE generation? What if I'm changing a bit of syntax? Is there a primer to all this?
ghc-devs mailing list
There is no proper write up for this yet, I will add my comments here to the
wiki page on HIE files (<https://gitlab.haskell.org/ghc/ghc/wikis/hie-files>)
and also as a Note in HieAst
When adding new syntax or changing a bit of syntax in HIE files, you need
to pay attention to the following things:
1) Symbols (Names/Vars/Modules) in the following categories:
a) Symbols that appear in the source file that directly correspond to
something the user typed
b) Symbols that don't appear in the source, but should be in some sense
"visible" to a user, particularly via IDE tooling or the like. This
includes things like the names introduced by RecordWildcards (We record
all the names introduced by a (..) in HIE files), and will include implicit
parameters and evidence variables after one of my pending MRs lands.
2) Subtrees that may contain such symbols
For 1), you need to call `toHie` for one of the following instances
instance ToHie (Context (Located Name)) where ...
instance ToHie (Context (Located Var)) where ...
instance ToHie (IEContext (Located ModuleName)) where ...
`Context` is a data type that looks like:
data Context a = C ContextInfo a -- Used for names and bindings
`ContextInfo` is defined in `HieTypes`, and looks like
= Use -- ^ regular variable
| IEThing IEType -- ^ import/export
-- | Value binding
BindType -- ^ whether or not the binding is in an instance
Scope -- ^ scope over which the value is bound
(Maybe Span) -- ^ span of entire binding
It is used to annotate symbols in the .hie files with some extra information on
the context in which they occur and should be fairly self explanatory. You need
to select one that looks appropriate for the symbol usage. In very rare cases,
you might need to extend this sum type if none of the cases seem appropriate.
If you select one that corresponds to a binding site, you will need to
provide a `Scope` and a `Span` for your binding. Both of these are basically
The `SrcSpan` in the `Scope` is supposed to span over the part of the source
where the symbol can be legally allowed to occur. For more details on how to
calculate this, see Note [Capturing Scopes and other non local information]
The binding `Span` is supposed to be the span of the entire binding for
For a function definition `foo`:
foo x = x + y
where y = x^2
This is the span of the entire function definition from `foo x` to `x^2`.
For a class definition, this is the span of the entire class, and so on.
If this isn't well defined for your bit of syntax (like a variable bound by
a lambda), then you can just supply a `Nothing`
There is a test that checks that all symbols in the resulting HIE file
occur inside their stated `Scope`. This can be turned on by passing the
-fvalidate-ide-info flag to ghc along with -fwrite-ide-info to generate the
You may also want to provide a test in testsuite/test/hiefile that includes
a file containing your new construction, and tests that the calculated scope
is valid (by using -fvalidate-ide-info)
For subtrees in the AST that may contain symbols, the procedure is fairly
straightforward. If you are extending the GHC AST, you will need to provide a
`ToHie` instance for any new types you may have introduced in the AST.
Here are is an extract from the `ToHie` instance for (LHsExpr (GhcPass p)):
toHie e@(L mspan oexpr) = concatM $ getTypeNode e : case oexpr of
HsVar _ (L _ var) ->
[ toHie $ C Use (L mspan var)
-- Patch up var location since typechecker removes it
HsConLikeOut _ con ->
[ toHie $ C Use $ L mspan $ conLikeName con
HsApp _ a b ->
[ toHie a
, toHie b
If your subtree is `Located` or has a `SrcSpan` available, the output list
should contain a HieAst `Node` corresponding to the subtree. You can use
either `makeNode` or `getTypeNode` for this purpose, depending on whether it
makes sense to assign a `Type` to the subtree. After this, you just need
to concatenate the result of calling `toHie` on all subexpressions and
appropriately annotated symbols contained in the subtree.
If your subtree doesn't have a span available, you can omit the `makeNode`
call and just recurse directly in to the subexpressions.
I can clarify any remaining questions you might have. If you are satisfied
with this write up, I can proceed to add it as a Note to HieAst.
Hope this helps,
On 19/07/22 18:09, Richard Eisenberg wrote:
> Hi devs,
> I recently learned about the code for HIE files. This is quite a substantial new development in GHC, judging from the amount of code. I understand broadly why it's here, but I'd like to learn more specifics. For example, if I'm adding a new bit of syntax, how should I update HIE generation? What if I'm changing a bit of syntax? Is there a primer to all this?
ghc-devs mailing list
This has been very helpful -- thanks! There are a few questions that I have, but this email contains the general info I'm looking for, and I think going through the normal review process will be better than creating an email chain here. My one overall suggestion is that it might be most useful for the Note to begin with instructions to those who edit the GHC AST -- that will be your main audience. Then, perhaps, someone will need to extend the HIE AST as well, and there should be further instructions.
I have made #16975 (https://gitlab.haskell.org/ghc/ghc/issues/16975) to track progress of adding this Note.
ghc-devs mailing list
|Free forum by Nabble||Edit this page|