Hadrian Transitive Dependencies

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Hadrian Transitive Dependencies

David Eichmann

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann


-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com

Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

Andrey Mokhov

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

·        Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

·        Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

·        Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [mailto:[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell <[hidden email]>; Andrey Mokhov <[hidden email]>; GHC developers <[hidden email]>
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

GHC - devs mailing list
In reply to this post by David Eichmann

The underlying think I don’t understand is this:

 

  • I assume that the shared cloud cache (SCC) maps input files + command line to outputs.

 

  • To have a SCC we really have to list all the input files that a compilation step consults – and that may be many more than the direct imports of the module.   If we compile M which imports A, we will read A.hi; but then we may now have to read B.hi and so on.  Let’s say that M has a deep dependency on B.hi.

 

Listing all the inputs is a soundness issue: if the SCC is simply a cached map from those inputs to outputs, then missing out an input would be fatal (i.e. unsound).  It’s sound to list too many inputs, but doing will reduce the usefulness of the cache.

 

  • In contrast, to get a correct Hadrian build, it suffices to list all the direct imports (= shallow depdendencies) of the thing being compiled.  We’ll bring those up to date and, by implication, all the things it depends on will now also be up to date.

 

Listing only the direct imports is much less onerous; that’s what ghc -M does.

 

  • Early cutoff has something in common with the SCC.   E.g. if we compile A and produce an identical A.hi, we still need to recompile M because B.hi has changed.   GHC already accommodated this by putting B.hi’s fingerprint in A.hi, so if B.hi changes then so will A.hi.

 

So maybe we don’t need to record those transitive dependencies in the SCC?

 

It would be extremely onerous to write Hadrian code to make all deep dependencies explicit.  Do we really need to?

 

Simon

 

From: ghc-devs <[hidden email]> On Behalf Of David Eichmann
Sent: 27 March 2019 11:54
To: Neil Mitchell <[hidden email]>; Andrey Mokhov <[hidden email]>; GHC developers <[hidden email]>
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

GHC - devs mailing list
In reply to this post by Andrey Mokhov

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

Well, ok, then we need a new definition!  Suppose

  • Module M imports A
  • A.hi mentions (in a type or unfolding) B.T
  • Hence, to find out about B.T we need to read B.hi

So you would say that B is a direct dependency of M.   Fine.  I’ll call B.hi a “deep dependency” of M, but A.hi is a “shallow dependency” of M.

 

But it’s very hard to predict deep dependencies.  Whether M deep-depends on B.hi depends on

  • the contents of A.hi
  • the optimisation level at which M is compiled
  • perhaps even the contents of some apparently unrelated X.hi, directly imported by M, which influences the extent to which GHC’s optimiser dives into A’s exprots

 

So no ghc -M is going to do the job.  To get the real answer you must compile M!

 

Happily,

  • we already list the deep dependencies in M.hi
  • and, as I mentioned earlier, if a deep dependency changes (such as B.hi) then GHC guarantees that a shallow dependency (such as A.hi) will change.

 

Does that help?

 

Simon

 

 

 

 

 

 

From: ghc-devs <[hidden email]> On Behalf Of Andrey Mokhov
Sent: 27 March 2019 15:06
To: David Eichmann <[hidden email]>; Neil Mitchell <[hidden email]>
Cc: GHC developers <[hidden email]>
Subject: RE: Hadrian Transitive Dependencies

 

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

  • Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

  • Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

  • Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell <[hidden email]>; Andrey Mokhov <[hidden email]>; GHC developers <[hidden email]>
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Hadrian Transitive Dependencies

Alp Mestanogullari-2
In reply to this post by Andrey Mokhov

https://gitlab.haskell.org/dashboard/issues?scope=all&utf8=%E2%9C%93&state=opened&search=%22ghc+-M%22 seems to suggest this never made it into a ticket. I searched in the Hadrian github repo as well, no luck there either. It certainly came up in various discussions I've had though.

On 27/03/2019 16:05, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

·        Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

·        Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

·        Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
Alp Mestanogullari, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/

Registered in England and Wales, OC335890
118 Wymering Mansions, Wymering Road, London, W9 2NF, England

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Hadrian Transitive Dependencies

Matthew Pickering
I also remember this discussion.. I think it was in the context of optimising hadrian. 

On Wed, 27 Mar 2019, 15:56 Alp Mestanogullari, <[hidden email]> wrote:

https://gitlab.haskell.org/dashboard/issues?scope=all&utf8=%E2%9C%93&state=opened&search=%22ghc+-M%22 seems to suggest this never made it into a ticket. I searched in the Hadrian github repo as well, no luck there either. It certainly came up in various discussions I've had though.

On 27/03/2019 16:05, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

·        Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

·        Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

·        Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
Alp Mestanogullari, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/

Registered in England and Wales, OC335890
118 Wymering Mansions, Wymering Road, London, W9 2NF, England
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Hadrian Transitive Dependencies

David Eichmann
In reply to this post by Andrey Mokhov

Hello,

To reiterate some definitions consider this scenario:

  • A.hs imports B.hs and B.hs imports C.hs
  • `ghc -M A.hs` reports that A.o depends on: A.hs, B.hi
  • `ghc -c A.hs` produces A.o and accesses A.hs, B.hi, and C.hi

There seems to be some confusion about the term "Direct Dependency" I'll use these definitions:

"Shallow Dependency": With respect to a haskell object file X.o, the shallow dependencies are the source file X.hs and interface files Y.hi for all modules Y imported by X.

  • These are the dependencies of X.o as reported by `ghc -M X.hs`
  • In the above scenario:
    • A.o depends on: A.hs, B.hi

"Deep Dependency": With respect to a haskell object file X.o, the deep dependencies are all hi files required by ghc to build X.o excluding direct dependencies:

  • This is a subset of modules transitively imported by X
  • These dependencies are NOT reported by `ghc -M X.hs`

"Direct Dependency": if the command to create file X accesses file Y, then X directly depends on Y (= Y is a direct dependency of X).

  • In the above scenario:
    • A.o directly depends on: A.hs, B.hi, and C.hi
  • SPJ noted that .hi files list direct dependencies.
  • The direct dependencies of a haskell object file is the union of its shallow and deep dependencies.

"Direct Output": All files created by a rule.

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared" (where this agrees with the definitions above) can we do the following for the build rule of a haskell object file X.o?

  1. `need` the shallow dependencies as reported by `ghc -M`. This guarantees that all shallow and deep dependencies (i.e. all direct dependencies) are built.
  2. build X.o and X.hi
  3. Inspect X.hi to derive the direct dependencies (and hence deep dependencies)
  4. `needed` the deep dependencies

Is there already an easy way to inspect *.hi files in this way? Is this use of `needed` valid?

- David E


On 3/27/19 3:05 PM, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

·        Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

·        Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

·        Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com

Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

GHC - devs mailing list

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared"

 

But I question that assumption.   As I mentioned, with GHC at least, the if a deep dependency changes then one of the shallow dependencies will change.  So I claim that even for cloud build it should be enough to depend only on shallow dependencies.

 

This is only true because GHC offers this guarantee.  We’d need to be sure that every deep dependency was either ‘needed’ or was reflected in the contents (perhaps via a fingerprint) another ‘needed’ thing.

 

Simon

 

From: ghc-devs <[hidden email]> On Behalf Of David Eichmann
Sent: 27 March 2019 17:12
To: Andrey Mokhov <[hidden email]>; Neil Mitchell <[hidden email]>
Cc: GHC developers <[hidden email]>
Subject: Re: Hadrian Transitive Dependencies

 

Hello,

To reiterate some definitions consider this scenario:

  • A.hs imports B.hs and B.hs imports C.hs
  • `ghc -M A.hs` reports that A.o depends on: A.hs, B.hi
  • `ghc -c A.hs` produces A.o and accesses A.hs, B.hi, and C.hi

There seems to be some confusion about the term "Direct Dependency" I'll use these definitions:

"Shallow Dependency": With respect to a haskell object file X.o, the shallow dependencies are the source file X.hs and interface files Y.hi for all modules Y imported by X.

  • These are the dependencies of X.o as reported by `ghc -M X.hs`
  • In the above scenario:
    • A.o depends on: A.hs, B.hi

"Deep Dependency": With respect to a haskell object file X.o, the deep dependencies are all hi files required by ghc to build X.o excluding direct dependencies:

  • This is a subset of modules transitively imported by X
  • These dependencies are NOT reported by `ghc -M X.hs`

"Direct Dependency": if the command to create file X accesses file Y, then X directly depends on Y (= Y is a direct dependency of X).

  • In the above scenario:
    • A.o directly depends on: A.hs, B.hi, and C.hi
  • SPJ noted that .hi files list direct dependencies.
  • The direct dependencies of a haskell object file is the union of its shallow and deep dependencies.

"Direct Output": All files created by a rule.

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared" (where this agrees with the definitions above) can we do the following for the build rule of a haskell object file X.o?

  1. `need` the shallow dependencies as reported by `ghc -M`. This guarantees that all shallow and deep dependencies (i.e. all direct dependencies) are built.
  2. build X.o and X.hi
  3. Inspect X.hi to derive the direct dependencies (and hence deep dependencies)
  4. `needed` the deep dependencies

Is there already an easy way to inspect *.hi files in this way? Is this use of `needed` valid?

- David E

 

On 3/27/19 3:05 PM, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

  • Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

  • Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

  • Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

Andrey Mokhov
In reply to this post by David Eichmann
Simon's insight is great: if deep dependencies are captured by shallow dependencies then the cloud build system is correct even if only direct shallow inputs are tracked.

That's a very non-trivial invariant, and I guess this means we can't rely on fsatrace linting for GHC compilation rules, because all deep dependencies will be reported as untracked. 

Cheers, 
Andrey

On 27 Mar 2019 18:27, Simon Peyton Jones <[hidden email]> wrote:

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared"

 

But I question that assumption.   As I mentioned, with GHC at least, the if a deep dependency changes then one of the shallow dependencies will change.  So I claim that even for cloud build it should be enough to depend only on shallow dependencies.

 

This is only true because GHC offers this guarantee.  We’d need to be sure that every deep dependency was either ‘needed’ or was reflected in the contents (perhaps via a fingerprint) another ‘needed’ thing.

 

Simon

 

From: ghc-devs <[hidden email]> On Behalf Of David Eichmann
Sent: 27 March 2019 17:12
To: Andrey Mokhov <[hidden email]>; Neil Mitchell <[hidden email]>
Cc: GHC developers <[hidden email]>
Subject: Re: Hadrian Transitive Dependencies

 

Hello,

To reiterate some definitions consider this scenario:

  • A.hs imports B.hs and B.hs imports C.hs
  • `ghc -M A.hs` reports that A.o depends on: A.hs, B.hi
  • `ghc -c A.hs` produces A.o and accesses A.hs, B.hi, and C.hi

There seems to be some confusion about the term "Direct Dependency" I'll use these definitions:

"Shallow Dependency": With respect to a haskell object file X.o, the shallow dependencies are the source file X.hs and interface files Y.hi for all modules Y imported by X.

  • These are the dependencies of X.o as reported by `ghc -M X.hs`
  • In the above scenario:
    • A.o depends on: A.hs, B.hi

"Deep Dependency": With respect to a haskell object file X.o, the deep dependencies are all hi files required by ghc to build X.o excluding direct dependencies:

  • This is a subset of modules transitively imported by X
  • These dependencies are NOT reported by `ghc -M X.hs`

"Direct Dependency": if the command to create file X accesses file Y, then X directly depends on Y (= Y is a direct dependency of X).

  • In the above scenario:
    • A.o directly depends on: A.hs, B.hi, and C.hi
  • SPJ noted that .hi files list direct dependencies.
  • The direct dependencies of a haskell object file is the union of its shallow and deep dependencies.

"Direct Output": All files created by a rule.

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared" (where this agrees with the definitions above) can we do the following for the build rule of a haskell object file X.o?

  1. `need` the shallow dependencies as reported by `ghc -M`. This guarantees that all shallow and deep dependencies (i.e. all direct dependencies) are built.
  2. build X.o and X.hi
  3. Inspect X.hi to derive the direct dependencies (and hence deep dependencies)
  4. `needed` the deep dependencies

Is there already an easy way to inspect *.hi files in this way? Is this use of `needed` valid?

- David E

 

On 3/27/19 3:05 PM, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

  • Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

  • Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

  • Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Hadrian Transitive Dependencies

David Eichmann

Ah! I see. This is a bit disappointing as it reduces the utility of fsatrace linting: the programmer is forced to decide if shallow dependencies are sufficient (changes in deep dependencies always change shallow dependencies). Hopeful similar scenarios are rare. Perhaps the best step forward is to simply silence the linting using `trackAllow ["//*.hi"]` for haskell object rules. Then I can continue tracking down other missing dependencies in Hadrian with fsatrace linting.

On 3/27/19 5:38 PM, Andrey Mokhov wrote:
Simon's insight is great: if deep dependencies are captured by shallow dependencies then the cloud build system is correct even if only direct shallow inputs are tracked.

That's a very non-trivial invariant, and I guess this means we can't rely on fsatrace linting for GHC compilation rules, because all deep dependencies will be reported as untracked. 

Cheers, 
Andrey

On 27 Mar 2019 18:27, Simon Peyton Jones [hidden email] wrote:

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared"

 

But I question that assumption.   As I mentioned, with GHC at least, the if a deep dependency changes then one of the shallow dependencies will change.  So I claim that even for cloud build it should be enough to depend only on shallow dependencies.

 

This is only true because GHC offers this guarantee.  We’d need to be sure that every deep dependency was either ‘needed’ or was reflected in the contents (perhaps via a fingerprint) another ‘needed’ thing.

 

Simon

 

From: ghc-devs [hidden email] On Behalf Of David Eichmann
Sent: 27 March 2019 17:12
To: Andrey Mokhov [hidden email]; Neil Mitchell [hidden email]
Cc: GHC developers [hidden email]
Subject: Re: Hadrian Transitive Dependencies

 

Hello,

To reiterate some definitions consider this scenario:

  • A.hs imports B.hs and B.hs imports C.hs
  • `ghc -M A.hs` reports that A.o depends on: A.hs, B.hi
  • `ghc -c A.hs` produces A.o and accesses A.hs, B.hi, and C.hi

There seems to be some confusion about the term "Direct Dependency" I'll use these definitions:

"Shallow Dependency": With respect to a haskell object file X.o, the shallow dependencies are the source file X.hs and interface files Y.hi for all modules Y imported by X.

  • These are the dependencies of X.o as reported by `ghc -M X.hs`
  • In the above scenario:
    • A.o depends on: A.hs, B.hi

"Deep Dependency": With respect to a haskell object file X.o, the deep dependencies are all hi files required by ghc to build X.o excluding direct dependencies:

  • This is a subset of modules transitively imported by X
  • These dependencies are NOT reported by `ghc -M X.hs`

"Direct Dependency": if the command to create file X accesses file Y, then X directly depends on Y (= Y is a direct dependency of X).

  • In the above scenario:
    • A.o directly depends on: A.hs, B.hi, and C.hi
  • SPJ noted that .hi files list direct dependencies.
  • The direct dependencies of a haskell object file is the union of its shallow and deep dependencies.

"Direct Output": All files created by a rule.

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared" (where this agrees with the definitions above) can we do the following for the build rule of a haskell object file X.o?

  1. `need` the shallow dependencies as reported by `ghc -M`. This guarantees that all shallow and deep dependencies (i.e. all direct dependencies) are built.
  2. build X.o and X.hi
  3. Inspect X.hi to derive the direct dependencies (and hence deep dependencies)
  4. `needed` the deep dependencies

Is there already an easy way to inspect *.hi files in this way? Is this use of `needed` valid?

- David E

 

On 3/27/19 3:05 PM, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

  • Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

  • Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

  • Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com

Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

GHC - devs mailing list

This is a bit disappointing

 

But it’s also Absolutely Great because it means that zillions of very-hard-to-predict dependencies don’t need to be explicitly ‘needed’.  

 

Perhaps the remaining un-tracked dependencies will be fewer and easier to nail?

 

Simon

 

From: David Eichmann <[hidden email]>
Sent: 27 March 2019 18:04
To: Andrey Mokhov <[hidden email]>; Simon Peyton Jones <[hidden email]>
Cc: Neil Mitchell <[hidden email]>; GHC developers <[hidden email]>
Subject: Re: Hadrian Transitive Dependencies

 

Ah! I see. This is a bit disappointing as it reduces the utility of fsatrace linting: the programmer is forced to decide if shallow dependencies are sufficient (changes in deep dependencies always change shallow dependencies). Hopeful similar scenarios are rare. Perhaps the best step forward is to simply silence the linting using `trackAllow ["//*.hi"]` for haskell object rules. Then I can continue tracking down other missing dependencies in Hadrian with fsatrace linting.

On 3/27/19 5:38 PM, Andrey Mokhov wrote:

Simon's insight is great: if deep dependencies are captured by shallow dependencies then the cloud build system is correct even if only direct shallow inputs are tracked.

 

That's a very non-trivial invariant, and I guess this means we can't rely on fsatrace linting for GHC compilation rules, because all deep dependencies will be reported as untracked. 

 

Cheers, 

Andrey

 

On 27 Mar 2019 18:27, Simon Peyton Jones [hidden email] wrote:

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared"

 

But I question that assumption.   As I mentioned, with GHC at least, the if a deep dependency changes then one of the shallow dependencies will change.  So I claim that even for cloud build it should be enough to depend only on shallow dependencies.

 

This is only true because GHC offers this guarantee.  We’d need to be sure that every deep dependency was either ‘needed’ or was reflected in the contents (perhaps via a fingerprint) another ‘needed’ thing.

 

Simon

 

From: ghc-devs [hidden email] On Behalf Of David Eichmann
Sent: 27 March 2019 17:12
To: Andrey Mokhov [hidden email]; Neil Mitchell [hidden email]
Cc: GHC developers [hidden email]
Subject: Re: Hadrian Transitive Dependencies

 

Hello,

To reiterate some definitions consider this scenario:

  • A.hs imports B.hs and B.hs imports C.hs
  • `ghc -M A.hs` reports that A.o depends on: A.hs, B.hi
  • `ghc -c A.hs` produces A.o and accesses A.hs, B.hi, and C.hi

There seems to be some confusion about the term "Direct Dependency" I'll use these definitions:

"Shallow Dependency": With respect to a haskell object file X.o, the shallow dependencies are the source file X.hs and interface files Y.hi for all modules Y imported by X.

  • These are the dependencies of X.o as reported by `ghc -M X.hs`
  • In the above scenario:

·      

    • A.o depends on: A.hs, B.hi

"Deep Dependency": With respect to a haskell object file X.o, the deep dependencies are all hi files required by ghc to build X.o excluding direct dependencies:

  • This is a subset of modules transitively imported by X
  • These dependencies are NOT reported by `ghc -M X.hs`

"Direct Dependency": if the command to create file X accesses file Y, then X directly depends on Y (= Y is a direct dependency of X).

  • In the above scenario:

·      

    • A.o directly depends on: A.hs, B.hi, and C.hi
  • SPJ noted that .hi files list direct dependencies.
  • The direct dependencies of a haskell object file is the union of its shallow and deep dependencies.

"Direct Output": All files created by a rule.

With that in mind, and considering a cloud build system where "all direct inputs and direct outputs must be declared" (where this agrees with the definitions above) can we do the following for the build rule of a haskell object file X.o?

  1. `need` the shallow dependencies as reported by `ghc -M`. This guarantees that all shallow and deep dependencies (i.e. all direct dependencies) are built.
  2. build X.o and X.hi
  3. Inspect X.hi to derive the direct dependencies (and hence deep dependencies)
  4. `needed` the deep dependencies

Is there already an easy way to inspect *.hi files in this way? Is this use of `needed` valid?

- David E

 

On 3/27/19 3:05 PM, Andrey Mokhov wrote:

Hi David,

 

We had a discussion about this with Neil some time ago, and I think we had the following list of progressively more complex invariants for different types of build systems:

 

  • Non-cloud build systems: *all direct inputs must be declared*. If you miss a direct input dependency then a build may complete successfully but with an incorrect result.

 

  • Cloud build systems: *all direct inputs and direct outputs must be declared*. If you miss a direct output then a build may fail because the cloud will not be able to restore the corresponding output.

 

  • Cloud build systems with shallow (deferred) materialisation of build artefacts: *all transitive inputs and direct outputs must be declared*. Let’s say you’d like to download the resulting GHC binary directly, without materialising any intermediate artefacts. Then you’ll need to know GHC’s ultimate transitive inputs.

 

I think for now we are really keen to make Hadrian a cloud build system, but whether shallow builds are valuable enough is not clear. Maybe not. Therefore, I’d say we don’t need to track transitive inputs right now. Furthermore, if we were to track all transitive inputs, we would lose the desirable early cutoff property, which prevents rebuilding after adding a comment in a file on which a lot of other files transitively depend on.

 

Having said that, if we really access a file during compilation, then I think it is *not* a transitive dependency by definition! Any file which is accessed during a build rule is a direct dependency.

 

> GHC is reading *.hi files that are not reported as dependencies by

> `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive

> dependencies!

 

So, here I’m confused. If we read a file A when compiling a file B, then it’s by definition a direct dependency. Perhaps we just read too much? Maybe the solution is to switch to fine-grained `ghc -M` mode, to analyse import dependencies for a single module instead of doing it transitively, which I believe was discussed in a ticket some time ago? I can’t find this ticket, but I think Alp was looking into it at some point. Alp: do you remember it?

 

Thank you for all your work on Hadrian!

 

Cheers,

Andrey

 

From: David Eichmann [[hidden email]]
Sent: 27 March 2019 12:54
To: Neil Mitchell [hidden email]; Andrey Mokhov [hidden email]; GHC developers [hidden email]
Subject: Hadrian Transitive Dependencies

 

Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:

1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi
_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags
Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:
  Used:  _build/HEAD_default/stage0/lib/settings
  Used:  _build/HEAD_default/stage0/lib/platformConstants
  Used:  _build/HEAD_default/stage0/lib/llvm-targets
  Used:  _build/HEAD_default/stage0/lib/llvm-passes
  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi
  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:

* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 
-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

 

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com
 
Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Hadrian Transitive Dependencies

Andrey Mokhov

Simon:

 

> Perhaps the remaining un-tracked dependencies will be fewer and easier to nail?

 

There are a few other cases where we `need` only a subset of direct dependencies that "cover" all others. For example, `setup-config` files produced when configuring a package are used to signal the completion of package configuration, which might produce a few other hard-to-predict files, such as “build/include/HsBaseConfig.h”.

 

I just realised that if one wants to have “hermetic builds”, where build commands are executed in a sandbox, then it is absolutely necessary to declare all direct dependencies without exception, because they will need to be made available in the sandbox for the build command to succeed.

 

David:

 

> Perhaps the best step forward is to simply silence the linting

> using `trackAllow ["//*.hi"]` for haskell object rules. Then

> I can continue tracking down other missing dependencies in

> Hadrian with fsatrace linting.

 

Yes, this looks like the best approach for now.

 

Cheers,

Andrey


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Hadrian Transitive Dependencies

David Eichmann

I've started a wiki page to document some of this discussion https://gitlab.haskell.org/ghc/ghc/wikis/Developing-Hadrian. We'll also need to deal with ouputs as well as inputs (dependencies) soon. Hence I've expanded a bit on my understanding of rule *outputs* and added the terms "vital input" and "vital output" (does it make sense? is there a better name here?). I've slightly changed the definition of "shallow dependencies" so that by definition it is a (minimal) subset of direct dependencies such that "no change in the shallow dependencies -> no change in the vial output" The important part to take away is that rules must:

  • `need` all shallow dependencies.
  • `produces` all vital outputs (excluding the rule target).

David E

On 3/27/19 11:23 PM, Andrey Mokhov wrote:

Simon:

 

> Perhaps the remaining un-tracked dependencies will be fewer and easier to nail?

 

There are a few other cases where we `need` only a subset of direct dependencies that "cover" all others. For example, `setup-config` files produced when configuring a package are used to signal the completion of package configuration, which might produce a few other hard-to-predict files, such as “build/include/HsBaseConfig.h”.

 

I just realised that if one wants to have “hermetic builds”, where build commands are executed in a sandbox, then it is absolutely necessary to declare all direct dependencies without exception, because they will need to be made available in the sandbox for the build command to succeed.

 

David:

 

> Perhaps the best step forward is to simply silence the linting

> using `trackAllow ["//*.hi"]` for haskell object rules. Then

> I can continue tracking down other missing dependencies in

> Hadrian with fsatrace linting.

 

Yes, this looks like the best approach for now.

 

Cheers,

Andrey

-- 
David Eichmann, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com

Registered in England & Wales, OC335890
118 Wymering Mansions, Wymering Road, London W9 2NF, England 

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs