Skip to content
Edward Z. Yang edited this page Sep 29, 2015 · 4 revisions

Backpack in Cabal

This page describes the proposed modifications to Cabal and cabal-install in order to support Backpack, a new module system for Haskell.

It has a companion page on the GHC Trac: https://ghc.haskell.org/trac/ghc/wiki/Backpack

Motivation: Why does Backpack affect Cabal/cabal-install?

The bulk of the implementation of Backpack lives in GHC, which in principle allows us to write a specification for Backpack without any reference to Cabal. However, there are a few important points by which Backpack interacts with Cabal:

  1. Cabal must be told where this Backpack file lives (a backpack-file field) and must call GHC differently (--backpack) to actually compile it.

  2. Backpack supports "private libraries", which are components that are not visible outside the package, but need to be installed and can be depended upon. Cabal packages can specify and build multiple components, but currently only one component can be installed: the (unique) library component. To support Backpack, we must lift this restriction and support installing all components. (This also means the InstalledPackageId identifier must be appropriately adjusted.)

  3. Backpack supports installing a partially compiled package, which is only fully compiled later, after its missing dependencies are filled. First of all, these "indefinite" units must be installable (they have interface files, will be used for typechecking other indefinite units.) Secondly, if a unit is instantiated the same way twice, the resulting types and code should be shared; Cabal is responsible for ensuring that this sharing takes place (by installing each compiled unit in the database and taking the resulting object files and building a library for each unit built this way.)

Introduce ComponentInstanceId, replacing InstalledPackageId and PackageKey

PR https://github.com/haskell/cabal/pull/2846 replaces InstalledPackageId and PackageKey with a new identifier called ComponentInstanceId, which has the following properties:

  • It is computed per-component, and consists of a package name, package version, hash of the ComponentInstanceIds of the dependencies it is built against, and the name of the component. For example, "foo-0.1-abcdef" continues to identify the library of package foo-0.1, but "foo-0.1-123455-foo.exe" would identify the executable, and "foo-0.1-abcdef-bar" would identify a private sub-library named bar.

  • It is passed to GHC to be used for linker symbols and type equality (replacing PackageKey). So as far as GHC is concerned (prior to Backpack), this is the end-all be-all identifier.

Prior to Backpack, ComponentInstanceId is the primary key in the installed package database.

Make install paths determined per component

See https://github.com/haskell/cabal/issues/2836

Introduce indefinite/instantiated entries in the installed package database

Backpack requires two new types of entry in the installed package database:

  1. An installed indefinite unit is the result of typechecking/desugaring a Backpack unit which has holes. There's no generated Haskell object code associated with such a package (however, if there were any C files, those would be compiled and stored with the indefinite unit.) A ComponentInstanceId uniquely identifies an installed indefinite unit.

  2. An instantiated unit is the result of taking an indefinite unit and filling in all of its holes, so that we can now generate object code for the unit. In general, there may be multiple instantiations of a single indefinite unit, so a ComponentInstanceId DOES NOT uniquely identify such instantiated units. Instead, GHC's UnitKey (InstalledUnitId) is used to identify these.

When Cabal does dependency resolution, it looks at a subset of the database including installed indefinite units and installed units (with no holes): instantiated units are ignored (it is covered by the indefinite unit.) Put differently, as far as Cabal is concerned, ComponentInstanceIds are the primary key of the database, InstalledUnitIds don't exist (except insofar as much as Cabal is a build tool and has to know how to register units.) For GHC's part, when it is compiling packages, it only cares about InstalledUnitIds; however, when it is typechecking it cares about ComponentInstanceIds.

Instantiated units get special handling for install paths: their installation is rooted in the indefinite package's install dir, but with an extra directory hierarchy per InstalledUnitId.