Resolving dependencies in Elm 0.19 package projects


#1

I am the author of an Elm language plugin, and I have run into an obstacle when implementing go-to-declaration for Elm 0.19 package projects. Imagine that you are developing an Elm package (not an application) and want to go to the declaration of a function provided by one of your dependencies. That dependency will exist somewhere in ~/.elm/0.19.0/package/. However, there may be multiple versions of the same package installed (e.g. if you are using it in another project). Which version of the dependency should the language plugin search for on disk? You can’t just use the elm.json file as–unlike in application projects–the version of each dependency is specified as a constraint of the form 1.0.0 <= v < 2.0.0. There may be multiple versions on disk that satisfy that constraint.

Selecting the correct version requires solving the set of dependency constraints. The Elm compiler already does this whenever you compile your package. It would be nice if there was a way to get that information from the compiler. In Elm 0.18, elm-stuff/exact-dependencies.json provided all of the information that a plugin author would need. But in Elm 0.19 we don’t have that anymore.

Would it be possible to have the Elm compiler provide this information?

Possible solutions:

Option A: store exact dependency info on disk

Basically, do what Elm 0.18 did. For all direct and indirect dependencies, write out the name of the dependency and the version. The Elm compiler must keep this up-to-date whenever a dependency is installed, upgraded or deleted.

Option B: ask Elm CLI to list the exact dependencies

Add a new CLI command to the elm binary which dumps the dependency info to stdout. For example, one might run elm metadata and it would emit JSON to stdout similar to Elm 0.18’s exact-dependencies.json.

One nice thing about this approach is that since the data is not persisted and cannot be stale, it can include additional things that may depend on the version of the elm binary. For instance, you could include information like the path to the dependency’s manifest (e.g. ~/.elm/0.19.0/package/foo/1.0.0/elm.json). It also allows for the output schema to be versioned (the caller could provide the schema version that it knows how to interpret).

The biggest downside is that it adds clutter to the Elm binary CLI which most people wouldn’t need.

Prior Art

Node/npm

npm keeps a “lock” file on disk which includes the exact dependencies. This file supplements package.json which uses version constraints.

Rust

Rust’s package manager, Cargo, implemented option B. You run cargo metadata from the command-line and it emits structured information. (docs)

One notable difference is that cargo metadata returns the full dependency graph, but I think a flat list similar to Elm 0.18’s exact-dependencies.json would be sufficient for now.


That’s the extent of the research that I’ve done so far. I would be happy to look into it more if that would be useful.


#2

It would be awesome if there was more information available for tooling or the language server protocol directly baked into elm.

You could try to implement the way the elm compiler resolves dependencies and find a solution for the constraints in an identical manner.
Sources: https://github.com/elm/compiler/blob/6086fd18f8be05cbd4be1938d258f167e650321d/builder/src/Deps/Verify.hs

I recently started gathering information for “jump to definition” of the work in progress elm language server implementation. https://github.com/elm-tooling/elm-language-server/issues/10#issuecomment-440885071


#3

Thank you for writing it up this way, that’s really helpful!

Option B seems more promising to me. Can you share a code block with an example of exactly what you’d like to elm metadata to produce on stdout? And @klazuka, if you had a wishlist of what elm metadata would output besides this dependency info, what would it include? If you think of anything, maybe make a second example of the ideal output?


#4

Similarly in Haskell, Cabal nowadays generates plan.json that contains all information for building a package set. https://hackage.haskell.org/package/cabal-plan


#5

This sort of thing would also be helpful for build tools that run in isolated environments and therefore must prefetch all dependencies. If a command like elm metadata could produce a json document describing exact versions + checksums of all dependencies that should be prefetched, that would be awesome.

Here are two examples of the json I’ve used in nix expressions when prefetching elm dependencies:

  1. elm itself
  2. elbum

And the nix functions that process that json are fetchElmDeps and makeDotElm.


#6

Minor correction: those two links aren’t actually json, they’re the nix language; the differences are purely cosmetic though and nix can easily work with actual json. :slight_smile:


#7

@evancz I propose that a new “metadata” command be added to the Elm CLI. When invoked, it will:

  • look for an elm.json file in current working directory
  • compute the project’s exact dependencies, including direct, indirect, and test dependencies
  • print a human-readable list of those dependencies and versions to stdout

The human-readable output would be similar to npm ls. For example:

$ elm metadata
Project Dependencies
└─┬ elm/core@1.0.2
  ├ elm/json@1.0.0
  | └ elm/core@1.0.2 (deduped)
  ├ elm/html@1.0.0
  | ├ elm/virtual-dom@1.0.2
  | ├ elm/json@1.0.0 (deduped)
  | └ elm/core@1.0.2 (deduped)
  └ elm/time@1.0.0
    └ elm/core@1.0.2 (deduped)

Additional Test Dependencies
└── elm-explorations/test@1.0.0
    └ elm/random@1.0.0

Aside: one complication is that everything depends on elm/core, so if we show the true graph there would be a lot of noise. We could filter out elm/core from the human-readable output, but that would be a lie.

Machine-readable Output

A --report=json option will cause it to instead output the summary in JSON format. This option would be intended for Elm language plugins and other tools, and backwards compatibility would, ideally, be maintained.

The bare minimum output would look something like this:

{
    "elm/browser": "1.0.1",
    "elm/core": "1.0.2",
    "elm/html": "1.0.0",
    "elm/json": "1.1.2",
    "elm/time": "1.0.0",
    "elm/url": "1.0.0",
    "elm/virtual-dom": "1.0.2"
}

But I think we should aim a little higher in order to:

  • provide some wiggle room for backwards-compatible changes
  • include enough information so that the caller does not need to read any elm.json files
  • make implicit things explicit (locations of packages on disk)

So I propose the following:

{
    "project": {
        "type": "package",                // either "application" or "package"
        "source-directories": [ "src" ],  // "application" projects only
        "name": "klazuka/elm-foo",        // "package" projects only
        "version": "1.0.1",               // "package" projects only
        "path": "~/dev/elm-foo",
        "elm-version": "0.19.0",
        "direct-dependencies": [ "elm/core", "elm/html" ],
        "direct-test-dependencies": [ "elm-explorations/test" ]
    },
    "packages": [
        {
            "name": "elm/core",
            "version": "1.0.2",
            "path": "~/.elm/0.19.0/package/elm/core/1.0.2",
            "exposed-modules": [ "Basics", "List", "String" /* , ... */ ],
            "dependencies": [],
            "test-dependencies": []
        },
        {
            "name": "elm/html",
            "version": "1.0.0",
            "path": "~/.elm/0.19.0/package/elm/html/1.0.0",
            "exposed-modules": [ "Html", "Html.Attributes", "Html.Events" ],
            "dependencies": [ "elm/core", "elm/json", "elm/virtual-dom" ],
            "test-dependencies": []
        },
        {
            "name": "elm/json",
            "version": "1.0.0",
            "path": "~/.elm/0.19.0/package/elm/json/1.0.0",
            "exposed-modules": [ "Json.Decode", "Json.Encode" ],
            "dependencies": [ "elm/core" ],
            "test-dependencies": []
        },
        {
            "name": "elm/virtual-dom",
            "version": "1.0.2",
            "path": "~/.elm/0.19.0/package/elm/virtual-dom/1.0.2",
            "exposed-modules": [ "VirtualDom" ],
            "dependencies": [ "elm/core", "elm/json" ],
            "test-dependencies": []
        },
        {
            "name": "elm-explorations/test",
            "version": "1.0.0",
            "path": "~/.elm/0.19.0/package/elm-explorations/test/1.0.2",
            "exposed-modules": [ "Test", "Test.Runner", "Expect" /* , ... */ ],
            "dependencies": ["elm/core", "elm/random"],
            "test-dependencies": []
        },
        {
            "name": "elm/random",
            "version": "1.0.0",
            "path": "~/.elm/0.19.0/package/elm/random/1.0.0",
            "exposed-modules": [ "Random" ],
            "dependencies": ["elm/core", "elm/time"],
            "test-dependencies": []
        }
    ]
}

The idea is that since the Elm binary will be crawling the dependency graph and doing a bunch of file I/O, we may as well collect all of the information from the elm.json files and include it in the output. This way the caller (i.e. a language plugin) can make a single call to elm metadata and obtain all of the information it needs to locate modules relative to a given source file. It also decouples Elm tools from both the elm.json format and the directory structure inside the global Elm package cache.

Aside: When presenting the human-readable dependency graph, there is no need to show the test-dependencies of your project’s dependencies (because no one cares). However, in the JSON output, the test-dependencies of each package must be included so that things like go-to-declaration can work. For example, I might open in my editor the tests for one of my project’s dependencies. That test will import some modules, and we need to know which packages provide those modules.

Bonus Information

Additional fields could be added to the package object.

  • sha256 hash
    • this was suggested by @jerith in order to verify package integrity
    • I’m not sure where this comes from (maybe it’s the Git tag sha?)
  • package source-directories
    • Elm 0.19 implicitly assumes that there is a single source-directory for a package and it is called “src”
    • we could include a field like "source-directories": [ "src" ] to make this explicit
    • if we were to include it, then the project section should also include source-directories for package projects
    • this is probably a bad idea, but I wanted to put it out there

Complications

It’s possible that an elm.json file has been modified without actually running elm install. In which case, a stated dependency may not actually exist on disk. Similarly, if the user installed some Elm packages using a custom ELM_HOME environment variable, but now runs elm metadata without that environment variable set, the Elm CLI will be unable to locate the installed dependencies. The latter situation may be particularly likely in the case where an Elm language plugin or other tool is running elm metadata from a non-shell environment and thus ELM_HOME may not be set.

In such cases the Elm CLI should print a message to stderr and exit with an error code.

Schema Compatibility

Seeing as how Elm is not yet a 1.0 product, it’s probably sufficient to assume that any changes to the output will be purely additive without breaking backward compatibility. JSON is good at this. Once Elm begins to make compatibility guarantees, this should be revisited so that (1) the caller can specify the schema version that it can handle and (2) the output includes a field describing the schema version that it was written with.

Naming

Calling the CLI command “metadata” is based on a similar command in Rust’s cargo package manager. It seems like a good idea to choose a suitably broad name so that if other tooling needs arise, it can be shoehorned into this command. Alternative names to consider: “project”, “structure”, “info”, or “plan” (following Cabal)…

But seeing as how the above proposal is primarily about dependencies, it may be better to consider more concrete names such as “dependencies” or “build-plan”.


#8

Having seen a few languages go through this phase, I think this is a very sensible way of going forward.

I’d like to add:

a) Please do include sha256, in Nix by design everything is checksum’d and in my current implementation of elm2nix I have to download all tarballs to replicate the work Elm already knows

b) specify the format (jsonschema, or just plain document would do) since it removes ambiguities in the future

c) ideally, Elm itself could use the format to perform the building, that way you ensure that at least one tooling can use the information to perform the build


#9

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.