Dependency solvers are algorithms taking as input the direct dependencies of an app or a package and computing all versions of all packages that will be necessary for the code to run (including indirect dependencies).
For quite some time now I’ve been working on implementations of a state of the art dependency solver called PubGrub, originally created by Natalie Weizenbaum for the Dart programming language. I used elm to prototype my first implementation of PubGrub, which resulted in elm-pubgrub, that I presented here 4 months ago. Beware that there is actually a bug in that implementation that I’ve not fixed yet (sorry).
But I have more plans for it, and to that end, I’ve reimplemented PubGrub in Rust this time! With the help of Jacob and Alex, we have improved everything from my initial draft and are working on making it a better alternative to the solver embedded in Cargo, Rust package manager.
Few days ago, I created an index, a registry of all 11079 published elm package versions (to date) between elm 0.14.0 and elm 0.19.1 with their direct dependencies, all in one file. And today, I’m proud to say that we can solve dependencies of all those 11079 package versions with pubgrub in less than 1 second! (0.842s with an i7-10750H) I’m now very confident that it will be a central piece for another project of mine, with hopefully more exciting news before Christmas!
Doing that analysis of elm packages also brought a lot of insight and some surprises. I’ll start with some statistics, followed by surprises (invalid packages in the elm registry).
Statistics on elm packages
It’s always fun to explore data and to share some insight on that data. So for this, I’ve generated CSV file containing the following fields per package version: id, author, package, version, elm-version, license, direct_dep_count, total_dep_count. Once you plug that data into a tool like vega data voyager, you can start exploring it and look for interesting patterns.
Here are some interesting plots, we can start with the number of package versions published per version of elm.
Now the most prolific authors, sorted by number of package versions published.
Here are the different licenses used in elm packages.
It is also interesting to note that most packages have a low count of direct and indirect dependencies.