Maybe dependencies should actually be checked into your repo?

In his Oslo Elm Day keynote, Richard Feldman talked about reducing dependencies for long-term robustness.

I’m thinking, what if the package manager just stored packages in your project folder instead of a system folder, and you checked them into Git?

Pros

  • Ensure old Elm apps can still be built from source 20+ years in the future
  • Reduce Elm’s susceptibility to a “leftpad incident”
  • Prevent package server outages from causing CI build failures
  • Reduce the load on the package server from CI setups that don’t cache packages
  • Since packages will only be downloaded once per project, even when switching computers or adding team members, maybe it would reduce to load enough that Elm wouldn’t need to rely on GitHub at all for hosting.
  • Keeping everything in a single folder instead of storing packages in a system folder reduces potential access permission issues, and makes it easier for beginners to understand

Cons

  • Increased repo size
  • Increased package server load from people who create many apps
    • I’m guessing the total load would still be less though. Also, Elm could still use a system wide cache in addition, if you really wanted to minimize server load.
  • Noisy Git diffs when updating dependencies
    • They’d be limited to the package folder though
    • Could even be considered a pro that you get diffs of internal package code, if you want to check them.

People who for some reason don’t want this behavior could just add the package folder to .gitignore

1 Like

I don’t know about this, I think I’d much prefer an offline mode by default if traffic is an issue. I’m also already sponsoring 10$/month for elm-related projects and I wouldn’t mind adding 1$ more to help finance a package server.
Versioning in elm works really well thanks to the enforced semantic versioning. And storing packages in a central location (which is not a system directory by the way, it’s in your home) in an immutable way is a very appropriate solution in my opinion (that’s also the approach of nixos).

1 Like

@mattpiz I’m not suggesting changing much else than the storage location though, it would still use semantic versioning and checksums for packages etc.
The main issue is that other people’s GitHub repos aren’t actually immutable, and neither is the Elm package server. They could change or disappear altogether over the years.

Does anyone know if you can build an Elm 0.16 or 0.17 project today? Are packages and tools still available?
Imagine you start working at a new place and they have an Elm 0.15 project that nobody has touched in 5 years. Your task is to change/fix something. Would you be helped if packages were included in the repository? My guess is yes, but I don’t know.

We used to commit all packages into the repo back in 0.18 when it was stored in elm-stuff for this exact reason. We also do it with the equivalent in PHP (vendor dir) in our backend projects.

1 Like

I like this idea for its practicality, despite not liking the idea of checking in dependencies.

Another way might be if we had a 3rd party package management tool - such a tool would:

  • Run on your local network but fetch from the central repo as its upstream.
  • Enforce immutability and tell you if it is violated upstream (not sure how you would resolve that though…).
  • Act as a local cache to reduce dependency on the central repo.
  • Allow you to have private packages inside your org.
  • Allow you to trial publish locally, before pushing upstream.

I’ve thought for a long time that this is something that Elm really needs.

Its a bit of a shame that the packaging tool is bundled in the compiler monolith. Would be better if they were separate cli tools, in my view.

1 Like

I strongly disagree. Having a separate cli tool for the package manager would add unnecessary complexity. Having the compiler and the package manager in the same program ensures that they work correctly together.

Another way might be if we had a 3rd party package management tool

Noooooo! I don’t want a third party package manager. Having one package manager that just works and is integrated with the rest of the compiler is a big advantage that Elm has.

1 Like

Its the unix way - one job one cli tool.

But it is turning to a disadvantage because there are no plans to change it. That is why it would be better if it were separate; then the community could develop something better.

@rupert A system like you’re talking about would certainly be nice for organizations to manage internal packages, and is more robust than relying on other people’s GitHub repos. Setting up and maintaining your own package server is not for everyone though, and thinking from the 20+ year perspective, it’s not as robust as keeping everything you need in the same place.

Couldn’t you have a script in your repo to set ELM_HOME to be inside to the repo and then check in your dependencies that way? At work we usually run a script when switching to work on another project that sets up a bunch of env variables specific to that project.

An idea I’ve kicked around from time to time is a language community that lives entirely in a single monorepo and does away with the whole idea of packages. It would be a great way of forcing all packages to play nicely with each other and also allow for plenty of experimentation and hotfixes for any individual app.

There’s reasons why this doesn’t work well with git (which is the primary reason this is a pie-in-the-sky thought for now), but there’s other VCSes that could potentially handle this better (see e.g. some of the ideas behind Pijul). There’s also a whole host of weird things that can happen around permissioning.

But with the right tools, it would be a very interesting experiment.

1 Like

Overall, I don’t really see the benefit over just caching ~/.elm.

If you really want to use git as a cache, you can set the ELM_HOME variable to move the ~/.elm directory whever you like. So you can run commands like

ELM_HOME=/whatever/you/want/ elm make

Then you can commit the resulting directory however you like. Maybe within the project you have a little bash script that sets ELM_HOME before calling elm. Then the contents of that directory will be specific to the project.

Seems like an easy way to get all the benefits you want, but without any changes except in your own workflow.

Aside: Package Facts

  1. If the compiler cannot reach the package website, it will try to build with the contents of ~/.elm so that people can begin new projects offline, as long as they have used the packages they want once before.

  2. I believe the package website can build any project for any past version. I recall looking at the logs to assess this, and I believe there are even some 0.16 projects out there still. But people can just save elm-stuff/ or ~/.elm as needed for their particular version.

  3. The best way to reduce load on the servers is to cache ~/.elm on CI. That will also make your CI faster and more reliable. Lots of things can go wrong if your builds need to go out to the internet. Dreamhost DNS went down recently. GitHub goes down sometimes. Etc. Whether you cache with git commits or some other way, the best protection is to not need to make HTTP requests outside of your own system!

8 Likes

It’s not so much about using git as a cache, more about using it as a backup. Following Joe Armstrong’s quote:

Code I wrote 25 years ago with zero dependencies still works today. Code I wrote 5 years ago with external dependencies often fails.

And also that if this was the default, it would prevent repeated downloads from CI. Changing the system seems more reliable than telling everyone to configure their caches correctly. Maybe that’s not a huge problem anyway though, especially with GitHub is providing free hosting.

I was thinking more of having the most robust option as the default, but neat trick with ELM_HOME, might use that for myself!

1 Like

Reducing dependencies has a lot of benefits, even outside of build failures. I think that can be separated from the suggestion here though.

If your suggestion was the default, it would still be up to everyone to “configure their version control correctly” such that they are (1) using version control in the first place and (2) committing this directory and managing changes through that system.

So the suggestion here also relies on recommendations and “correct configuration” to work well, but I suspect far fewer people would “correctly configure” their system like this. Putting aside people who wouldn’t want to do this through version control, many students in high school and college are not using version control at all. I was not at least! From there, “misconfigured” systems would result in higher volume of requests to various services since downloads are per-project.

It looks like you can experiment with ELM_HOME to see if the approach you suggest works well for you. If so, perhaps it would be useful to compare to caching ~/.elm so others can better understand the tradeoffs that you faced when evaluating different options.

1 Like

All the code you write, you will have to maintain and upgrade with every Elm version.

I suggest focusing on what makes your app unique is much better than reinventing the wheel and not standing on the shoulders of others.

A much better way would be not to use elm install but git submodule.

The current system has an “opt-in” for the scenario you describe. This means that if the person wants to get the pros you described they can use the ELM_HOME facilities only in the projects where this makes sense.

I understand the need but I do not want countless copies of various libraries in my system by default. Elm used to do this and I am very happy that the packages cache move to ~/.elm by default.

2 Likes

Wow, topical, just now our builds started failing because something has broken at NPM and the package @uirouter/angularjs is no longer available :sweat_smile:

I know we used to have our own NPM proxy for CI but I guess it must have stopped working as well.

1 Like

Would this expose your project to internal package modules that the package author has chosen not to expose, which you might accidentally use in your project? Furthermore, possibly introducing naming collisions between some packages’ internal modules (e.g., I’ve seen Utils in a few packages)?

github or elm packaging are exactly the same, so no difference there, the ~/.elm/ directory is an exact copy of your github repository basically.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.