About Private Packages

The other day someone asked here about having private code within their company. Here are some ideas that makes sense to me.

Idea

It sounds like the root thing people want is not that intense. Imagine you have a file called multi-repo.json like this:

{
    "https://github.com/company/thing.git": "a41fee"
}

And then you have a little bash script called setup-multi-repo that takes the contents of this record and:

  1. Makes sure everything is downloaded properly / lets you know if there are any local changes.
  2. Reads elm.json and adds in the local directories as "source-directories"

Now you can call elm and it can figure everything out.

Considerations

Given what I have heard so far, different companies want different things from this script. Maybe some people use GitHub, but maybe others use something else. Maybe some people already pay npm money, and can just go off that. Maybe some people want versions and some people want commit hashes. It seems hard for me personally to design for those situations. I’m not in those situations, and people seem to really dislike my intuitions about what makes sense.

From the very earliest designs on the compiler and packages, I knew that I do not want to be in the build tool business. There is a slow process where you add things one at a time until you accidentally made a turing complete language for build scripts. Oopsie! Now you have a accidentally designed language that feels complex for even the simplest task because everybody has different needs and they all turned into features. So I have been pretty specific about making sure elm.json is just about data so far.

Furthermore, the fact that we have a GitHub integration for packages already makes people mad. They want integrations with all the other things. And all the future other things. I don’t like having integrations in the first place, so my goal would be to have zero rather than N. This project seems like another “we want N integrations” path, and I don’t have resources to have final responsibility for that sort of project.

So I think folks should just write scripts exploring this and share them if they want to. My only ask is that any efforts along these lines are marked as “for companies” or “for professional use” such that people do not come to Elm under the impression that they need a fancy build tool for anything. For a company that uses a monorepo (and studies how and why Google uses that strategy) you do not need any of this. This is for a specific use-case, and my main anxiety is that “beginner tutorials” would point folks to tools for specific professional users.

8 Likes

Oh, and if you want to simulate versioned packages, you could have a little file in each private repo like:

{
    "1.0.0": "a41fee",
    "1.1.0": "88c21a"
}

I don’t think you can get elm bump to do the version computation for you, but my point is just that if you want to do something other than commit hashes, it is not too tough to create a little file to do that.

Note: It seems really easy to just keep adding ideas and features on top of this, so I encourage anyone who pursues this to try to cut things down as much as possible. Start with less than what you think you might want and see how it goes. If you can not have a file, don’t have it. Etc. Cut everything you can.

I’ve begun working on a tool that solves the needs at our company. I’m just getting started, but I’ll hopefully have something useful soon: https://github.com/Skinney/elm-git-install

3 Likes

Hi Robin,

You should probably use another file than elm.json, since that file is rewritten on elm install x/x , unless your script blocks that command usage entirely?

Yes, there’s already an issue about that in the github repo.

That being said, let’s not discuss my solution in this thread. Let’s use either github or slack for that purpose.

It seems to me that the Elm compiler could enable extensions for package repositories without doing much work at all itself.

From my investigations of how 0.19 works, there are three things the compiler needs:

  1. Get a JSON representation of all known package and their versions. This currently comes from https://package.elm-lang.org/all-packages

  2. Get a JSON representation of packages that have been published since the last locally known package. This currently comes from, e.g. https://package.elm-lang.org/all-packages/since/6800

  3. Download version <x.y.z> of package <owner>/<package>. This currently comes from tag <x.y.z> of github.com/<owner>/<package>.git

Information for 1 and 2 is stored in ~/.elm/0.19.0/package/versions.dat.

The source code is stored in ~/.elm/0.19.0/package/<owner>/<package>/<x.y.z>, and compiled versions of that source are stored in the files cached.dat, ifaces.dat, and objs.dat.

The relevant parts of elm.json look like:

"dependencies": {
    "elm/core": "1.0.0 <= v < 2.0.0",
    "elm/json": "1.0.0 <= v < 2.0.0",
    "elm-community/list-extra": "8.0.0 <= v < 9.0.0"

This could be extended by putting a prefix on the <owner>/<package> strings:

"dependencies": {
    "elm/core": "1.0.0 <= v < 2.0.0",
    "elm/json": "1.0.0 <= v < 2.0.0",
    "elm-community/list-extra": "8.0.0 <= v < 9.0.0",
    "elm-gitlab:spisemisu/elm-utf8": "1.0.0 <= v < 2.0.0"

This denotes that there is an executable named elm-gitlab in the user’s PATH, and it can be called three ways, to perform the three functions needed by the package system:

  1. elm-gitlab all-packages
  2. elm-gitlab all-packages --since 6800
  3. elm-gitlab fetch spisemisu/elm-utf8 --version 1.0.1 --output ~/.elm/0.19.0/elm-gitlab/package/spisemisu/elm-utf8/1.0.1

Suitably-named subdirectories of ~/.elm could be used to store the downloads and compiler output for packages from each extension executable.

This is pretty minimal work in the compiler, and allows a wide range of extension types, including pretty much all other repository systems, and local files.

4 Likes

From what you gathered @billstclair, and what @evancz wrote up, I distinguish a few options available if the scripts path is chosen:

  • Allowing custom fields in elm.json , or encouraging elm-*.json additional files?
  • Should the .dat (or even ~/.elm/) files be modified by those scripts, or should the compiler provide methods to do that? Maybe install locally any repos, like sbt’s publishLocal?

No, never.

I want to very strongly discourage people from modifying the cache. It is just a cache.

If you want to do things fancier than what I described (which I discourage) please do not mess with ~/.elm.

I think modifying "source-directories" and having extra elm-whatever.json files is a much better path.

1 Like

I really don’t understand why the tight integration with github is necessary. I think a tight integration with git is sufficient.

The package manager could be designed to just interpret each specified dependency as a mapping of git cloneable stuff to a git commitish.

As git clone can accept different protocols and even paths to local folders containing a git repository, this would solve all use cases I have. Instead of only accepting a version number, the package manager could accept anything that git accepts in a git checkout: this could be a tag (maybe in valid version number format) or the sha of a commit.

By adding shorthands, e.g. rewriting cloneables in the format “author/project” to “https://github.com/author/project.git” this would even be 100% backwards compatible.

To download a package, instead of getting the zipball from github, just make a shallow fetch with --depth 1 of the git repository URL.

Are there any downsides to this model compared to how it currently works?

A tight integration with git can also be used to keep a snapshot of the Elm package database.
Instead of always getting the whole state of the package ecosystem using a http request, the official rust package manager for example, gets the newest list of published packages by updating its local checkout of https://github.com/rust-lang/crates.io-index. Which is really ingenious since you get diffing and compression for free.


Edit: seems like Skinney/ elm-git-install follows exactly this idea :slight_smile:

6 Likes

What if it was possible to create a private elm package repo, so you could have http://elm-package.mycompany.com and it would work the same as the normal repo, but would require some form of authentication.

And this way the company gets SemVersioning and all the rest of the features of the normal elm package repo but internally

My idea supports that and more, with only small changes to the elm binary.

I wanted a solution with as little changes as possible, so I just added ".github:norpan:elm-html5-drag-drop:3.0.0" to source-directories in elm.json, and then made a little script to download from github directly into that directory, so that Elm can use the files.

#!/bin/bash

for f in $(jq -r '."source-directories" | join (" ")' elm.json); do
  if [[ $f == .github:* ]] && ! [ -d $f ]; then
    IFS=: read -a fields <<<"$f"
    mkdir $f
    curl https://codeload.github.com/${fields[1]}/${fields[2]}/tar.gz/${fields[3]} | tar -C $f -xz --strip-components=2 ${fields[2]}-${fields[3]}/src/*
  fi
done

Now, you have to add dependencies yourself, and the source must be in the src directory, but it can of course be improved to copy dependencies from the package’s elm.json etc. But the main point is that no extra files are needed.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.