Roadmap for internal packages?

Answers from there: https://www.youtube.com/watch?v=tISy7EJQPzI

  • stuff that isn’t tested deserves breaking
  • “live at head” (i.e. code that is left behind is abandoned). The principle is that code is built and tested frequently. Google has the means to store external stuff in a vendor folder, and maintain it if it becomes a problem.

Here is our example use where it seems like a private package method would have been better.

We were building a page with a datepicker. As a small company we don’t have the resources (time) to build this from scratch ourselves (especially as we are still learning Elm), so we wanted to use a package.

This was great, however we weren’t able to do everything we wanted with the package api - there were a few modifications that we had to put in because we were using a datepicker in a slightly non-standard way.

Therefore we forked said package and submitted changes as a PR to the package, but we couldn’t wait around for these to be merged (and it may not even have been right for them to be merged in), so we now had the dilemma of how do we distribute the altered package in our codebase.

The easiest solution (and the one we went for) was to publish our forked version of the repo as a separate package. This solved our issue but felt bad because we were essentially cluttering up the elm package list just so we could easily distribute our code

2 Likes

We use private packages to distribute application independent concepts like shared API types and UI components. We have multiple teams in different parts of an organization, so things that rarely change (e.g. UI components) are shared across the board, while API types can be local to just a few apps.

Some apps and packages live together in multirepos while others are independent (different things play into this; dev preferences, history and bureaucracy). Versioning is necessary to prevent slow apps holding back progress elsewhere (like upgrading to 0.19 but being dependency blocked), so we tag versions for release and distribute them privately with npm.

So I’m a consultant, which sometimes can influence how the client organises their repos, but mostly I’m limited to giving advice. I’m quite often in the position to suggest Elm though. Very often they have java/jvm stack, with maven repos for sharing internal jars and private npm repos for sharing js related stuff. In addition many companies uses proxies(local cache) for repos using tools like nexus/artifactory etc.

Given that clients instinctively want to be able to create shared code that they do not want to publish to the global elm package repo; I think it’s a hard sell to convince them to use Elm and at the same time tell them that they need to reorganise their repos to a mono-repo.

6 Likes

Have you attempted sharing Elm code using npm? What issues are you noticing? We have started using a private npm package holding shared UI code which seems to work fine.

Transitive dependencies can be a pain. In the eyes of the compiler you just include additional source dirs, so there isn’t really an elm.json to speak of. You can create one for your own sake while developing the package, but you have to make sure your applications also have all the direct and indirect dependencies of the private packages they use.

You also have to do manual versioning. That is less of a pain, but automatic semver would still be a nice-to-have.

Yes - a local proxy can be very important to cover the situation where the public package repo goes down and then you have 20 developers sitting around unable to get on with their work. Ideally we would create plugins for nexus and artifactory and a modified tool chain that can pull from them.

1 Like

Someone showed me this link that gets into why large companies use monorepos. I encourage folks to check it out.

Separately, I made a plan about how you can do a multi-repo setup here, but I still encourage you to check out both links. They have counters :wink:

I like the usage of monorepos as well but wanted to point out for the “store everything in one repo” crowd that Google has their own DVCS and lots of tooling for it. They can (aka have no other choice than to) checkout slices since the full repo is too large for a single machine and the data is stored in their Big Data infrastructure. They also have user permissions for subtrees which Git doesn’t by design. They also have tooling for sweeping changes over large parts of the repo as well as rollback etc. I’ve read that their deployment pipelines always point to the latest master of their libraries so everybody is pretty careful not to check in broken builds whose tests are greeen :slight_smile: . Git or your git repo host may also have problems with huge repos in the long run since it wasn’t designed for that use case, BitBucket seems to have that at least and GitHub has a soft limit of 1GB, Facebook also struggled.

I’m also wondering if something fairly simple could be set up using squid to proxy the package server and github onto my local network:

http://www.squid-cache.org/

For example, I already set up a VM to run deb-squid-proxy, so every time I commission a new box it is really fast and I can do it without an internet connection too.

When? I think you are writing this as a warning to people going down this path, but I wanted to point out that they ran into this stuff once they reached a certain size. That threshold may be quite high. Does it happen at 200 engineers? Or 1000? Or 100? At any of those sizes having some people do this work doesn’t seem like that big a deal.

Relative to what? I got to watch some companies go through the “grow from 50 to 200 employees” transition, the amount of time and energy spent on getting microservices and multi-repos working was really high. There was a dedicated team and lots of projects had to integrate with that, and the integrations weren’t friendly for all different languages. Point is, a serious amount of work exists on this path as well.

In summary: It seems like companies go from monorepo to multi-repo back to monorepo as they grow, and I’m not convinced that means it is the fastest, cheapest, or easiest path.

Point taken, like I said the monorepo feels like the best option to me as well and it’s good to think about the questions you posed. That’s very useful information by the way :slight_smile:

Google wrote their own distributed VCS at around 50,000 full time employees.

Here’s a pretty good talk from Google about their use of a monorepo.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.