Can folks try out monorepos?

We recently had two threads about having multiple repos for a single project. Some interesting discussion about monorepos occurred as a result. I think it is worth exploring that more in practice!

Core Idea

Big companies like Google and Facebook all seem to use monorepos. They could surely introduce more repos if they wanted, but they do not. If a monorepo is nicer when things are small and nicer when things are massive, what if they are nicer when things are in between as well? What if the particular design of common tools just make it seem otherwise?

@rtfeldman shared a very nice link Advantages of monorepos explaining the logic behind this a bit more. Very interesting!

Point is, I think it is worth exploring this path the community and seeing where it goes!

Discussion

Many commercial users shared their concerns about monorepos in this thread, and one of the chief ones I saw was that a monorepo may require some technical investment. One poster said:

And my questions were:

@dta confirmed seen afterwards that Google made their own distributed version control system (DVCS) when they had 50k employees! So it seems the ceiling is quite high for more traditional tools. That poster also shared this talk about how the monorepo works at Google.

Goals

I would be very interested to hear how things go for anyone using a monorepo with Elm at work. I wager it is not very common, but if it sounds interesting to you, I encourage you to learn more about it and see if it might be a good path for you! And then to let us know! :smiley:

7 Likes

We recently had two threads about having multiple repos for a single project.

This isn’t really the level of granularity that monorepos operate at. If you want to be able to share code across a company (given the limitations of a package manager like Elm’s), then everything has to be in the monorepo, not just a “single project.” If there’s a package that is specific to a particular project, then it fits your simplified use-case, but it doesn’t fit the general use-case of reusing private packages across an entire org unless they go all-in on monorepos. There are a lot of reasons, both technical and practical, why companies can’t or won’t do this, illustrated in part by your linked google talk:

  • It requires custom tooling that represents years of dev effort to accommodate
  • The decision to use a monorepo (or decision to keep a monorepo, in google’s case) wasn’t a forethought, it was an afterthought, done in large part because of the significant effort involved in moving to a more traditional model.
1 Like

Agreed. I have no interest in trying a monorepo. I want to use standard dependencies, but have the option of storing the known package list somewhere other than package.elm-lang.org and the code somewhere other than GitHub. My proposal to add this to the elm executable didn’t get much traction. I think it will satisfy lots of people, allows lots of customization, and won’t be hard to add to the elm executable, since almost all the work is done by the extension.

I’m even willing to do the work myself to add it, if Evan is open to considering a pull request.

6 Likes

I’m a big fan of monorepos in general. Another great article that I often use to explain them is Why you should use a single repository for all your company’s projects | David R. MacIver

Unfortunately I’ve only used Elm in a monorepo in a situation where there was a single Elm project and not multiple with shared internal dependencies, so I don’t have a ton of context on that issue. I do think the package manager should be maximally flexible in this case, however. If the goal is to make it easy for many different kinds of organizations to use Elm in production, I don’t think it makes sense to presume or enforce any particular organizational structure/process.

In my experience the tooling provided by GitHub/GitLab plus some simple shell scripts is more than adequate for small to mid-size organizations using monorepos. In terms of headcount and total size of their source base, Google is operating at a very different scale than all but a handful of other companies, so pretty much any tool or process they use would represent years of effort. That doesn’t mean there aren’t benefits to the monorepo paradigm for different kinds of organizations.

Whether or not something could work, and whether or not it’s practical or reasonable are two different things. There certainly won’t be any consensus on the matter.

What we can say for absolute certain is that if a monorepo is required to share private Elm packages across a company, it will be a hard stop for a lot of folks. It’s not reasonable to simply say “you should be using a monorepo.”

6 Likes

Yep, I completely agree.

A bit of perspective on monorepos and open source, based on working with them at Google a while back:

  • Google’s build system mostly doesn’t use version numbers. When an app or library specifies a dependency on a shared library, it doesn’t say which version. Instead, all apps are expected to use a common set of shared libraries that are all at the same version.

  • Upgrading a shared library means upgrading it for everyone. Conflicts are detected by building downstream libraries and apps and running tests, not by comparing version numbers.

  • Where possible, changes to a shared library and its dependencies are done in a single commit, but often this isn’t practical. It can be easier to add a new API, upgrade the callers, then remove the old one.

  • Upgrading open source shared libraries can be a lot of work and Google can be stuck on an old version for a long time.

  • Where multiple versions of a library are needed, they are checked in under different directories. The build system treats them as different packages. Not all languages support this well.

  • Google has a lot of code in a monorepo, but open source projects often don’t use it. Where they do, there is the problem of copying new releases into the monorepo versus making changes internally and then releasing them externally. It can be painful working with two different build systems and version control tools that have different philosophies.

It would be interesting to hear how people moving away from one repo per Elm package handle versioning issues. Do you want to upgrade shared libraries as an atomic transaction? How do you plan to divide up the work for doing a tricky library upgrade that requires changes to multiple apps?

7 Likes

Some remarks about this monorepo discussion:

  • It wouldn’t feel right if Elm was forcing development practices that go way beyond the language on its users. In the same way that it shouldn’t decide which editor its users must use.

  • Be careful in that encouraging corporate users to use a monorepo style could harm the development of open source Elm libraries (because you then need to manage a different repository, and this goes against your habits).

  • Are the advantages of monorepos the same for SaaS companies (like Google and Facebook) and for traditional software companies that ship software to their clients / users? Of course, given the targets of the Elm language, the perspective will be mostly on what is good for SaaS but be careful not to alienate other (potential future) usages of the Elm language because of the recommended practices…

4 Likes

It seems like a number of folks are quite concerned by the possibility of monorepos being suggested as the one true way to share Elm code privately. I think it’s worth taking a step back to point out that nobody has said that. I can’t speak for Evan, but all I see here is him asking for some experience reports around using Elm in monorepos. It sounds like he knows some folks who have been successful with that approach, and I have certainly heard people speak well of it, so I think it’s perfectly reasonable to ask the community to collect more information.

While I’m personally skeptical that I could get organizational buy-in for a monorepo approach at my large but not google-massive organization, I’m open to the possibility that it might not be as difficult as I think, or that it might work better than I expect. I’d be glad to hear how attempts to adopt the practice go at other organizations. In particular, I’d like to know about:

  • Getting buy-in across different org structures
  • The adoption process for a large organization - gradual or all at once?
  • Handling shared dependencies across different products with differing release cadences.

While I can’t think of elegant solutions to some of these problems, maybe someone else can.

In the meantime, as Evan suggested here, I can experiment with building separate tooling to link dependencies as source directories. That’s great! I was already considering that approach, and I’m pretty sure it can handle about 90% of my use cases without a high investment in tooling or a need to get organizational support. I look forward to hearing how that works out for other teams as well. We can collect information about more than one approach at once!

The only thing I’d be truly disappointed to see would be a rush to add a quick fix for private packages into the compiler and package manager. I can’t stress how much I appreciate that Elm only adds features after a good deal of consideration, even if that means things are a little awkward sometimes in the short term. I recently added TypeScript to a project and it required two additional configuration files and about a dozen additional build options on top of what I already had. I don’t want that in the Elm ecosystem.

10 Likes

It seems like a number of folks are quite concerned by the possibility of monorepos being suggested as the one true way to share Elm code privately. I think it’s worth taking a step back to point out that nobody has said that.

It seems like a reasonable concern given the context in which the question is being asked. The tweet linking to this topic states outright that a “monorepo is nicer when things are small and nicer when things are massive” which aside from being a false premise, seems to be setting the stage for simply telling people they should be using monorepos. This is analogous to saying “The problem we’re trying to solve goes away when people use emacs. A lot of people prefer emacs, can people experiment with emacs?”

While I’m personally skeptical that I could get organizational buy-in for a monorepo approach at my large but not google-massive organization

This stands to reason for most people. There needs to be justification beyond “I want to experiment,” and the anecdotal evidence surrounding monorepos is not compelling enough to make that case, especially when the single-repo approach works great for the overwhelming majority of folks (outside of the context of private packages in Elm).

I recently added TypeScript to a project and it required two additional configuration files and about a dozen additional build options on top of what I already had. I don’t want that in the Elm ecosystem.

How is this equivalent? Adding an entirely new language to a project is very different from adding a dependency to an existing language in an existing project. Virtually all modern package managers support this in the form of git dependencies. At the very least, it doesn’t seem like a problem that is without precedent.

I’m not OP, but I believe we’re reading this very differently.

I don’t think it is making any claims, rather I see a direct question there: “If it is good in X and Y situations, could they be good in Y in-between?” I don’t think “it’s setting the stage” for something. The original thread had a series of discussions that went beyond the context of Elm. From what I have read and experienced in the community, it is very common to encourage exploration and experimentation, and they don’t necessarily tie into Elm. I think this is the case here as well: an interesting (either way) discussion came up, and Evan would encourage anyone interested as such to explore further and let everyone else know.

(Original thread: Roadmap for internal packages?)

When I initially saw this thread I thought “I love that we can have explorations and projects spinning off discussions!”. I did not see much reason to share that, but now I think is a good time. Anyway, I’m sure there’s many other ways to read this, but I for one wouldn’t attribute intent beyond curiosity and an exchange of ideas :slight_smile:

2 Likes

While reading up on the Aurelia blog I’ve stumbled upon some relevant information https://aurelia.io/blog/2018/08/05/aurelia-vnext/

We’ve mentioned a few advantages of our vNext work above, but we wanted to take the opportunity to call them out in more detail here as well:

  • Increased Team Agility - As previously discussed, use of Lerna 3 and TypeScript 3 projects is going to make it super easy to develop across packages, do cross-package refactoring, perform simpler integration testing, and have pain-free publishing. The entire Aurelia Core team will be able to better maintain the code and assist our community by using the new setup and architecture.
  • Easier Community Contributions - With one repo, there will be no confusion on where to post issues or how to find the code in question. Anyone who wants to help fix a bug or add a new feature can checkout one repo, run a couple of simple commands, and be able to test their contributions against the full Aurelia framework and plugin set. Our new setup also has improved CI and code coverage reporting. We’ll be adding additional automated code analysis and nightly builds in the near future as well.

This passage of an earlier post is also kind of cute (emphasize is mine):

Monorepo Project Structure - Aurelia is and will remain a modular framework. Three years ago, the best way to do this seemed to be to create each module in its own GitHub repository. In practice, this has caused us a lot of problems, one of which is the “scattering” of our stars across many repos, which today causes our primary repo to show almost 4,000 less stars for Aurelia than it actually has. Fortunately, in the time since we launched the project, tools such as Lerna have emerged as superior ways to handle multi-module projects like Aurelia. We recently switched our Aurelia UX library to a monorepo and have really enjoyed the benefits that come with this. For the next version of Aurelia, we’re planning to move all core modules into a monorepo. This will make it easier to develop, report issues, contribute, test and much more.

So it seems they initially considered the monorepo in part because they thought… they could… boost their github star popularity… talking about strange decisions.

https://aurelia.io/blog/2018/01/03/aurelia-2018-roadmap/

Off-topic but interesting non the less, their core development seems to be relying on Patreon and OpenCollective sponsoring now which is kind of telling considering that they started their venture with Enterprise support in mind from day one.

What about training, consulting and support?
We’ve partnered with several companies to begin providing training and consulting around Aurelia. You’re going to begin seeing workshops show up at major conferences this year and soon we’ll have virtual training available as well. Need commercial support? We’ll be providing that too.

I can understand the concerns - as currently framed, it would appear that the only supported solution currently available is monorepos (or else submodules), and between the nature of status quos and Evan’s stated preference not to make package managers, it’s easy to feel that this is going to be the only “correct” way to do things, which could be damaging to getting widespread community support.

However, Evan also suggested (in a separate post I believe), that it would be possible to make simple scripts that would handle pulling versions of code from whatever source you want - I’ll probably be doing a light “package manager” of this sort at my company since monorepos are unlikely to happen, and while a full dependency-matching system could be difficult, for this purpose we only need explicit version references, and matching peer dependencies could be done programmatically.

It seems to me that as long as the Elm compiler explicitly compiles all dependency code as “part of the app”, then a package manager can be 100% independent from Elm itself, save for the following:

  1. Source references - Elm automatically picks up packages within the dependencies list connected to the official Elm package manager. We can add our own sources, but have to do so explicitly. This can be automated however.
  2. Official Elm version bumping support - it’s a nice tool, but currently seems to rely on comparison between a local codebase and a codebase existing on the official elm package manager server. If this tool could instead compare a codebase against a more arbitrary source, that could pave the way to any non-official package manager to integrate the tool and get consistent versioning.

So perhaps an actionable step to enable the private repo support that many are looking for without enforcing git submodules or monorepos would be to tweak the bump tool to be more arbitrary in its comparison sources? That would allow the development of any level of package manager the community wants without having to add hacks to the language itself - the only required integration with Elm would be the ability to declare multiple source directories, which seems to be a safe and supported portion of the language currently. It would also liberate the core team from having to worry so much about package manager development (though I think we all appreciate having an “official” place similar to npm where new users can find useful open source packages).

Edit: I suppose matching peer dependencies between this “custom package manager” and the official Elm one could be difficult, provided one was trying to use both simultaneously.

1 Like

I think there has been some miscommunication here. There are two posts:

  1. About Private Packages describes an easy path for people to use multiple repos without blocking other project or committing to rushed designs. @robin.heggelund started a project called GitHub - robinheghan/elm-git-install: A tool for installing private Elm packages from any git url. that explores this direction.
  2. This post is about exploring an alternate path. The hope is that maybe there is a way that is even simpler. I am encouraging people who are interested in this to explore this other direction.

One poster says “I have no interest in trying a monorepo.” That is okay. There’s a whole thread about exploring the multi-repo path. You posted in it. This thread is asking to hear from folks who use a monorepo. That is why I had a big heading called Goals that said the following:

In the time everyone spent posting in a thread that is not about them, they probably could have written the bash script needed for their multi-repo builds. But instead of writing that code, many people want to immediately push the work into elm without carefully considering the long-term implications. I would like to take a more conservative path. In developing this language, I have found it works well to explore many options before making choices with significant implications for maintenance, security, and simplicity. But this thread is not about us agreeing on any of those things. This thread is about hearing from people who use monorepos.

If you want a language that jumps into serious design choices, they exist out there. If you don’t like them, you should wonder if it’s because they jump into serious design choices.

9 Likes

My team uses a “monorepo”. Let me clarify why I quote the word: Our company doesn’t share one large repository. Rather, our team decided that all of the project we care for should live in one place.

Let me give an example of why monorepos are great: we use elm-export. This lets our backend generate the part of the Elm code that our frontend uses, and all the encoders and decoders that you can imagine for them. Having both our backend and our frontend in the same repository means that the shared CI tool guarantees that updates to our API are implemented on both sides. No versioning needed here.

On the other hand, another unrelated team also uses some Elm, and they’ve had to patch elm-export for some reason at some point. Had they been in the same repository as us, one of the three would have happened:

  • They would have had to fix our code in order to be able to move on,
  • We would have been pulled out of what we believe is our priority to fix our code,
  • We would have used different build targets for the two projects, effectively making virtual separate spaces inside the monorepo.

I guess everyone agrees that the two first options aren’t very attractive, so let’s focus on the third. Having one repo guarantees a working head only if the CI is applied on the whole repo. Making a diff system is essentially similar to building a versioning system. Building the mother-of-all-CI has a cost. For a collection of separate repositories, it is kind of a waste.

So if the company has separate repos, are we not a monorepo ? Well, I would beg to differ. All the bits that are relevant to our domain and that we can afford to maintain are in our repository. For the remaining bits, they are either unmaintainable by our team, or they don’t change enough to make it valuable to source them.

The point I’m going to here is that monorepositoryness is a spectrum. On one end, you have people making repos and packages out of everything, and living with the resulting dependency hell, and on the other hand, you have Google, with the means (both organizational and technical) to live up to the dream (or do they?).

When we decide to unite a piece of code and its dependency, we essentially assert coupling between these pieces. Coupling comes with a cost in terms of CI length, in terms of the complexity of the system, and in terms of the cost of a single modification. On the other hand, when we segregate a dependency, there is a risk: when we cut apart two things that are coupled, we increase the risk that CI won’t catch mistakes. A good team may make good coupling decisions, but there is no recipe to avoid mistakes.

The balance between risk and cost is not something that people will universally agree on. If you have Google’s deep pockets, you will probably not want to risk bugs. On the other hand, if you have less company buy-in, or simply less money, you may pick the opposite solution.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.