Hoping GHA ubuntu-latest-4-cores will stop CI crashes

As I have noted in the past, some large Elm apps that I contribute to have regularly hit issues where they crash during compilation, sometimes with explicit OOM, on GHA. We’re going to try ubuntu-latest-4-cores (Using larger runners - GitHub Docs) and see if that stops the crashes. I’ll update this thread in a week if the crashes stopped after the switch.

Right now, somewhere between 5-10% of our Elm CI builds crash during compilation.

4 Likes

Hey @kanishka - I know you’ve posted before about these issues and some folks suggested workarounds - couldn’t find the threads at a glance.

What’s the current state of affairs regarding flags you’re using with the compiler?

Do you also have any sense as to which part of the codebase might be causing the problem?

1 Like

I think we started with applying GHC memory limits, which seems to reduce the failure rate a bit, but still had failures. Then we switched to using different garbage collection in GHC, which reduced failure rate more, but it still persists (possibly because the number of modules keeps growing). I will add a comment with exact settings later.

I can try to take our largest app and attempt a binary search over the number of pages in the app, until I get a smaller reproducing example, when I get down time. I haven’t attempted that yet. The annoying part will be that I need to run each configuration 20 times to determine if there is any change in failure rate. I will reflect on whether I can force the error to reproduce at a higher rate through some artificial constraint like bounding the memory artificially low, but I am unsure if that will change the nature of the problem. I think this would be much easier to diagnose if the compiler was implemented in a strict language. (I wonder if Unison has hit these issues and how they mitigate against this.)

I think this would be much easier to diagnose if the compiler was implemented in a strict language.

I’m curious what’s making you think this is related to laziness specifically?

If you’re interested in some compiler-level support in debugging this further with access to your codebase under commercial NDA I’d be happy to help – feel free to DM me :slight_smile:

1 Like

I have no idea if it’s related to laziness. I just am unsure if change constraints like available memory or number of modules compiled will cause the shape of the problem to change in a non strict compiler implementation. If it was strict, then I would have less hesitation about change such variables and being sure that I am observing the same problem.

I’m going to tentatively claim success with this configuration. I still intend to update this thread with the history of elm compiler / GHC runtime options that we have used along the way.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.