Sooo… my ci build failed this morning because the package server is down. This is my teams fault not the communities
What options are available to help mitigate this issue?
Thanks!
merging these two topics together.
this is known, and the quickest way to get updates will probably be Slack.
the package site is back up, hooray!
using my modly powers to change the topic so people know at a glance. Thanks for bringing it up, folks.
Event Summary
Sorry for the downtime everyone! I did a more complete explanation elsewhere, but here’s the quick summary:
DigitalOcean needed to restart some servers for Spectre/Meltdown mitigation. The package website server was one of them. I think it was possible for the disruption to be shorter, but I do not think it could have been avoided entirely.
I hope it has not been too disruptive to your work, and I don’t think this indicates a larger issue with the reliability of the site. I hope that CPU bugs will be quite rare.
Future Mitigation
@nerdyworm, one of the things I have implemented for the next release is a per-user cache of build artifacts. When you download a package, it is saved in ~/.elm
. And then the first time you build it, all the build artifacts are cached there as well. That means that each package will be downloaded once-per-user and built once-per-user.
For users, this means you can start new projects or use the REPL outside of existing projects without internet. (As long as you do not need packages you have never used before.) I am excited about that!
But more importantly for the CI case, I think it’ll be possible to set up scripts to persist the ~/.elm/0.19.0/
directory between builds which will have two benefits. (1) No external website can take you down (package website or GitHub!) and (2) you do not actually have to compile your packages, only the files you actually wrote, so it should be a bit faster.
I was always felt that “have zero downloads” is better than “have a elaborate infrastructure to allow infinite downloads” so this design is the realization of that feeling. I believe I outlined the per-user cache details a while back on elm-dev, and it feels weird to give updates that say “everything that was true in the last update is still true” but it feels appropriate given the circumstances today. So @nerdyworm, I hope that information helps you and your team plan what would be the most effective moves, and again, I am sorry for the trouble the downtime caused!
Goals for this Thread
I am in the cleanup phase with my work, so I really want to focus on that so my next update can just be the full outline of an alpha release. So I cannot do a Q&A with folks in this medium to go in more depth. I’m very eager to share my work and set up productive conversations, so I think the alpha period (which will be quite long, a month or more) will be a much better time to have meta discussions about things.
This sounds great! I’ve wanted many times to dev offline and had to mess around copying and pasting folders and existing projects. Looking forward to it!
Thanks a lot for the post mortem, very appreciated.