ezyang’s blog

the arc of software bends towards understanding

Online/offline continuous integration

Raise your hand if you've ever put one of these commands in your continuous integration scripts:

  • apt install somepackage
  • pip install -r requirements.txt or pip install somepkg
  • conda install blah
  • cabal update or cabal install blah
  • git clone https://github.com/someguy/somerepo
  • wget http://some-website/thingy-latest.tgz

Can you tell what the problem is? These commands are not reproducible: depending on when you run them, they may give different results. More insidiously, most of the time they give you the same result (or, perhaps, a different result that still works for your use case).

I know, we need a reproducible build! The prevailing answer to this problem by tooling authors has been to seize the means of production and replace it with something that is reproducible. If you live the npm/yarn ecosystem, lockfiles ensure all of your dependencies redownload the same way every time (except when it doesn't). If you live in the Stack ecosystem, Stackage distributions ensure that you get the same Hackage package every time you build (except when it doesn't...). If you live in the Nix ecosystem, it means literally replacing the packaging system for everything on your system to achieve reproducibility.

So, it seems:

  1. If you can live entirely within the walled garden of the tools you use, things are pretty reproducible, but you're still on your own when it comes to taking updates on your dependencies.
  2. As soon as you step outside of the garden, it's entirely up to you to ensure reproducibility. The "easy way" out is usually not reproducible.

What if we change the question? We have entered this discussion under the assumption that reproducibility is our terminal value. But it's not: it's the mechanism by which we can achieve other goals. In the setting of continuous integration, what we really care about is a system that gives us signal about whether or not a given change set is correct or breaks things. A non-reproducible build interferes with this goal only in the sense that's its harder to tell if a change set has broken things if some random dependency has self-updated itself and broken your build. If this happens, you are blocked: you won't get clean signal until you fix the dependency problem. Broken window theory demands you drop everything and fix the build.

Clearly, we don't care if our dependencies are getting silently upgraded as development proceeds; in fact, we might prefer it, because "automatic" is less friction than "manual", at least when it works. What we do care about is the ability to block the upgrade if it is known to break us or revert the upgrade if we find out later that it caused some breakage.

Online/offline continuous integration. We traditionally think of the continuous integration build as a single pipeline which, when run from beginning to end, gives us signal about whether or not our code works or not. But I think it is better to think of a CI pipeline as dividing into two phases:

  1. Online environment configuration. In this stage, you download all of the external software that depend on that fiddly third-party world, setting up a complete build environment. Once you are done, you snapshot this environment by some mechanism (e.g., filesystem snapshot or make a Docker image.)
  2. Offline actual build and test. In this stage, within the snapshotted environment from step (1), turn off your Internet connection and run the actual build and test.

The key is that you don't have to run step (1) every build (you didn't want to anyway, for performance reasons.) Instead, the series of immutable snapshots of build environments generated by step (1) gives you the ability to revert or peg to a particular version of all of your dependencies, without having to go and make the universe reproducible. You can have a weekly cronjob rebuilding your environment, running the tests, and only deciding to push the activate snapshot forward if everything passes. You don't have to actually turn off the Internet when you run step (2), but it might help keep you honest.

Think offline. In today's connected world, it's easy to build systems with the assumption that you are always connected to the Internet. Doing so, however, leaves your tool at the mercy of the sound and fury of the real world. By applying a simple principle: "what can I do offline; what must I do online?" we reverse-engineer a design for continuous integration that gives you something almost as good as reproducibility, without forcing you to rewrite the universe. Surely that's worth something.