ezyang's blog

the arc of software bends towards understanding

2016

A tale of backwards compatibility in ASTs

Those that espouse the value of backwards compatibility often claim that backwards compatibility is simply a matter of never removing things. But anyone who has published APIs that involve data structures know that the story is not so simple. I’d like to describe my thought process on a recent BC problem I’m grappling with on the Cabal file format. As usual, I’m always interested in any insights and comments you might have.

Read more...

Backpack and the PVP

In the PVP, you increment the minor version number if you add functions to a module, and the major version number if you remove function to a module. Intuitively, this is because adding functions is a backwards compatible change, while removing functions is a breaking change; to put it more formally, if the new interface is a subtype of the older interface, then only a minor version number bump is necessary.

Read more...

Left-recursive parsing of Haskell imports and declarations

Suppose that you want to parse a list separated by newlines, but you want to automatically ignore extra newlines (just in the same way that import declarations in a Haskell file can be separated by one or more newlines.) Historically, GHC has used a curious grammar to perform this parse (here, semicolons represent newlines):

decls : decls ';' decl
      | decls ';'
      | decl
      | {- empty -}

It takes a bit of squinting, but what this grammar does is accept a list of decls, interspersed with one or more semicolons, with zero or more leading/trailing semicolons. For example, ;decl;;decl; parses as:

Read more...

The problem of reusable and composable specifications

It’s not too hard to convince people that version bounds are poor approximation for a particular API that we depend on. What do we mean when we say >= 1.0 && < 1.1? A version bound is a proxy some set of modules and functions with some particular semantics that a library needs to be built. Version bounds are imprecise; what does a change from 1.0 to 1.1 mean? Clearly, we should instead write down the actual specification (either types or contracts) of what we need.

Read more...

Thoughts about Spec-ulation (Rich Hickey)

Rich Hickey recently gave a keynote at Clojure/conj 2016, meditating on the problems of versioning, specification and backwards compatibility in language ecosystems. In it, Rich considers the “extremist” view, what if we built a language ecosystem, where you never, ever broke backwards compatibility.

A large portion of the talk is spent grappling with the ramifications of this perspective. For example:

  1. Suppose you want to make a backwards-compatibility breaking change to a function. Don’t mutate the function, Richard says, give the function another name.
  2. OK, but how about if there is some systematic change you need to apply to many functions? That’s still not an excuse: create a new namespace, and put all the functions there.
  3. What if there’s a function you really don’t like, and you really want to get rid of it? No, don’t remove it, create a new namespace with that function absent.
  4. Does this sound like a lot of work to remove things? Yeah. So don’t remove things!

In general, Rich wants us to avoid breakage by turning all changes into accretion, where the old and new can coexist. “We need to bring functional programming [immutability] to the library ecosystem,” he says, “dependency hell is just mutability hell.” And to do this, there need to be tools for you to make a commitment to what it is that a library provides and requires, and not accidentally breaking this commitment when you release new versions of your software.

Read more...

Try Backpack: <code>ghc --backpack</code>

Backpack, a new system for mix-in packages in Haskell, has been released with GHC 8.2. Although Backpack is closely integrated with the Cabal package system, it’s still possible to play around with toy examples using a new command ghc --backpack. Before you get started, make sure you have a recent enough version of GHC:

ezyang@sabre:~$ ghc-8.2 --version
The Glorious Glasgow Haskell Compilation System, version 8.2.1

By the way, if you want to jump straight into Backpack for real (with Cabal packages and everything), skip this tutorial and jump to Try Backpack: Cabal packages.

Read more...

Seize the Means of Production (of APIs)

There’s a shitty API and it’s ruining your day. What do you do?

Without making a moral judgment, I want to remark that there is something very different about these two approaches. In Dropbox’s case, Dropbox has no (direct) influence on what APIs Apple provides for its operating system. So it has no choice but to work within the boundaries of the existing API. (When Apple says jump, you say, “How high?”) But in Adam’s case, POSIX is implemented by an open source project Linux, and with some good ideas, Adam could implement his new interface on top of Linux (avoiding the necessity of writing an operating system from scratch.)

Read more...

The Edit-Recompile Manager

A common claim I keep seeing repeated is that there are too many language-specific package managers, and that we should use a distribution’s package manager instead. As an example, I opened the most recent HN discussion related to package managers, and sure enough the third comment was on this (very) dead horse. (But wait! There’s more.) But it rarely feels like there is any forward progress on these threads. Why?

Here is my hypothesis: these two camps of people are talking past each other, because the term “package manager” has been overloaded to mean two things:

Read more...

Backpack and separate compilation

When building a module system which supports parametrizing code over multiple implementations (i.e., functors), you run into an important implementation question: how do you compile said parametric code? In existing language implementations are three major schools of thought:

  1. The separate compilation school says that you should compile your functors independently of their implementations. This school values compilation time over performance: once a functor is built, you can freely swap out implementations of its parameters without needing to rebuild the functor, leading to fast compile times. Pre-Flambda OCaml works this way. The downside is that it’s not possible to optimize the functor body based on implementation knowledge (unless, perhaps, you have a just-in-time compiler handy).
  2. The specialize at use school says, well, you can get performance by inlining functors at their use-sites, where the implementations are known. If the functor body is not too large, you can transparently get good performance benefits without needing to change the architecture of your compiler in major ways. Post-FLambda OCaml and C++ templates in the Borland model both work this way. The downside is that the code must be re-optimized at each use site, and there may end up being substantial duplication of code (this can be reduced at link time)
  3. The repository of specializations school says that it’s dumb to keep recompiling the instantiations: instead, the compiled code for each instantiation should be cached somewhere globally; the next time the same instance is needed, it should be reused. C++ templates in the Cfront model and Backpack work this way.

The repository perspective sounds nice, until you realize that it requires major architectural changes to the way your compiler works: most compilers don’t try to write intermediate results into some shared cache, and adding support for this can be quite complex and error-prone.

Read more...