ezyang's blog

the arc of software bends towards understanding

Posts

Proposal: Suggest explicit type application for Foldable length and friends

tl;dr If you use a Foldable function like length or null, where instance selection is solely determined by the input argument, you should make your code more robust by introducing an explicit type application specifying which instance you want. This isn’t necessary for a function like fold, where the return type can cross-check if you’ve gotten it right or not. If you don’t provide this type application, GHC should give a warning suggesting you annotate it explicitly, in much the same way it suggests adding explicit type signatures to top-level functions.

Read more...

Prio: Private, Robust, and Scalable Computation of Aggregate Statistics

I want to take the opportunity to advertise some new work from a colleague of mine, Henry Corrigan-Gibbs (in collaboration with the venerable Dan Boneh) on the subject of preserving privacy when collecting aggregate statistics. Their new system is called Prio and will be appearing at this year’s NSDI.

The basic problem they tackle is this: suppose you’re Google and you want to collect some statistics on your users to compute some aggregate metrics, e.g., averages or a linear regression fit:

Read more...

Designing the Backpack signature ecosystem

Suppose you are a library writer interested in using Backpack. Backpack says that you can replace a direct dependency on a function, type or package with one or more signatures. You typecheck against a signature and your end user picks how they want to eventually implement the signature.

Sounds good right? But there’s a dirty little secret: to get all of this goodness, you have to write a signature–you know, a type signature for each function and type that you want to use in your library. And we all know how much Haskellers hate writing signatures. But Backpack has a solution to this: rather than repeatedly rewrite signatures for all your packages, a conscientious user can put a signature in a package for reuse in other packages.

Read more...

How to integrate GHC API programs with Cabal

GHC is not just a compiler: it is also a library, which provides a variety of functionality that anyone interested in doing any sort of analysis on Haskell source code. Haddock, hint and ghc-mod are all packages which use the GHC API.

One of the challenges for any program that wants to use the GHC API is integration with Cabal (and, transitively, cabal-install and Stack). The most obvious problem that, when building against packages installed by Cabal, GHC needs to be passed appropriate flags telling it which package databases and actual packages should be used. At this point, people tend to adopt some hacky strategy to get these flags, and hope for the best. For commonly used packages, this strategy will get the job done, but for the rare package that needs something extra–preprocessing, extra GHC flags, building C sources–it is unlikely that it will be handled correctly.

Read more...

Try Backpack: Cabal packages

This post is part two of a series about how you can try out Backpack, a new mixin package system for Haskell. In the previous post, we described how to use a new ghc --backpack mode in GHC to quickly try out Backpack’s new signature features. Unfortunately, there is no way to distribute the input files to this mode as packages on Hackage. So in this post, we walk through how to assemble equivalent Cabal packages which have the same functionality.

Read more...

A tale of backwards compatibility in ASTs

Those that espouse the value of backwards compatibility often claim that backwards compatibility is simply a matter of never removing things. But anyone who has published APIs that involve data structures know that the story is not so simple. I’d like to describe my thought process on a recent BC problem I’m grappling with on the Cabal file format. As usual, I’m always interested in any insights and comments you might have.

Read more...

Backpack and the PVP

In the PVP, you increment the minor version number if you add functions to a module, and the major version number if you remove function to a module. Intuitively, this is because adding functions is a backwards compatible change, while removing functions is a breaking change; to put it more formally, if the new interface is a subtype of the older interface, then only a minor version number bump is necessary.

Read more...

Left-recursive parsing of Haskell imports and declarations

Suppose that you want to parse a list separated by newlines, but you want to automatically ignore extra newlines (just in the same way that import declarations in a Haskell file can be separated by one or more newlines.) Historically, GHC has used a curious grammar to perform this parse (here, semicolons represent newlines):

decls : decls ';' decl
      | decls ';'
      | decl
      | {- empty -}

It takes a bit of squinting, but what this grammar does is accept a list of decls, interspersed with one or more semicolons, with zero or more leading/trailing semicolons. For example, ;decl;;decl; parses as:

Read more...

The problem of reusable and composable specifications

It’s not too hard to convince people that version bounds are poor approximation for a particular API that we depend on. What do we mean when we say >= 1.0 && < 1.1? A version bound is a proxy some set of modules and functions with some particular semantics that a library needs to be built. Version bounds are imprecise; what does a change from 1.0 to 1.1 mean? Clearly, we should instead write down the actual specification (either types or contracts) of what we need.

Read more...

Thoughts about Spec-ulation (Rich Hickey)

Rich Hickey recently gave a keynote at Clojure/conj 2016, meditating on the problems of versioning, specification and backwards compatibility in language ecosystems. In it, Rich considers the “extremist” view, what if we built a language ecosystem, where you never, ever broke backwards compatibility.

A large portion of the talk is spent grappling with the ramifications of this perspective. For example:

  1. Suppose you want to make a backwards-compatibility breaking change to a function. Don’t mutate the function, Richard says, give the function another name.
  2. OK, but how about if there is some systematic change you need to apply to many functions? That’s still not an excuse: create a new namespace, and put all the functions there.
  3. What if there’s a function you really don’t like, and you really want to get rid of it? No, don’t remove it, create a new namespace with that function absent.
  4. Does this sound like a lot of work to remove things? Yeah. So don’t remove things!

In general, Rich wants us to avoid breakage by turning all changes into accretion, where the old and new can coexist. “We need to bring functional programming [immutability] to the library ecosystem,” he says, “dependency hell is just mutability hell.” And to do this, there need to be tools for you to make a commitment to what it is that a library provides and requires, and not accidentally breaking this commitment when you release new versions of your software.

Read more...