Inside 736-131

Existential Pontification and Generalized Abstract Digressions

Two bugs in the borrow checker every Rust developer should know about

Apologies in advance: this post assumes familiarity with Rust.

Anyone who has done some coding in Rust may be familiar with the dreaded borrow checker, famous for obstructing the compilation of otherwise “perfectly reasonable code.” In many cases, the borrow checker is right: you’re writing your code wrong, and there is another, clearer way to write your code that will appease the borow checker. But sometimes, even after you’ve skimmed the tutorial, memorized the mantra “a &mut pointer is the only way to mutate the thing that it points at” and re-read the borrowed pointers tutorial, the borrow-checker might still stubbornly refuse to accept your code.

If that’s the case, you may have run into one of the two (in)famous bugs in the borrow-checker. In this post, I want to describe these two bugs, give situations where they show up and describe some workarounds. This is the kind of post which I hope becomes obsolete quickly, but the fixes for them are pretty nontrivial, and you are inevitably going to run into these bugs if you try to program in Rust today.

Mutable borrows are too eager (#6268)

Summary. When you use &mut (either explicitly or implicitly), Rust immediately treats the lvalue as borrowed and imposes its restrictions (e.g. the lvalue can’t be borrowed again). However, in many cases, the borrowed pointer is not used until later, so imposing the restrictions immediately results in spurious errors. This situation is most likely to occur when there is an implicit use of &mut. (Bug #6268)

Symptoms. You are getting the error “cannot borrow `foo` as immutable because it is also borrowed as mutable”, but the reported second borrow is an object dispatching a method call, or doesn’t seem like it should have been borrowed at the time the flagged borrow occured.

Examples. The original bug report describes the situation for nested method calls, where the outer method call has &mut self in its signature:

fn main() {
  let mut map = std::hashmap::HashMap::new();
  map.insert(1, 2);
  map.insert(2, *map.get(&1)); // XXX
} 4:20 error: cannot borrow `map` as immutable because it is also borrowed as mutable   map.insert(2, *map.get(&1)); // XXX
                           ^~~ 4:5 note: second borrow of `map` occurs here   map.insert(2, *map.get(&1)); // XXX

This code would like to retrieve the value at key 1 and store it in key 2. Why does it fail? Consider the signature fn insert(&mut self, key: K, value: V) -> bool: the insert method invocation immediately takes out a &mut borrow on map before attempting to evaluate its argument. If we desugar the method invocation, the order becomes clear: HashMap::insert(&mut map, 2, *map.get(&1)) (NB: this syntax is not implemented yet). Because Rust evaluates arguments left to right, this is equivalent to:

let x_self : &mut HashMap<int> = &mut map;
let x_arg1 : int = 2;
let x_arg2 : int = *map.get(&1); // XXX
HashMap::insert(x_self, x_arg1, x_arg2);

meaning there is an active borrow by the time we call map.get. A minor rewrite resolves the problem:

fn main() {
  let mut map = std::hashmap::HashMap::new();
  map.insert(1, 2);
  let x = *map.get(&1);
  map.insert(2, x);

Sensitivity to order of arguments even when no method invocation is involved. Here is another example in which there is no method invocation:

fn g(x: &mut int) -> int { *x }
fn f(x: &mut int, y: int) { *x += y; }
fn main() {
    let mut a = 1;
    f(&mut a, g(&mut a));

Discussion. Fortunately, this bug is pretty easy to work around, if a little annoying: move all of your sub-expressions to let-bindings before the ill-fated mutable borrow (see examples for a worked example). Note: the borrows that occur in these sub-expressions really do have to be temporary; otherwise, you have a legitimate “cannot borrow mutable twice” error on your hands.

Borrow scopes should not always be lexical (#6393)

Summary. When you borrow a pointer, Rust assigns it a lexical scope that constitutes its lifetime. This scope can be as small as a single statement, or as big as an entire function body. However, Rust is unable to calculate lifetimes that are not lexical, e.g. a borrowed pointer is only live until halfway through a function. As a result, borrows may live longer than users might expect, causing the borrow checker to reject some statements. (Bug #6393)

Symptoms. You are getting a “cannot borrow foo as immutable/mutable because it is also borrowed as immutable/mutable”, but you think the previous borrow should have already expired.

Examples. This problem shows up in a variety of situations. The very simplest example which tickles this bug can be seen here:

fn main() {
    let mut x = ~1;
    let y = &mut *x;
    *y = 1;
    let z = &mut *x;
    *z = 1;
} 5:19 error: cannot borrow `*x` as mutable more than once at a time     let z = &mut *x;
                      ^~~~~~~ 3:19 note: second borrow of `*x` as mutable occurs here     let y = &mut *x;

Clearly y is dead after *y = 1, but the borrow checker can’t see that. Fortunately, in this case it is very easy to add a new lexical scope to solve the problem:

fn main() {
    let mut x = ~1;
        let y = &mut *x;
        *y = 1;
    let z = &mut *x;
    *z = 1;

So, when does this actually become a problem? The usual culprit is match statements. Here is some common code involving maps that you might want to write:

extern mod extra;
fn main() {
    let mut table = extra::treemap::TreeMap::new();
    let key = ~"test1";
    match table.find_mut(&key) {
        None    => table.insert(key.clone(), ~[1]), // XXX
        Some(v) => { v.push(1); false }
} 6:24 error: cannot borrow `table` as mutable more than once at a time         None    => table.insert(key.clone(), ~[1]), // XXX
                             ^~~~~ 5:15 note: second borrow of `table` as mutable occurs here     match table.find_mut(&key) {

table is a map of integer keys to vectors. The code performs an insert at key: if the map has no entry, then we create a new singleton vector and insert it in that location; otherwise, it just pushes the value 1 onto the existing vector. Why is table borrowed in the None branch? Intuitively, the borrow for table.find_mut should be dead, since we no longer are using any of the results; however, to Rust, the only lexical scope it can assign the borrowed pointer encompasses the entire match statement, since the borrowed pointer continues to be used in the Some branch (note that if the Some branch is removed, this borrow checks). Unfortunately, it’s not possible to insert a new lexical scope, as was possible in the previous example. (At press time, I wasn’t able to find a small example that only used if.)

Sometimes, the lifetime associated with a variable can force it to be assigned to a lexical scope that is larger than you would expect. Issue #9113 offers a good example of this (code excerpted below):

pub fn read1<'a>(&'a mut self, key: int) -> Option<&'a Data> {
    match self.cache.find(&key) {
        Some(data) => return Some(data),
        None => ()
    match self.db.find(&key) {
        Some(data) => {
            let result: &Data = self.cache.find_or_insert(key, data.clone());
        None => None
} 22:46 error: cannot borrow `(*self).cache` as mutable because it is also borrowed as immutable                 let result: &Data = self.cache.find_or_insert(key, data.clone());
                                               ^~~~~~~~~~ 15:24 note: second borrow of `(*self).cache` occurs here         match self.cache.find(&key) {

This code is attempting to perform a database lookup; it first consults the cache and returns a cached entry if available. Otherwise, it looks for the value in the database, caching the value in the process. Ordinarily, you would expect the borrow of self.cache in the first match to extend only for the first expression. However, the return statement throws a spanner in the works: it forces the lifetime of data to be 'a, which encompasses the entire function body. The borrow checker then concludes that there is a borrow everywhere in the function, even though the function immediately returns if it takes out this borrow.

Discussion. The workaround depends on the nature of the scope that is causing trouble. When match is involved, you can usually arrange for the misbehaving borrow to be performed outside of the match statement, in a new, non-overlapping lexical scope. This is easy when the relevant branch does not rely on any variables from the pattern-match by using short-circuiting control operators:

extern mod extra;
use extra::treemap::TreeMap;
fn main() {
    let mut table: TreeMap<~str,~[int]> = TreeMap::new();
    let key = ~"test1";
    match table.find_mut(&key) {
        None    => {},
        Some(v) => { v.push(1); return }
    table.insert(key.clone(), ~[1]); // None-case

Alternately, instead of directly returning, the match can assign a boolean to indicate whether or not the None-case should be run:

extern mod extra;
use extra::treemap::TreeMap;
fn main() {
    let mut table: TreeMap<~str,~[int]> = TreeMap::new();
    let key = ~"test1";
    let is_none = match table.find_mut(&key) {
        None    => true,
        Some(v) => { v.push(1); false }
    if is_none {
        table.insert(key.clone(), ~[1]);

The boolean can be elaborated into an enum that holds any non-references from the pattern-match you might need. Note that this will not work for borrowed references; but in that case, the borrow truly was still live!

It is a bit more difficult to workaround problems regarding lifetimes, since there is nowhere in the function the pointer is not “borrowed”. One trick which can work in some situations is to convert the function to continuation passing style: that is, instead of returning the borrowed pointer, accept a function argument which gets invoked with the function. pnkfelix describes how you might go about fixing the third example. This removes the lifetime constraint on the variable and resolves the problem.

The lexical scope assigned to a borrow can be quite sensitive to code pertubation, since removing a use of a borrow can result in Rust assigning a (much) smaller lexical scope to the borrow, which can eliminate the error. Sometimes, you can avoid the problem altogether by just avoiding a borrow.


To sum up:

  1. Bug #6268 can cause borrows to start too early (e.g. in method invocations), work around it by performing temporary borrows before you do the actual borrow.
  2. Bug #6393 can cause borrows to end too late (e.g. in match statements), work around it by deferring operations that need to re-borrow until the original lexical scope ends.

Keep these in mind, and you should be able to beat the borrow checker into submission. That is, until Niko fixes these bugs.

  • December 17, 2013

Visualizing a block allocator

GHC’s block allocator is a pretty nifty piece of low-level infrastructure. It offers a much more flexible way of managing a heap, rather than trying to jam it all in one contiguous block of memory, and is probably something that should be of general interest to anyone who is implementing low-level code like a runtime. The core idea behind it is quite old (BIBOP: Big Bag of Pages), and is useful for any situation where you have a number of objects that are tagged with the same descriptor, and you don’t want to pay the cost of the tag on each object.

Managing objects larger than pages is a bit tricky, however, and so I wrote a document visualizing the situation to help explain it to myself. I figured it might be of general interest, so you can get it here:

Some day I’ll convert it into wikiable form, but I don’t feel like Gimp'ing the images today...

  • October 30, 2013

Xmonad and media keys on Saucy

Ubuntu continues on its rampage of breaking perfectly good software, and on my most recent upgrade to Saucy Salamander, I discovered to my dismay that my media keys (e.g. volume keys, fn (function) keys, suspend button, etc) had stopped working. Of course, it worked fine if I logged into my user using Unity, but who wants to use a silly window manager like that...

The root problem, according to these Arch Linux forum posts is that Gnome has moved media-key support out of gnome-settings-daemon (which any self-respecting Xmonad user is sure to spawn) and into their window manager proper. Which, of course, is no good because I don’t want to use their window manager!

For now, it seems the simplest method of bringing back this functionality is to run a 3.6 version of gnome-settings-daemon. Fortunately, at least for Saucy, there are a few builds of 3.6 available before they upgraded to 3.8. So, all you need to do is grab these two deb files appropriate for your architecture (you need gnome-control-center too, because it has a dependency on gnome-settings-daemon):

Once you've downloaded the appropriate deb files, a dpkg -i $DEBFILE and then apt-mark hold gnome-control-center gnome-settings-daemon should do the trick. You should run an aptitude upgrade to make sure you haven't broken any other dependencies (for example, gnome-shell). (Power-users can add the debs to a local repo and then downgrade explicitly from apt-get.)

Moving forward, we will probably be forced to reimplement media key bindings in some other software package, and it would be nice if this could be standardized in some way. Linux Mint has already forked gnome-settings-daemon, with their cinnamon-settings-daemon, but I've not tried it, and have no idea how well it works.

Update. Trusty has an updated version of this package which restores support, so I am providing backports via my PPA.

  • October 27, 2013

If you’re using lift, you’re doing it wrong (probably)

David Darais asked me to make this public service announcement: If you're using lift, you're doing it wrong. This request was prompted by several talks at ICFP about alternatives to monad transformers in Haskell, which all began their talk with the motivation, "Everyone hates lifting their operations up the monad stack; therefore, we need another way of organizing effects." This StackOverflow question describes the standard technique that mtl uses to remove the use of lift in most monadic code.

Now, as most things go, the situation is a bit more nuanced than just "never use lift", and a technically incorrect quip at the beginning of a talk does not negate the motivation behind other effect systems. Here are some of the nuances:

  • As everyone is well aware, when a monad transformer shows up multiple times in the monad stack, the automatic type class resolution mechanism doesn't work, and you need to explicitly say which monad transformer you want to interact with.
  • This mechanism only works if the monadic operations you are interacting with are suitably generalized to begin with, e.g. MonadReader a m => m a rather than Monad m => ReaderT m a or Reader a. This is especially evident for the IO monad, where most people have not generalized their definitions to MonadIO. Fortunately, it is generally the case that only one liftIO is necessary.

And of course, there are still many reasons why you would want to ditch monad transformers:

  • Type-class instances are inherently unordered, and thus a generalized MonadCont m, MonadState m => m a monadic value says nothing about what order the two relevant monads are composed. But the order of this composition has an important semantic effect on how the monad proceeds (does the state transfer or reset over continuation jumps). Thus, monad transformers can have subtle interactions with one another, when sometimes you want non-interfering effects that are truly commutative with one another. And indeed, when you are using the type class approach, you usually use only monads that commute with one another.
  • The interference between different monad transformers makes it difficult to lift certain functions. For example, the type of mask :: ((forall a. IO a -> IO a) -> IO b) -> IO b. If we think operationally what has to happen when IO is composed with State, the lifter has to some how arrange for the state to transfer all the way into the code that runs with exceptions restored. That's very tricky to do in a general way. It gets even worse when these callbacks are invoked multiple times.
  • At the end of the day, while the use of type classes makes the monad stack somewhat abstract and allows the elision of lifts, most of this code is written with some specific monad stack in mind. Thus, it is very rare for nontrivial programs to make use of multiple effects in a modular way, or for effects to be instantiated (i.e. a concrete monad selected) without concretizing the rest of the monad stack.

Monad transformers have problems, let's argue against them for the right reasons!

  • September 26, 2013

Of Monadic Fixpoints and Heap Offsets

Here at ICFP, sometimes the so-called “hallway track” is sometimes just as important as the ordinary track. Johan Tibell was wanting to avoid an out-of-line call to allocate function in GHC when a small array of statically known size was allocated. But he found the way that GHC's new code generator handles heap allocation a bit confusing, and so we skipped out of one session today to work it out. In this post, I would like to explain how the code generation monad figures out what the heap offsets in the code are, by way of a kind of cute (and also slightly annoying) trick involving a “monadic” fixpoint.

First, some background about the code generator. The big overall pattern of a function that GHC has to generate code for is something like:

  1. Check if there is enough heap space, if not GC,
  2. Write a bunch of data to the heap,
  3. Push some things to the stack,
  4. Jump to the appropriate continuation.

Concretely, the code will be along the lines of:

    _s28e::P64 = R2;
    // Check if there is enough heap space
    Hp = Hp + 40;
    if (Hp > HpLim) goto c2ET; else goto c2ES;
    // If not enough space, GC
    HpAlloc = 40;
    R2 = _s28e::P64;
    R1 = withEmpty_riC_static_closure;
    call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
    // Write a bunch of data to the heap
    I64[Hp - 32] = sat_s28f_info;
    _c2EG::P64 = Hp - 32;
    I64[Hp - 16] = :_con_info;
    P64[Hp - 8] = _c2EG::P64;
    P64[Hp] = _s28e::P64;
    _c2EO::P64 = Hp - 14;
    R1 = _c2EO::P64;
    // No stack updates this time
    // Jump to the continuation
    call (P64[Sp])(R1) args: 8, res: 0, upd: 8;

This seems reasonable, but how does one go about actually generating this code? The code is generated in order, but the amount of heap that needs to be checked is not known until we've finished laying out the rest of the code. If we put on our mutation hats, we might say, “Well, leave it out for now, and then mutate it in when you know the actual value”, but there is still the knotty question of what the offsets should be when we are writing values to the heap. Notice that in the above code, we only bump the heap pointer once; if we repeatedly bump the heap pointer, then the offsets are easy to calculate, but we are wasting instructions; x86 addressing modes support writing to a register plus some offset directly.

Let’s take a look what GHC does when it allocates a dynamic closure to the heap (simplified):

allocDynClosureCmm info_tbl args_offsets
  = do  { virt_hp <- getVirtHp
        ; let rep = cit_rep info_tbl -- cit = c info table
              info_offset = virt_hp + 1 -- virtual heap offset of first word of new object
              info_ptr = CmmLit (CmmLabel (cit_lbl info_tbl))
        ; base <- getHpRelOffset (virt_hp + 1)
        ; emitSetDynHdr base info_ptr
        ; let (args, offsets) = unzip args_offsets
        ; hpStore base args offsets
        ; setVirtHp (virt_hp + heapClosureSize rep)
        ; getHpRelOffset info_offset

In words, it:

  1. Retrieves a “virtual heap pointer” (more on this later),
  2. Gets the true Hp - n expression (base) using the virtual heap pointer (getHpRelOffset, N.B. the off-by-one),
  3. Emits a bunch of writes to the memory at base (emitSetDynHdr and hpStore),
  4. Bumps the virtual Hp up with the size of the just allocated closure,
  5. Returns the Hp - n expression.

As it turns out, the virtual heap pointer is just an ordinary state variable in the code generation monad FCode (it’s good to take a look at the implementation of the monad you’re using!):

newtype FCode a = FCode (CgInfoDownwards -> CgState -> (# a, CgState #))

data CgState
  = MkCgState { ...
     cgs_hp_usg  :: HeapUsage,
     ... }

data HeapUsage =
  HeapUsage {
        virtHp :: VirtualHpOffset, -- Virtual offset of highest-allocated word
                                   --   Incremented whenever we allocate
        realHp :: VirtualHpOffset  -- realHp: Virtual offset of real heap ptr
                                   --   Used in instruction addressing modes

So virtHp just marches upwards as we allocate things; it is, in effect, the contents of the Hp register in our inefficient, rebumping implementation.

Which leaves us with the pressing question, what is realHp? Well, it starts off as zero (since the offset of the real heap pointer is just zero), but once we bump the heap pointer to do the stack check, it is now precisely the amount of heap we did the heap check for. Calling back our example:

    _s28e::P64 = R2;
    // Check if there is enough heap space
    // virtHp = 0; realHp = 0
    Hp = Hp + 40;
    // virtHp = 0; realHp = 40
    if (Hp > HpLim) goto c2ET; else goto c2ES;
    // If not enough space, GC
    HpAlloc = 40;
    R2 = _s28e::P64;
    R1 = withEmpty_riC_static_closure;
    call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
    // Write a bunch of data to the heap
    // First closure
    // virtHp = 0; realHp = 40
    I64[Hp - 32] = sat_s28f_info;
    _c2EG::P64 = Hp - 32;
    // virtHp = 8; realHp = 40
    I64[Hp - 16] = :_con_info;
    P64[Hp - 8] = _c2EG::P64;
    P64[Hp] = _s28e::P64;
    _c2EO::P64 = Hp - 14;
    // virtHp = 32; realHp = 40
    R1 = _c2EO::P64;
    // No stack updates this time
    // Jump to the continuation
    call (P64[Sp])(R1) args: 8, res: 0, upd: 8;

(Actually, internally the offsets are recorded as words, so, this being 64-bit code, divide everything by eight. BTW, virtHp + 8 == realHp, and that's where the off-by-one comes from.) The math is a little fiddly, but getHpRelOffset will calculate the offsets for you; you just have to make sure the virtual offset is right!

OK, but we still haven’t figured out how we get this magic number 40 from in the first place! The key is to look at the code generator responsible for doing the heap check, heapCheck, which is wraps the call to code, which is actually responsible for the code generation:

heapCheck :: Bool -> Bool -> CmmAGraph -> FCode a -> FCode a
heapCheck checkStack checkYield do_gc code
  = getHeapUsage $ \ hpHw ->

Hey, what's that magic getHeapUsage function?

-- 'getHeapUsage' applies a function to the amount of heap that it uses.
-- It initialises the heap usage to zeros, and passes on an unchanged
-- heap usage.
-- It is usually a prelude to performing a GC check, so everything must
-- be in a tidy and consistent state.
-- Note the slightly subtle fixed point behaviour needed here

getHeapUsage :: (VirtualHpOffset -> FCode a) -> FCode a
getHeapUsage fcode
  = do  { info_down <- getInfoDown
        ; state <- getState
        ; let   fstate_in = state { cgs_hp_usg  = initHpUsage }
                (r, fstate_out) = doFCode (fcode hp_hw) info_down fstate_in
                hp_hw = heapHWM (cgs_hp_usg fstate_out)        -- Loop here!

        ; setState $ fstate_out { cgs_hp_usg = cgs_hp_usg state }
        ; return r }

And here, we see the monadic fixpoint. In order to provide the heap usage to fcode, GHC writes itself a check: hp_hw. The check is borrowed from the result of generating fcode, and the string attached is this: “As long as you don’t cash this check before you finish generating the code, everything will be OK!” (It’s a bit like a big bank in that respect.) Cute—and we only need to do the code generation once!

This technique is not without its dark side. hp_hw is dangerous; if you force it in the wrong place, you will chunder into an infinite loop. There are two uses of this variable, both in compiler/codeGen/StgCmmLayout.hs, which are careful not to force it. What would be nice is if one could explicitly mark hp_hw as blackholed, and attach a custom error message, to be emitted in the event of an infinite loop. How this might be accomplished is left as an exercise for the reader.

BTW, in case you aren't aware, I've been live-tumblr'ing coverage of ICFP at — the coverage is not 100%, and the editing is rough, but check it out!

  • September 24, 2013

Induction and logical relations

Logical relations are a proof technique which allow you to prove things such as normalization (all programs terminate) and program equivalence (these two programs are observationally equivalent under all program contexts). If you haven't ever encountered these before, I highly recommend Amal Ahmed's OPLSS lectures on the subject; you can find videos and notes from yours truly. (You should also be able to access her lectures from previous years.) This post is an excuse to talk about a formalization of two logical relations proofs in Agda I worked on during OPLSS and the weeks afterwards. I'm not going to walk through the code, but I do want expand on two points about logical relations:

  1. They work when simple induction would not, and
  2. The logical relation is not an inductive definition.

The full development is in the lr-agda repository on GitHub. Many thanks to Dan Licata for providing the initial development as a homework assignment for his OPLSS Agda course and for bushwhacking the substitution lemmas which are often the greatest impediment to doing proofs about the lambda calculus.

If you didn't know any better, you might try to prove normalization inductively, as follows:

To show that all programs normalize to a value, let us proceed inductively on the typing derivations. For example, in the application case, we need to show e1 e2 normalizes to some value v, given that e1 normalizes to v1 and e2 normalizes to v2. Well, the type of v1 is t1 -> t2, which means v1 = λx. e'. Uh oh: this should step to e'[v2/x], but I don’t know anything about this expression (e' could be anything). Stuck!

What is the extra oomph that logical relations gives you, that allows you to prove what was previously unprovable by usual induction? Let's think about our second proof sketch: the problem was that we didn't know anything e'. If we knew something extra about it, say, "Well, for some appropriate v, e'[v/x] will normalize," then we'd be able to make the proof go through. So if this definition of WN was our old proof goal:

WN : (τ : Tp) → [] ⊢ τ → Set
WN τ e = e ⇓

then what we'd like to do is extend this definition to include that "extra stuff":

WN : (τ : Tp) → [] ⊢ τ → Set
WN τ e = e ⇓ × WN' τ e
At this point, it would be good to make some remarks about how to read the Agda code here. WN is a type family, that is, WN τ e is the unary logical relation of type τ for expression e. The type of a type is Tp, which is a simple inductive definition; the type of a term is a more complicated [] ⊢ τ (utilizing Agda's mixfix operators; without them, you might write it Expr [] τ), which not only tells us that e is an expression, but well-typed, having the type τ under the empty context []. (This is an instance of the general pattern, where an inductive definition coincides with a well-formedness derivation, in this case the typing derivation.) e ⇓ is another mixfix operator, which is defined to be a traditional normalization (there exists some value v such that e reduces to v, e.g. Σ (λ v → value v × e ↦* v)).

But what is this extra stuff? In the case of simple types, e.g. booleans, we don't actually need anything extra, since we're never going to try to apply it like a function:

WN' : (τ : Tp) → [] ⊢ τ → Set
WN' bool e = Unit

For a function type, let us say that a function is WN (i.e. in the logical relation) if, when given a WN argument, it produces a WN result. (Because we refer to WN, this is in fact a mutually recursive definition.) This statement is in fact the key proof idea!

WN' (τ1 ⇒ τ2) e = (e1 : [] ⊢ τ1) → WN τ1 e1 → WN τ2 (app e e1)

There are a number of further details, but essentially, when you redo the proof, proving WN instead of plain old normalization, you no longer get stuck on the application case. Great! The flip-side, however, is that the proof in the lambda case is no longer trivial; you have to do a bit of work to show that the extra stuff (WN') holds. This was described to me as the "balloon" principle.

The two sides of the balloon are the "use of the inductive hypothesis" and the "proof obligation". When you have a weak inductive hypothesis, it doesn't give you very much, but you don't have to work as hard to prove it either. When you strengthen the inductive hypothesis, you can prove more things with it; however, your proof obligation is correspondingly increased. In the context of a normalization proof, the "use of the inductive hypothesis" shows up in the application case, and the "proof obligation" shows up in the lambda case. When you attempt the straightforward inductive proof, the lambda case is trivial, but the inductive hypothesis is so weak that the application case is impossible. In the logical relations proof, the application case falls out easily from the induction hypothesis, but you have to do more work in the lambda case.

We now briefly take a step back, to remark about the way we have defined the WN' type family, on our way to the discussion of why WN' is not an inductive definition. In Agda, there are often two ways of defining a type family: it can be done as a recursive function, or it can be done as an inductive definitions. A simple example of this is in the definition of a length indexed list. The standard inductive definition goes like this:

data Vec : Set → Nat → Set where
  vnil : {A : Set} → Vec A 0
  vcons : {A : Set} → A → Vec A n → Vec A (S n)

But I could also build the list out of regular old products, using a recursive function on the index:

Vec : Set → Nat → Set
Vec A 0 = Unit
Vec A (S n) = A × Vec A n

The two different encodings have their own strengths and weaknesses: using a recursive function often means that certain equalities are definitional (whereas you'd have to prove a lemma with the inductive definition), but an inductive definition lets you do case-analysis on the different possibilities.

Sometimes, it is simply not possible to do the inductive definition, and this is the case for a logical relation. This strengthening of the inductive hypothesis is to blame:

data WN' : τ → [] ⊢ τ → Set where
  WN/bool : {e : [] ⊢ bool} → WN' bool e
  WN/⇒ : {e1 : [] ⊢ τ1} → (WN τ1 e1 → WN τ2 (app e e1)) → WN' (τ1 ⇒ τ2) e

Agda doesn't complain about inductive-recursive definitions (though one should beware: they are not too well metatheoretically grounded at the moment), but it will complain about this definition. The problem is a familiar one: WN does not occur in strictly positive positions; in particular, it shows up as an argument to a function contained by the WN/⇒ constructor. So we can't use this!

As it turns out, the inability to define the logical relation inductively is not a big deal for normalization. However, it causes a bit more headaches for more complicated logical relations proofs, e.g. for equivalence of programs. When considering program equivalence, you need a binary relation relating values to values, saying when two values are equal. This can be stated very naturally inductively:

data V'⟦_⟧ : (τ : Tp) → [] ⊢ τ → [] ⊢ τ → Set where
   V/bool-#t : V'⟦ bool ⟧ #t #t
   V/bool-#f : V'⟦ bool ⟧ #f #f
   V/⇒ : {τ₁ τ₂ : Tp} {e₁ e₂ : [] ,, τ₁ ⊢ τ₂}
     → ((v₁ v₂ : [] ⊢ τ₁) → V'⟦ τ₁ ⟧ v₁ v₂ → E⟦ τ₂ ⟧ (subst1 v₁ e₁) (subst1 v₂ e₂))
     → V'⟦ τ₁ ⇒ τ₂ ⟧ (lam e₁) (lam e₂)

We define the relation by type. If a value is a boolean, then we say that #t (true) is related to itself, and #f is related to itself. If the value is a function, then we say that a lambda term is related to another lambda term if, when applied to two related values, the results are also related. This "function" is directly analogous to the extra stuff we added for the normalization proof. (If you like, you can mentally replace "related" with "equal", but this is misleading since it doesn't capture what is going on in the function case). But this fails the strict positivity check, so we have to define it recursively instead:

V⟦_⟧ : (τ : Tp) → [] ⊢ τ → [] ⊢ τ → Set
V⟦ bool ⟧    #t #t = Unit
V⟦ bool ⟧    #f #f = Unit
V⟦ bool ⟧ _ _  = Void
V⟦ τ₁ ⇒ τ₂ ⟧ (lam e) (lam e') = (v v' : [] ⊢ τ₁) → V⟦ τ₁ ⟧ v v' → E⟦ τ₂ ⟧ (subst1 v e) (subst1 v' e')
V⟦ τ₁ ⇒ τ₂ ⟧ _ _ = Void

Notice that the definition here is much less nice than the inductive definition: we need two fall through cases which assert a contradiction when two things could not possibly be equal, e.g. #t could not possibly be equal to #f. Furthermore, supposing that we are given V as a hypothesis while performing a proof, we can no longer just case-split on it to find out what kind of information we have; we have to laboriously first case split over the type and expression, at which point the function reduces. To give you a sense of how horrible this is, consider this function which converts from a inductive definition to the recursive definitions:

pV : {τ : Tp} → {e e' : [] ⊢ τ} → V⟦ τ ⟧ e e' → V'⟦ τ ⟧ e e'
pV {bool} {#t} {#t} V = V/bool-#t
pV {bool} {#t} {#f} ()
pV {bool} {#t} {if _ then _ else _} ()
pV {bool} {#t} {var _} ()
pV {bool} {#t} {app _ _} ()
pV {bool} {#f} {#t} ()
pV {bool} {#f} {#f} V = V/bool-#f
pV {bool} {#f} {if _ then _ else _} ()
pV {bool} {#f} {var _} ()
pV {bool} {#f} {app _ _} ()
pV {bool} {if _ then _ else _} ()
pV {bool} {var _} ()
pV {bool} {app _ _} ()
pV {_ ⇒ _} {if _ then _ else _} ()
pV {_ ⇒ _} {var _} ()
pV {_ ⇒ _} {lam _} {if _ then _ else _} ()
pV {_ ⇒ _} {lam _} {var _} ()
pV {_ ⇒ _} {lam _} {lam _} f = V/⇒ (\ v v' V → pE (f v v' V))
pV {_ ⇒ _} {lam _} {app _ _} ()
pV {_ ⇒ _} {app _ _} ()

Good grief! Perhaps the situation could be improved by improving how Agda handles wildcards in pattern matching, but at the moment, all of this is necessary.

"But wait, Edward!" you might say, "Didn't you just say you couldn't define it inductively?" Indeed, this function does not operate on the inductive definition I presented previously, but a slightly modified one, which banishes the non-strictly-positive occurrence by replacing it with V, the recursive definition:

V/⇒ : {τ₁ τ₂ : Tp} {e e' : [] ,, τ₁ ⊢ τ₂}
  → (V : (v v' : [] ⊢ τ₁) → V⟦ τ₁ ⟧ v v' {- the critical position! -} → E'⟦ τ₂ ⟧ (subst1 v e) (subst1 v' e'))
  → V'⟦ τ₁ ⇒ τ₂ ⟧ (lam e) (lam e')

This conversion function helps a lot, because agda-mode interacts a lot more nicely with inductive definitions (C-c C-c works!) than with recursive definitions in cases like this.

Why do logical relations in Agda? (or any proof assistant, for that matter?) Proofs using logical relations often follow the pattern of defining an appropriate logical relation for your problem, and then a lot of book-keeping to actually push the relation through a proof. Computers are great for doing book-keeping, and I think it is hugely informative to work through a logical relation proof in a proof assistant. An interesting challenge would be to extend this framework to a non-terminating language (adding step-indexes to the relation: the very pinnacle of book-keeping) or extending the lambda calculus with polymorphism (which requires some other interesting techniques for logical relations).

  • September 18, 2013

Cost semantics for STG in modern GHC

One of the problems with academic publishing is that it’s hard to keep old papers up-to-date. This is the certainly case for this 1995 Sansom paper on profiling non-strict, higher-order functional languages. While the basic ideas of the paper still hold, the actual implementation of cost centers in GHC has changed quite a bit, perhaps the most dramatic change being the introduction of cost center stacks. So while the old paper is good for giving you the basic idea of how profiling in GHC works, if you really want to know the details, the paper offers little guidance.

So what do you do when your cost semantics are out-of-date? Why, update them of course! I present an updated cost-semantics for STG in modern GHC (PDF) (GitHub). Eventually these will go in the GHC repository proper, alongside core-spec which is a similar document for Core. However, I haven't done any proofs with these semantics yet, so they are probably a little buggy.

Despite the lack of proofs, the formalization has been helpful already: I’ve already spotted one bug in the current implementation (remarked upon in the document). I’ve also identified a potential refactoring based on the way the rules are currently setup. Please let me know about any other bugs you find!

  • September 7, 2013

Blame Trees

I just presented Blame Trees at the 13th Algorithms and Data Structures Symposium. Blame trees are a functional data structure which support an efficient merge operation by incorporating information about the “blame” (think git blame) of any given part of the structure. It’s a theory paper, so the constant factors are not so good, but the asymptotics are much better than traditional merge algorithms used by modern VCSes.

This was joint work with David A. Wilson, Pavel Panchekha and Erik D. Demaine. You can view the paper or check out the slides. I also have a slightly older version of the talk recorded on YouTube (20 minutes) which I had used to help get feedback from my out-of-town collaborators before actually giving the talk. Thanks also to David Mazières for giving useful comments on the presentation in person.

  • August 12, 2013

OPLSS lecture notes

I write in from sunny Oregon, where the sun floods into your room at seven in the morning and the Oregon Programming Languages Summer School is in session. So far, it’s been a blast—with lectures covering Coq, Agda, homotopy type theory, linear logic, logical relations—and we’ve still got another week to go!

If you were not able to make it, fear not: you can go to the curriculum page and pick up not only videos (I hear they are still quite large; we’ve been trying to convince the organizers to upload them to YouTube) but lecture notes from yours truly. (Sample from the logical relations lectures.) The earlier notes are a little iffy, but I get a bit more detailed on the later ones.

One technical note: I have heard that some folks on Macs are having trouble viewing the PDFs on the default PDF viewers; the text doesn’t show up. I have also heard that Chrome’s PDF viewer as well as Acrobat Reader are able to read the PDFs. Whatever the problem is, it is likely some trouble with Xournal’s PDF export functionality. If someone has a method of fixing this without losing metadata (e.g. the ability to copy/paste the text), I’d be all ears.

  • July 29, 2013

No grammar? No problem!

One day, you’re strolling along fields of code, when suddenly you spot a syntax construct that you don’t understand.

Perhaps you’d ask your desk-mate, who’d tell you in an instant what it was.

Perhaps your programming toolchain can tell you. (Perhaps the IDE would you mouse over the construct, or you’re using Coq which let’s you Locate custom notations.)

Perhaps you’d pull up the manual (or, more likely, one of many tutorials) and scan through looking for the syntax construct in question.

But when all this fails, what is one to do? What if the code in question is written in an internal language for a compiler, whose details have changed since it was last documented, for which the documentation is out of date?

No problem. As long as you’re willing to roll up your sleeves and take a look at the source code of the compiler in question, you can frequently resolve your question for less effort than it would have taken to look up the syntax in the manual (and it’s guaranteed to be up-to-date too!) The key is that modern compilers all use parser generators, and the input to these are essentially executable specifications.

I’ll give two examples from GHC. The first is from C--, GHC’s high-level assembly language. Consider this function:

INFO_TABLE_RET(stg_maskUninterruptiblezh_ret, RET_SMALL, W_ info_ptr)
    return (P_ ret)
    StgTSO_flags(CurrentTSO) =
          | TSO_BLOCKEX)

    return (ret);

Some aspects of this definition are familiar to someone who has written C before, but there are some mysterious bits. For example, what does the return (P_ ret) mean in the preamble?

The first order of business is to find the relevant file. When the code in question has very distinctive keywords (as this one does), a grep will often do the trick:

ezyang@javelin:~/Dev/ghc-clean/rts$ grep -R INFO_TABLE_RET ../compiler/
../compiler/cmm/CmmParse.y:INFO_TABLE_RET ( label, FRAME_TYPE, info_ptr, field1, ..., fieldN )
../compiler/cmm/CmmParse.y:        'INFO_TABLE_RET'{ L _ (CmmT_INFO_TABLE_RET) }
../compiler/cmm/CmmParse.y:        | 'INFO_TABLE_RET' '(' NAME ',' INT ')'
../compiler/cmm/CmmParse.y:        | 'INFO_TABLE_RET' '(' NAME ',' INT ',' formals0 ')'
../compiler/cmm/CmmParse.y:-- is.  That is, for an INFO_TABLE_RET we want the return convention,
../compiler/cmm/CmmLex.x:  | CmmT_INFO_TABLE_RET
../compiler/cmm/CmmLex.x:   ( "INFO_TABLE_RET",     CmmT_INFO_TABLE_RET ),

File extensions can also be dead giveaways; GHC uses a parser generator named Happy, and the file extension of Happy files is .y:

ezyang@javelin:~/Dev/ghc-clean/rts$ find ../compiler -name *.y

From there, we can search the file for keywords or symbols (check for the string token name if a lexer is used; also, make sure to quote alphanumeric literals). A symbol can show up in multiple places, as it does for return:

maybe_conv :: { Convention }
           : {- empty -}        { NativeNodeCall }
           | 'return'           { NativeReturn }


stmt    :: { CmmParse () }
        : ';'                                   { return () }
        | 'goto' NAME ';'
                { do l <- lookupLabel $2; emit (mkBranch l) }
        | 'return' '(' exprs0 ')' ';'
                { doReturn $3 }

Guessing from the names of the productions and the contexts, it seems more likely that maybe_conv is the relevant production. It is used here:

cmmproc :: { CmmParse () }
        : info maybe_conv maybe_formals maybe_body
                { do ((entry_ret_label, info, stk_formals, formals), agraph) <-
                       getCodeR $ loopDecls $ do {
                         (entry_ret_label, info, stk_formals) <- $1;
                         formals <- sequence (fromMaybe [] $3);
                         return (entry_ret_label, info, stk_formals, formals) }
                     let do_layout = isJust $3
                     code (emitProcWithStackFrame $2 info
                                entry_ret_label stk_formals formals agraph
                                do_layout ) }

Now, if you really need to know exactly how it is lade out, you can go and checkout how emitProcWithStackFrame is implemented. Alternately, you might hope there is a useful comment in the source file which explains what is up:

A stack frame is written like this:

INFO_TABLE_RET ( label, FRAME_TYPE, info_ptr, field1, ..., fieldN )
               return ( arg1, ..., argM )
  ... code ...

where field1 ... fieldN are the fields of the stack frame (with types)
arg1...argN are the values returned to the stack frame (with types).
The return values are assumed to be passed according to the
NativeReturn convention.

The second example is for STG, which you can ask GHC to print out using -ddump-stg. Now, there is no parser for STG, so instead you’ll have to look at the pretty-printer. Not too difficult. Take this simple function:

Gnam.$WKST =
    \r [tpl_sl4 tpl_sl6]
        case tpl_sl4 of tpl_sl8 {
          __DEFAULT ->
              case tpl_sl6 of tpl_sl9 {
                __DEFAULT -> Gnam.KST [tpl_sl8 tpl_sl9];

Some aspects are familiar. But what does the \r mean?

Once again, we have to find the relevant source file. Since STG is printed out only when we pass the -ddump-stg flag, a good start is to trace the flag through the source code:

ezyang@javelin:~/Dev/ghc-clean/compiler$ grep -R ddump-stg .
./main/DynFlags.hs:  , Flag "ddump-stg"               (setDumpFlag Opt_D_dump_stg)
ezyang@javelin:~/Dev/ghc-clean/compiler$ grep -R Opt_D_dump_stg .
./main/DynFlags.hs:   | Opt_D_dump_stg
./main/DynFlags.hs:  , Flag "ddump-stg"               (setDumpFlag Opt_D_dump_stg)
./simplStg/SimplStg.lhs:        ; dumpIfSet_dyn dflags Opt_D_dump_stg "STG syntax:"

That’s a good sign! Popping open SimpleStg.lhs gives us:

; dumpIfSet_dyn dflags Opt_D_dump_stg "STG syntax:"
                (pprStgBindings un_binds)

And the location of pprStgBindings (compiler/stgSyn/StgSyn.lhs) is in fact the ticket.

STG is pretty small, and as it turns out if you just do a quick scan of the file you’re likely to find what you need. But in case you don’t, you can still figure things out deliberately. Suppose we search for a quoted backslash:

pprStgExpr (StgLam bndrs body)
  = sep [ char '\\' <+> ppr_list (map (pprBndr LambdaBind) bndrs)
            <+> ptext (sLit "->"),
         pprStgExpr body ]
  where ppr_list = brackets . fsep . punctuate comma


-- general case
pprStgRhs (StgRhsClosure cc bi free_vars upd_flag srt args body)
  = sdocWithDynFlags $ \dflags ->
    hang (hsep [if gopt Opt_SccProfilingOn dflags then ppr cc else empty,
                pp_binder_info bi,
                ifPprDebug (brackets (interppSP free_vars)),
                char '\\' <> ppr upd_flag, pprMaybeSRT srt, brackets (interppSP args)])
         4 (ppr body)

Which is it? As it turns out:

StgLam is used *only* during CoreToStg's work. Before CoreToStg has
finished it encodes (\x -> e) as (let f = \x -> e in f)

Since -ddump-stg is post-CoreToSTG, we must be looking at StgRhsClosure, and ppr upd_flag looks like the ticket. r must be an upd_flag, whatever that is. An UpdateFlag, as it turns out:

data UpdateFlag = ReEntrant | Updatable | SingleEntry

instance Outputable UpdateFlag where
    ppr u = char $ case u of
                       ReEntrant   -> 'r'
                       Updatable   -> 'u'
                       SingleEntry -> 's'

The r indicates the function is re-entrant! (Of course, as for what that means, you’ll have to consult other documentation.)

Of course, in an ideal world, all of this would be documented. But even if it is not, there is no reason why you can’t help yourself. If your codebase is as nice as GHC’s, there will be plenty of breadcrumbs and comments to help you out. I hope this gives some insight into one possible thought process when you encounter something you don’t know, and don’t know how to learn. (Of course, sometimes it’s just best to ignore it!)

  • July 2, 2013