Inside 206-105

Existential Pontification and Generalized Abstract Digressions

Induction and logical relations

Logical relations are a proof technique which allow you to prove things such as normalization (all programs terminate) and program equivalence (these two programs are observationally equivalent under all program contexts). If you haven't ever encountered these before, I highly recommend Amal Ahmed's OPLSS lectures on the subject; you can find videos and notes from yours truly. (You should also be able to access her lectures from previous years.) This post is an excuse to talk about a formalization of two logical relations proofs in Agda I worked on during OPLSS and the weeks afterwards. I'm not going to walk through the code, but I do want expand on two points about logical relations:

  1. They work when simple induction would not, and
  2. The logical relation is not an inductive definition.

The full development is in the lr-agda repository on GitHub. Many thanks to Dan Licata for providing the initial development as a homework assignment for his OPLSS Agda course and for bushwhacking the substitution lemmas which are often the greatest impediment to doing proofs about the lambda calculus.

If you didn't know any better, you might try to prove normalization inductively, as follows:

To show that all programs normalize to a value, let us proceed inductively on the typing derivations. For example, in the application case, we need to show e1 e2 normalizes to some value v, given that e1 normalizes to v1 and e2 normalizes to v2. Well, the type of v1 is t1 -> t2, which means v1 = λx. e'. Uh oh: this should step to e'[v2/x], but I don’t know anything about this expression (e' could be anything). Stuck!

What is the extra oomph that logical relations gives you, that allows you to prove what was previously unprovable by usual induction? Let's think about our second proof sketch: the problem was that we didn't know anything e'. If we knew something extra about it, say, "Well, for some appropriate v, e'[v/x] will normalize," then we'd be able to make the proof go through. So if this definition of WN was our old proof goal:

WN : (τ : Tp) → [] ⊢ τ → Set
WN τ e = e ⇓

then what we'd like to do is extend this definition to include that "extra stuff":

WN : (τ : Tp) → [] ⊢ τ → Set
WN τ e = e ⇓ × WN' τ e
At this point, it would be good to make some remarks about how to read the Agda code here. WN is a type family, that is, WN τ e is the unary logical relation of type τ for expression e. The type of a type is Tp, which is a simple inductive definition; the type of a term is a more complicated [] ⊢ τ (utilizing Agda's mixfix operators; without them, you might write it Expr [] τ), which not only tells us that e is an expression, but well-typed, having the type τ under the empty context []. (This is an instance of the general pattern, where an inductive definition coincides with a well-formedness derivation, in this case the typing derivation.) e ⇓ is another mixfix operator, which is defined to be a traditional normalization (there exists some value v such that e reduces to v, e.g. Σ (λ v → value v × e ↦* v)).

But what is this extra stuff? In the case of simple types, e.g. booleans, we don't actually need anything extra, since we're never going to try to apply it like a function:

WN' : (τ : Tp) → [] ⊢ τ → Set
WN' bool e = Unit

For a function type, let us say that a function is WN (i.e. in the logical relation) if, when given a WN argument, it produces a WN result. (Because we refer to WN, this is in fact a mutually recursive definition.) This statement is in fact the key proof idea!

WN' (τ1 ⇒ τ2) e = (e1 : [] ⊢ τ1) → WN τ1 e1 → WN τ2 (app e e1)

There are a number of further details, but essentially, when you redo the proof, proving WN instead of plain old normalization, you no longer get stuck on the application case. Great! The flip-side, however, is that the proof in the lambda case is no longer trivial; you have to do a bit of work to show that the extra stuff (WN') holds. This was described to me as the "balloon" principle.

The two sides of the balloon are the "use of the inductive hypothesis" and the "proof obligation". When you have a weak inductive hypothesis, it doesn't give you very much, but you don't have to work as hard to prove it either. When you strengthen the inductive hypothesis, you can prove more things with it; however, your proof obligation is correspondingly increased. In the context of a normalization proof, the "use of the inductive hypothesis" shows up in the application case, and the "proof obligation" shows up in the lambda case. When you attempt the straightforward inductive proof, the lambda case is trivial, but the inductive hypothesis is so weak that the application case is impossible. In the logical relations proof, the application case falls out easily from the induction hypothesis, but you have to do more work in the lambda case.

We now briefly take a step back, to remark about the way we have defined the WN' type family, on our way to the discussion of why WN' is not an inductive definition. In Agda, there are often two ways of defining a type family: it can be done as a recursive function, or it can be done as an inductive definitions. A simple example of this is in the definition of a length indexed list. The standard inductive definition goes like this:

data Vec : Set → Nat → Set where
  vnil : {A : Set} → Vec A 0
  vcons : {A : Set} → A → Vec A n → Vec A (S n)

But I could also build the list out of regular old products, using a recursive function on the index:

Vec : Set → Nat → Set
Vec A 0 = Unit
Vec A (S n) = A × Vec A n

The two different encodings have their own strengths and weaknesses: using a recursive function often means that certain equalities are definitional (whereas you'd have to prove a lemma with the inductive definition), but an inductive definition lets you do case-analysis on the different possibilities.

Sometimes, it is simply not possible to do the inductive definition, and this is the case for a logical relation. This strengthening of the inductive hypothesis is to blame:

data WN' : τ → [] ⊢ τ → Set where
  WN/bool : {e : [] ⊢ bool} → WN' bool e
  WN/⇒ : {e1 : [] ⊢ τ1} → (WN τ1 e1 → WN τ2 (app e e1)) → WN' (τ1 ⇒ τ2) e

Agda doesn't complain about inductive-recursive definitions (though one should beware: they are not too well metatheoretically grounded at the moment), but it will complain about this definition. The problem is a familiar one: WN does not occur in strictly positive positions; in particular, it shows up as an argument to a function contained by the WN/⇒ constructor. So we can't use this!

As it turns out, the inability to define the logical relation inductively is not a big deal for normalization. However, it causes a bit more headaches for more complicated logical relations proofs, e.g. for equivalence of programs. When considering program equivalence, you need a binary relation relating values to values, saying when two values are equal. This can be stated very naturally inductively:

data V'⟦_⟧ : (τ : Tp) → [] ⊢ τ → [] ⊢ τ → Set where
   V/bool-#t : V'⟦ bool ⟧ #t #t
   V/bool-#f : V'⟦ bool ⟧ #f #f
   V/⇒ : {τ₁ τ₂ : Tp} {e₁ e₂ : [] ,, τ₁ ⊢ τ₂}
     → ((v₁ v₂ : [] ⊢ τ₁) → V'⟦ τ₁ ⟧ v₁ v₂ → E⟦ τ₂ ⟧ (subst1 v₁ e₁) (subst1 v₂ e₂))
     → V'⟦ τ₁ ⇒ τ₂ ⟧ (lam e₁) (lam e₂)

We define the relation by type. If a value is a boolean, then we say that #t (true) is related to itself, and #f is related to itself. If the value is a function, then we say that a lambda term is related to another lambda term if, when applied to two related values, the results are also related. This "function" is directly analogous to the extra stuff we added for the normalization proof. (If you like, you can mentally replace "related" with "equal", but this is misleading since it doesn't capture what is going on in the function case). But this fails the strict positivity check, so we have to define it recursively instead:

V⟦_⟧ : (τ : Tp) → [] ⊢ τ → [] ⊢ τ → Set
V⟦ bool ⟧    #t #t = Unit
V⟦ bool ⟧    #f #f = Unit
V⟦ bool ⟧ _ _  = Void
V⟦ τ₁ ⇒ τ₂ ⟧ (lam e) (lam e') = (v v' : [] ⊢ τ₁) → V⟦ τ₁ ⟧ v v' → E⟦ τ₂ ⟧ (subst1 v e) (subst1 v' e')
V⟦ τ₁ ⇒ τ₂ ⟧ _ _ = Void

Notice that the definition here is much less nice than the inductive definition: we need two fall through cases which assert a contradiction when two things could not possibly be equal, e.g. #t could not possibly be equal to #f. Furthermore, supposing that we are given V as a hypothesis while performing a proof, we can no longer just case-split on it to find out what kind of information we have; we have to laboriously first case split over the type and expression, at which point the function reduces. To give you a sense of how horrible this is, consider this function which converts from a inductive definition to the recursive definitions:

pV : {τ : Tp} → {e e' : [] ⊢ τ} → V⟦ τ ⟧ e e' → V'⟦ τ ⟧ e e'
pV {bool} {#t} {#t} V = V/bool-#t
pV {bool} {#t} {#f} ()
pV {bool} {#t} {if _ then _ else _} ()
pV {bool} {#t} {var _} ()
pV {bool} {#t} {app _ _} ()
pV {bool} {#f} {#t} ()
pV {bool} {#f} {#f} V = V/bool-#f
pV {bool} {#f} {if _ then _ else _} ()
pV {bool} {#f} {var _} ()
pV {bool} {#f} {app _ _} ()
pV {bool} {if _ then _ else _} ()
pV {bool} {var _} ()
pV {bool} {app _ _} ()
pV {_ ⇒ _} {if _ then _ else _} ()
pV {_ ⇒ _} {var _} ()
pV {_ ⇒ _} {lam _} {if _ then _ else _} ()
pV {_ ⇒ _} {lam _} {var _} ()
pV {_ ⇒ _} {lam _} {lam _} f = V/⇒ (\ v v' V → pE (f v v' V))
pV {_ ⇒ _} {lam _} {app _ _} ()
pV {_ ⇒ _} {app _ _} ()

Good grief! Perhaps the situation could be improved by improving how Agda handles wildcards in pattern matching, but at the moment, all of this is necessary.

"But wait, Edward!" you might say, "Didn't you just say you couldn't define it inductively?" Indeed, this function does not operate on the inductive definition I presented previously, but a slightly modified one, which banishes the non-strictly-positive occurrence by replacing it with V, the recursive definition:

V/⇒ : {τ₁ τ₂ : Tp} {e e' : [] ,, τ₁ ⊢ τ₂}
  → (V : (v v' : [] ⊢ τ₁) → V⟦ τ₁ ⟧ v v' {- the critical position! -} → E'⟦ τ₂ ⟧ (subst1 v e) (subst1 v' e'))
  → V'⟦ τ₁ ⇒ τ₂ ⟧ (lam e) (lam e')

This conversion function helps a lot, because agda-mode interacts a lot more nicely with inductive definitions (C-c C-c works!) than with recursive definitions in cases like this.

Why do logical relations in Agda? (or any proof assistant, for that matter?) Proofs using logical relations often follow the pattern of defining an appropriate logical relation for your problem, and then a lot of book-keeping to actually push the relation through a proof. Computers are great for doing book-keeping, and I think it is hugely informative to work through a logical relation proof in a proof assistant. An interesting challenge would be to extend this framework to a non-terminating language (adding step-indexes to the relation: the very pinnacle of book-keeping) or extending the lambda calculus with polymorphism (which requires some other interesting techniques for logical relations).

  • September 18, 2013

Cost semantics for STG in modern GHC

One of the problems with academic publishing is that it’s hard to keep old papers up-to-date. This is the certainly case for this 1995 Sansom paper on profiling non-strict, higher-order functional languages. While the basic ideas of the paper still hold, the actual implementation of cost centers in GHC has changed quite a bit, perhaps the most dramatic change being the introduction of cost center stacks. So while the old paper is good for giving you the basic idea of how profiling in GHC works, if you really want to know the details, the paper offers little guidance.

So what do you do when your cost semantics are out-of-date? Why, update them of course! I present an updated cost-semantics for STG in modern GHC (PDF) (GitHub). Eventually these will go in the GHC repository proper, alongside core-spec which is a similar document for Core. However, I haven't done any proofs with these semantics yet, so they are probably a little buggy.

Despite the lack of proofs, the formalization has been helpful already: I’ve already spotted one bug in the current implementation (remarked upon in the document). I’ve also identified a potential refactoring based on the way the rules are currently setup. Please let me know about any other bugs you find!

  • September 7, 2013

Blame Trees

I just presented Blame Trees at the 13th Algorithms and Data Structures Symposium. Blame trees are a functional data structure which support an efficient merge operation by incorporating information about the “blame” (think git blame) of any given part of the structure. It’s a theory paper, so the constant factors are not so good, but the asymptotics are much better than traditional merge algorithms used by modern VCSes.

This was joint work with David A. Wilson, Pavel Panchekha and Erik D. Demaine. You can view the paper or check out the slides. I also have a slightly older version of the talk recorded on YouTube (20 minutes) which I had used to help get feedback from my out-of-town collaborators before actually giving the talk. Thanks also to David Mazières for giving useful comments on the presentation in person.

  • August 12, 2013

OPLSS lecture notes

I write in from sunny Oregon, where the sun floods into your room at seven in the morning and the Oregon Programming Languages Summer School is in session. So far, it’s been a blast—with lectures covering Coq, Agda, homotopy type theory, linear logic, logical relations—and we’ve still got another week to go!

If you were not able to make it, fear not: you can go to the curriculum page and pick up not only videos (I hear they are still quite large; we’ve been trying to convince the organizers to upload them to YouTube) but lecture notes from yours truly. (Sample from the logical relations lectures.) The earlier notes are a little iffy, but I get a bit more detailed on the later ones.

One technical note: I have heard that some folks on Macs are having trouble viewing the PDFs on the default PDF viewers; the text doesn’t show up. I have also heard that Chrome’s PDF viewer as well as Acrobat Reader are able to read the PDFs. Whatever the problem is, it is likely some trouble with Xournal’s PDF export functionality. If someone has a method of fixing this without losing metadata (e.g. the ability to copy/paste the text), I’d be all ears.

  • July 29, 2013

No grammar? No problem!

One day, you’re strolling along fields of code, when suddenly you spot a syntax construct that you don’t understand.

Perhaps you’d ask your desk-mate, who’d tell you in an instant what it was.

Perhaps your programming toolchain can tell you. (Perhaps the IDE would you mouse over the construct, or you’re using Coq which let’s you Locate custom notations.)

Perhaps you’d pull up the manual (or, more likely, one of many tutorials) and scan through looking for the syntax construct in question.

But when all this fails, what is one to do? What if the code in question is written in an internal language for a compiler, whose details have changed since it was last documented, for which the documentation is out of date?

No problem. As long as you’re willing to roll up your sleeves and take a look at the source code of the compiler in question, you can frequently resolve your question for less effort than it would have taken to look up the syntax in the manual (and it’s guaranteed to be up-to-date too!) The key is that modern compilers all use parser generators, and the input to these are essentially executable specifications.

I’ll give two examples from GHC. The first is from C--, GHC’s high-level assembly language. Consider this function:

INFO_TABLE_RET(stg_maskUninterruptiblezh_ret, RET_SMALL, W_ info_ptr)
    return (P_ ret)
    StgTSO_flags(CurrentTSO) =
          | TSO_BLOCKEX)

    return (ret);

Some aspects of this definition are familiar to someone who has written C before, but there are some mysterious bits. For example, what does the return (P_ ret) mean in the preamble?

The first order of business is to find the relevant file. When the code in question has very distinctive keywords (as this one does), a grep will often do the trick:

ezyang@javelin:~/Dev/ghc-clean/rts$ grep -R INFO_TABLE_RET ../compiler/
../compiler/cmm/CmmParse.y:INFO_TABLE_RET ( label, FRAME_TYPE, info_ptr, field1, ..., fieldN )
../compiler/cmm/CmmParse.y:        'INFO_TABLE_RET'{ L _ (CmmT_INFO_TABLE_RET) }
../compiler/cmm/CmmParse.y:        | 'INFO_TABLE_RET' '(' NAME ',' INT ')'
../compiler/cmm/CmmParse.y:        | 'INFO_TABLE_RET' '(' NAME ',' INT ',' formals0 ')'
../compiler/cmm/CmmParse.y:-- is.  That is, for an INFO_TABLE_RET we want the return convention,
../compiler/cmm/CmmLex.x:  | CmmT_INFO_TABLE_RET
../compiler/cmm/CmmLex.x:   ( "INFO_TABLE_RET",     CmmT_INFO_TABLE_RET ),

File extensions can also be dead giveaways; GHC uses a parser generator named Happy, and the file extension of Happy files is .y:

ezyang@javelin:~/Dev/ghc-clean/rts$ find ../compiler -name *.y

From there, we can search the file for keywords or symbols (check for the string token name if a lexer is used; also, make sure to quote alphanumeric literals). A symbol can show up in multiple places, as it does for return:

maybe_conv :: { Convention }
           : {- empty -}        { NativeNodeCall }
           | 'return'           { NativeReturn }


stmt    :: { CmmParse () }
        : ';'                                   { return () }
        | 'goto' NAME ';'
                { do l <- lookupLabel $2; emit (mkBranch l) }
        | 'return' '(' exprs0 ')' ';'
                { doReturn $3 }

Guessing from the names of the productions and the contexts, it seems more likely that maybe_conv is the relevant production. It is used here:

cmmproc :: { CmmParse () }
        : info maybe_conv maybe_formals maybe_body
                { do ((entry_ret_label, info, stk_formals, formals), agraph) <-
                       getCodeR $ loopDecls $ do {
                         (entry_ret_label, info, stk_formals) <- $1;
                         formals <- sequence (fromMaybe [] $3);
                         return (entry_ret_label, info, stk_formals, formals) }
                     let do_layout = isJust $3
                     code (emitProcWithStackFrame $2 info
                                entry_ret_label stk_formals formals agraph
                                do_layout ) }

Now, if you really need to know exactly how it is lade out, you can go and checkout how emitProcWithStackFrame is implemented. Alternately, you might hope there is a useful comment in the source file which explains what is up:

A stack frame is written like this:

INFO_TABLE_RET ( label, FRAME_TYPE, info_ptr, field1, ..., fieldN )
               return ( arg1, ..., argM )
  ... code ...

where field1 ... fieldN are the fields of the stack frame (with types)
arg1...argN are the values returned to the stack frame (with types).
The return values are assumed to be passed according to the
NativeReturn convention.

The second example is for STG, which you can ask GHC to print out using -ddump-stg. Now, there is no parser for STG, so instead you’ll have to look at the pretty-printer. Not too difficult. Take this simple function:

Gnam.$WKST =
    \r [tpl_sl4 tpl_sl6]
        case tpl_sl4 of tpl_sl8 {
          __DEFAULT ->
              case tpl_sl6 of tpl_sl9 {
                __DEFAULT -> Gnam.KST [tpl_sl8 tpl_sl9];

Some aspects are familiar. But what does the \r mean?

Once again, we have to find the relevant source file. Since STG is printed out only when we pass the -ddump-stg flag, a good start is to trace the flag through the source code:

ezyang@javelin:~/Dev/ghc-clean/compiler$ grep -R ddump-stg .
./main/DynFlags.hs:  , Flag "ddump-stg"               (setDumpFlag Opt_D_dump_stg)
ezyang@javelin:~/Dev/ghc-clean/compiler$ grep -R Opt_D_dump_stg .
./main/DynFlags.hs:   | Opt_D_dump_stg
./main/DynFlags.hs:  , Flag "ddump-stg"               (setDumpFlag Opt_D_dump_stg)
./simplStg/SimplStg.lhs:        ; dumpIfSet_dyn dflags Opt_D_dump_stg "STG syntax:"

That’s a good sign! Popping open SimpleStg.lhs gives us:

; dumpIfSet_dyn dflags Opt_D_dump_stg "STG syntax:"
                (pprStgBindings un_binds)

And the location of pprStgBindings (compiler/stgSyn/StgSyn.lhs) is in fact the ticket.

STG is pretty small, and as it turns out if you just do a quick scan of the file you’re likely to find what you need. But in case you don’t, you can still figure things out deliberately. Suppose we search for a quoted backslash:

pprStgExpr (StgLam bndrs body)
  = sep [ char '\\' <+> ppr_list (map (pprBndr LambdaBind) bndrs)
            <+> ptext (sLit "->"),
         pprStgExpr body ]
  where ppr_list = brackets . fsep . punctuate comma


-- general case
pprStgRhs (StgRhsClosure cc bi free_vars upd_flag srt args body)
  = sdocWithDynFlags $ \dflags ->
    hang (hsep [if gopt Opt_SccProfilingOn dflags then ppr cc else empty,
                pp_binder_info bi,
                ifPprDebug (brackets (interppSP free_vars)),
                char '\\' <> ppr upd_flag, pprMaybeSRT srt, brackets (interppSP args)])
         4 (ppr body)

Which is it? As it turns out:

StgLam is used *only* during CoreToStg's work. Before CoreToStg has
finished it encodes (\x -> e) as (let f = \x -> e in f)

Since -ddump-stg is post-CoreToSTG, we must be looking at StgRhsClosure, and ppr upd_flag looks like the ticket. r must be an upd_flag, whatever that is. An UpdateFlag, as it turns out:

data UpdateFlag = ReEntrant | Updatable | SingleEntry

instance Outputable UpdateFlag where
    ppr u = char $ case u of
                       ReEntrant   -> 'r'
                       Updatable   -> 'u'
                       SingleEntry -> 's'

The r indicates the function is re-entrant! (Of course, as for what that means, you’ll have to consult other documentation.)

Of course, in an ideal world, all of this would be documented. But even if it is not, there is no reason why you can’t help yourself. If your codebase is as nice as GHC’s, there will be plenty of breadcrumbs and comments to help you out. I hope this gives some insight into one possible thought process when you encounter something you don’t know, and don’t know how to learn. (Of course, sometimes it’s just best to ignore it!)

  • July 2, 2013

HoTT exercises in Coq (in progress)

I spent some of my plane ride yesterday working on Coq versions of the exercises in The HoTT book. I got as far as 1.6 (yeah, not very far, perhaps I should make a GitHub repo if other folks are interested in contributing skeletons. Don't know what to do about the solutions though). All of these have been test solved.

You will need HoTT/coq in order to run this development; instructions on how to install it are here.

Update. Solutions and more exercises can be found at the HoTT-coqex repository. I’ve done all of the nontrivial homotopy-specific exercises, and left out some of the more standard type theory exercises (which aren’t really homotopy specific). Some of the solutions are really terrible and could use some sprucing up.

Require Import HoTT.

Definition admit {T: Type} : T. Admitted.

(* Exercise 1.1 *)
Definition mycompose {A B C : Type} (g : B -> C) (f : A -> B) : A -> C := admit.

Goal forall (A B C D : Type) (f : A -> B) (g : B -> C) (h : C -> D),
       mycompose h (mycompose g f) = mycompose (mycompose h g) f.

(* Exercise 1.2 *)
Section ex_1_2_prod.
  Variable A B : Type.
  Check @fst.
  Check @snd.
  Definition my_prod_rec (C : Type) (g : A -> B -> C) (p : A * B) : C := admit.
  Goal fst = my_prod_rec A (fun a => fun b => a). Admitted.
  Goal snd = my_prod_rec B (fun a => fun b => b). Admitted.
End ex_1_2_prod.

Section ex_1_2_sig.
  Variable A : Type.
  Variable B : A -> Type.
  Check @projT1.
  Check @projT2.
  Definition my_sig_rec (C : Type) (g : forall (x : A), B x -> C) (p : exists (x : A), B x) : C := admit.
  Goal @projT1 A B = my_sig_rec A (fun a => fun b => a). Admitted.
  (* What goes wrong when you try to prove this for projT2? *)
End ex_1_2_sig.

(* Exercise 1.3 *)

Definition refl {A : Type} (x : A) : x = x := 1%path.

Section ex_1_3_prod.
  Variable A B : Type.
  (* Given by the book *)
  Definition uppt : forall (x : A * B), ((fst x, snd x) = x) :=
    fun p => match p with (a,b) => refl (a,b) end.
  Definition my_prod_ind (C : A * B -> Type) (g : forall (x : A) (y : B), C (x, y)) (x : A * B) : C x := admit.
  Goal forall C g a b, my_prod_ind C g (a, b) = g a b. Admitted.
End ex_1_3_prod.

Section ex_1_3_sig.
  Variable A : Type.
  Variable B : A -> Type.
  Definition sig_uppt : forall (x : exists (a : A), B a), ((projT1 x; projT2 x) = x) := admit.
  Definition mysig_ind (C : (exists (a : A), B a) -> Type) (g : forall (a : A) (b : B a), C (a; b)) (x : exists (a : A), B a) : C x := admit.
  Goal forall C g a b, mysig_ind C g (a; b) = g a b. Admitted.
End ex_1_3_sig.

(* Exercise 1.4 *)
Fixpoint iter (C : Type) (c0 : C) (cs : C -> C) (n : nat) : C :=
  match n with
    | 0 => c0
    | S n' => cs (iter C c0 cs n')
Definition mynat_rec (C : Type) : C -> (nat -> C -> C) -> nat -> C := admit.
Eval compute in mynat_rec (list nat) nil (@cons nat) 2.
Eval compute in nat_rect (fun _ => list nat) nil (@cons nat) 2.

(* Exercise 1.5 *)
Definition mycoprod (A B : Type) := exists (x : Bool), Bool_rect (fun _ => Type) A B x.

Section ex_1_5.
  Variable A B : Type.
  Definition inl := existT (Bool_rect (fun _ => Type) A B) true.
  Definition inr := existT (Bool_rect (fun _ => Type) A B) false.
  Definition mycoprod_ind (C : mycoprod A B -> Type)
                          (l : forall (a : A), C (inl a))
                          (r : forall (b : B), C (inr b))
                          (x : mycoprod A B) : C x := admit.
  Goal forall C l r x, mycoprod_ind C l r (inl x) = l x. Admitted.
  Goal forall C l r x, mycoprod_ind C l r (inr x) = r x. Admitted.
End ex_1_5.

(* Exercise 1.6 *)

Definition myprod (A B : Type) := forall (x : Bool), Bool_rect (fun _ => Type) A B x.
Section ex_1_6.
  Context `{Funext}.
  Variable A B : Type.
  Definition mypr1 (p : myprod A B) := p true.
  Definition mypr2 (p : myprod A B) := p false.
  Definition mymkprod (a : A) (b : B) : myprod A B := Bool_rect (Bool_rect (fun _ => Type) A B) a b.
  Definition myprod_ind (C : myprod A B -> Type)
                        (g : forall (x : A) (y : B), C (mymkprod x y)) (x : myprod A B) : C x := admit.
  Goal forall C g a b, myprod_ind C g (mymkprod a b) = g a b. Admitted.
End ex_1_6.

Actually, I lied. I haven't proved the last goal in exercise 1.6; my trouble is I don't know how to get function extensionality to compute, but I’m sure it’s something simple...

  • July 1, 2013

(Homotopy) Type Theory: Chapter One

In what is old news by now, the folks at the Institute for Advanced Study have released Homotopy Type Theory: Univalent Foundations of Mathematics. There has been some (meta)commentary (Dan Piponi, Bob Harper, Andrej Bauer, François G. Dorais, Steve Awodey, Carlo Angiuli, Mike Shulman, John Baez) on the Internet, though, of course, it takes time to read a math textbook, so don’t expect detailed technical commentary from non-authors for a while.

Of course, being a puny grad student, I was, of course, most interested in the book’s contribution of yet another Martin-Löf intuitionistic type theory introduction, e.g. chapter one. The classic introduction is, of course, the papers that Martin Löf wrote (nota bene: there were many iterations of this paper, so it’s a little hard to find the right one, though it seems Giovanni Sambin’s notes are the easiest to find), but an introduction of type theory for homotopy type theory has to make certain adjustments, and this makes for some novel presentation. In particular, the chapter’s discussion of identity types is considerably more detailed than I have seen elsewhere (this is not surprising, since identity is of central importance to homotopy type theory). There is also a considerable bit of pedantry/structure in the discussion of the types that make up the theory, reminiscent of the PFPL (though I believe that this particular chapter was mostly written by others). And, of course, there are many little variations in how the theory is actually put together, expounded upon in some detail in the chapter notes.

In more detail:

Definitional and propositional equality. The chapter spends a little bit of time carefully distinguishing between definitional equality (a purely syntactic notion up to computation) and propositional equality (which involves evidence), which I appreciated. The difference between connectives which show up inside and outside the deductive system was a major point of confusion for me when I was originally learning logic.

The general pattern of the introduction of a new kind of type. The modern style for introducing logical connectives is to classify the rules into various kinds, such as introduction rules and elimination rules, and then hew to this regularity in the presentation. Often, readers are expected to “see it”, but this book makes a helpful remark laying out the style. I found a useful exercise was to take the rules and reorganize them so that, for example, all of the elimination rules are together and compare them.

Recursion and induction. I’ve written about this subject before, arguing that recursion and induction aren’t the same thing, since induction needs to work over indexed types. This is true, but there is an important point I did not make: induction is generalized recursion. This is because when you specify your type family P to be the constant type family which ignores its index, the dependence is erased and you have an ordinary recursor. In fact, this is a CPDT exercise; I think it clarifies things to see this in both Coq and informal mathematics, as the informal presentation makes the dimension of generalization clearer.

Identity types. I won’t lie: I had a difficult time with this section, and I don’t think I fully understand why path induction works, even after a very long remark at the end of the section. (Additionally, while the notes point to some prior literature about the subject, I took a look at the papers and I did not see anything that resembled their presentation of path induction.) By default, Coq thinks the inductive principle for equality types should be what is referred to in this book as the indiscernability of identicals:

> Check eq_rect.
     : forall (A : Type) (x : A) (P : A -> Type),
       P x -> forall y : A, x = y -> P y

(As a tangent, the use of family C is confusingly overloaded; when discussing the generalization of the previous principlem the reader is required to imagine C(x) -> C(y)  ===  C(x, y)—the C’s of course being distinct.) Path induction asks for more:

     : forall (A : Type), forall (C : forall (x y : A), x = y -> Type),
       (forall (x : A), C x x (eq_refl x)) -> forall (x y : A), forall (p : x = y), C x y p

This is perhaps not too surprising, since this machinery is principally motivated by homotopy type theory. Additionally, the inductive principle follows the same pattern as the other inductive principles defined for the other types. The trouble is a frustrating discussion of why this inductive principle valid, even when you might expect, in a HoTT setting, that not all equality was proven using reflexivity. My understanding of the matter is that is has to do with the placement of the forall (x : A) quantifier. It is permissible to move one of the x's to the top level (based path induction), but not both. (This is somewhat obscured by the reuse of variable names.) There is also a geometric intuition, which is that when both or one endpoints of the path are free (inner-quantification), then I can contract the path into nothingness. But I have a difficult time mapping this onto any sort of rigorous argument. Perhaps you can help me out.

As an aside, I have some general remarks about learning type theory from a functional programming background. I have noticed that it is not too hard to use Coq without knowing much type theory, and even easier to miss the point of why the type theory might be helpful. But in the end, it is really useful to understand what is going on, and so it’s well worth studying why dependent products and sums generalize the way they do. It also seems that people find the pi and sigma notation confusing: it helps if you realize that they are algebraic puns. Don’t skip the definition of the inductive principles.

I apologize if any of this post has been inaccurate or misleadingly skewed. My overall impression is that this first chapter is a very crisp introduction to type theory, but that the segments on identity types may be a little difficult to understand. Now, onwards to chapter two!

  • June 24, 2013

The AST Typing Problem

This Lambda the Ultimate post (dated 2010) describes a rather universal problem faced by compiler writers: how does one go about adding “extra information” (e.g. types) to an AST? (The post itself divides the problem into three components: adding the information to the data types, using the information to inform the construction of the node, and using the information to inform the destruction of a node—but I’m really only interested in the question of how you define your data type, not do things to it.) In this post, I want to sum up ways of solving the problem which were described in this post, and also take a look at what some real world compilers do. The running example lambda calculus looks like the following:

data Exp = Num Int
         | Bool Bool
         | Var Var
         | If Exp Exp Exp
         | Lambda Var Exp
         | App Exp Exp
data Type = TyInt | TyBool | TyArrow Type Type

Separate IR where nodes are decorated with types

The low-tech solution: if you need a new version of the IR with more information, just define a new IR type where each node can also carry the information. A trick to make these definitions more concise is to make a mutually recursive data structure. [1]

type TExp = (TExp', Type)
data TExp' = TNum Int
           | TBool Bool
           | TVar Var
           | TIf TExp TExp TExp
           | TLambda Var TExp
           | TApp TExp TExp

Despite (or perhaps because of) it’s simplicity, this approach is extremely popular among many compilers, especially in the ML community. A few examples include OCaml (parsetree/typedtree), MLton (AST/CoreML) and Ikarus Scheme. Part of the reason for this is that the transition from frontend language to typed language also comes with some other changes, and when a new AST is defined those changes can be combined in too.

Nullable field

The unprincipled solution: use one AST, but have an optional field in which you can slot in the information. [2]

type TExp = (TExp', Maybe Type)
data TExp' = TNum Int
           | TBool Bool
           | TVar Var
           | TIf TExp TExp TExp
           | TLambda Var TExp
           | TApp TExp TExp

Presented without further comment.

Explicit typing

While closely related to the separate IR solution, an explicitly typed IR takes the approach of not decorating each node with a type, but arranging that the type of any given node can be quickly computed using only local information. [3]

data TExp = TNum Int
          | TBool Bool
          | TVar Var
          | TIf TExp TExp TExp
          | TLambda Var Type TExp
          | TApp TExp TExp

Here, the difference between TExp and Exp is very slight; the TLambda is annotated with an explicit type for the binder. As far as type-checking is concerned, this makes a world of difference: we no longer need to look outside a lambda to figure out what the binder could be.

Forcing your IR to be explicitly typed is often a good idea for metatheoretic reasons, as complicated type systems often don’t have decidable inference algorithms. Both GHC’s core IR, Ur/Web's core and Coq are explicitly typed in this way.

Two-level types

By deferring when you tie the knot of a recursive data-structure, you can arrange for the base functor to do double-duty for the untyped and typed representations. [4]

data ExpF a = Num Int
            | Bool Bool
            | Var Var
            | If a a a
            | Lambda Var a
            | App a a
newtype Exp = Exp (ExpF Exp)
newtype TExp = TExp (ExpF TExp, Type)

The Coq kernel uses this to define its expression type, although it doesn’t use it to define an untyped variant.

(Lazy) Attribute grammars

I don’t claim to understand this approach too well, but essentially it is a programming model distinct from usual algebraic data types which associates attributes over nodes of a tree. In some sense, it can be thought as a memoized function from AST nodes to the attributes. Many compilers do utlize maps, but only for top-level declarations. [5]

Closing remarks

There were a few things that I did not mention here which came up in the discussion. One participant suggested using polymorphic variants to define the data type; this doesn’t help much with adding extra information but allows for different ways of writing traversal functions. Indeed, traversal is one of the big concerns, and the mention of generic programming also is targeted at this problem.

As for my preference? It’s hard to say. I’ve worked with compilers mostly written in the “define a new IR style”, and while the initial outlay of defining two data structures is quite annoying, it is mostly a fixed cost. What’s yours?

Also, a question. Is there a presentation of the conventional set of annotations needed to get explicitly typed System F?

  • May 28, 2013

Anatomy of an MVar operation

Adam Belay (of Dune fame) was recently wondering why Haskell’s MVars are so slow. “Slow?” I thought, “aren’t Haskell’s MVars supposed to be really fast?” So I did some digging around how MVars worked, to see if I could explain.

Let’s consider the operation of the function takeMVar in Control.Concurrent.MVar. This function is very simple, it unpacks MVar to get the underlying MVar# primitive value, and then calls the primop takeMVar#:

takeMVar :: MVar a -> IO a
takeMVar (MVar mvar#) = IO $ \ s# -> takeMVar# mvar# s#

Primops result in the invocation of stg_takeMVarzh in PrimOps.cmm, which is where the magic happens. For simplicity, we consider only the multithreaded case.

The first step is to lock the closure:

("ptr" info) = ccall lockClosure(mvar "ptr");

Objects on the GHC heap have an info table header which indicates what kind of object they are, by pointing to the relevant info table for the object. These headers are also used for synchronization: since they are word-sized, they can be atomically swapped for other values. lockClosure is in fact a spin-lock on the info table header:

EXTERN_INLINE StgInfoTable *lockClosure(StgClosure *p)
    StgWord info;
    do {
        nat i = 0;
        do {
            info = xchg((P_)(void *)&p->, (W_)&stg_WHITEHOLE_info);
            if (info != (W_)&stg_WHITEHOLE_info) return (StgInfoTable *)info;
        } while (++i < SPIN_COUNT);
    } while (1);

lockClosure is used for some other objects, namely thread state objects (stg_TSO_info, via lockTSO) and thread messages i.e. exceptions (stg_MSG_THROWTO_info, stg_MSG_NULL_info).

The next step is to apply a GC write barrier on the MVar:

if (info == stg_MVAR_CLEAN_info) {
    ccall dirty_MVAR(BaseReg "ptr", mvar "ptr");

As I’ve written before, as the MVar is a mutable object, it can be mutated to point to objects in generation 0; thus, when a mutation happens, it has to be added to the root set via the mutable list. Since mutable is per capability, this boils down into a bunch of pointer modifications, and does not require any synchronizations. Note that we will need to add the MVar to the mutable list, even if we end up blocking on it, because the MVar is a retainer of the thread (TSO) which is blocked on it! (However, I suspect in some cases we can get away with not doing this.)

Next, we case split depending on whether or not the MVar is full or empty. If the MVar is empty, we need to block the thread until the MVar is full:

/* If the MVar is empty, put ourselves on its blocking queue,
 * and wait until we're woken up.
if (StgMVar_value(mvar) == stg_END_TSO_QUEUE_closure) {

    // We want to put the heap check down here in the slow path,
    // but be careful to unlock the closure before returning to
    // the RTS if the check fails.
         unlockClosure(mvar, stg_MVAR_DIRTY_info);
         GC_PRIM_P(stg_takeMVarzh, mvar));

    q = Hp - SIZEOF_StgMVarTSOQueue + WDS(1);

    StgMVarTSOQueue_link(q) = END_TSO_QUEUE;
    StgMVarTSOQueue_tso(q)  = CurrentTSO;

    if (StgMVar_head(mvar) == stg_END_TSO_QUEUE_closure) {
        StgMVar_head(mvar) = q;
    } else {
        StgMVarTSOQueue_link(StgMVar_tail(mvar)) = q;
        ccall recordClosureMutated(MyCapability() "ptr",
    StgTSO__link(CurrentTSO)       = q;
    StgTSO_block_info(CurrentTSO)  = mvar;
    StgTSO_why_blocked(CurrentTSO) = BlockedOnMVar::I16;
    StgMVar_tail(mvar)             = q;

    jump stg_block_takemvar(mvar);

A useful thing to know when decoding C-- primop code is that StgTSO_block_info(...) and its kin are how we spell field access on objects. C-- doesn’t know anything about C struct layout, and so these “functions” are actually macros generated by utils/deriveConstants. Blocking a thread consists of three steps:

  1. We have to add the thread to the blocked queue attached to the MVar (that’s why blocking on an MVar mutates the MVar!) This involves performing a heap allocation for the linked list node as well as mutating the tail of the old linked list.
  2. We have to mark the thread as blocked (the StgTSO modifications).
  3. We need to setup a stack frame for the thread so that when the thread wakes up, it performs the correct action (the invocation to stg_block_takemvar). This invocation is also responsible for unlocking the closure. While the machinery here is pretty intricate, it’s not really in scope for this blog post.

If the MVar is full, then we can go ahead and take the value from the MVar.

/* we got the value... */
val = StgMVar_value(mvar);

But that’s not all. If there are other blocked putMVars on the MVar (remember, when a thread attempts to put an MVar that is already full, it blocks until the MVar empties out), then we should immediately unblock one of these threads so that the MVar can always be left in a full state:

    q = StgMVar_head(mvar);
    if (q == stg_END_TSO_QUEUE_closure) {
        /* No further putMVars, MVar is now empty */
        StgMVar_value(mvar) = stg_END_TSO_QUEUE_closure;
        unlockClosure(mvar, stg_MVAR_DIRTY_info);
        return (val);
    if (StgHeader_info(q) == stg_IND_info ||
        StgHeader_info(q) == stg_MSG_NULL_info) {
        q = StgInd_indirectee(q);
        goto loop;

There is one interesting thing about the code that checks for blocked threads, and that is the check for indirectees (stg_IND_info). Under what circumstances would a queue object be stubbed out with an indirection? As it turns out, this occurs when we delete an item from the linked list. This is quite nice, because on a singly-linked list, we don't have an easy way to delete items unless we also have a pointer to the previous item. With this scheme, we just overwrite out the current item with an indirection, to be cleaned up next GC. (This, by the way, is why we can't just chain up the TSOs directly, without the extra linked list nodes. [1])

When we find some other threads, we immediately run them, so that the MVar never becomes empty:

// There are putMVar(s) waiting... wake up the first thread on the queue

tso = StgMVarTSOQueue_tso(q);
StgMVar_head(mvar) = StgMVarTSOQueue_link(q);
if (StgMVar_head(mvar) == stg_END_TSO_QUEUE_closure) {
    StgMVar_tail(mvar) = stg_END_TSO_QUEUE_closure;

ASSERT(StgTSO_why_blocked(tso) == BlockedOnMVar::I16); // note: I16 means this is a 16-bit integer
ASSERT(StgTSO_block_info(tso) == mvar);

// actually perform the putMVar for the thread that we just woke up
W_ stack;
stack = StgTSO_stackobj(tso);
PerformPut(stack, StgMVar_value(mvar));

There is one detail here: PerformPut doesn’t actually run the thread, it just looks at the thread’s stack to figure out what it was going to put. Once the MVar is put, we then wake up the thread, so it can go on its merry way:

// indicate that the MVar operation has now completed.
StgTSO__link(tso) = stg_END_TSO_QUEUE_closure;

// no need to mark the TSO dirty, we have only written END_TSO_QUEUE.

ccall tryWakeupThread(MyCapability() "ptr", tso);

unlockClosure(mvar, stg_MVAR_DIRTY_info);
return (val);

To sum up, when you takeMVar, you pay the costs of:

  • one spinlock,
  • on order of several dozen memory operations (write barriers, queue twiddling), and
  • when the MVar is empty, a (small) heap allocation and stack write.

Adam and I puzzled about this a bit, and then realized the reason why the number of cycles was so large: our numbers are for roundtrips, and even with such lightweight synchronization (and lack of syscalls), you still have to go through the scheduler when all is said and done, and that blows up the number of cycles.

[1] It wasn’t always this way, see:

commit f4692220c7cbdadaa633f50eb2b30b59edb30183
Author: Simon Marlow <>
Date:   Thu Apr 1 09:16:05 2010 +0000

    Change the representation of the MVar blocked queue

    The list of threads blocked on an MVar is now represented as a list of
    separately allocated objects rather than being linked through the TSOs
    themselves.  This lets us remove a TSO from the list in O(1) time
    rather than O(n) time, by marking the list object.  Removing this
    linear component fixes some pathalogical performance cases where many
    threads were blocked on an MVar and became unreachable simultaneously
    (nofib/smp/threads007), or when sending an asynchronous exception to a
    TSO in a long list of thread blocked on an MVar.

    MVar performance has actually improved by a few percent as a result of
    this change, slightly to my surprise.

    This is the final cleanup in the sequence, which let me remove the old
    way of waking up threads (unblockOne(), MSG_WAKEUP) in favour of the
    new way (tryWakeupThread and MSG_TRY_WAKEUP, which is idempotent).  It
    is now the case that only the Capability that owns a TSO may modify
    its state (well, almost), and this simplifies various things.  More of
    the RTS is based on message-passing between Capabilities now.
  • May 19, 2013

HotOS “Unconference” report:
Verifying Systems

Ariel Rabkin has some code he'd like to verify, and at this year’s HotOS he appealed to participants of one “unconference” (informal breakout sessions to discuss various topics) to help him figure out what was really going on as far as formal verification went.

He had three questions: "What can we verify? What is impossible to verify? How can we tell that the verification is correct?" They seemed like impossibly large questions, so we drilled in a little bit, and found that Ariel had the very understandable question, "Say I have some C++ code implementing a subtle network protocol, and I'd like to prove that the protocol doesn't deadlock; how do I do that?"

I wish the formal verification community had a good answer to a question like this, but unfortunately we don't. The largest verification projects include things like verified kernels, which are written in fully specified subsets of C; which assume the translation performed by the compiler is correct, formalize C in a theorem prover, and then verify there. This is the "principled approach". It's just not feasible to take C or C++ in its entirety and try to formalize it; it's too complicated and too ill-specified. The easiest thing to do is formalize a small fragment of your algorithm and then make a hand-wavy argument that your implementation is adequate.

Martin Abadi remarked that before you embark on a verification project, you have to figure out where you'll get the most bang for your buck. Most of the time, a formalization won't get you "full correctness"; the "electrons may be faulty", as the case may be. But even a flawed verification forces you to state your assumptions explicitly, which is a good thing.

We then circled around to the subject of, well, what can be verified. Until the 90s, the formal verification community limited itself to only complete and sound analyses—and failed. The relaxation of this restriction lead to a renaissance of formal verification work. We talked about who was using formal verification, and the usual suspects showed up: safety critical software, cache coherence protocols (but one participant remarked that this was only a flash in the pan, as far as applications goes—he asserted that these companies are likely not going to use these methods any longer in the future), etc. Safety critical software is also likely to use coprocessors (since hardware failure is a very real issue), but Gernot Heiser noted that these folks are trying to get away from physical separation: it is expensive in terms of expense, weight and energy. Luckily, the costs of verification, as he recounted, are within a factor of two of normal industrial assurance, and half the cost of military assurance (though, he cautioned that this was for a specific project, and for a specific size of code.) He also remarked that as far as changes to code requiring changes to the proofs, the changes in the proofs seemed to be linear in the complexity (conceptual or implementation-wise) of the change, which is a good sign!

Well, supposing that you decide that you actually want to verify your software, how do you go about doing it? Unfortunately, it takes a completely different set of skills to build verified software versus normal software. Everyone agreed, "Yes, you need to hire a formal methods guy" if you're going to make any progress. But that's just not enough. The formal methods guy has to talk to the systems guy. Heiser recounted a very good experience hiring a formal methods person who was able to communicate with the other systems researchers working on the project; without this line of communication, he said, the project likely would have failed. And he mentioned another project, which had three times as much funding, but didn't accomplish nearly as much their team had. (Names not mentioned to protect the guilty.)

In the end, it seemed that we didn’t manage to give Ari a quite satisfactory answer. As one participant said, “You’ll probably learn the most by just sitting down and trying to formalize the thing you are interested in.” This is probably true, though I fear most will be scared off by the realization of how much work it actually takes to formalize software.

Hey guys, I’m liveblogging HotOS at my research Tumblr. The posts there are likely to be more fragmented than this, but if people are interested in any particular topics I can inflate them into full posts.

  • May 14, 2013