An Eq instance for non de Bruijn terms : ezyang’s blog

An Eq instance for non de Bruijn terms

tl;dr A non-nameless term equipped with a map specifying a de Bruijn numbering can support an efficient equality without needing a helper function. More abstractly, quotients are not just for proofs: they can help efficiency of programs too.

The cut. You're writing a small compiler, which defines expressions as follows:

type Var = Int
data Expr = Var Var
          | App Expr Expr
          | Lam Var Expr

Where Var is provided from some globally unique supply. But while working on a common sub-expression eliminator, you find yourself needing to define equality over expressions.

You know the default instance won’t work, since it will not say that Lam 0 (Var 0) is equal to Lam 1 (Var 1). Your colleague Nicolaas teases you that the default instance would have worked if you used a nameless representation, but de Bruijn levels make your head hurt, so you decide to try to write an instance that does the right thing by yourself. However, you run into a quandary:

instance Eq Expr where
  Var v == Var v'          = n == n'
  App e1 e2 == App e1' e2' = e1 == e1' && e2 == e2'
  Lam v e == Lam v' e'     = _what_goes_here

If v == v', things are simple enough: just check if e == e'. But if they're not... something needs to be done. One possibility is to rename e' before proceeding, but this results in an equality which takes quadratic time. You crack open the source of one famous compiler, and you find that in fact: (1) there is no Eq instance for terms, and (2) an equality function has been defined with this type signature:

eqTypeX :: RnEnv2 -> Type -> Type -> Bool

Where RnEnv2 is a data structure containing renaming information: the compiler has avoided the quadratic blow-up by deferring any renaming until we need to test variables for equality.

“Well that’s great,” you think, “But I want my Eq instance, and I don’t want to convert to de Bruijn levels.” Is there anything to do?

Perhaps a change of perspective in order:

The turn. Nicolaas has the right idea: a nameless term representation has a very natural equality, but the type you've defined is too big: it contains many expressions which should be equal but structurally are not. But in another sense, it is also too small.

Here is an example. Consider the term x, which is a subterm of λx. λy. x. The x in this term is free; it is only through the context λx. λy. x that we know it is bound. However, in the analogous situation with de Bruijn levels (not indexes—as it turns out, levels are more convenient in this case) we have 0, which is a subterm of λ λ 0. Not only do we know that 0 is a free variable, but we also know that it binds to the outermost enclosing lambda, no matter the context. With just x, we don’t have enough information!

If you know you don’t know something, you should learn it. If your terms don’t know enough about their free variables, you should equip them with the necessary knowledge:

import qualified Data.Map as Map
import Data.Map (Map)

data DeBruijnExpr = D Expr NEnv

type Level = Int
data NEnv = N Level (Map Var Level)

lookupN :: Var -> NEnv -> Maybe Level
lookupN v (N _ m) = Map.lookup v m

extendN :: Var -> NEnv -> NEnv
extendN v (N i m) = N (i+1) (Map.insert v i m)

and when you do that, things just might work out the way you want them to:

instance Eq DeBruijnExpr where
  D (Var v) n == D (Var v') n' =
    case (lookupN v n, lookupN v' n') of
      (Just l, Just l')  -> l == l'
      (Nothing, Nothing) -> v == v'
      _ -> False
  D (App e1 e2) n == D (App e1' e2') n' =
    D e1 n == D e1' n' && D e2 n == D e2' n'
  D (Lam v e) n == D (Lam v' e') n' =
    D e (extendN v n) == D e' (extendN v' n')

(Though perhaps Coq might not be able to tell, unassisted, that this function is structurally recursive.)

Exercise. Define a function with type DeBruijnExpr -> DeBruijnExpr' and its inverse, where:
data DeBruijnExpr' = Var' Var
                   | Bound' Level
                   | Lam' DeBruijnExpr'
                   | App' DeBruijnExpr' DeBruijnExpr'

The conclusion. What have we done here? We have quotiented a type—made it smaller—by adding more information. In doing so, we recovered a simple way of defining equality over the type, without needing to define a helper function, do extra conversions, or suffer quadratically worse performance.

Sometimes, adding information is the only way to get the minimal definition. This situation occurs in homotopy type theory, where equivalences must be equipped with an extra piece of information, or else it is not a mere proposition (has the wrong homotopy type). If you, gentle reader, have more examples, I would love to hear about them in the comments. We are frequently told that “less is more”, that the route to minimalism lies in removing things: but sometimes, the true path lies in adding constraints.

Postscript. In Haskell, we haven’t truly made the type smaller: I can distinguish two expressions which should be equivalent by, for example, projecting out the underlying Expr. A proper type system which supports quotients would oblige me to demonstrate that if two elements are equivalent under the quotienting equivalence relation, my elimination function can't observe it.

Postscript 2. This technique has its limitations. Here is one situation where I have not been able to figure out the right quotient: suppose that the type of my expressions are such that all free variables are implicitly universally quantified. That is to say, there exists some ordering of quantifiers on a and b such that a b is equivalent to b a. Is there a way to get the quantifiers in order on the fly, without requiring a pre-pass on the expressions using this quotienting technique? I don’t know!

Anonymous says:

January 31, 2015 at 1:14 pm

Very nice!

For the very same idea, check out http://www.math.harvard.edu/~lurie/papers/DAG.pdf, top of the page 3. Quite a different context but related to HoTT in the sense of being a “higher” theory.

November 15, 2015 at 10:25 pm

Another way to do this is pass a `Map Int Int` that maps variables from the left side of (==) to the right side (or the other way around), then

eq :: Map Int Int -> Expr -> Expr -> Bool
eq m (Lam x a) (Lam y b) = eq (M.insert x y m) a b
eq m (Var a) (Var b) = M.lookup a m == b

Edward Z. Yang says:

November 16, 2015 at 5:42 pm

Anonymous 2: Yes, and this is the standard method; the problem is that you can’t write an Eq instance with this implementation!