An Eq instance for non de Bruijn terms
tl;dr A non-nameless term equipped with a map specifying a de Bruijn numbering can support an efficient equality without needing a helper function. More abstractly, quotients are not just for proofs: they can help efficiency of programs too.
The cut. You're writing a small compiler, which defines expressions as follows:
type Var = Int data Expr = Var Var | App Expr Expr | Lam Var Expr
Where Var is provided from some globally unique supply. But while working on a common sub-expression eliminator, you find yourself needing to define equality over expressions.
You know the default instance won’t work, since it will not say that Lam 0 (Var 0) is equal to Lam 1 (Var 1). Your colleague Nicolaas teases you that the default instance would have worked if you used a nameless representation, but de Bruijn levels make your head hurt, so you decide to try to write an instance that does the right thing by yourself. However, you run into a quandary:
instance Eq Expr where Var v == Var v' = n == n' App e1 e2 == App e1' e2' = e1 == e1' && e2 == e2' Lam v e == Lam v' e' = _what_goes_here
If v == v', things are simple enough: just check if e == e'. But if they're not... something needs to be done. One possibility is to rename e' before proceeding, but this results in an equality which takes quadratic time. You crack open the source of one famous compiler, and you find that in fact: (1) there is no Eq instance for terms, and (2) an equality function has been defined with this type signature:
eqTypeX :: RnEnv2 -> Type -> Type -> Bool
Where RnEnv2 is a data structure containing renaming information: the compiler has avoided the quadratic blow-up by deferring any renaming until we need to test variables for equality.
“Well that’s great,” you think, “But I want my Eq instance, and I don’t want to convert to de Bruijn levels.” Is there anything to do?
Perhaps a change of perspective in order:
The turn. Nicolaas has the right idea: a nameless term representation has a very natural equality, but the type you've defined is too big: it contains many expressions which should be equal but structurally are not. But in another sense, it is also too small.
Here is an example. Consider the term x, which is a subterm of λx. λy. x. The x in this term is free; it is only through the context λx. λy. x that we know it is bound. However, in the analogous situation with de Bruijn levels (not indexes—as it turns out, levels are more convenient in this case) we have 0, which is a subterm of λ λ 0. Not only do we know that 0 is a free variable, but we also know that it binds to the outermost enclosing lambda, no matter the context. With just x, we don’t have enough information!
If you know you don’t know something, you should learn it. If your terms don’t know enough about their free variables, you should equip them with the necessary knowledge:
import qualified Data.Map as Map import Data.Map (Map) data DeBruijnExpr = D Expr NEnv type Level = Int data NEnv = N Level (Map Var Level) lookupN :: Var -> NEnv -> Maybe Level lookupN v (N _ m) = Map.lookup v m extendN :: Var -> NEnv -> NEnv extendN v (N i m) = N (i+1) (Map.insert v i m)
and when you do that, things just might work out the way you want them to:
instance Eq DeBruijnExpr where D (Var v) n == D (Var v') n' = case (lookupN v n, lookupN v' n') of (Just l, Just l') -> l == l' (Nothing, Nothing) -> v == v' _ -> False D (App e1 e2) n == D (App e1' e2') n' = D e1 n == D e1' n' && D e2 n == D e2' n' D (Lam v e) n == D (Lam v' e') n' = D e (extendN v n) == D e' (extendN v' n')
(Though perhaps Coq might not be able to tell, unassisted, that this function is structurally recursive.)
Exercise. Define a function with type DeBruijnExpr -> DeBruijnExpr' and its inverse, where:data DeBruijnExpr' = Var' Var | Bound' Level | Lam' DeBruijnExpr' | App' DeBruijnExpr' DeBruijnExpr'
The conclusion. What have we done here? We have quotiented a type—made it smaller—by adding more information. In doing so, we recovered a simple way of defining equality over the type, without needing to define a helper function, do extra conversions, or suffer quadratically worse performance.
Sometimes, adding information is the only way to get the minimal definition. This situation occurs in homotopy type theory, where equivalences must be equipped with an extra piece of information, or else it is not a mere proposition (has the wrong homotopy type). If you, gentle reader, have more examples, I would love to hear about them in the comments. We are frequently told that “less is more”, that the route to minimalism lies in removing things: but sometimes, the true path lies in adding constraints.
Postscript. In Haskell, we haven’t truly made the type smaller: I can distinguish two expressions which should be equivalent by, for example, projecting out the underlying Expr. A proper type system which supports quotients would oblige me to demonstrate that if two elements are equivalent under the quotienting equivalence relation, my elimination function can't observe it.
Postscript 2. This technique has its limitations. Here is one situation where I have not been able to figure out the right quotient: suppose that the type of my expressions are such that all free variables are implicitly universally quantified. That is to say, there exists some ordering of quantifiers on a and b such that a b is equivalent to b a. Is there a way to get the quantifiers in order on the fly, without requiring a pre-pass on the expressions using this quotienting technique? I don’t know!