Inside 206-105

Existential Pontification and Generalized Abstract Digressions

The Difference between Recursion & Induction

Recursion and induction are closely related. When you were first taught recursion in an introductory computer science class, you were probably told to use induction to prove that your recursive algorithm was correct. (For the purposes of this post, let us exclude hairy recursive functions like the one in the Collatz conjecture which do not obviously terminate.) Induction suspiciously resembles recursion: the similarity comes from the fact that the inductive hypothesis looks a bit like the result of a “recursive call” to the theorem you are proving. If an ordinary recursive computation returns plain old values, you might wonder if an “induction computation” returns proof terms (which, by the Curry-Howard correspondence, could be thought of as a value).

As it turns out, however, when you look at recursion and induction categorically, they are not equivalent! Intuitively, the difference lies in the fact that when you are performing induction, the data type you are performing induction over (e.g. the numbers) appears at the type level, not the term level. In the words of a category theorist, both recursion and induction have associated initial algebras, but the carrier sets and endofunctors are different. In this blog post, I hope to elucidate precisely what the difference between recursion and induction is. Unfortunately, I need to assume some familiarity with initial algebras: if you don’t know what the relationship between a fold and an initial algebra is, check out this derivation of lists in initial algebra form.

When dealing with generalized abstract nonsense, the most important first step is to use a concrete example! So let us go with the simplest nontrivial data type one can gin up: the natural numbers (our examples are written in both Coq and Haskell, when possible):

Inductive nat : Set := (* defined in standard library *)
  | 0 : nat
  | S : nat -> nat.

data Nat = Z | S Nat

Natural numbers are a pretty good example: even the Wikipedia article on F-algebras uses them. To recap, an F-algebra (or sometimes simply “algebra”) has three components: an (endo)functor f, a type a and a reduction function f a -> a. For simple recursion over natural numbers, we need to define a functor NatF which “generates” the natural numbers; then our type a is Nat and the reduction function is type NatF Nat -> Nat. The functor is defined as follows:

Inductive NatF (x : Set) : Set :=
  | F0 : NatF x.
  | FS : x -> NatF x.

data NatF x = FZ | FS x

Essentially, take the original definition but replace any recursive occurrence of the type with a polymorphic variable. As an exercise, show that NatF Nat -> Nat exists: it is the (co)product of () -> Nat and Nat -> Nat. The initiality of this algebra implies that any function of type NatF x -> x (for some arbitrary type x) can be used in a fold Nat -> x: this fold is the homomorphism from the initial algebra (NatF Nat -> Nat) to another algebra (NatF x -> x). The take-away point is that the initial algebra of natural numbers consists of an endofunctor over sets.

/img/nat-recursion.png

Let’s look at the F-algebra for induction now. As a first try, let’s try to use the same F-algebra and see if an appropriate homomorphism exists with the “type of induction”. (We can’t write this in Haskell, so now the examples will be Coq only.) Suppose we are trying to prove some proposition P : nat -> Prop holds for all natural numbers; then the type of the final proof term must be forall n : nat, P n. We can now write out the morphism of the algebra: NatF (forall n : nat, P n) -> forall n : nat, P n. But this “inductive principle” is both nonsense and not true:

Hint Constructors nat NatF.
Goal ~ (forall (P : nat -> Prop), (NatF (forall n : nat, P n) -> forall n : nat, P n)).
  intro H; specialize (H (fun n => False)); auto.
Qed.

(Side note: you might say that this proof fails because I’ve provided a predicate which is false over all natural numbers. But induction still “works” even when the predicate you’re trying to prove is false: you should fail when trying to provide the base case or inductive hypothesis!)

We step back and now wonder, “So, what’s the right algebra?” It should be pretty clear that our endofunctor is wrong. Fortunately, we can get a clue for what the right endofunctor might be by inspecting the type the induction principle for natural numbers:

(* Check nat_ind. *)
nat_ind : forall P : nat -> Prop,
  P 0 -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n

P 0 is the type of the base case, and forall n : nat, P n -> P (S n) is the type of the inductive case. In much the same way that we defined NatF nat -> nat for natural numbers, which was the combination of zero : unit -> nat and succ : nat -> nat, we need to define a single function which combines the base case and the inductive case. This seems tough: the result types are not the same. But dependent types come to the rescue: the type we are looking for is:

fun (P : nat -> Prop) => forall n : nat, match n with 0 => True | S n' => P n' end -> P n

You can read this type as follows: I will give you a proof object of type P n for any n. If n is 0, I will give you this proof object with no further help (True -> P 0). However, if n is S n', I will require you to furnish me with P n' (P n' -> P (S n')).

We’re getting close. If this is the morphism of an initial algebra, then the functor IndF must be:

fun (P : nat -> Prop) => forall n : nat, match n with 0 => True | S n' => P n' end

What category is this a functor over? Unfortunately, neither this post nor my brain has the space to give a rigorous treatment, but roughly the category can be thought of as nat-indexed propositions. Objects of this category are of the form forall n : nat, P n, morphisms of the category are of the form forall n : nat, P n -> P' n. [1] As an exercise, show that identity and composition exist and obey the appropriate laws.

Something amazing is about to happen. We have defined our functor, and we are now in search of the initial algebra. As was the case for natural numbers, the initial algebra is defined by the least fixed point over the functor:

Fixpoint P (n : nat) : Prop :=
  match n with 0 => True | S n' => P n' end.

But this is just True!

Hint Unfold P.
Goal forall n, P n = True.
  induction n; auto.
Qed.

Drawing out our diagram:

/img/nat-ind.png

The algebras of our category (downward arrows) correspond to inductive arguments. Because our morphisms take the form of forall n, P n -> P' n, one cannot trivially conclude forall n, P' n simply given forall n, P n; however, the presence of the initial algebra means that True -> forall n, P n whenever we have an algebra forall n, IndF n -> P n. Stunning! (As a side note, Lambek’s lemma states that Mu P is isomorphic to P (Mu P), so the initial algebra is in fact really really trivial.)

In conclusion:

  • Recursion over the natural numbers involves F-algebras with the functor unit + X over the category of Sets. The least fixed point of this functor is the natural numbers, and the morphism induced by the initial algebra corresponds to a fold.
  • Induction over the natural numbers involves F-algebras with the functor fun n => match n with 0 => True | S n' => P n' over the category of nat-indexed propositions. The least fixed point of this functor is True, and the morphism induced by the initial algebra establishes the truth of the proposition being inductively proven.

So, the next time someone tells asks you what the difference between induction and recursion is, tell them: Induction is just the unique homomorphism induced by an initial algebra over indexed propositions, what’s the problem?


Acknowledgements go to Conor McBride, who explained this shindig to me over ICFP. I promised to blog about it, but forgot, and ended up having to rederive it all over again.

[1] Another plausible formulation of morphisms goes (forall n : nat, P n) -> (forall n : nat, P' n). However, morphisms in this category are too strong: they require you to go and prove the result for all n... which you would do with induction, which misses the point. Plus, this category is a subcategory of the ordinary category of propositions.

  • April 27, 2013

Kindle is not good for textbooks

Having attempted to read a few textbooks on my Kindle, I have solemnly concluded that the Kindle is in fact a terrible device for reading textbooks. The fundamental problem is that, due to technological limitations, the Kindle is optimized for sequential reading. This can be seen in many aspects:

  • Flipping a page in the Kindle is not instantaneous (I don't have a good setup to time how long the screen refresh takes, but there is definitely a perceptible lag before when you swipe, and when the Kindle successfully redraws the screen—and it’s even worse if you try to flip backwards).
  • Rapidly flipping through pages in order to scan for a visual feature compounds the delay problem.
  • There is no way to take the “finger” approach to random access (i.e. wedge your finger between two pages to rapidly switch between them); jumping between bookmarks requires four presses with the current Kindle interface!
  • The screen size of the Kindle is dramatically smaller than that of an average textbook, which reduces the amount of information content that can be placed on one screen and further exacerbates slow page turns.

A textbook cannot be read as a light novel. So, while the Kindle offers the tantalizing possibility of carrying a stack of textbooks with you everywhere, in fact, you’re better off getting the actual dead tree version if you’re planning on doing some serious studying from it. That is not to say textbook ebooks are not useful; in fact, having a searchable textbook on your laptop is seriously awesome—but this is when you’re using the textbook as a reference material, and not when you’re trying to actually learn the material.

  • April 15, 2013

A Zerocoin puzzle

I very rarely post linkspam, but given that I’ve written on the subject of anonymizing Bitcoins in the past, this link seems relevant: Zerocoin: making Bitcoin anonymous. Their essential innovation is to have a continuously operating mixing pool built into the block chain itself; they pull this off using zero-knowledge proofs. Nifty!

Here is a puzzle for the readers of this blog. Suppose that I am a user who wants to anonymize some Bitcoins, and I am willing to wait expected time N before redeeming my Zerocoins. What is the correct probability distribution for me to pick my wait time from? Furthermore, suppose a population of Zerocoin participants, all of which are using this probability distribution. Furthermore, suppose that each participant has some utility function trading off anonymity and expected wait time (feel free to make assumptions that make the analysis easy). Is this population in Nash equilibrium?

  • April 11, 2013

A classical logic fairy tale

(Selinger) Here is a fairy tale: The evil king calls the poor shepherd and gives him these orders. “You must bring me the philosophers stone, or you have to find a way to turn the philosopher’s stone to gold. If you don’t, your head will be taken off tomorrow!” What can the poor shepherd do to save his life?

Hat tip to Chris for originally telling me a different variant of this story. Unfortunately, this quote from Lectures on the Curry-Howard Isomorphism was the only reference I could find. What should the shepherd do? Is there something a little odd about this story?

  • April 7, 2013

Resource limits for Haskell

Last week, I made my very first submission to ICFP! The topic? An old flame of mine: how to bound space usage of Haskell programs.

We describe the first iteration of a resource limits system for Haskell, taking advantage of the key observation that resource limits share semantics and implementation strategy with profiling. We pay special attention to the problem of limiting resident memory usage: we describe a simple implementation technique for carrying out incremental heap censuses and describe a novel information-flow control solution for handling forcible resource reclamation. This system is implemented as a set of patches to GHC.

You can get a copy of the submission here. I've reproduced below the background section on how profiling Haskell works; if this tickles your fancy, check out the rest of the paper!


Profiling in Haskell is performed by charging the costs of computation to the “current cost center.” A cost center is an abstract, programmer-specified entity to which costs can be charged; only one is active per thread at any given time, and the cost semantics determines how the current cost center changes as the program executes. For example, the scc cc e expression (set-cost-center) modifies the current cost center during evaluation of e to be cc. Cost centers are defined statically at compile time.

A cost semantics for Haskell was defined by Sansom et al. (1995) Previously, there had not been a formal account for how to attribute costs in the presence of lazy evaluation and higher-order functions; this paper resolved these questions. The two insights of their paper were the following: first, they articulated that cost attribution should be independent of evaluation order. For the sake of understandability, whether a thunk is evaluated immediately or later should not affect who is charged for it. Secondly, they observed that there are two ways of attributing costs for functions, in direct parallel to the difference between lexical scoping and dynamic scoping.

The principle of order-independent cost-attribution can be seen by this program:

f x = scc "f" (Just (x * x))
g x = let Just y = f x in scc "g" y

When g 4 is invoked, who is charged the cost of evaluating x * x? With strict evaluation, it is easy to see that f should be charged, since x * x is evaluated immediately inside the scc expression. Order-independence dictates, then, that even if the execution of x * x is deferred to the inside of scc "g" y, the cost should still be attributed to f. In general, scc "f" x on a variable x is a no-op. In order to implement such a scheme, the current cost-center at the time of the allocation of the thunk must be recorded with the thunk and restored when the thunk is forced.

The difference between lexical scoping and dynamic scoping for function cost attribution can be seen in this example:

f = scc "f" (\x -> x * x)
g = \x -> scc "g" (x * x)

What is the difference between these two functions? We are in a situation analogous to the choice for thunks: should the current cost-center be saved along with the closure, and restored upon invocation of the function? If the answer is yes, we are using lexical scoping and the functions are equivalent; if the answer is no, we are using dynamic scoping and the scc in f is a no-op. The choice GHC has currently adopted for scc is dynamic scoping.

  • April 2, 2013

The single export pattern

From the files of the ECMAScript TC39 proceedings

Single export refers to a design pattern where a module identifier is overloaded to also represent a function or type inside the module. As far as I can tell, the term “single export” is not particularly widely used outside the ECMAScript TC39 committee; however, the idea shows up in other contexts, so I’m hoping to popularize this particular name (since names are powerful).

The basic idea is very simple. In JavaScript, a module is frequently represented as an object:

var sayHello = require('./sayhello.js');
sayHello.run();

The methods of sayHello are the functions exported by the module. But what about sayHello itself? Because functions are objects too, we could imagine that sayHello was a function as well, and thus:

sayHello()

would be a valid fragment of code, perhaps equivalent to sayHello.run(). Only one symbol can be exported this way, but in many modules, there is an obvious choice (think of jQuery’s $ object, etc).

This pattern is also commonly employed in Haskell, by taking advantage of the fact that types and modules live in different namespaces:

import qualified Data.Map as Map
import Data.Map (Map)

Map is now overloaded to be both a type and a module.

  • March 31, 2013

The duality of weak maps and private symbols

From the files of the ECMAScript TC39 proceedings

I want to talk about an interesting duality pointed out by Mark Miller between two otherwise different language features: weak maps and private symbols. Modulo implementation differences, they are the same thing!

A weak map is an ordinary associative map, with the twist that if the key for any entry becomes unreachable, then the value becomes unreachable too (though you must remember to ignore references to the key from the value itself!) Weak maps have a variety of use-cases, including memoization, where we’d like to remember results of a computation, but only if it will ever get asked for again! A weak map supports get(key) and set(key, value) operations.

A private symbol is an unforgeable identifier of a field on an object. Symbols are useful because they can be generated “fresh”; that is, they are guaranteed not to conflict with existing fields that live on an object (whereas one might get unlucky with _private_identifier_no_really); a private symbol has the extra stipulation that one cannot discover that it exists on an object without actually having the symbol—e.g. an object will refuse to divulge the existence of a private symbol while enumerating the properties of an object. A private symbol psym might be created, and then used just like an ordinary property name to get (obj[psym]) and set (obj[psym] = value) values.

To see why these are the same, lets implement weak maps in terms of private symbols, and vice versa (warning, pseudocode ahead):

function WeakMap() {
  var psym = PrivateSymbol();
  return {
    get: function(key) { return key[psym]; },
    set: function(key, value) { key[psym] = value; }
  }
}

function PrivateSymbol() {
  return WeakMap();
}
// pretend that get/set are magical catch-all getters and setters
Object.prototype.get = function(key) {
  if (key instanceof PrivateSymbol) { return key.get(this); }
  else { return this.old_get(key); }
}
Object.prototype.set = function(key, value) {
  if (key instanceof PrivateSymbol) { return key.get(this, value); }
  else { return this.old_set(key, value); }
}

Notice, in particular, that it wouldn’t make sense to enumerate all of the entries of a weak map; such an enumeration would change arbitrarily depending on whether or not a GC had occurred.

If you look at this more closely, there is something rather interesting going on: the implementation strategies of weak maps and private symbols are the opposite of each other. With a weak map, you might imagine a an actual map-like data structure collecting the mappings from keys to values (plus some GC trickery); with a private symbol, you would expect the values to be stored on the object itself. That is to say, if we say “WeakMap = PrivateSymbol” and “key = this”, then the primary difference is whether or not the relationships are stored on the WeakMap/PrivateSymbol, or on the key/this. WeakMap suggests the former; PrivateSymbol suggests the latter.

Is one implementation or the other better? If objects in your system are immutable or not arbitrarily extensible, then the private symbol implementation may be impossible to carry out. But if both implementations are possible, then which one is better depends on the lifetimes of the objects in question. Garbage collecting weak references is expensive business (it is considerably less efficient than ordinary garbage collection), and so if you can arrange for your weak mappings to die through normal garbage collection, that is a win. Therefore, it’s better to store the mapping on the shorter-lived object. In the case of a memotable, the key is going to be a bit more ephemeral than the map, which results in a very strange consequence: the best implementation strategy for a weak map doesn’t involve creating a map at all!

Alas, as with many elegant results, there are some difficulties stemming from complexities in other parts of the ECMAScript specification. In particular, it is not at all clear what it means to have an “read-only weak map”, whereas an read-only private symbol has an obvious meaning. Furthermore, collapsing these two rather distinct concepts into a single language may only serve to confuse web developers; a case of a proposal being too clever for its own good. And finally, there is an ongoing debate about how to reconcile private state with proxies. This proposal was introduced to solve one particular aspect of this problem, but to our understanding, it only addresses a specific sub-problem, and only works when the proxy in question is a membrane.

  • March 19, 2013

What is a membrane?

If you hang out long enough with a certain crowd (in my case, it was the ECMAScript TC39 committee), you will probably hear the term membrane tossed around. And eventually, you will start to wonder, “Well, what is a membrane, anyway?”

As is the case with many clever but simple ideas, membranes were first introduced as a footnote [1] in a PhD thesis. Suppose that you are building distributed system, in which you pass references to objects between two separate nodes. If I want to pass a reference to foo in process A to process B, I can hardly just hand over an address—the memory spaces are not the same! So instead, I need to create a wrapper object wrappedFoo representing foo in B, which knows how to access the original object in A. So far so good.

Now here’s the catch: what if I pass a reference to wrappedFoo back to process A? If I were not very clever, I’d do the same thing as I did originally: create a new wrapper object wrappedWrappedFoo in A which knows how to access wrappedFoo in B. But this is silly; really, when I cross back over to A, I want to get back the original foo object.

This wrap-unwrap behavior is precisely what a membrane is. We consider the original object foo to be “inside” the membrane (a so-called wet object), and as it exits the membrane, it is wrapped with its own little membrane. However, when the object returns to its original membrane, the wrapper goes away. Just like in biology!

http://upload.wikimedia.org/wikipedia/commons/thumb/1/1a/Endocytosis_types.svg/400px-Endocytosis_types.svg.png

There is one last operation, called a “gate”: this occurs when you invoke a method on a wrapped object. Since the wrapper cannot actually perform the method, it has to forward the request to the original object. However, the arguments of the method need to get wrapped (or unwrapped) as they get forwarded; as you might expect.

While I used an RPC-like system to demonstrate the basic principle of membranes, a more conventional use is to enforce access control. Membranes are quite important; Mozilla relies on them extensively in order to enforce access restriction between objects from different websites which may interact with each other, but need security checks. (Actually, did you know that Mozilla is using a capability-based system for their security? Kind of neat!) It’s important to notice that when we unwrap, we are skipping security checks—the only reason this is acceptable is because the only objects that will be able to access the unwrapped object are precisely those objects in the same domain. For a more modern treatment of the subject, check out a more recent paper, Trustworthy Proxies: Virtualizing Objects with Invariants, which includes a lucid explanation of membranes.

[1] Well, actually it was a figure; figure 9.3 on page 71, to be precise!

  • March 15, 2013

Kindle Paperwhite notes

Along with a Nexus 7, I also acquired a Kindle Paperwhite over winter break. (Wi-Fi only) I have been quite pleased by this purchase, though in an unexpected way: while I have not increased the number of books I read, the Kindle has materially changed how I read articles on the Internet. Not via their web browser, which is essentially unusable except for the simplest tasks, but via tools which take articles on the Internet and convert them into ebook form.

For blog posts I use Calibre with the Google Reader source. This has been revolutionary: I no longer read blog posts in my browser; instead, I bundle them up on my Kindle and read them at my leisure. This change has also meant that I read blog posts much more carefully (interfluidity used to only get a skim; now I can actually make it through the posts). This setup is a little nontrivial so I’ll probably describe it in a later blog post. (I used to use Klip.me, which had easy setup, but (1) it could not handle images (so equations and graphs are right out), and (2) it botched formatting of pre formatted text.)

For longform articles I use Longform, which serves up a interesting mix of in-depth reporting and nonfiction articles. They have a very convenient “Send to Kindle” button which I’m told is served by Readability; I wonder if I should add a button like that to my blog. I am also using Amazon’s Send to Kindle Firefox extension for my paywalled articles (primarily LWN.net), although its results seem a bit spottier than Readability.

For academic papers, the going has been a bit rough, but I have gotten decent results on two-column papers with cut2col, which manages to make text large enough to be readable in Kindle’s PDF reader. Landscape mode also helps a bit with reading PDFs. Generally, however, managing which PDFs are on my device is a bit of an open problem. I haven’t started using Calibre yet for this purpose, although I have used it to perform some conversions between ebook formats. These conversions don’t work particularly well, although they are usable. It’s well worth getting the latest version of Calibre, since there are some bugs in the current Quantal version; I use ppa:n-muench/calibre.

For textbooks, ebook editions are still extremely expensive. I would love to carry around all of the famous computer science textbooks in my pocket, but to date I don’t have any good way of doing so without breaking the bank. Books suffer similarly: it turns out I did most of my pre-Kindle reading borrowing books from libraries; however, I hear the Palo Alto public library does Kindle lending, and I intend to check them out at some point in time. The typesetting for public domain books put out by places like Project Gutenberg are notoriously spotty; Readability and similar services seem to do a much better job! Alas, the Stanford library system does not appear to carry ebooks. The Amazon free samples of books have also been good fun, and I’ve collected quite a few of them.

Some annoyances with the Kindle itself:

  • Amazon Special Offers, which I cannot seem to get rid of (apparently you can only pay the $20 to remove them if the Kindle is associated with the original Amazon account that purchased the device; which is not me)
  • Re-sorting behavior of the Recent view: if you have gone home after reading a book, the Kindle will briefly flash the old order of items before it reshuffles and moves the item you just finished reading to the front. If you attempt to click on a different item during this period, the click will not register until the new reshuffle has happened; this has frustratingly caused me to accidently click on the wrong item multiple times!
  • With hypertext articles, they tend to contain links, which means that if you are using a "tap" to advance to the next page, and a link appears under your finger and you will accidentally bring up your browser. This is quite annoying and I would just like to turn off links entirely. (Yes, I know you can just swipe, but that is annoying and I do not want to bother retraining myself.)
  • For certain formats, especially PDFs, page refresh speed is rather slow; this makes it difficult to rapidly flip through pages like you might in a real book. This is probably the primary downside of a Kindle as opposed to a traditional book; the other being the inability to see and rapidly flip to bookmarks (it takes more than one tap to move to a bookmark on a Kindle).
  • I have also jailbroken my Kindle, but there does not seem to be any interesting software to run with it.

All-in-all, I am quite pleased with this device, and I would recommend it to anyone interested in reducing the amount of time they spend staring at a computer monitor.

  • January 30, 2013