ezyang's blog

the arc of software bends towards understanding

Ely Cycles

Yesterday I cycled from Cambridge to Ely, and back again. The route is a glorious 38 miles (round trip) of British towns and countryside. The cycling around Cambridge is quite good, because there aren’t very many hills, and in the farmland areas you get to see the tractors rolling by. The longest I’d ever cycled before was the Springwater Corridor in Portland, the segment of which I did was only about 10 miles.

*Non sequitur.* ICFP submission deadline is today! Functional programmers around the world have reported a sharp increase in draft paper sightings.

Interleaving stream processors

Ghost in the state machine

A long time ago (circa 2007-2008), I wrote perhaps the single most complicated piece of code in HTML Purifier—one of those real monsters that you don’t think anyone else could ever understand and that you are really grateful you have a comprehensive test suite for. The idea was this: I had a state machine that modified a stream of tokens (since this was a stream of HTML tags and text, the state machine maintained information such as the current nesting stack), and I wanted to allow users to add extra functionality on top of this stream processor (the very first processor inserted paragraph tags when double-newlines were encountered) in a modular way.

The simplest thing I could have done was abstract out the basic state machine, created a separate processor for every transformation I wanted to make, and run them one pass after another. But for whatever reason, this idea never occurred to me, and I didn’t want to take the performance hit of having to iterate over the list of tokens multiple times (premature optimization, much?) Instead, I decided to add hooks for various points in the original state machine, which plugins could hook into to do their own stream transformations.

Even before I had more than one “injector”, as they are called, I had decided that I would have to do something sensible when one injector created a token that another injector could process. Suppose you ran injector A, and then injector B. Any tokens that injector A created would be picked up by injector B, but not vice versa. This seemed to me a completely arbitrary decision: could I make it so that order didn’t matter? The way I implemented this was have the managing state machine figure out what new tokens any given injector had created, and then pass them to all the other injectors (being careful not to pass the token back to the originating injector.) Getting this right was tricky; I originally stored the ranges of “already seen” tokens separately from the token stream itself, but as other injectors made changes it was extremely easy for these ranges to get out of sync, so I ended up storing the information in the token themselves. Another difficulty is preventing A from creating a token which B converts into another token which A converts to B etc; so this skip information would have to be preserved over tokens. (It seems possible that this excluded some possible beneficial interactions between two injectors, but I decided termination was more important.)

Extra features also increased complexity. One particular feature needed the ability to rewind back to an earlier part of the stream and reprocess all of those tokens; since most other injectors wouldn’t be expecting to go back in time, I decided that it would be simplest if other injectors were all suspended if a rewind occurred. I doubt there are any sort of formal semantics for what the system as a whole is supposed to do, but it seems to work in practice. After all, complexity isn’t created in one day: it evolves over time.

There are several endings to this story. One ending was that I was amused and delighted to see that the problem of clients making changes, which then are recursively propagated to other clients, which can make other changes, is a fairly common occurrence in distributed systems. If you compress this into algorithmic form, you get (gasp) research papers like Lerner, Grove and Chamber’s Composing Dataflow Analyses and Transformations, which I discovered when brushing against Hoopl (note: I am not belittling their work: dataflow analysis for programming languages is a lot more complicated than hierarchy analysis for HTML, and they allow for a change made by A to affect a change B makes that can affect A again: they cut the knot by ensuring that their analysis eventually terminates by bounding the information lattice—probably something to talk about in another post).

Another ending is that, fascinatingly enough, this complex system actually was the basis for the first external contribution to HTML Purifier. This contributor had the following to say about the system: “I have to say, I’m impressed with the design I see in HTML Purifier; this has been pretty easy to jump into and understand, once I got pointed to the right spot.” The evolution of systems with complex internal state is, apparently, quite well understood by practicing programmers, and I see experienced developers tackling this subsystem, usually with success. From an experience standpoint, I don’t find this too surprising—years after I originally wrote the code, it doesn’t take me too long to recall what was going on. But I do wonder if this is just the byproduct of many long hacking sessions on systems with lots of state.

The final ending is a what-if, the “What if the older Edward came back and decided to rewrite the system.” It seems strange, but I probably wouldn’t have the patience to do it over again. Or I might have recognized the impending complexity and avoided it. But probably the thing that has driven me the most crazy over the years, and which is a technical problem in its own right, is despite the stream-based representation, everything HTML Purifier processes is loaded up in memory, and we don’t take advantage of streams of token at all. Annoying!

The return of Hellenistic reasoning

I recently attended a talk which discussed extending proof assistants with diagrammatic reasoning support , helping to break the hegemony of symbolic systems that is predominant in this field. While the work is certainly novel in some respects, I can’t also but help think that we’ve come back full circle to the Ancient Greeks, who were big fans of geometry, and its correspondingly visual form of reasoning. The thought came up again while I was reading a mathematics text and marveling at the multiple methods of presenting a single concept. In this essay, I’d like to look at this return to older, more “intuitive” forms of reasoning: I’ve called it “Hellenistic reasoning” because geometry and the Socratic method nicely sum up visual and interactive reasoning that I’d like to discuss. I argue that this resurgence is a good thing, and that though these forms of reasoning may not be as powerful or general as symbolic reasoning, they will be critical to the application and communication of abstract mathematical results.

Symbolic reasoning involves syntactic manipulation of abstract symbols on a page; as Knuth puts it, it’s the sort of thing we do when we have a symbol that “appears in both the upper and the lower position, and we wish to have it appear in one place rather than two”. It’s a rather unnatural and mechanical mode of reasoning that most people have to be trained to do, but it is certainly one of the most popular and effective modes of reasoning. While I suspect that the deep insights that lead to new mathematical discoveries lie outside this realm, symbolic reasoning derives its power from being a common language which can be used as a communication channel to share these insights. Unlike natural language, it’s compact, precise and easy to write.

While symbols are an imperfect but serviceable basis for communication among the mathematically inclined, they strike fear and terror in the hearts of everyone else. A conference room of mathematicians will sigh in relief when the slide of axioms and equations shows up; a conference room of systems researchers will zone out when the same slide rolls around. Is this a sign to just give up and start using UML? My answer should be obvious: No! The benefits of a precise formalism are too great to give up. (What are those benefits? Unfortunately, that’s out of the scope of this particular essay.) Can we toss out the symbols but not the formalism? Maybe…

First on the list is visual reasoning. If you can find one for your problem, they can be quite elegant. Unfortunately, the key word is if: proofs with visual equivalents are the exception, not the rule. But there are some encouraging theorems and other efforts that can show a class of statements can be “drawn as pictures on paper.” Here are some examples:

  • Theorem. Let $\mathcal{H}$ be the algebra of all open subsets (for the non-topologically inclined, you can think of this as sets minus their “edges”: the interior (Int) of all subsets) of the set $\mathbb{R}$ or $\mathbb{R}^2$ (or any Cartesian product of the above). Then $\mathcal{H} \vdash p$ if and only if $p$ is intuitionistically valid. What this means is that it is always possible to give a counterexample for an invalid formula over the real line or Euclidean plane. One example is Peirce’s law $((p \rightarrow q) \rightarrow p) \rightarrow p$: the counter-example on the real line is as such: let the valuation of p be $\mathbb{R} - \lbrace 0\rbrace$ and the valuation of q be $\emptyset$. The recipe is to repeatedly apply our rules for combining these subsets, and the see if the result covers all of the real line (if it doesn’t, it’s not intuitionistically valid). The rule for $A \rightarrow B$ is $\mathrm{Int}(-A \cup B)$, so we find that $p \rightarrow q$ is $\emptyset$ and $(p \rightarrow q) \rightarrow q$ is $\mathbb{R}$, and the full expression is $\mathbb{R} - \lbrace 0\rbrace$, just one point shy of the full number line.
  • von Neumann famously remarked: “I do not believe absolutely in Hilbert space no more.” He was referring to the growing pains of the Hilbert Space formalism for quantum mechanics. If you’ve ever played around with tensor products and quibts, you’ll know that it takes a lot of work to show even the simplest calculations. The quantum picturalism movement attempts to recast quantum computation in terms of a graphical language, with the goal of making simple theorems, well, simple. No word yet if they will actually be successful, but it’s a very interesting vein of investigation.
  • Commutative diagrams in category theory make proving properties a cinch: one simply needs to understand what it is that is being proved and draw the appropriate diagram! One particularly elegant proof of the Church-Rosser theorem (confluence of the untyped lambda calculus) uses a diagram chase after addressing one small technical detail.

We proceed on to interactive reasoning. These are methods that involve dialogues and games: they find their roots in game theory, but have since then popped up in a variety of contexts.

  • The act of creating a construction for a formula can be thought of as a dialogue between a prover (who claims to know how to produce the construction), and a skeptic (who claims the construction doesn’t exist.) We can bundle up all of the information content of a proof by specifying a “strategy” for the prover, but the little bit of specificity that a particular instance of the game can be exceedingly enlightening. They also highlight various “interesting” logical twists that may not necessarily be obvious from applying inference rules: ex falso sequitur quodlibet corresponds to tricking the skeptic into providing contradictory information and classical logic permits Catch-22 tricks (see section 6.5 of Lectures on the Curry-Howard Isomorphism).
  • Game semantics gives meaning to program execution in terms of games. Some interesting results include increased expressiveness over the more traditional denotational semantics (game semantics is able to “see” if a function requests the value of one of its argument), but in my opinion this is perhaps the most natural way to talk about laziness: when a thunk is forced I am asking someone for an answer; if I never force the thunk then this conversation never happens.
  • In a much more mundane sense, traditional debuggers are a interactive reasoning system: the programmer converses with the program, asks questions and gets answers.

(It is that last sense that makes wonder if interactional reasoning is something that could become very widely used among software engineers: if you want to reason about the correctness of a system as a whole using a game, you still need to prove facts about strategies in general, but in an adversarial context (for example, compiler error) it could be very simple and useful. This is idle speculation: interactive error systems have been built before, e.g. LaTeX, and unfortunately, they’re not very good. One wonders why.)

I could end this essay with an admonition to “draw pictures and describe interactions”, but that would be very silly, since people do that already. What I would like to suggest is that there is rich theory that formalizes these two very informal activities, and this formalism is a good thing, because it makes our tools more precise, and thereby more effective.

Spring Reading: 2011 edition

Books are expensive, but by the power of higher-education (also expensive, but differently so), vast oceans of books are available to an enterprising compsci. Here’s my reading list for the spring break lending period (many of which were recommended on #haskell:

image

  • Concepts, Techniques, and Models of Computer Programming by Peter Van Roy and Seif Haridi. Wonderfully iconoclastic book, and probably one of the easier reads on the list.
  • Types and Programming Languages by Benjamin Pierce. I’ve been working on this one for a while; this break I’m focusing on the proof strategies for preservation, progress and safety, and also using it to complement a self-study course summed up by the next book.
  • Lectures on the Curry-Howard Isomorphism by M.H. Sørensen and P. Urzycyzn. Very good, I’ve skimmed the first three chapters and I’m working on the exercises in chapter 2. I’ve been prone to making silly mis-assertions about the Curry-Howard Isomorphism (or is it?), so I’m looking forward to more firmly grounding my understanding of this correspondence. The sections on intuitionistic logic has already been very enlightening.
  • Type Theory and Functional Programming by Simon Thompson. Haven’t looked at it yet, but fits into the general course of the previous two books.
  • Purely Functional Data Structures by Chris Okasaki. Also one I’ve been working on a while. Working on compressing all the information mentally.
  • Basic Category Theory for Computer Scientists by Benjamin Pierce. I’ve got two items on category theory; I got this one on a whim. Haven’t looked at it yet.
  • Pearls of Functional Algorithm Design by Richard Bird. Something like a collection of puzzles. I think I will enjoy reading through them and working out the subtleties. I probably won’t get to the information compression stage this time around.
  • Category Theory by Steve Awodey. I was working on the exercises in this textbook, and think I might get past the first chapter.

On expressivity

Wherein I make fun of functional programming advocates.

In this essay, I’d like to discuss the ideologies of “imperative programming” and “functional programming” in terms of the language features they lean on: in particular, the mechanisms by which they allow developers to express themselves in less code. I propose that the set of features that make up imperative programming constitute a dominant programming monoculture that is partially incompatible with functional programming’s favored features, requiring functional programming advocates to do funny things to gain the attention of the programmers.

To first give a flavor of expressiveness, here are some frequently seen language features that increase expressiveness:

  • Macros
  • Concurrency
  • Mutation
  • Indirection
  • Laziness
  • Dynamic typing
  • Polymorphism
  • Higher-order functions
  • Exceptions
  • Eval
  • Input/Output
  • Continuations
  • Anonymous functions

A few of those entries might make you laugh, because you might not understand how you could program without them. You may recognize a few that your favorite language supports well, a few that your language supports less well, and a few your language has no support for. The culture around the language will also have its folklore about what kinds of features are acceptable for use and which ones are not (think Pythonic or JavaScript: The Good Parts). The language you choose determines which features you get to know well.

Being expressive has a cost, which most developers figure out with a little experience in the field. There is a sort of natural selection going on here: language features that are well supported by languages, that other programmers know how to use, and that allow the job to get done are favored—in particular the community effect reinforces the winner. As a result, we have developer monoculture that is mostly comfortable with mutation, input/output, indirection, exceptions, polymorphism, etc. But even the bread-and-butter of current programming practice doesn’t come without cost: think about the famed division between “those who get pointers and those who don’t”, or the runtime costs of using exceptions in C++, or the representation complications of polymorphism (e.g. autoboxing in Java and Go’s lack thereof).

When someone does functional programming advocacy, what they’re really doing is asking you to look more closely at some of the other mechanisms we have for increasing expressiveness. You might feel like these are the only voices you hear, because there’s not much point advocating something that everyone uses already. And you might feel like the enthusiasm is unjustified, because the feature seems bizarrely complicated (continuations, anyone?) or you’ve tried using it in your favorite language, and there’s nothing more painful than seeing someone try to do functional programming in Python. Fact of the matter is, it’s not easy to add these extra language features to the existing monoculture. The new features interact in very complex and subtle ways.

This is why a functional programming advocate will often ask you to give up some of your old tools of expression. They will ask you to give up shared-state mutation, because otherwise handling concurrency is really damn hard. They will ask you to give up dynamic typing, because otherwise higher-order functions become much harder to reason about. The rhetoric will edge on the side of “stop doing that!” because it’s the common practice—they don’t actually mean “stop it entirely” but to the poor functional programming advocate it seems like a little bit of hyperbole is necessary to get the point across—and you can do some pretty amazing things with these extra features.

I encourage programmers to learn about as many ways to express themselves as possible, even if their language or workplace won’t necessarily allow them to use the method. The reasons are manifold:

  1. “Any complex enough program eventually contains a poorly written implementation of Lisp.” Like it or not, eventually you will be faced with a hard problem that is handily dispatched by one of these well studied language features, and if you’re going to have to implement it by hand, you might as well know how it’s going to look like from the start. As the Gang of Four once said, language features that are not actually supported by the language often show up as design patterns; knowing the pattern makes your code clearer and cleaner.
  2. Conversely, if you are reading someone else’s code and they resort to using one of these patterns, knowing how the feature should work will greatly aid comprehension.
  3. Libraries and frameworks are considered essential to the working developer’s toolbox, yet they seem to grow and be obsoleted at a dizzying rate. Language features are eternal: the anonymous functions of 1936 (when Alonzo Church invented the lambda calculus) are still the anonymous functions of today.
  4. Language features are fun to learn about! Unlike “yet another API to memorize”, a language feature will tickle your brain and make you think very hard about what is going on.

tl;dr Certain programming language features increase developer expressiveness, and the “imperative programming methodology” captures the dominant monoculture containing those language features in wide use. But there are other ways of expressing oneself, and programmers are encouraged to explore these methods even when practical use necessitates them to stop using some of their favorite expressive tools.

Someone is wrong on the Internet

The perverse incentive structure of the Internet

Suppose you have a Random Question™ that you would like answered. For example, “Did Sir Francis Galton have any children?” It’s the type of thing you’d type into Google.

image

Answers.com is not a terribly good source, but it is something to at least see if you can scrounge up some extra information on. So suppose you dig a little deeper and discover that Francis Galton only married once and that this marriage was barren. So unless some historian’s account of Galton suffering from venereal disease (Kevles 12) indicates that he knocked up some ladies in Syria and managed to pop out four offspring, who happened to conveniently be named “Francis, Harriet, Matilda and Mark,” Answers.com is wrong.

Not really surprising. But that’s not what this essay is about.

What this essay is about is what you’re supposed to do when you encounter flagrant and unabashed misinformation on the Internet: that is, someone is wrong on the Internet. There are a few obvious answers:

  • Disseminate corrective information,
  • Fix it, or
  • Ignore it.

image

Disseminating corrective information is really just a glorified term for “arguing”. And everyone knows that arguing on the Internet is perhaps one of the most undignified occupations one can assume. Nevertheless, it is an important mechanism for correcting incorrect information. As Djerassi once said (and I paraphrase), only a masochist would get into an argument with a fundamentalist about evolution, yet we would be much worse off if we had no such masochists willing to engage in this debate. For less contentious issues, the formula of the “dispelling myths” essay is a popular mechanism of presenting interesting, corrective information to a receptive audience.

I think the primary difficulties with this approach can be summed up in three words: cost, effectiveness and reach.

  • Cost. Writing up an effective rebuttal takes time. Arguing is a time-consuming activity. If it’s not something you particularly care about, it makes much more sense to update your own belief structure, and not bother attempting to convince other people about your updated belief.
  • Effectiveness. The truism is that “arguing with someone on the Internet is like arguing with a pig: it frustrates you and irritates the pig.” But social psychology research paints an even more troubling picture: even if you manage to convince someone that you’re right, they’re old (falsified) belief can still affect their decision making process. Not only that, but true information may cause a person to believe more strongly in their false beliefs. Why bother?
  • Reach. Even if you write a stunning argument that convinces everyone who reads it, there is still the problem of dissemination. The Internet makes it extremely easy to pick and choose what you want to read, to aggregate news from like-minded individuals, a phenomenon Ethan Zuckerman calls homophily. The fear is that we get caught up in our own comfortable information loops, making us much dumber.

The notion that you might be able to directly fix it seems to be a relatively recent one, come with the advent of wikis and the social web. Perhaps the ability to unrestrictedly edit an incorrect information source is a new, but the ability to go to the source, to write a letter to the editor of a textbook, is one that has been around for a long time. There is something of a glorious tradition here, including things like Knuth’s reward checks that offer fame and minuscule amounts of money for those who find errors. Sure you might need to argue with one person, but that’s all you need, and afterwards, the original infelicity is corrected.

When you go to the source, you partially resolve the reach problem: new readers instantly have access to the corrected text (though this doesn’t do much for existing dead-tree copies). But cost and effectiveness are still present; perhaps even exacerbated. You have to convince a single person that you’re right; if you can’t manage that, you’ll have to fall back to a separate rebuttal. Sometimes content is abandoned, with the original author having no intention of updating it.

Unrestricted edit access has also given rise to edit warring. In this situation, you have to actively defend the modification you have made, and unfortunately, the winner is usually the one who has more time and resources. Edit warring is war by attrition,.

Aside. There are some parallels with the world of open-source software. Fixing it is analogous to submitting a bug report or a patch; the need to convince the maintainers you are right translates into the act of writing a good bug report. If a project is abandoned, you’re out of luck; even if it’s not abandoned, you need to make the maintainer care about your specific problem. To fork is to disseminate corrective information, although there is an element of unrestricted edits mixed in with social coding websites that make it easy to find related copies.


Or you can not bother, and just ignore it. Maybe vent a little about the problem to your friends, who have no particular stake in the matter, file it away in your brain, and let life go on. In fact, this is precisely the disposition I see adopted by most of my colleagues. They don’t write for the public. They are more than happy to share their vast warehouse of knowledge and insights with their friends, with whom the exchange is usually effective and the cost not very high (they prefer informal mediums like face-to-face conversations and instant-messaging).

In fact, I’d hazard that only a fraction of what you might call the literati and experts of the world prolifically utilize this sort of broad-net information dissemination, that sites like Wikipedia, StackOverflow and Quora attempt to exploit. The incentive structure is simply not there! For the academic, communication with the general public is secondary to communication with your peers and your funders. For the entrepreneur, communication with your customers, and not the sort of communication we see here, is what is valued. For the software engineer, communication with upstream is beneficial insofar as it lets you stop maintaining your bug fixes. The mind boggles at how much untapped expertise there is out there (perhaps that’s what sites like Quora are going for.)

It would be folly to try to tell them all to not do that. After all, from their point of view, this is what is the most rational use of their time. We can try to trick individuals into pouring this information out, in the form of social incentives and addictive triggers, but by in large I think these individuals will not bite. They’re too clever for that.

Better hope you catch them in a fit of irrationality.

Many-valued logics and bottom

I was flipping through An Introduction to Non-Classical Logic by Graham Priest and the section on many-valued logics caught my eye. Many-valued logics are logics with more than the usual two truth values true and false. The (strong) Kleene 3-valued logic, sets up the following truth table with 0, 1 and x (which is thought to be some value that is neither true nor false):

NOT
    1 0
    x x
    0 1

AND
      1 x 0
    1 1 x 0
    x x x 0
    0 0 0 0

OR
      1 x 0
    1 1 1 1
    x 1 x x
    0 1 x 0

IMPLICATION
      1 x 0
    1 1 x 0
    x 1 x x
    0 1 1 1

I’ve always thought many-valued logics were a bit of a “hack” to deal with the self-referentiality paradoxes, but in fact, Kleene invented his logic by thinking about what happened with partial functions where applied with values that they were not defined for: a sort of denotation failure. So it’s not surprising that these truth tables correspond to the parallel-or and and operators predicted by denotational semantics.

The reader is invited to consider whether or not one could use this logic for a Curry-Howard style correspondence; in particular, the law of the excluded middle is not valid in K3.

Killer mutants attack (mutation gone wrong)

This is a collection of WTFs due to misuse of mutable state.


We’ll start off with some Java. What do you expect this snippet of code to do? :

Sensor Accel = sm.getDefaultSensor(Sensor.TYPE_ACCELEROMETER);
Sensor Orient = sm.getDefaultSensor(Sensor.TYPE_ORIENTATION);
sm.registerListener((SensorEventListener) this, Accel, sm.SENSOR_DELAY_FASTEST);

Ostensibly, it registers the current object to receive just accelerometer updates. But what if I told you getDefaultSensor was implemented like this:

public Sensor getDefaultSensor (int type){
        if(sensors == null) {
                sensors = new Sensor(mContext, type);
                return sensors;                 
        }else if(sensors.checkList(type)){
                sensors.addSensor(type);
                return sensors;
        }else{
                sensors.removeSensor(type);
                return sensors;
        }
}

This code completely fails to manage the expected semantics: there is a single sm wide Sensor object (stored in sensors) that accumulates sensor values as getDefaultSensor is called. So in fact, this will receive events from both the accelerometer and the magnetometer. The only saving grace is that when we register event listeners, we usually do want them to receive all events, so we might not notice if we weren’t looking too closely. This is real code from OpenIntents SensorSimulator.


Lest you think I only make fun of other people’s code, here is a diff from a project of my own:

@@ -197,13 +197,7 @@ def upload_all(tree, ftp, base):

     ftp.cwd(base)
     for blob in tree.blobs:
-        logging.info('Uploading ' + '/'.join((base, blob.name)))
-        try:
-            ftp.delete(blob.name)
-        except ftplib.error_perm:
-            pass
-        ftp.storbinary('STOR ' + blob.name, blob.data_stream)
-        ftp.voidcmd('SITE CHMOD ' + format_mode(blob.mode) + ' ' + blob.name)
+        upload_blob(blob, ftp, base)

@@ -260,11 +254,25 @@ def upload_diff(diff, tree, ftp, base):
             node = subtree/components[-1]
             assert isinstance(node, Blob)

-            logging.info('Uploading ' + full_path)
-            ftp.storbinary('STOR ' + file, node.data_stream)
-            ftp.voidcmd('SITE CHMOD ' + format_mode(node.mode) + ' ' + file)
-            # Don't do anything if there isn't any item; maybe it
-            # was deleted.
+            upload_blob(node, ftp, base)

It looks fairly plausible: I’ve factored out some common storebinary logic. Can you tell what the bug is? Here’s a hint.

The problem is that the upload_all changes the current working directory on the FTP connection (mutable state!), while upload_diff does not (working entirely from the base working directory). The upload function assumed upload_all style working directory changes, and so all upload_diff uploads were dumped in the base directory. Mutability hurts modularity! The fix was to get rid of this mutation and manually calculate the full path; this also removed some delicate invariant preservation in the original upload_all implementation.


Paradoxically enough, though Haskell encourages you not to use mutation, when you do use it, Haskell expressive static type system gives you the unusual ability to statically encode complicated invariants about your mutation—invariants that would not have been necessary if you hadn’t used mutation. A small example of this is ST monad, which uses rank-2 types to ensure that references to mutable memory cannot escape runST, the isolated “mutation thread.”

To the limit, you may find yourself knee deep in advanced type system features if you try to statically rule out incorrect usages of a mutable API. I found this out when I worked with abcBridge, and tried very hard to use types to prevent improper use of underlying C library. Here is one relevant code quote:

-- | When you duplicate a network, node indexes may change, so if you
-- would like to use old references, they need to be translated first.
-- You can read this type as @Node n -> NT n2 (Node n2)@ or @Node n ->
-- NQ n2 (Node n2)@, with the condition that the @n2@ index was one
-- generated by 'branch' or 'speculate'.
translate :: (G.NetworkMonad m AIG (Dup n n2)) => Node n -> m AIG (Dup n n2) (Node (Dup n n2))

This truly is some WTF, rank-2-phantom-types code, but it grew out of a very specific bug I stumbled onto and was unconvinced that I’d remember to avoid in the future (can you guess what it was?) A curious reader may ask, why do I need to duplicate networks in the first place? Because some operations that the underlying library provides are destructive, and the only way I can provide the illusion of persistent networks is duplicating before destruction.


In summary:

  • Mutation is frequently not what people expect,
  • Mutation is not modular, and
  • Mutation is complicated.

Avoid it when you can!

Type Technology Tree

They say that one doesn’t discover advanced type system extensions: rather, the type system extensions discover you! Nevertheless, it’s worthwhile to know what the tech tree for GHC’s type extensions are, so you can decide how much power (and the correspondingly headache inducing error messages) you need. I’ve organized the relations in the following diagram with the following criterion in mind:

  1. Some extensions automatically enable other extensions (implies);
  2. Some extensions offer all the features another extension offers (subsumes);
  3. Some extensions work really nicely with other extensions (synergy);
  4. Some extensions offer equivalent (but differently formulated) functionality to another extension (equiv).

It’s also worth noting that the GHC manual divides these extensions into “Extensions to data types and type synonyms”, “Class and instances declarations”, “Type families” and “Other type system extensions”. I have them organized here a little differently.

Rank and data

Our first tech tree brings together two extensions: arbitrary-rank polymorphism and generalized algebraic data types.

image

Briefly:

  • GADTSyntax permits ordinary data types to be written GADT-style (with explicit constructor signatures): data C where C :: Int -> C
  • ExplicitForall allows you to explicitly state the quantifiers in polymorphic types: forall a. a -> a
  • ExistentialQuantification allows types to be hidden inside a data constructor: data C = forall e. C e
  • GADTs permits explicit constructor signatures: data C where C :: C a -> C b -> C (a, b). Subsumes ExistentialQuantification because existentially quantified data types are simply polymorphic constructors for which the type variable isn’t in the result.
  • PolymorphicComponents allows you to write forall inside data type fields: data C = C (forall a. a)
  • Rank2Types allows polymorphic arguments: f :: (forall a. a -> a) -> Int -> Int. This with GADTs subsumes PolymorphicComponents because data type fields with forall within them correspond to data constructors with rank-2 types.
  • RankNTypes: f :: Int -> (forall a. a -> a)
  • ImpredicativeTypes allows polymorphic functions and data structures to be parametrized over polymorphic types: Maybe (forall a. a -> a)

Instances

Our next tech tree deals with type class instances.

image

Briefly:

  • TypeSynonymInstances permits macro-like usage of type synonyms in instance declarations: instance X String
  • FlexibleInstances allows more instances for more interesting type expressions, with restrictions to preserve decidability: instance MArray (STArray s) e (ST s) (frequently seen with multi-parameter type classes, which are not in the diagram)
  • UndecidableInstances allows instances for more interesting type expression with no restrictions, at the cost of decidability. See Oleg for a legitimate example.
  • FlexibleContexts allows more type expressions in constraints of functions and instance declarations: g :: (C [a], D (a -> b)) => [a] -> b
  • OverlappingInstances allows instances to overlap if there is a most specific one: instance C a; instance C Int
  • IncoherentInstances allows instances to overlap arbitrarily.

Perhaps conspicuously missing from this diagram is MultiParamTypeClasses which is below.

Type families and functional dependencies

Our final tech tree addresses programming with types:

image

Briefly:

  • KindSignatures permits stating the kind of a type variable: m :: * -> *
  • MultiParamTypeClasses allow type classes to range over multiple type variables: class C a b
  • FunDeps allow restricting instances of multi-parameter type classes, helping resolve ambiguity: class C a b | a -> b
  • TypeFamilies allow “functions” on types: data family Array e

The correspondence between functional dependencies and type families is well known, though not perfect (type families can be more wordy and can’t express certain equalities, but play more nicely with GADTs).

Petri net concurrency

A petri net is a curious little graphical modeling language for control flow in concurrency. They came up in this talk a few weeks ago: Petri-nets as an Intermediate Representation for Heterogeneous Architectures, but what I found interesting was how I could describe some common concurrency structures using this modeling language.

Here is, for example, the well venerated lock:

image

The way to interpret the graph is thus: each circle is a “petri dish” (place) that may contain some number of tokens. The square boxes (transitions) are actions that would like to fire, but in order to do so all of the petri dishes feeding into them must have tokens. It’s the sort of representation that you could make into a board game of sorts!

If multiple transitions can fire off, we pick one of them and only that one succeeds; the ability for a token to flow down one or another arrow encodes nondeterminism in this model. In the lock diagram, only one branch can grab the lock token in the middle, but they return it once they exit the critical area (unlock).

Here is a semaphore:

image

It’s exactly the same, except that the middle place may contain more than one token. Of course, no one said that separate processes must wait before signalling. We can implement a simple producer-consumer chain like this:

image

Note that petri net places are analogous to MVar (), though it takes a little care to ensure we are not manufacturing tokens out of thin air in Haskell, due to the lack of linear types. You may also notice that petri nets say little about data flow; we can imagine the tokens as data, but the formalism doesn’t say much about what the tokens actually represent.