ezyang's blog

Tension of patch and tree version control

September 20, 2010

This post is not meant as a rag on Darcs, just a observation of the difference between two philosophies of version control. Also, I’m a bit new to Darcs, so there might be some factual inaccuracies. Please let me know about them!

At some point, I would like to write a Darcs for Git users guide, distilling my experiences as an advanced Git user wrangling with Darcs. But perhaps the most important take away point is this: don’t try to superimpose Git’s underlying storage model on Darcs! Once I realized this point, I found Darcs fit rather nicely with my preferred Git development style—constant rebasing of local patches until they hit the official repository.

How does this rebasing workflow work? Despite the funny name, it’s a universal workflow that predates version control: the core operation is submit a patch. That is, after you’re done hacking and recompiling and you’ve cleaned up your changes, you pull up the original copy of the repository, generate a unified diff, and send it off to the official mailing list. If the unified diff doesn’t apply cleanly to whatever the official development version is, upstream will ask you to make the patch apply to the newer version of the software.

Git streamlines this workflow with rebases. As the name suggests, you are changing the base commit that your patches are applied to. The identity of the patch is more important than the actual “history” of the repository. Interactive rebases allow you to reorder patches, and slice and dice history into something pretty for upstream to read.

Because Git supports both tree-based and patch-based workflows, the tension between the two schools of thought is fairly evident. Old commit objects become unreachable when you rebase, and you have to rely on mechanisms like the reflog to retrieve your old trees. Good practice is to never rebase published repositories, because once you’ve published a consistent history is more important than a pretty one.

Darcs only supports the patch-based workflow. It’s hard to keep your patches nicely ordered like you must do when you rebase, but there’s no need to: darcs send --dry-run will let you know what local patches that haven’t been put into the upstream repository are floating around, and essentially every interesting command asks you to explicitly specify what patch you are referring to with -p. Darcs makes it easy to merge and split patches, and edit old patches even if they’re deep within your darcs changes log.

However, there are times when I really do miss the tree-based model: in particular, while it’s easy to get close, there’s no easy way to get precisely the makeup of the repository as it was two days ago (when, say, your build was still working). The fact that Git explicitly reifies any given state your repository is in as a tree object makes the patch abstraction less fluid, but means you will never ever lose committed data. Unfortunately, with Darcs, there is no shorthand for “this particular repository state”; you might notice the patches that darcs send have to explicitly list every patch that came before the particular patch you’re sending out. In this way, I think Darcs is doing too much work: while the most recent N changes should be thought of as patches and not tree snapshots, I probably don’t care about the ancient history of the project. Darcs already supports this with tags, but my experience with fast moving repositories like GHC indicates to me that you also want a timeline of tags tracking the latest “official” repository HEAD.

There is also the subject of conflict resolution, but as I have not run into any of the complicated cases, I have little to say here.

Session types, subtyping and dependent types

September 17, 2010

While I was studying session type encodings, I noticed something interesting: the fact that session types, in their desire to capture protocol control flow, find themselves implementing something strongly reminiscent of dependent types.

Any reasonable session type encoding requires the ability to denote choice: in Simon Gay’s paper this is the T-Case rule, in Neubauer and Thiemann’s work it is the ALT operator, in Pucella and Tov’s implementation it is the :+: type operator, with the offer, sel1 and sel2 functions. There is usually some note that a binary alternation scheme is—in terms of user interface—inferior to some name-based alternation between an arbitrary number of cases, but that the latter is much harder to implement.

What the authors of these papers were really asking for was support for something that smells like dependent types. This becomes far more obvious when you attempt to write a session type encoding for an existing protocol. Consider the following tidbit from Google’s SPDY:

Once a stream is created, it can be used to send arbitrary amounts of data. Generally this means that a series of data frames will be sent on the stream until a frame containing the FLAG_FIN flag is set. The FLAG_FIN can be set on a SYN_STREAM, SYN_REPLY, or a DATA frame. Once the FLAG_FIN has been sent, the stream is considered to be half-closed.

The format for a data frame is:

+----------------------------------+
|C|       Stream-ID (31bits)       |
+----------------------------------+
| Flags (8)  |  Length (24 bits)   |
+----------------------------------+
|               Data               |
+----------------------------------+

Whereas offer is implemented by transmitting a single bit across the network, here, the critical bit that governs whether or not the stream will be closed is embedded deep inside the data. Accordingly, if I even want to consider writing a session type encoding, I have to use a data definition with an extra phantom type in it, and not the obvious one:

data DataFrame fin = DataFrame StreamId FlagFin Data

I’ve had to promote FlagFin from a regular term into a type fitting into the fin hole, something that smells suspiciously of dependent types. Fortunately, the need for dependent types is averted by the fact that the session type will immediately do a case split on the type, accounting for both the case in which it is true and the case in which it is false. We don’t know at compile time what the value will actually be, but it turns out we don’t care! And if we are careful to only permit fin to be TrueTy when FlagFin is actually True, we don’t even need to have FlagFin as a field in the record.

This observation is what I believe people are alluding to when they say that you can go pretty far with type tricks without resorting to dependent types. Pushing compile-time known values into types is one obvious example (Peano integers, anyone?), but in this case we place compile-time unknown values into the types just by dealing with all possible cases!

Alas, actually doing this in Haskell is pretty awkward. Consider some real-world algebraic data type, a simplified version of the SPDY protocol that only allows one stream at a time:

data ControlFrame = InvalidControlFrame
                  | SynStream FlagFin FlagUnidirectional Priority NameValueBlock
                  | SynReply FlagFin NameValueBlock
                  | RstStream StatusCode
                  | Settings FlagSettingsClearPreviouslyPersistedSettings IdValuePairs
                  | NoOp
                  | Ping Word32
                  | Headers NameValueBlock
                  | WindowUpdate DeltaWindowSize

Each constructor needs to be turned into a type, as do the FlagFin, but it turns out the other data doesn’t matter for the session typing. So we end up writing a data declaration for each constructor, and no good way of stitching them back together:

data RstStream
data SynStream fin uni = SynStream Priority NameValueBlock
data SynReply fin = SynReply NameValueBlock
...

The thread we are looking for here is subtyping, specifically the more exotic sum-type subtyping (as opposed to product-type subtyping, under the more usual name record subtyping). Another way of thinking about this is that our type now represents a finite set of possible terms that may inhabit a variable: as our program evolves, more and more terms may inhabit this variable, and we need to do case-splits to cut down the possibilities to a more manageable size.

Alas, I hear that subtyping gunks up inference quite a bit. And, alas, this is about as far as I have thought it through. Doubtless there is a paper that exists out there somewhere that I ought to read that would clear this up. What do you think?

Evolution of a Shared Web Host

September 15, 2010

Edward continues his spree of systems posts. Must be something in the Boston air.

Yesterday, I gave a SIPB cluedump on the use and implementation of scripts.mit.edu, the shared host service that SIPB provides to the MIT community. I derive essentially all of my sysadmin experience points from helping maintain this service.

Scripts is SIPB’s shared hosting service for the MIT community. However, it does quite a bit more than your usual $10 host: what shared hosting services integrate directly with your Athena account, replicate your website on a cluster of servers managed by Linux-HA, let you request hostnames on *.mit.edu, or offer automatic installs of common web software, let you customize it, and still upgrade it for you? Scripts is a flourishing development platform, with over 2600 users and many interesting technical problems.

I ended up splitting up the talk into two segments: a short Scripts for Power Users presentation, and a longer technical piece named Evolution of a Shared Web Host. There was also a cheatsheet handout passed out in the talk.

Among the technologies discussed in this talk include Apache, MySQL, OpenAFS, Kerberos, LDAP and LVS.

Keyword arguments in Haskell

September 13, 2010

Keyword arguments are generally considered a good thing by language designers: positional arguments are prone to errors of transposition, and it’s absolutely no fun trying to guess what the 37 that is the third argument of a function actually means. Python is one language that makes extensive use of keyword arguments; they have the following properties:

Functions are permitted to be a mix of positional and keyword arguments (a nod to the compactness of positional arguments),
Keywords are local to any given function; you can reuse a named function argument for another function,
In Python 3.0, you can force certain arguments to only be specifiable with a keyword.

Does Haskell have keyword arguments? In many ways, they’re much less necessary due to the static type system: if you accidentally interpose an Int and Bool your compiler will let you know about it. The type signature guides you!

Still, if we were to insist (perhaps our function took many arguments of the same type), one possibility is to pass a record data type in as the sole argument, but this is a little different than Python keyword arguments in the following ways:

There is a strict delineation between positional and keywords: either you can specify your record entirely with keywords or entirely with positional arguments, but you can’t do both,
Record fields go into the global namespace, so you have to prefix/suffix them with some unique identifier, and
Even with named records, a user can still choose to construct the record without specifying keyword arguments. For large argument lists, this is not as much of an issue, but for short argument lists, the temptation is great.

I find issue two to be the reason why I don’t really employ this trick; I would find it quite annoying to have to make a data structure for every function that I wanted to use named arguments with.

I’d like to suggest another trick to simulate named arguments: use newtypes! Consider this undertyped function:

renderBox :: Int -> Int -> Int -> Int -> IO ()
renderBox x y width height = undefined

main = renderBox 2 4 50 60

We can convert it to use newtypes like this:

newtype XPos = XPos Int
newtype YPos = YPos Int
newtype Width = Width Int
newtype Height = Height Int

renderBox :: XPos -> YPos -> Width -> Height -> IO ()
renderBox (XPos x) (YPos y) (Width width) (Height height) = undefined

main = renderBox (XPos 2) (YPos 4) (Width 50) (Height 60)

Unlike the usual use of newtypes, our newtypes are extremely short-lived: they last just long enough to get into the body of renderBox and then they are pattern matched to oblivion: the function body can rely on good local variable names to do the rest. But it still manages to achieve the goals of keyword arguments: any call to renderBox makes it crystal clear what each integer means. We also maintain the following good properties:

If the type already says all you need to say about an argument, there’s no need to newtype it again. Thus, you can have a mix of regular and newtype arguments.
Newtypes can be reused. Even further, they are only to be reused when the semantic content of their insides is the same, which encourages good naming practices.
The user is forced to do the newtype wrapping: there’s no way around it. If you publish smart constructors instead of the usual constructors, you can factor out validation too.

Newtypes are so versatile!

Towards platform-agnostic interruptibility

September 10, 2010

Last post, I talked about some of the notable difficulties in emulating pthread_cancel on Windows. Today, I want to talk about what a platform agnostic compiler like GHC actually ought to do. Recall our three design goals:

GHC would like to be able to put blocking IO calls on a worker thread but cancel them later; it can currently do this on Linux but not on Windows,
Users would like to write interrupt friendly C libraries and have them integrate seamlessly with Haskell’s exception mechanism, and
We’d like to have the golden touch of the IO world, instantly turning blocking IO code into nice, well-behaved, non-blocking, interruptible code.

I am going to discuss these three situations, described concisely as blocking system calls, cooperative libraries and blocking libraries. I propose that, due to the lack of a cross-platform interruption mechanism, the correct interruptibility interface is to permit user defined handlers for asynchronous exceptions.

Interruptible blocking system calls. In the past, GHC has had some bugs in which a foreign call to a blocking IO system call caused Windows to stop being interruptible. This is a long standing difference in asynchronous IO philosophy between POSIX and Windows: POSIX believes in functions that look blocking but can be interrupted by signals, while Windows believes in callbacks. Thus, calls that seem harmless enough actually break interruptibility and have to be rewritten manually into a form amenable to both the POSIX and Windows models of asynchrony.

While it is theoretically and practically possible to manually convert every blocking call (which, by the way, works perfectly fine on Linux, because you can just send it a signal) into the asynchronous version, but this is very annoying and subverts the idea that we can simply ship blocking calls onto another thread to pretend that they’re nonblocking.

Since Windows Vista, we can interrupt blocking IO calls using a handy new function CancelSynchronousIo. Notice that cancelling IO is not the same as cancelling a thread: in particular, the synchronous operation merely returns with a failure with the last error set to ERROR_OPERATION_ABORTED, so the system call needs to have been performed directly by GHC (which can then notice the aborted operation and propagate the interrupt further) or occur in C code that can handle this error condition. Unfortunately, this function does not exist on earlier versions of Windows.

Aside. Is there anything we can do for pre-Vista Windows? Not obviously: the under-the-hood changes that were made to Windows Vista were partially to make a function like CancelSynchronousIo possible. If we were to enforce extremely strong invariants on when we call TerminateThread; that is, we’d have to manually vet every function we consider for termination, and at that point, you might as well rewrite it asynchronous style.

Interruptible cooperative libraries. This is the situation where we have a C library that we have a high level of control over: it may be our own library or we may be writing a intermediate C layer between GHC and an expressive, asynchronous underlying library. What we would like to do is have GHC seamlessly convert its asynchronous exceptions into some for that our C can notice and act gracefully upon.

As you may have realized by now, there are a lot of ways to achieve this:

Signals. POSIX only, signals can be temporarily blocked with sigprocmask or pthread_sigmask and a signal handler can be installed with sigaction to cleanup and possible exit the thread or long jump.
Pthread cancellation. POSIX only, cancellation can be temporarily blocked with pthread_setcanceltype and a cancellation handler can be installed with pthread_cleanup_push.
Select/poll loop. Cancellation requests are sent over a socket which is being polled, handler can choose to ignore them.
Event objects. Windows only, threads can receive cancellation requests from the handle from OpenEvent but choose to ignore them.
IO cancellation. Windows Vista only, as described above.
Completion queue. Windows only, similar to select/poll loop.

It doesn’t make much sense to try to implement all of these mechanisms natively. Therefore, my proposal: have GHC call a user-defined function in a different thread upon receipt of an asynchronous function and let the user figure out what to do. In many ways, this is not much of a decision at all: in particular, we’ve asked the programmer to figure it out for themselves. Libraries that only worked with POSIX will still only work with POSIX. However, this is still an advance from the current state, which is that asynchronous exceptions necessarily behave differently for Haskell and FFI code.

Interruptible blocking libraries. Because blocking IO is much easier to program than non-blocking IO, blocking interfaces tend to be more prevalent and better tested. (A friend of mine who spent the summer working on Chromium had no end of complaints about the bugginess of the non-blocking interface to NSS.) It might be practical to rewrite a few system calls into asynchronous style, but when you have a blob of existing C code that you want to interface with, the maintenance cost of such a rewrite quickly becomes untenable. What is one to do?

Alas, there is no magic bullet: if the library never had any consideration for interruptibility, forcibly terminating it is more than likely to leave your program in a corrupted state. For those who would like to play it fast and loose, however, the user-defined function approach would still let you call TerminateThread if you really wanted to.

In conclusion, I propose that the interruptibility patch be extended beyond just a simple interruptible keyword to allow a user-defined asynchronous exception handler, compiled against the RTS, as well as providing a few built-in handlers which provide sensible default behaviors (both platform specific and non-platform specific, though I expect the latter to give much weaker guarantees).

pthread_cancel on Windows

September 8, 2010

Edward, I’m afraid I have some bad news. Your interruptible GHC patch; it was involved in a terrible accident on the way to Windows portability. I hope you understand: we’re doing our best to patch it up, but there have been some complications…

Pop quiz! What does this pthreads code do? :

#include <pthread.h>
#include <stdio.h>

void *thread1(void *arg) { sleep(10000); }
void *thread2(void *arg) { while (1) {} }

void *psycho_killer(void *arg) {
    pthread_t *id = (pthread_t*)arg;
    pthread_cancel(*id);
    printf("[%p] Psycho killer...\n", id);
    pthread_join(*id, NULL);
    printf("[%p] ...qu'est-ce que c'est.\n", id);
}

int main(char* argv, int argc) {
    pthread_t t1, t2, k1, k2;
    pthread_create(&t1, NULL, thread1, NULL);
    printf("[%p] I can't sleep 'cause my bed's on fire\n", &t1);
    pthread_create(&t2, NULL, thread2, NULL);
    printf("[%p] Don't touch me I'm a real live wire\n", &t2);
    pthread_create(&k1, NULL, psycho_killer, &t1);
    pthread_create(&k2, NULL, psycho_killer, &t2);
    pthread_join(k1, NULL);
    pthread_join(k2, NULL);
    printf("Run run run away!\n");
    return 0;
}

It never manages to terminate the second thread…

ezyang@javelin:~/Desktop$ ./test 
[0xbf900b4c] I can't sleep 'cause my bed's on fire
[0xbf900b48] Don't touch me I'm a real live wire
[0xbf900b4c] Psycho killer...
[0xbf900b4c] ...qu'est-ce que c'est.
[0xbf900b48] Psycho killer...
^C

If you just had the pthread_cancel and the pthread_setcancelstate manpages, this might seem a little mysterious. The pthreads page, however, makes things clear: sleep is among one-hundred and two “cancellable” functions, which pthread_cancel must terminate within if a thread’s cancellability status is PTHREAD_CANCEL_DEFERRED (there are another two-hundred and forty-two which may or may not be cancelled). If the thread is stuck in userspace, it has to explicitly allow a deferred cancellation with pthread_testcancel. Previous versions of the POSIX spec were a little unclear whether or not cancellation should take place upon entry to the system call, or while the system call was running, but the 2008 spec is fairly clear:

Cancellation points shall occur when a thread is executing the following functions…

The million-dollar question is: “Can we implement the same semantics on Windows?” Actually, since it seems that a lot of people would have wanted pthreads functionality on Windows, you would think that this has been already been implemented by pthreads-win32. We turn to the source! :

if (tp->cancelType == PTHREAD_CANCEL_ASYNCHRONOUS
    && tp->cancelState == PTHREAD_CANCEL_ENABLE
    && tp->state < PThreadStateCanceling)
  {
    /* snip  */
  }
else
  {
    /*
     * Set for deferred cancellation.
     */
    if (tp->state < PThreadStateCancelPending)
  {
    tp->state = PThreadStateCancelPending;
    if (!SetEvent (tp->cancelEvent))
      {
        result = ESRCH;
      }
  }
    else if (tp->state >= PThreadStateCanceling)
  {
    result = ESRCH;
  }

    (void) pthread_mutex_unlock (&tp->cancelLock);
  }

Interestingly enough, pthreads-win32 doesn’t seem to do anything special: when we translate our test program and run it with pthreads-win32, it gets stuck on the Sleep call as well:

C:\Users\ezyang\pthreads-win32\Pre-built.2\lib>test.exe
[0022FF40] I can't sleep 'cause my bed's on fire
[0022FF38] Don't touch me I'm a real live wire
[0022FF40] Psycho killer...
[0022FF38] Psycho killer...
^C

At this point, it’s worth stepping back for a moment and asking, “What are we really trying to do here?” If you were to ask how to terminate threads on, say, Stack Overflow, you’d get a bunch of responses telling you, “Stop that and do it the right way”; namely, by explicitly handling thread termination on the thread itself via another message passing mechanism.

So there are number of different needs for interruptible calls:

GHC would like to be able to put blocking IO calls on a worker thread but cancel them later; it can currently do this on Linux but not on Windows,
Users would like to write interrupt friendly C libraries and have them integrate seamlessly with Haskell’s exception mechanism, and
We’d like to have the golden touch of the IO world, instantly turning blocking IO code into nice, well-behaved non-blocking code.

Next time I’ll talk about what different approaches might be needed for each of these goals.

Embracing Windows

September 6, 2010

Some things come round full circle.

As a high schooler, I was a real Windows enthusiast. A budding programmer, I accumulated a complete development environment out of necessity, a mix of Cygwin, handwritten batch scripts, PuTTY, LogMeIn, a homegrown set of PHP build scripts and Notepad++. I was so devoted to the cause I even got a single patch into Git, for the purpose of making Git play nicely with plink on Windows. The setup worked, but it always felt like a patchwork of different components, all not quite seeing eye-to-eye with each other. When I discovered that Linux was able to offer me an unbelievably coherent development environment, I jumped ship and said goodbye to Windows.

Some things come round full circle. Windows has a way of coming back to you eventually. The product I worked on over the summer at Galois had to support Windows, and I consequently devoted days of effort getting my changes to build properly on Windows. I then went on to hacking GHC, and Simon Marlow asked me to implement the equivalent feature in Windows.

I’ve decided that I should stop shunning Microsoft Windows as the developer’s black sheep of the operating systems. Like it or not, Windows is here to stay; even if I never boot my laptop into Windows, as a developer it is good practice to think about and test my code on Windows. It might even be the case that Windows is a perfectly reasonable underlying platform to develop on.

There seem to be two reasons why developers might find targeting other platforms to be annoying:

They don’t have access to a computer running that operating system, which makes debugging the problems extremely annoying—after all, this is why a reproduceable test-case is the gold standard of bug reporting. We should have easy to access and easy to use build servers setup to let people play in these different environments. This involves putting down some money to buy the appropriate licenses, which open-source authors might be reluctant to do: people at places with site licenses might be able to help by donating boxes for these people to play in (the same way companies and universities donate disk space and bandwidth for mirrors).
They have to learn about another platform, with all of its intricacies and gotchas. On the one hand, this is annoying because “I already know how to do this in Unix, and now I have to spend N minutes to figure out how to do it on Windows, and spend another N minutes figuring out why it doesn’t work in some edge case.” On the other hand, learning a platform that does something you already know how to do can be kind of fun: you get to see different design decisions and develop multiple perspectives on the same problem, which I have found has always helped me out for problem solving.

There remain parts of Windows programming that I continue to have no interest in: for example, I find the vagaries of manifest files to be fairly uninteresting. But then again, I find packaging in Linux distributions to be uninteresting. Stop blaming Windows!

Annotating slides

September 3, 2010

A little trick for your toolbox: after you’ve generated your slide deck and printed it out to PDF, you might want to annotate the slides with comments. These is a good idea for several reasons:

If you’ve constructed your slides to be text light, they might be optimized for presentation but not for reading later on. (“Huh, here is this diagram, I sure wish I knew what the presenter was saying at that point.”)
Writing out a dialog to go along the slides is a nonvocal way of practicing your presentation!

But how do you interleave the slide pages with your annotations? With the power of enscript and pdftk, you can do this entirely automatically, without even having to leave your terminal! Here’s the recipe.

Create an “annotations” text file (we’ll refer to it as annot.txt). This will contain your text commentary to accompany the slides. Write the text explaining your first slide, and then insert a form feed (^L, you can do so by pressing C-l in vim (insert mode) or C-q C-l in emacs.) Write the text for your second slide. Rinse and repeat.
We now want to render this into a PDF file, with the same dimensions as your slide deck. Figure out what the size of your slides are in pixels, and then edit your ~/.enscriptrc to contain the following line:
```
Media: Slide width height llx lly urx ury
```
where ll stands for lower left and ur stands for upper right: these four numbers denote the bounding box for the text. One possible combination for these might be:
```
Media: Slide 576 432 18 17 558 415
```
We can now invoke enscript to generate a nicely formatted PostScript file of our annotations in the right dimensions, with enscript annot.txt -p annot.ps -M Slide -B -f Palatino-Roman14 (pick a different font, if you like.)
Convert the resulting PostScript file into a PDF, with ps2pdf annot.ps.
Now, with pdftk, we will split our annotations PDF and our slides PDF into individual pages, and then merge them back together into one PDF. We can use burst to output the pages, suggestively naming the output files so they interleave correctly:
```
mkdir stage
pdftk slides.pdf burst output stage/%02da.pdf
pdftk annot.pdf burst output stage/%02db.pdf
```
and then we join them back together:
```
pdftk stage/*.pdf cat output annotated-slides.pdf
```

Here’s the full script:

#!/bin/sh
set -e
ANNOT="$1"
SLIDES="$2"
OUTPUT="$3"
if [ -z "$3" ]
then
    echo "usage: $0 annot.txt slides.pdf output.pdf"
    exit 1
fi
TMPDIR="$(mktemp -d)"
enscript "$ANNOT" -p "$ANNOT.ps" -M Slide -B -f Palatino-Roman14
ps2pdf "$ANNOT.ps" "$ANNOT.pdf"
pdftk "$SLIDES" burst output "$TMPDIR/%03da.pdf"
pdftk "$ANNOT.pdf" burst output "$TMPDIR/%03db.pdf"
pdftk "$TMPDIR"/*.pdf cat output "$OUTPUT"
rm -Rf "$TMPDIR"

Don’t forget to define Slide in your .enscriptrc, and happy annotating!

My type signature overfloweth

September 1, 2010

I’ve recently started researching the use of session types for practical coding, a thought that has been in the back of my mind ever since I was part of a team that built a networked collaborative text editor and spent a lot of time closely vetting the server and the client to ensure that they had implemented the correct protocols. The essence of such protocols is often relatively simple, but can quickly become complicated in the presence of error flow (for example, resynchronizing after a disconnection). Error conditions also happen to be difficult to automatically test! Thus, static types seem like an attractive way of tackling this task.

There are three implementations of session types in Haskell: sessions, full-sessions and simple-sessions. If you were feeling particularly naive, you might try going to the Haddock page to get a feel for what the API looks like. Before you continue reading, please inspect that page.

Done gouging your eyes out? Let’s proceed.

In an interview in Coders at Work, Simon Peyton Jones mentioned that one of the notable benefits of types is that it gives a concise, crisp description of what a function might do. That API is anything from concise and crisp, and it’s certainly not something that I could figure out just by looking at the corresponding function definition. Accordingly, one of the key selling points of current encodings of session types is that they do not break type inference: we give up on our user understanding what the gaggle of typeclasses means, and only expect to transfer one bit of information, “Do the protocols match?”

This is not a problem that is fundamental to session types: any functionality that makes extensive use typeclasses can easily fall prey to these long type signatures. I have two (rather half-baked) thoughts on how this complexity might be rendered more nicely to the user, although not eliminated:

A favorite pastime of type system hackers is a type-level encoding of naturals, using Peano numbers Z and S a, attached to something like Vector (S (S Z)). Vector is a type constructor of kind * -> *. However, since there is only one primitive kind in Haskell, we could actually pass any type to Vector, say Vector Int, which would be a nonsensical. One way to prevent this from occurring is to declare our Peano numbers instances of a typeclass Nat, and then declare Nat a => Vector a. But, since a is used precisely once in any such a statement, wouldn’t it be great if instead we could write Vector :: Nat -> *? If you need to specify type equality, you could imagine some sort of type pattern matching concat :: Vector a -> Vector b -> Vector c with c ~ a :+: b. Collapsing types and kinds is an interesting step in this direction.
When mathematicians present proofs, they might explicitly specify “for all F such that F is a field…”, but more frequently, they’ll say something like, “in the following proof, assume the following variable naming conventions.” With this, they get to avoid having to repeatedly explicitly redeclare what all of their variable names mean. An analogous system for type variables would go a long way towards reducing long type signatures.

But actually, that has nothing to do with what I’m currently looking at.

Here’s what I am looking at: session types suffer from another type signature explosion phenomenon: any function in the protocol contains, in its type, a complete specification of the entire protocol continuing from that point in time. As Neubauer and Thiemann admit (PDF), the “session type corresponding to full SMTP is quite unreadable.” The two lines of inquiry I am pursuing are as follows:

Can building exception support into session types (currently an open problem) allow for much simpler session types by allowing most cases to elide the session types corresponding to error cases?
Can we use type to permit a single global specification of the protocol, which individual functions then simply refer to? Do we need something a little more powerful?

At this point, I’ve just been doing thinking and paper reading, but I hope to start hacking on code soon. I’d love to hear your thoughts though.

Defining “Haskelly”

August 30, 2010

At risk of sounding like a broken record, the topic of this post also sprang from abcBridge. John Launchbury asked a question during my presentation that got me thinking about API design in Haskell. (By the way, the video for the talk is out! Unfortunately, the second half had to be cut out due to technical difficulties, but you can still check out the slides.)

His question was this:

You’ve presented this in a very imperative style, where you’ve got this AIG structure in the ABC tool, and what you’ve really done is given me a nicely typed Haskell typed interface that allows you to go in a put a new gate or grab a structure, and I’m left wondering, what is the reason for needing this tight tie-in with what’s going on in that space? Here is a thought experiment: I could imagine myself having a purely functional data structure that is describing the data structure…and you end up with a functional description of what you want your graph to look like, and then you tell ABC to go and build the graph in one go.

I had claimed that abcBridge was a “functional API” for manipulating and-inverter graphs; perhaps I was lying! Is abcBridge—with its close correspondence to the underlying imperative code—truly “functional?” Or, if it’s not functional, does it at least have a “Haskelly” API? (What does it even mean for an API to be Haskelly?) Why does the purely functional interface seem morally better than the imperative interface? It’s a question of philosophical import as well as practical import—why do we prefer the functional interface which might require a more complex underlying implementation?

My conjecture is that the faithfulness of an API to its host language is based on how much it utilizes particular features that a language makes easy to use. Haskell is frequently introduced as a “purely functional, lazy, strongly statically typed programming language.” Looking at each of these terms in turn (informally)…

Purely functional indicates APIs that eschew destructive updates, instead opting for immutability and persistent data structures. Language features that make it easier to write in this style include the final and const keywords, algebraic data types, pattern-matching and a library of persistent data structures to write more persistent data structures with.
Lazy indicates APIs that utilize laziness to build custom control structures and generate infinite data structures. The poster child language feature for laziness is, well, lazy evaluation by default, but explicit laziness annotations in a strict language or even a convenient lambda abstraction encourages lazy style. (Python does not have a convenient lambda, which is why asynchronous frameworks like Twisted are so painful!)
Strongly statically typed indicates APIs that encode invariants of all shapes and sizes into the static type system, so that programmer errors can be caught at compile time, not run time. The type system is the obvious language feature here, with its expressiveness defining what you can easily add to your system.

We associate programs that take advantage of these language features as “Haskelly” because Haskell makes it easy—both syntactically and conceptually—to use them! But at the same time, these are all (mostly) orthogonal language features, and for any given API you might write, you may opt to ditch some of these properties for others: maybe the feature just doesn’t matter in your problem domain, maybe the feature imposes an unacceptable performance penalty or is an insufficiently sealed abstraction.

With abcBridge as our concrete example, here is how you might make such classifications in practice:

The monadic interface for constructing networks is about as far from purely functional as you can get, which was an explicit design choice in the name of performance and control. (Fortunately, we can build a nicer API on top of this one—in fact, I did an experimental implementation of one.) However, when you’re dealing with fully constructed networks the API takes a purely functional style, doing copying and unsafePerformIO behind the scenes to preserve this illusion.
abcBridge does not directly use laziness: in particular the monadic code is very structured and doesn’t have a lot of flow control in it.
The static type system is a huge part of abcBridge, since it is operating so close to the bare metal: it has two monads, with an intricate set of functions for running and converting the monads, and the low level FFI bindings make every attempt to enhance the existing C-based type system. Notice the interesting play between the types and a functional interface: if we had a purely functional interface, we probably could have ditched most of these complicated types! (Imperative code, it seems, needs stronger type system tricks.)

As far as pure Haskell libraries go, abcBridge is very un-Haskelly: I would certainly expect more from an equivalent library implemented in pure Haskell. But it is leaps and bounds better than the C library it sprang from. How far should one push the envelope? It is all about striking the right balance—that is why API design is an art.

« Newer Posts Older Posts »