August 27, 2012As software engineers, we are trained to be a little distrustful of marketing copy like this:
OfflineIMAP is SAFE; it uses an algorithm designed to prevent mail loss at all costs. Because of the design of this algorithm, even programming errors should not result in loss of mail. I am so confident in the algorithm that I use my own personal and work accounts for testing of OfflineIMAP pre-release, development, and beta releases.
What is this algorithm? Why does it work? Where is the correctness proof? Unfortunately, no where in OfflineIMAP’s end user documentation is the algorithm described in any detail that would permit a software engineer to convince himself of OfflineIMAP’s correctness. Fortunately for us, OfflineIMAP is open source, so we can find out what this mysterious algorithm is. In fact, OfflineIMAP’s synchronization algorithm is very simple and elegant. (Nota bene: for simplicity’s sake, we don’t consider message flag synchronization.)
Preliminaries
Define our local and remote repositories (Maildir and IMAP, respectively) to consist of sets over messages L and R. In a no-delete synchronization scheme, we would like to perform some set of operations such that end states of the repositories L’ and R’ are L ∪ R.
However, no-delete synchronization schemes work poorly for email, where we would like the ability to delete messages and have those changes be propagated too. To this end, OfflineIMAP defines a third repository called the status repository, also a set over messages, which says whether or not a message has been synchronized in the past without an intervening synchronized delete. There are now seven possible states for a message to have, based on which repositories it is a member:

Considering all possible combinations:
- Synchronized (L,R,S): The message is fully synchronized and needs no further processing.
- New Local (L): The message was newly added to the local repository and needs to be uploaded.
- New Remote (R): The message was newly added to the remote repository and needs to be downloaded.
- Status Missing (L,R): The message is synchronized but our status is out-of-date.
- Remote Removed (L,S): The message was synchronized, but since then was removed from the remote; it should now be removed from local.
- Local Removed (R,S): The message was synchronized, but since then was removed from the local; it should now be removed from remote.
- Missing (S): The message has been deleted everywhere and our status has a stale entry for it.
The green-shaded region of the Venn diagram is what we would like L, R and S to cover at the end of synchronization.
Algorithm
Define a synchronization operation on a source, destination and status repository syncto(src, dst, status) to be these two steps:
- Calculate the set difference
src - status, and copy these messages to dst and status. - Calculate the set difference
status - src, and delete these messages from dst and status.
The full synchronization algorithm is then:
syncto(R, L, S) (download changes)syncto(L, R, S) (upload changes)
How it works
In the absence of crashes, the correctness proof only involves verifying that the status repository invariant (that messages in status have been synchronized in the past without an intervening synchronized delete) is preserved over all four operations, and that the set differences are, in fact, precisely the sets of messages we want to copy and delete. However, we can also try and look at how the local, remote and status repositories change as the algorithm progresses. In particular, the contents of the status repository in the first syncto is slightly surprising as it evolves differently from local, despite having the same operations applied to it (it then evolves in lockstep with remote).

Another important correctness claim is that OfflineIMAP never “loses mail”. Under what conditions is mail deleted? When it is present in status repository, but not in the local or remote repository. So it is easy to see that when the status repository is “lost” (either corrupted, or deleted as the instructions tell you to if you delete the contents of your local folders), OfflineIMAP will conservatively perform a full, no-delete synchronization between the two sources. So long as the status repository never contains data for more messages than it ought to, OfflineIMAP will not delete your mail.
Variations
Suppose that I have more disk space available on local disk for Maildir than my remote IMAP server. Eventually, you will end up in the awkward position of wanting to delete messages from your remote IMAP server without correspondingly nuking them from your local mail store. OfflineIMAP provides the maxage option (in which OfflineIMAP refuses to acknowledge the existence of messages older than some sliding window), but what if we really wanted to be sure that OfflineIMAP would never ever delete messages from my local repository?
Simple: Skip step 1-2.

Conclusion
By utilizing a third repository, for which data loss results in a conservative action on the part of the program, OfflineIMAP achieves its claims of an algorithm designed to prevent mail loss at all costs. It is also a simple algorithm, and I hope that any computer scientist or software engineer using this software will take the time to convince themselves of its correctness, rather than relying on the hearsay of some marketing material.
August 25, 2012After a long delay and a lot of editing, Issue 20 of The Monad Reader is finally out. Check it out!
August 16, 2012On the importance of primary sources.
(Introductory material ahead.) Most readers of this blog should have at least a passing familiarity with applicative functors:
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
This interface is quite convenient for day-to-day programming (in particular, it makes for the nice f <$> a <*> b <*> c idiom), but the laws it obeys are quite atrocious:
[identity] pure id <*> v = v
[composition] pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
[homomorphism] pure f <*> pure x = pure (f x)
[interchange] u <*> pure y = pure ($ y) <*> u
So, if you (like me twenty-four hours ago) haven’t seen it already, you should show that this interface is equivalent to Applicative:
class Functor f => Monoidal f where
unit :: f ()
(**) :: f a -> f b -> f (a,b)
(By the way, if you haven’t shown that join :: m (m a) -> m a for monads is equivalent to bind :: m a -> (a -> m b) -> m b, you should do that too.) The laws for this formulation are much nicer:
[naturality] fmap (f *** g) (u ** v) = fmap f u ** fmap g v
[left identity] unit ** v ≅ v
[right identity] u ** unit ≅ u
[associativity] u ** (v ** w) ≅ (u ** v) ** w
Where f *** g = \(x,y) -> (f x, g y). I’ve prettied things up a bit by using “is isomorphic to” in order to suppress the differences between ((), a) and a, as well as (a,(b,c)) and ((a,b),c), for strict equalities you’ll need some extra functions to massage the results into the right types. It seems that there is a general pattern where the API which has nice formulations of laws is not convenient to program with, and the formulation which is nice to program with does not have nice laws. C’est la vie… but at least they’re equivalent!
With this formulation, it becomes trivial to state what laws commutative applicative functors obey:
[commutativity] u ** v ≅ v ** u
The original paper Applicative Programming With Effects is well worth a read. Check it out! That concludes this public service announcement.
August 15, 2012Robert Harper has (somewhat) recently released a pre-print of a book (PDF) that he has been working on, Practical Foundations for Programming Languages. I downloaded a copy when it initially came out, but I was guilty of putting off actually digging into the book’s 590-some pages. It was only until Harper successfully baited me with one of his most recent blog posts that I finally sat down and skimmed it a bit more thoroughly.
The immediate temptation is to compare PFPL to Benjamin Pierce’s seminal Types and Programming Languages. At first glance, there would seem to be quite a bit of overlap, both in terms of content and in terms of presentation. Both books starting with a very simple programming language and successively add features to explain successively more complex topics in programming languages design. But PFPL consciously differs from TAPL in many aspects. For ideological reasons, Harper completely skips the untyped language, jumping straight to a typed language with variable let-bindings, in order to immediately introduce types, contexts and safety. This presentation is substantially more terse, and newcomers with no programming languages experience may perceive that PFPL feels more like a reference manual than a textbook—`one commenter <http://existentialtype.wordpress.com/2012/08/06/there-and-back-again/#comment-949>`_ likened it to a math textbook. (Apropos of nothing, Harper’s introductory class 15-312 Principles of Programming Languages, which uses PFPL, does start with the untyped lambda calculus.)
Nevertheless, this terseness is an asset for PFPL; for one thing, it permits Harper to cover a lot of ground, covering topics that TAPL did not handle at all. Nor does the terseness mean that Harper has “left anything out”, each chapter is self-contained and comprehensive on the topics it chooses to cover. It also makes it a fun read for people like me who do have familiarity with the topics discussed but benefit from seeing and thinking about a different treatment.
Harper has been blogging about his book, and I think his blog posts are a good indication of what parts of the book are particularly worth looking at. Harper has taken the style of going “all intuition” in his blog posts, and relegating most of the formalism to his book. I think this is a shame, since the formalisms he defines are quite accessible and would make things a lot clearer for many in his audience (judging from the comments section, at least!) Here are some of the pairings:
- Dynamic Languages are Static Languages is a companion to Chapter 18, “Dynamic Typing”. There, he develops Dynamic PCF (essentially the core of Lisp) and shows how the usual concrete syntax masks the tagging that occurs, and the usual dynamics masks the wasteful and repetitive checks that are endemic to dynamically typed languages. There is always a temptation, in these holy wars, to expand the scope of the argument, but if you accept Dynamic PCF as a valid way of framing one aspect of the debate, it is extremely precise.
- Referential transparency is a companion to Chapter 32, “Symbols”. Symbols are a little weird, because most languages don’t even have a way of even acknowledging this concept exists. You might think of it as an identifier for a “generalized mutable cell” apart from how you actually access it, but really you should just read the formal treatment, since it is very simple.
- Words matter is a companion to Chapter 36, “Assignable References”. It’s a simple terminology split, motivated by how Harper defines the term “variable”, way in Chapter 1 of his book.
- Haskell is Exceptionally Unsafe is a companion to Chapter 34, “Dynamic Classification”. It argues that it is important to be able to generate exception classes at runtime (the term “class” here has a very precise meaning, namely, it is an index of a finite sum, in this case the exception type; this is discussed in Chapter 12). At least in the Haskell community, this is not a particularly common usage of the term “dynamic” (though I agree with Harper that it is a correct usage), and PFPL spells exactly what it means, no more, no less.
All-in-all, Practical Foundations for Programming Languages is well worth checking out. It is a not too widely kept secret that no one really reads textbooks from tip to tail, but if you found yourself reading one of Harper’s blog posts and being puzzled, do give the companion chapter a chance. Even with just the small bits of the book I have read, PFPL has taught me new things and clarified my thinking.
August 10, 2012Steve Yegge has posted a fun article attempting to apply the liberal and conservative labels to software engineering. It is, of course, a gross oversimplification (which Yegge admits). For example, he concludes that Haskell must be “extreme conservative”, mostly pointing at its extreme emphasis on safety. This completely misses one of the best things about Haskell, which is that we do crazy shit that no one in their right mind would do without Haskell’s safety features.
So I thought I’d channel some Yegge and take a walk through the criteria proposed for assessing how conservative a user of a language is, and try to answer them to the best of my ability with my ”Haskell hat” on:
- Software should aim to be bug free before it launches. Yes. Though, “Beware of bugs in the above code; I have only proved it correct, not tried it.”
- Programmers should be protected from errors. Yes. But, Yegge then adds the sentence: “Many language features are inherently error-prone and dangerous, and should be disallowed for all the code we write.” This is not the approach that Haskell takes: if you want continuations with mutable state, Haskell will give them to you. (Try doing that in Python.) It doesn’t disallow language features, just make them more wordy (
unsafePerformIO) or harder to use. Haskell has a healthy belief in escape hatches. - Programmers have difficulty learning new syntax. No. Haskell is completely on the wrong side of the fence here, with arbitrary infix operators; and even more extremist languages (e.g. Coq) go even further with arbitrary grammar productions. Of course, the reason for this is not syntax for its own sake, but syntax for the sake of closely modeling existing syntax that mathematicians and other practitioners already use. So we allow operator overloading, but only when it is backed up by algebraic laws. We allow metaprogramming, though I suspect it’s currently used sparingly only because it’s so unwieldy (but culturally, I think the Haskell community is very open to the idea of metaprogramming).
- Production code must be safety-checked by a compiler. Yes. But, anyone who has used a dependently typed language has a much higher standard of what “safety-checked” means, and we regularly play fast and loose with invariants that we decided would be too annoying to statically encode. Note that Yegge claims the opposite of compiler safety-checking is succinctness, which is a completely false myth perpetuated by non-Hindley Milner type systems with their lack of type inference.
- Data stores must adhere to a well-defined, published schema. Well-defined? Yes. Published? No. The emphasis that Haskell has on static checking mean that people writing data types are a lot more willing to update them as the needs of the application change, and don’t really mind global refactoring of the database because it’s so damn easy to get right.
- Public interfaces should be rigorously modeled. Yes. (though cough “ideally object oriented” cough)
- Production systems should never have dangerous or risky back-doors. Accidental. The lack of tooling here means that it’s pretty difficult to snoop into a running compiled executable and fiddle around with internal data: this is a big sore point for the current Haskell ecosystem. But in the abstract, we’re pretty flexible: XMonad, for example, can be restarted to run arbitrary new code while preserving the entirety of your working state.
- If there is ANY doubt as to the safety of a component, it cannot be allowed in production. This is something of a personal question, and really depends on your project, and not so much on the language itself. Haskell is great for safety critical projects, but I also use it for one-off scripts.
- Fast is better than slow. No. Haskell code has the opportunity to be really fast, and it tends to be quite zippy from the get go. But we’ve emphasized features (laziness and abstraction) which are known to cause performance problems, and most Haskellers take the approach of only optimizing when our (very awesome) profiler yells at us. Some Haskellers reflexively add
! {-# UNPACK #-} to their data types, but I don’t—at least, not until I decide my code is too slow.
Haskell has a lot of features which show up in Yegge’s “Liberal Stuff”. Here are some of them:
- Eval: We love coding up interpreters, which are like type-safe evals.
- Metaprogramming: Template Haskell.
- Dynamic scoping: Reader monad.
- all-errors-are-warnings: We can delay type errors to runtime!.
- Reflection and dynamic invocation:
class Data. - RTTI: I hear it’s called a “dictionary”.
- The C preprocessor: Indispensable, begrudgingly.
- Lisp macros: Why use macros when you can do it properly in Template Haskell!
- Domain-specific languages: Haskell eats EDSLs for lunch.
- Optional parameters: It’s called combinator libraries.
- Extensible syntax: Fuck yeah infix!
- Auto-casting: Numeric literals, anyone?
- Automatic stringification:
class Show and deriving. - Sixty-pass compilers: GHC does a lot of passes.
- Whole-namespace imports: Yep (and it’s both convenient and kind of annoying).
The feeling I get from this conversation is that most people think “Haskell” and “static typing” and while thinking about how horrible it is to write traditional dynamically typed code in Haskell, forget that Haskell is actually a surprisingly liberal language prizing understandability, succinctness and risk-taking. Is Haskell liberal or conservative? I think of it as an interesting point in the design space which treats some conservative viewpoints as foundational, and then sees how far it can run from there. It’s folded so far right, it came around left again.
August 4, 2012A common simplification when discussing many divide and conquer algorithms is the assumption that the input list has a size which is a power of two. As such, one might wonder: how do we encode lists that have power of two sizes, in a way that lists that don’t have this property are unrepresentable? One observation is that such lists are perfect binary trees, so if we have an encoding for perfect binary trees, we also have an encoding for power of two lists. Here are two well-known ways to do such an encoding in Haskell: one using GADTs and one using nested data-types. We claim that the nested data-types solution is superior.
This post is literate, but you will need some type system features:
{-# LANGUAGE ScopedTypeVariables, GADTs, ImpredicativeTypes #-}
GADTs
One approach is to encode the size of the tree into the type, and then assert that the sizes of two trees are the same. This is pretty easy to do with GADTs:
data Z
data S n
data L i a where
L :: a -> L Z a
N :: L i a -> L i a -> L (S i) a
By reusing the type variable i, the constructor of N ensures that we any two trees we combine must have the same size. These trees can be destructed like normal binary trees:
exampleL = N (N (L 1) (L 2)) (N (L 3) (L 4))
toListL :: L i a -> [a] -- type signature is necessary!
toListL (L x) = [x]
toListL (N l r) = toListL l ++ toListL r
Creating these trees from ordinary lists is a little delicate, since the i type variable needs to be handled with care. Existentials over lists work fairly well:
data L' a = forall i. L' { unL' :: L i a }
data Ex a = forall i. Ex [L i a]
fromListL :: [a] -> L' a
fromListL xs = g (Ex (map L xs))
where
g (Ex [x]) = L' x
g (Ex xs) = g (Ex (f xs))
f (x:y:xs) = (N x y) : f xs
f _ = []
Nested data-types
Another approach is to literally build up a type isomorphic to a 2^n size tuple (modulo laziness). For example, in the case of a 4-tuple, we’d like to just say ((1, 2), (3, 4)). There is still, however, the pesky question of how one does recursion over such a structure. The technique to use here is bootstrapping, described in Adam Buchsbaum in his thesis and popularized by Chris Okasaki:
data B a = Two (B (a, a)) | One a
deriving Show
Notice how the recursive mention of B does not hold a, but (a, a): this is so-called “non-uniform” recursion. Every time we apply a Two constructor, the size of our tuple doubles, until we top it off:
exampleB = Two (Two (One ((1,2), (3,4))))
fromListB :: [a] -> B a
fromListB [x] = One x
fromListB xs = Two (fromListB (pairs xs))
where pairs (x:y:xs) = (x,y) : pairs xs
pairs _ = []
toListB :: B a -> [a]
toListB (One x) = [x]
toListB (Two c) = concatMap (\(x,y) -> [x,y]) (toListB c)
Which is better?
At first glance, the GADT approach seems more appealing, since when destructing it, the data type looks and feels a lot like an ordinary binary tree. However, it is much easier to parse user data into nested data types than GADTs (due to the fact that Haskell is not a dependently typed language). Ralf Hinze, in his paper Perfect Trees and Bit-reversal Permutations, gives another argument in favor of nested datatypes:
Comparing [perfect trees and the usual definition of binary trees] it is fairly obvious that the first representation is more concise than the second one. If we estimate the space usage of an k-ary constructor at k+1 cells, we have that a perfect tree of rank n consumes (2^n-1)3+(n+1)2 cells with the first and (2^n-1)3+22^n* with the second. [The difference coming from all of the extra leaf nodes.]
Nevertheless, destructing the nested data type tree is very weird, and we might feel better about the “exotic” nested data type if there was an efficient transformation from the catamorphism (n :: t a -> t a -> t a , z :: a -> t a) on traditional trees:
cataL :: (t a -> t a -> t a, a -> t a) -> L i a -> t a
cataL (n,z) (N l r) = n (cataL (n,z) l) (cataL (n,z) r)
cataL (n,z) (L x) = z x
to a catamorphism (f :: a -> t a, g :: t (a, a) -> t a) on our nested data-type tree:
cataB :: (forall a. a -> t a, forall a. t (a, a) -> t a) -> B a -> t a
cataB (f,g) (One a) = f a
cataB (f,g) (Two t) = g (cataB (f,g) t)
This conversion is possible, though, alas, it is not a catamorphism:
cataLB :: forall t a. (t a -> t a -> t a, a -> t a) -> B a -> t a
cataLB (n,z) t = f t z
where
f :: forall b. B b -> (b -> t a) -> t a
f (One a) z = z a
f (Two t) z = f t (\(l,r) -> n (z l) (z r))
The idea is to create a function (a -> t a) -> t a, which we then pass z in order to get the final result. This is the time honored difference list/continuation passing trick, where we build up a chain of function invocations rather than attempt to build up the result directly, since ordinarily the catamorphism on nested data-type trees proceeds in the wrong direction. But now, we can easily perform any fold we would have done on ordinary trees on our nested data-type trees, which resolves any lingering concerns we may have had. Nested data types are superior… from a representation size perspective, in any case. (See Jeremy’s comment for another take on the issue, though.)
For further reading, check out Generalised folds for nested datatypes (Richard Bird, Ross Paterson).
July 29, 2012This document explains how polymorphic variants in Ur/Web work. It was written because the official tutorial has no mention of them, the manual only devotes a paragraph to the topic, and there are some useful tricks for dealing with them that I picked up while wrangling with them in Logitext.
What are polymorphic variants?
Polymorphic variants may be familiar to OCaml users: they permit you to use the tags of variants in multiple types, and not just in the original algebraic data type a constructor was defined in. Instead having to keep names unique:
datatype yn = Yes1 | No1
datatype ynm = Yes2 | No2 | Maybe2
We can just reuse them:
con yn = variant [Yes = unit, No = unit]
con ynm = variant [Yes = unit, No = unit, Maybe = unit]
This is extremely convenient if you have a lot of constructors which you want to share between multiple logical types. Unfortunately, they have a number of nasty effects, largely stemming from the fact that they introduce subtyping to the language. In Ur, this inherent subtyping is modulated by the same row types that power Ur’s record system, so handling polymorphic variants is quite like handling records, and both are based off of Ur/Web’s type-level records.
How do I make a polymorphic variant?
To make a polymorphic variant type, instead of applying the $ operator, instead apply the variant to a type-level record. So:
$[A = int, B = bool]
generates a record with two fields, A containing an int and B containing a bool, whereas:
variant [A = int, B = bool]
generates a variant with two constructors, A containing just an int, or B containing just a bool.
To make a polymorphic variant value, use the make function, which requires a label (indicating the constructor) and the value:
make [#A] 2
Technically, during the construction of a polymorphic variant, you also need to know what the full set of constructors this value will be used with are. Normally Ur/Web will infer this for you, but this is an important restriction which will affect code that operates on variants. The full signature of make is this:
val make : nm :: Name
-> t ::: Type
-> ts ::: {Type}
-> [[nm] ~ ts]
=> t -> variant ([nm = t] ++ ts)
The function of nm and t should be self-evident, and ts is the type-level record for the rest of the values in the variant, concatenated with [nm = t] to produce a type-level record which is guaranteed to contain nm.
How do I destruct a polymorphic variant?
Use the match function, which takes a variant and a record of functions which indicate how to process each possible constructor of the variant:
match t { A = fn a => a + 2,
B = fn b => if b then 3 else 6 }
Indeed, the variant and the record use the same type-level record, though the types of the record are a little different, as seen in the type of match:
val match : ts ::: {Type} (* the type-level record *)
-> t ::: Type
-> variant ts (* the variant *)
-> $(map (fn t' => t' -> t) ts) (* the record *)
-> t
make and match are the only primitives you need: everything else can derived. However, the meta library has a Variant module which contains of useful derived functions for operating with variants. For example, this pair of functions:
val read : r ::: {Unit} -> t ::: Type -> folder r
-> $(mapU t r) -> variant (mapU {} r) -> t
val write : r ::: {Unit} -> t ::: Type -> folder r
-> $(mapU t r) -> variant (mapU {} r) -> t -> $(mapU t r)
Allow you to use variants as labels, to project and edit values from homogeneously typed records. The signatures are not too difficult to read: r is the type-level record which defines the variant, t is the type of the homogeneous record, folder r is a folder for the record (which usually will get inferred), $(mapU t r) is the type of a homogeneous record (we didn’t write $r because that would be a record containing only unit) and variant (mapU {} r) is the variant serving as a “label”. Here are some example uses of some of the simpler functions in this library:
read {A = 1, B = 2} (make [#A] ())
== 1
write {A = 1, B = 2} (make [#B] ()) 3
== {A = 1, B = 3}
search (fn v => match v {A = fn () => None, B = fn () => Some 2})
== Some 2
find (fn v => match v {A = fn () => True, B = fn () => False})
== Some (make [#A] ())
test [#A] (make [#A] 2)
== Some 2
weaken (make [#A] 2 : variant [A = int])
== make [#A] 2 : variant [A = int, B = int]
eq (make [#A] ()) (make [#B] ())
== False
mp (fn v => match v {A = fn () => 2, B = fn () => True})
== {A = 2, B = True}
fold (fn v i => match v {A = fn () => i + 1, B = fn () => i + 2}) 0
== 3
mapR (fn v x => match v {A = fn i => i * 2, B = fn i => i * 3}) { A = 2 , B = 3 }
== { A = 4, B = 9 }
What does destrR do?
This function has a somewhat formidable type:
val destrR : K --> f :: (K -> Type) -> fr :: (K -> Type) -> t ::: Type
-> (p :: K -> f p -> fr p -> t)
-> r ::: {K} -> folder r -> variant (map f r) -> $(map fr r) -> t
But really, it’s just a more general match. match can easily be implemented in terms of destrR:
match [ts] [t] v fs =
destrR [ident] [fn p => p -> t] (fn [p ::_] f x => f x) v fs
destrR affords more flexibility when the record is not quite a function, but a record containing a function, or a type class, or even if the variant is the function and the record the data.
Is there a more concise way to match over multiple constructors?
Polymorphic variants frequently have a lot of constructors, all of which look basically the same:
con tactic a =
[Cut = logic * a * a,
LExact = int,
LConj = int * a,
LDisj = int * a * a,
LImp = int * a * a,
LIff = int * a,
LBot = int,
LTop = int * a,
LNot = int * a,
LForall = int * universe * a,
LExists = int * a,
LContract = int * a,
LWeaken = int * a,
RExact = int,
RConj = int * a * a,
RDisj = int * a,
RImp = int * a,
RIff = int * a * a,
RTop = int,
RBot = int * a,
RNot = int * a,
RForall = int * a,
RExists = int * universe * a,
RWeaken = int * a,
RContract = int * a]
Filling out records to match against quickly gets old, especially if the functionality for any two constructors with the same type of data is the same:
let fun empty _ = True
fun single (_, a) = proofComplete a
fun singleQ (_, _, a) = proofComplete a
fun double (_, a, b) = andB (proofComplete a) (proofComplete b)
in match t {Cut = fn (_, a, b) => andB (proofComplete a) (proofComplete b),
LExact = empty,
LBot = empty,
RExact = empty,
RTop = empty,
LConj = single,
LNot = single,
LExists = single,
LContract = single,
LWeaken = single,
LTop = single,
RDisj = single,
RImp = single,
LIff = single,
RNot = single,
RBot = single,
RForall = single,
RContract = single,
RWeaken = single,
LForall = singleQ,
RExists = singleQ,
LDisj = double,
LImp = double,
RIff = double,
RConj = double
}
end
Adam Chlipala and I developed a nice method for reducing this boilerplate by abusing local type classes, which allow us to lean of Ur/Web’s inference engine to automatically fill in the function to handle an element of a particular type. Here is that recursive traversal again, using our new method:
let val empty = declareCase (fn _ (_ : int) => True)
val single = declareCase (fn _ (_ : int, a) => proofComplete a)
val singleQ = declareCase (fn _ (_ : int, _ : Universe.r, a) => proofComplete a)
val double = declareCase (fn _ (_ : int, a, b) => andB (proofComplete a) (proofComplete b))
val cut = declareCase (fn _ (_ : Logic.r, a, b) => andB (proofComplete a) (proofComplete b))
in typeCase t end
For every “type” in the variant, you write a declareCase which takes that type and reduces it to the desired return type. (You also get, as the first constructor, a constructor function to create the original constructor; e.g. declareCase (fn f x => f x) is the identity transformation. Then you run typeCase and watch the magic happen. There are more detailed usage instructions in variant.urs.
How do I make the type of my variant bigger?
When writing metaprograms that create variant types, a common problem is that the variant you just created is too narrow: that is, the ts in variant ts doesn’t have enough entries in it. This is especially a problem when ts is the record you are folding over. Consider a simple example where we would like to write this function, which generates a record of constructors for each constructor of the variant:
fun ctors : ts ::: {Type} -> fl : folder ts -> $(map (fn t => t -> variant ts) ts)
Ur/Web is not clever enough to figure out the naive approach:
fun ctors [ts] fl =
@fold [fn ts' => $(map (fn t => t -> variant ts) ts')]
(fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] n => n ++ {nm = make [nm]})
{} fl
because it has no idea that nm is a member of the type-level record ts (Ur/Web doesn’t directly have an encoding of field inclusion.)
The way to fix this problem is to use a trick that shows up again and again in variant metaprograms: make the accumulator polymorphic in the fields that are already processed. It is the same trick that is employed in value level programs, when you reverse a list by foldr by observing that you want to make the accumulator a function:
con accum r = s :: {Type} -> [r ~ s] => $(map (fn t => t -> variant (r ++ s)) r)
fun ctors [ts] fl =
@fold [accum]
(fn [nm::_] [v::_] [r::_] [[nm] ~ r]
(k : accum r)
[s::_] [[nm = v] ++ r ~ s] => k [[nm = v] ++ s] ++ {nm = make [nm]})
(fn [s::_] [[] ~ s] => {}) fl [[]] !
accum is the type of the accumulator, and we can see it has a new type argument s :: {Type}. This argument is concatenated with the fields that are to be processed r and the current field nm in order to provide the full set of fields ts. During a fold over a record like [A = int, B = bool, C = string], we see:
r = [], nm = A, s = [B = bool, C = string]
r = [A = int], nm = B, s = [C = string]
r = [A = int, B = bool], nm = C, s = []
r builds up fields as usual in a fold, but s builds up its fields in reverse, because, like list reverse, s is not determined until we’ve folded over the entire structure, and now evaluate the pile of type functions outside-in. Thus, it’s easy to see k [[nm = v] ++ s] will always have the correct type.
Conclusion
Polymorphic variants in Ur/Web are quite useful, and avoid many of the problems associated with unrestricted subtyping. Logitext wasn’t originally intending on using polymorphic variants, but we adopted them when they were found to be the most reliable method by which we could quickly implement JSON serialization via metaprogramming, and we’ve come to appreciate their metaprogrammability in a variety of other contexts too. Probably their biggest downside over traditional algebraic data types is the lack of recursion, but that too can be simulated by manually implementing the mu operator using Ur/Web’s module system. I hope this tutorial has given you enough meat to use polymorphic variants on your own, and maybe do a little metaprogramming with them too.
July 25, 2012The holy grail of web application development is a single language which runs on both the server side and the client side. The reasons for this are multifarious: a single language promotes reuse of components that no longer need to be reimplemented in two languages and allows for much more facile communication between the server and the client. Web frameworks that explicitly strive to handle both the server and client include Meteor, Ur/Web, Opa and Google Web Toolkit.
One of the biggest implementation difficulties facing anyone wishing to build such a system is the fact that there are multiple runtimes: the server runtime and the browser runtime, each with an accordingly different set of primitives and APIs available. Furthermore, some code we might wish to only live on the server, and never be sent to the client. When a language feature can be implemented on both runtimes, we maintain the illusion that client and server are indistinguishable; when it cannot, the illusion is broken.
Thus, in order to support runtime-specific FFI calls in such an integrated language, the following questions must be answered:
- When is code sent to the client and when is code kept on the server? This information must be exposed to the user (rather than be kept as an “implementation detail”).
- How do I force execution on the server?
- How do I force execution on the client?
In this blog post, I’ll discuss how Ur/Web addresses these questions. The answers are rather simple (any with any luck generalize to other similar systems), but they are rather difficult to come by if you treat the compiler as a black box.
1. Client/server split
An obvious solution to the client/server division problem is to label entry points (e.g. the main function or an onClick handler) as starting from the server (main) or client (onClick), and then conducting reachability analysis to label all other functions. Thus, in the following Ur/Web code, txn : transaction unit would execute on the server here:
fun main () =
txn;
return <xml><body>Done!</body></xml>
while it would execute on the client here:
fun main () =
return <xml><body><a onclick={txn}>Click me!</a></body></xml>
When given a fragment like this:
fun foo body =
r <- txn body;
return <xml>{r}</xml>
it is not possible to know whether or not txn will be needed on the client side or server side without analyzing all of the callers and checking if they are client side or server side. Situations like this are the most important for forcing server-side or client-side behavior.
2. Forcing server-side
Suppose that we wanted to force txn to be executed on the server-side. If we’re already on the server, there is nothing more to do. However, if we’re on the client, we need to make an RPC call back to the server. In Ur/Web, this is easily done:
fun fooClient body =
r <- rpc (txn body);
return <xml>{r}</xml>
However, as rpc is a client-side only function in Ur/Web, we can no longer use this function for server-side computation. One consequence of this choice is it forces us to be explicit about when an RPC occurs, which is good news for understanding and security.
3. Forcing client-side
Suppose we wanted to force txn to be executed on the client-side. This is tricky: if we’re already on the client we can go ahead as normal, but if we’re executing in the server side, what does it mean to execute some code on the client?
One interpretation is this: since we are building some HTML that is to be shown to the client, txn should be run when the client actually displays the HTML. Ur/Web recently added the active tag which achieves just this effect:
fun fooServer body =
return <xml><active code={txn body} /></xml>
The code attribute acts much like onclick and other similar attributes, in that it defines an entry point which happens to automatically get run when shown in the browser. It is still an event handler, in the sense that if someone invokes fooServer, but then doesn’t end up using the HTML, txn never gets called: active can be thought of a sort of lazy execution.
If we would truly like the client to execute some code immediately, our best bet is to shove an active tag down a source which is hooked up to an active dyn element:
fun fooServer body source =
set source <xml><active code={txn body} /></xml>;
return <xml>???</xml>
but in this case it is not really possible to ask the client what the result of the computation was (the server is not permitted to block!) This method of mobile code delivery can even be done asynchronously, using channels:
fun fooServerAsync body channel =
send channel <xml><active code={txn body} /></xml>;
return <xml>???</xml>
4. Interaction with the optimizer
The code within HTML event handlers (e.g. onclick={...} and active code={...}) can have free variables which bind to variables in their lexical scope, which may have been calculated from the server. As such, you might expect in this case, foo : int -> xbody would be executed on the server:
fun main n =
let val x = foo n
in return <xml><body><active code={txn; return x} /></body></xml>
However, Ur/Web’s optimizer is too smart for it’s own good: since foo is pure and thus referentially transparent, it can always be safely inlined (especially when there is only one use-site):
fun main n =
return <xml><body><active code={txn; return (foo n)} /></body></xml>
Written this way, it is clear that foo is run from the client. Thus, an innocent transformation can break your code, if foo was a server FFI called that was unimplemented on the client.
The troubling conclusion is that variable substitution can make valid programs invalid. Of course, in an eagerly evaluated, impure language, variable substitution is not valid. But we might expect it to be true for a pure language like Ur/Web. In any case, we can teach Ur/Web not to inline by marking foo as being benignEffectful in our urp file.
5. Conclusions
In general, when writing Ur/Web applications, here are some useful guidelines:
- Always mark server/client-only identifiers with
serverOnly and clientOnly in your urp files. Ur/Web will generally handle one-sided FFI functions appropriately, but if you have code that takes advantage of language features that are only implemented on one side (e.g. closures on the server side), be sure to mark those functions appropriately. - Use
rpc to move from client-to-server, and active to move from server-to-client. Because of the “rpc must refer to named function” invariant, the general structure of Ur/Web applications will be blobs of server code with client-side code embedded internally. - If you are interested in generating client code that includes pure server-computed data, make sure the functions computing that data are marked
benignEffectful. - In general, don’t worry about the server/client split! Ur/Web will warn you if you need to move things around, but for the most part, things should just work.
One last word about security in a shared server/client model: how does Ur/Web ensure that input validation ends up on the server-side and not the client-side? It is rather simple: the only parts of your program that care about input validation are ones that involve persistent data, and all of these functions are server only. Thus, any user-data that makes it to any of these functions necessarily passed through a top-level page handler or an rpc, making it a relatively simple job to ensure that the validation is on the “right” side of the pond. If you use a data-structure that is correct-by-construction, you’re automatically done!
July 20, 2012Abstract. We describe how secure multi-party sorting can serve as the basis for a Bitcoin anonymization protocol which improves over current centralized “mixing” designs.
Bitcoin is a pseudonymous protocol: while Bitcoin addresses are in principle completely anonymous, all traffic into and out of a wallet is publicly visible. With some simple network analysis collections of addresses can be linked together and identified.
The current state of the art for anonymizing Bitcoins is a mixing service, which is trusted third-party wallet which accepts incoming transactions, and in random increments scheduled at random times in the future, transfers a corresponding quantity to a new wallet of your choice. The result is given any Bitcoin that is distributed from this service, there exist a large number of identities from whom the Bitcoin may have originated.
Mixing services of this kind have a number of notable problems. First, the mixing service must be trusted not to keep logs or otherwise monitor the mixing: if they are compromised, the path of any given Bitcoin can be fully traced. The usual advice for this scenario is to use multiple mixing services, so that all of these services must be compromised before anonymity is lost. Second, the mixing service must be trusted not to turn around and refuse to give you back your funds; this makes such mixing services risky for anonymizing large quantities of Bitcoins. Finally, most mixing services charge a processing fee on top of the usual transaction fee one might expect to pay out for arranging for a Bitcoin transfer.
We propose a decentralized, secure multiparty protocol for implementing a mixing protocol. Such a system has been proposed in abstract (also here); in this post, we describe precisely how to implement it, in particular showing that multi-party sorting (a relatively well-studied algorithmic problem) is sufficient to implement this protocol. This protocol does not require a trusted third party (except to assist in discovery of participants interested in performing the mixing protocol), does not require you to reveal your input-output addresses beyond the ultimate transaction, and can be performed atomically using Bitcoin’s transaction language.
Protocol description
First some preliminaries: the multi-party sorting problem is usually formulated as follows: each party i contributes an input element A[i]. After the protocol is carried out, all parties learn the sorted sequence A in secret shared form, but do not learn who contributed any particular element A[i] of the sequence. Referring to the work of Jónsson, Kreitz and Uddin (2011), we assume this protocol as a primitive: the essential idea behind any multi-party sorting is to construct a fixed size sorting circuit for the inputs, and then use the general framework of multi-party computation on the resulting circuit description.
We now describe the mixing problem. Assume that some number of parties are assembled to mix 1 BTC of coins among themselves. (For now, we assume that every participant has the same number of Bitcoins; we will generalize this later.) In particular, each party i has a input wallet A[i] (with balance of at least 1 BTC) and an output wallet B[i], and will only sign a transaction in which 1 BTC is transferred to its output wallet B[i]. Any adversary participating in this transaction should not be able to learn the B[i] corresponding to A[i], except that it is among the set of output wallets taking part in the transaction.
The protocol proceeds as follows:
- Every participant declares the input wallet it will be participating in the protocol with, and produces a signature to show that they own the wallet. These wallets are publically sorted into A.
- With ordering defined as the numeric value of the public key of each participating output wallet, conduct a secure multi-party sort of all of the output wallets. We now have a sorted list of output wallets B, with no member of the transaction having learned who contributed any given output wallet. Each participant should check if their output wallet is contained in this list (to protect against Byzantine failure); if it is not, they should abort the protocol and destroy their output wallet (its identity has been leaked).
- Construct a transaction transferring 1 BTC from A[0] to B[0] (from the sorted lists), A[1] to B[1], and so forth and broadcast it to all participants. Clients sign the transaction with their input wallet.
- Once all signatures have arrived, the transaction is valid and is broadcast for incorporation into the block chain.
Mixing pools only work if all participants are attempting to mix identical amounts of Bitcoins. In order to manage participants who would like to mix larger amounts of Bitcoins, we suggest maintaining discovery channels for power of two sizes of Bitcoins, e.g. …1/4 BTC, 1/2 BTC, 1 BTC, 2 BTC, 4 BTC…
Analysis
Step (1) does not mention output wallets, and thus cannot leak information about the input-output correspondence. By definition, step (2) does not leak information about who contributed the output wallet. Furthermore, we don’t even require the sorted result to be secret shared: this sorted list will become public information once the transaction is published in the block chain. The case of aborting the transaction when your output wallet is not present in the result (in the case of Byzantine failure) is delicate: aborting does leak information, and thus you must not use the output wallet in any further transactions. In step (3), assuming that an attacker knows that a mixing transaction is taking place, the deterministic mapping of input to output wallets gives all participants no further bits of information. Thus, this protocol clearly fulfills its security requirements.
One odd thing about this protocol is that no random permutation between participants is explicitly constructed. This might seem unusual, since a natural security property one might expect is for an output wallet to receive its 1 BTC from a uniformly randomly chosen input wallet. However, this condition, while sufficient for anonymity, is not necessary. Just as adding a random permutation destroys all information about the original permutation, replacing the original permutation with a new constant permutation also destroys all information about the original permutation. Furthermore, honest participants will have picked their addresses uniformly at random, so the process of sorting automatically constructs a random permutation between these participants (dishonest participants must be excluded, since they can generate addresses from a skewed probability distributions).
What is the amount of anonymity granted by one round of this protocol? Consider the case where I have 1 BTC in an input wallet tied to my identity, and I participate in a mixing round with n honest participants with a fresh output wallet. Since no information about the source of this output wallet was leaked to the participants, an adversary in the transaction would have a 1/n-1 probability of guessing which output wallet was mine: call this the anonymity factor. In the case of a Sybil attack, the amount of anonymity conferred decreases. If the fraction of attackers is less than some fraction of the participants (for many secret sharing protocols, the magic numbers are 1/2 for passive attackers, and 1/3 for active attackers), then the anonymity factor is still 1/n-1, where n is the number of honest participants; but n is smaller than the number of visible participants in the protocol: the size of the transaction is not necessarily correlated with how anonymous it is! If the number of attackers is above this fraction, then the secret sharing scheme may leak information and no anonymity is gained. This allows for a denial of service attack against a mixing protocol; we describe some mitigation strategies against this attack later. (Note, however, that no matter how many attackers there are, you are guaranteed to not lose any Bitcoins, due to the verification step in (2).)
In practice
As with traditional mixing, a number of precautions should be taken in order to avoid accidentally revealing information about the source of Bitcoins via a side-channel. Consider the case where Alice has 2 BTC tied to her real-world identity, and she would like to anonymously pay 3/4 BTC to Bob.
Alice first prepare by creating a pool of empty wallets, which she will use to carry the anonymized Bitcoins. Alice connects to a tracker for 1 BTC mixing over Tor. (1 BTC is the amount she would like to pay to Bob, rounded up.) She waits a time window to expire for the next mixing, and then as the protocol takes place submits her (public) input wallet and her (anonymous) output wallet. If the protocol fails, she throws out her output wallet and submits a new one next time, and blacklists the misbehaving node if she can figure out who it is.
Once Alice has successfully carried out a mixing, she flips a coin (which comes up tails with probability 1/m). If the coin comes up heads, she waits for another mixing. The number of mixing transactions she is expected to perform is m (obeying the geometric distribution, so selected because it makes all output wallets behave identically with regards to remixing or exiting). Once Alice exits mixing, she now has a wallet containing an anonymous Bitcoin (more precisely, this Bitcoin could be attributable with equal probability to any of the other wallets that she participated in mixes with). She transfers 3/4 BTC to Bob, leaving 1/4 BTC in her wallet.
The remaining Bitcoins in the wallet should now be considered tainted (as they now have a direct relationship to Bob, who may have a public wallet). These Bitcoins should be split into mixable amounts and reanonymized, before used for any other purposes. Even after anonymization, these coins must still be used with care: in particular, they must not be transferred back to the original, public Bitcoin account. In such a situation, the graph structure of mixing transactions looks like this:

(The green node is your public node, the red node is your anonymous node). Network analysis that looks for cycles in Bitcoin transfers will be able to identify any transactions, even if the anonymizing pool has a large amount of turnover (though, amusingly enough, if many participants generate cycles, this attack is harder to carry out). To assist in the tracking of these coins, we suggest the development of wallet management software that can handle thousands of private keys, sort by “size of coin”, and track the easily traceable transaction history associated with any given coin.
In order to protect herself against Sybil attacks, Alice may wish to select her mixing tracker with care. Some mixing trackers could charge fees for listing: with sufficient volume, these charges would make it more expensive to carry out a sustained Sybil attack. (The fees can then be turned around and used to pay for the processing of the complicated mixing transactions, which have the social expectation of being accompanied with a transaction fee.) Every mixing should be conducted with a different IP address; if Alice is using Tor for anonymity she needs to reanonymize her connection each time.
Conclusion
Secure multi-party computation has always been in the eye of users of Bitcoin seeking anonymization, but to date there has not been any plausible implementation strategy for any such computation. We hope that this document describes such an implementation strategy and leads the way to a better ecosystem of Bitcoin anonymizers. As the fiascos at Bitcoinica and other exchanges have demonstrated, relying on third party wallets is dangerous. Fortunately, they are also unnecessary.
Acknowledgments
I would like to thank Eric Price for playing an instrumental role in the formulation of this protocol.
June 23, 2012Correctness is overrated. After all, no one knows what it means for any reasonably complicated system to be “correct”, and even when we do, the mileposts move around on a daily basis. With the raison d’être of formal verification stripped away, can we still consider it a worthy goal?
Perhaps verification results in higher quality code. But this is not obviously true: correctness is not quality. We might hope that high quality code is readable and easily understood, that it should be as self-contained and independent from the rest of the system, that it is efficient and economical. There is no a priori reason to believe that verification would grant us any of these properties. No matter how horrible some code is, as long as it is correct, there exists a proof which vouches for its correctness.
But as anyone who has gone through the sweat and tears of verifying a program can tell you, formal verification really does make your code better. Here’s the secret: proving theorems is really hard. If we want any hope to successfully prove something about a program, we must to make reasoning about the code as easy as possible. A program under verification irresistibly gravitates towards it’s most “reasonable” form, because otherwise the proofs are simply too arduous to carry out. And in this form, the tenets of high quality code follow.
Take for example Bedrock, a system for building verified low-level programs which manipulate pointers and registers. These are programs that deal with mutable state, a feature which is well known to dramatically increase the difficulty of reasoning. Bedrock, and many systems like it, would be dead out of the water if not for the development of an important system called separation logic. The central idea behind it is so obvious to any experienced practitioner it is barely worth stating: private local state is easier to reason about than public global state—**modularity** is good. It enforces this through a clever formalism, the star operator, which combines two assertions about two regions of memory while assuring that the regions are disjoint. Regardless, the end result is this: if your components are independent, the theorem proving is easy; if your components are tangled together, the theorem proving is hard. You do the math.
But it doesn’t stop there. When different components do interact, the principle of encapsulation says that I do not want to know all of the gory details of a component, just its high-level interface. In theorem prover land, “all of the gory details” means unmanageably large facts about many pointers, and a “high-level interface” is an abstract predicate which rolls up all of these facts into a single, cohesive logical fact (“this is a linked list.”) Developing these predicates is critical to keeping your theorem statements concise and understandable, and in higher-order provers like Bedrock, they can apply not only to data but also to code, i.e. function pointers.
The tenets of high quality code speak for code that is written for humans to understand, and not just machines to execute. But code that is written for machines to understand have many of the same properties that are valued by humans, for if they do not, getting the machine to “understand” becomes an impossible task. Computers may be simple-minded, but all that means is code a computer can understand is code that you can understand too. And that is high quality code.