ezyang's blog

the arc of software bends towards understanding

Posts

Hails: Protecting Data Privacy in Untrusted Web Applications

This post is adapted from the talk which Deian Stefan gave for Hails at OSDI 2012.

It is a truth universally acknowledged that any website (e.g. Facebook) is in want of a web platform (e.g. the Facebook API). Web platforms are awesome, because they allow third-party developers to build apps which operate on our personal data.

But web platforms are also scary. After all, they allow third-party developers to build apps which operate on our personal data. For all we know, they could be selling our email addresses to spamlords or snooping on our personal messages. With the ubiquity of third-party applications, it’s nearly trivial to steal personal data. Even if we assumed that all developers had our best interests at heart, we’d still have to worry about developers who don’t understand (or care about) security.

Read more...

Visualizing satisfiability, validity & entailment

So you’re half bored to death working on your propositional logic problem set (after all, you know what AND and OR are, being a computer scientist), and suddenly the problem set gives you a real stinker of a question:

Is it true that Γ ⊢ A implies that Γ ⊢ ¬A is false?

and you think, “Double negation, no problem!” and say “Of course!” Which, of course, is wrong: right after you turn it in, you think, “Aw crap, if Γ contains a contradiction, then I can prove both A and ¬A.” And then you wonder, “Well crap, I have no intuition for this shit at all.”

Read more...

GET /browser.exe

Jon Howell dreams of a new Internet. In this new Internet, cross-browser compatibility checking is a distant memory and new features can be unilaterally be added to browsers without having to convince the world to upgrade first. The idea which makes this Internet possible is so crazy, it just might work.

What if a web request didn’t just download a web page, but the browser too?

“That’s stupid,” you might say, “No way I’m running random binaries from the Internet!” But you’d be wrong: Howell knows how to do this, and furthermore, how to do so in a way that is safer than the JavaScript your browser regularly receives and executes. The idea is simple: the code you’re executing (be it native, bytecode or text) is not important, rather, it is the system API exposed to the code that determines the safety of the system.

Read more...

Generalizing the programmable semicolon

Caveat emptor: half-baked research ideas ahead.

What is a monad? One answer is that it is a way of sequencing actions in a non-strict language, a way of saying “this should be executed before that.” But another answer is that it is programmable semicolon, a way of implementing custom side-effects when doing computation. These include bread and butter effects like state, control flow and nondeterminism, to more exotic ones such as labeled IO. Such functionality is useful, even if you don’t need monads for sequencing!

Read more...

Template project for GHC plugins

There is a bit of scaffolding involved with making Core-to-Core transforming GHC plugins, so I made a little project, based off of Max Bolingbroke’s examples, which is a nice, clean template project which you can use to create your own GHC plugins. In particular, it has documentation and pointers to the GHC source as well as a handy-dandy shell script rename.sh MyProjectName which will let you easily rename the template project into whatever name you want. You can find it on GitHub. I’ll probably be adding more to it as I go along; let me know about any bugs too.

Read more...

"This is really the End."

Done. This adjective rarely describes any sort of software project—there are always more bugs to fix, more features to add. But on September 20th, early one afternoon in France, Georges Gonthier did just that: he slotted in the last component of a six year project, with the laconic commit message, “This is really the End.”

It was complete: the formalization of the Feit-Thompson theorem.

If you’re not following the developments in interactive theorem proving or the formalization of mathematics, this achievement may leave you scratching your head a little. What is the Feit-Thompson theorem? What does it mean for it to have been formalized? What was the point of this exercise? Unlike the four coloring theorem which Gonthier and his team tackled previously in 2005, the Feit-Thompson theorem (also known as the odd order theorem) is not easily understandable by non-mathematician without a background in group theory (I shall not attempt to explain it). Nor are there many working mathematicians whose lives will be materially impacted by the formalization of this particular theorem. But there is a point; one that can be found between the broad motivation behind the computer-assisted theorem proving and the fascinating social context surrounding the Feit-Thompson theorem.

Read more...

Unintuitive facts about Safe Haskell

Safe Haskell is a new language pragma for GHC which allows you to run untrusted code on top of a trusted code base. There are some common misconceptions about how Safe Haskell works in practice. In this post, I’d like to help correct some of these misunderstandings.

[system 'rm -Rf /' :: IO ExitCode] is accepted by Safe Haskell

Although an IO action here is certainly unsafe, it is not rejected by Safe Haskell per se, because the type of this expression clearly expresses the fact that the operation may have arbitrary side effects. Your obligation in the trusted code base is to not run untrusted code in the IO monad! If you need to allow limited input/output, you must define a restricted IO monad, which is described in the manual.

Read more...

The Y Combinator and strict positivity

One of the most mind-bending features of the untyped lambda calculus is the fixed-point combinator, which is a function fix with the property that fix f == f (fix f). Writing these combinators requires nothing besides lambdas; one of the most famous of which is the Y combinator λf.(λx.f (x x)) (λx.f (x x)).

Now, if you’re like me, you saw this and tried to implement it in a typed functional programming language like Haskell:

Read more...

So you want to hack on IMAP...

(Last IMAP themed post for a while, I promise!)

Well, first off, you’re horribly misinformed: you do not actually want to hack on IMAP. But supposing, for some masochistic reason, you need to dig in the guts of your mail synchronizer and fix a bug or add some features. There are a few useful things to know before you start your journey…

  • Read your RFCs. RFC 3501 is the actual specification, while RFC 2683 gives a lot of helpful tips for working around the hairy bits of the many IMAP servers out there in practice. You should also know about the UIDPLUS extension, RFC 4315, which is fairly well supported and makes a client implementor’s life a lot easier.
  • IMAP is fortunately a text-based protocol, so you can and should play around with it on the command line. A great tool to use for this is imtest, which has all sorts of fancy features such as SASL authentication. (Don’t forget to rlwrap it!) Make sure you prefix your commands with an identifier (UID is a valid identifier, so typing UID FETCH ... will not do what you want.)
  • It is generally a better idea to use UIDs over sequence numbers, since they are more stable, but be careful: as per the specification, UID prefixed commands never fail, so you will need to check untagged data in the response to see if anything actually happened. (If you have a shitty IMAP library, it may not clear out untagged data between requests, so watch out for stale data!) Oh, and look up UIDVALIDITY.
  • There exist a lot of software that interfaces with IMAP, all of which has accreted special cases for buggy IMAP servers over the years. It is well worth sourcediving a few to get a sense for what kinds of things you will need to handle.

OfflineIMAP sucks

I am going to share a dirty little secret with you, a secret that only someone who uses and hacks on OfflineIMAP could reasonably know: OfflineIMAP sucks. Of course, you can still use software that sucks (I do all the time), but it’s useful to know what some of its deficiencies are, so that you can decide if you’re willing to put up with the suckage. So why does OfflineIMAP suck?

Read more...