ezyang’s blog

the arc of software bends towards understanding

Compilers

The hidden problem(?) with basic block procedures in SSA

Years ago, Nadav Rotem related to me this story about why basic block procedures in Swift are not as good as they seem. Nelson Elhage reminded me about this on Twitter and so I thought this should be put into the public record. Basic block procedures make certain optimizations more difficult. Consider this program: block […]

  • October 24, 2020

Optimizing incremental compilation

When you run make to build software, you expect a build on software that has been previously built to take less time than software we are building from scratch. The reason for this is incremental compilation: by caching the intermediate results of ahead-of-time compilation, the only parts of a program that must be recompiled are […]

  • August 27, 2016

The convergence of compilers, build systems and package managers

Abstract. The traditional abstraction barriers between compiler, build system and package manager are increasingly ill-suited for IDEs, parallel build systems, and modern source code organization. Recent compilers like go and rustc are equipped with a fully-fledged build systems; semantic build systems like Bazel and Gradle also expect to manage the packaging of software. Does this […]

  • December 7, 2015

No grammar? No problem!

One day, you’re strolling along fields of code, when suddenly you spot a syntax construct that you don’t understand. Perhaps you’d ask your desk-mate, who’d tell you in an instant what it was. Perhaps your programming toolchain can tell you. (Perhaps the IDE would you mouse over the construct, or you’re using Coq which let’s […]

  • July 2, 2013

The AST Typing Problem

This Lambda the Ultimate post (dated 2010) describes a rather universal problem faced by compiler writers: how does one go about adding “extra information” (e.g. types) to an AST? (The post itself divides the problem into three components: adding the information to the data types, using the information to inform the construction of the node, […]

  • May 28, 2013

Category theory for loop optimizations

Christopher de Sa and I have been working on a category theoretic approach to optimizing MapReduce-like pipelines. Actually, we didn’t start with any category theory—we were originally trying to impose some structure on some of the existing loop optimizations that the Delite compiler performed, and along the way, we rediscovered the rich relationship between category […]

  • May 12, 2013

Purpose of proof: semi-formal methods

In which the author muses that “semi-formal methods” (that is, non computer-assisted proof writing) should take a more active role in allowing software engineers to communicate with one another. C++0x has a lot of new, whiz-bang features in it, one of which is the atomic operations library. This library has advanced features that enable compiler […]

  • October 20, 2010

Data is Code

Yesterday I had the pleasure of attending a colloquium given by Chung-Chieh Shan on Embedding Probabilistic Languages. A full account for the talk can be found in this paper, so I want to focus in on one specific big idea: the idea that data is code. Lispers are well acquainted with the mantra, “code is […]

  • September 22, 2010