Why can’t I just be a little lazy?

by Edward Z. Yang

You can. Imagine a version of Haskell where every constructor was strict, e.g. every field had a ! prefix. The semantics of this language are well defined; and in fact, the fine folks at CMU have known about this for some time:

Up to this point we have frequently encountered arbitrary choices in the dynamics of various language constructs. For example, when specifying the dynamics of pairs, we must choose, rather arbitrarily, between the lazy dynamics, in which all pairs are values regardless of the value status of their components, and the eager dynamics, in which a pair is a value only if its components are both values. We could even consider a half-eager (or, equivalently, half-lazy) dynamics, in which a pair is a value only if, say, the first component is a value, but without regard to the second.

Similar questions arise with sums (all injections are values, or only injections of values are values), recursive types (all folds are values, or only folds of values are values), and function types (functions should be called by-name or by-value). Whole languages are built around adherence to one policy or another. For example, Haskell decrees that products, sums, and recursive types are to be lazy, and functions are to be called by name, whereas ML decrees the exact opposite policy. Not only are these choices arbitrary, but it is also unclear why they should be linked. For example, we could very sensibly decree that products, sums, and recursive types are lazy, yet impose a call-by-value discipline on functions. Or we could have eager products, sums, and recursive types, yet insist on call-by-name. It is not at all clear which of these points in the space of choices is right; each has its adherents, and each has its detractors.

Are we therefore stuck in a tarpit of subjectivity? No! The way out is to recognize that these distinctions should not be imposed by the language designer, but rather are choices that are to be made by the programmer. This may be achieved by recognizing that differences in dynamics reflect fundamental type distinctions that are being obscured by languages that impose one policy or another. We can have both eager and lazy pairs in the same language by simply distinguishing them as two distinct types, and similarly we can have both eager and lazy sums in the same language, and both by-name and by-value function spaces, by providing sufficient type distinctions as to make the choice available to the programmer.

This is from the Polarization chapter of Harper’s Practical Foundations for Programming Languages. Personally, I think call-by-name with (by default) eager data types by default is an under-appreciated point in the design space: with this combination, you still get the ability to implement your own control-flow structures like if (just not on data structures) and have lazy bindings, but you no longer have to worry about a large class of space leak. Of course, this regime doesn't eliminate all problems: for example, if you foldl instead of foldl', you will still end up with a long line of function applications and stack overflow. It’s not clear to me if there is an alternative form of fix which dodges this bullet.