ezyang's blog

Comonads and Convolutions

February 22, 2010

> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> import Control.Comonad
> import Data.List

That scary Control.Comonad import from category-extras is going to be the subject of today’s post. We’re going to look at one possible implementation of comonads for non-empty lists that model causal time-invariant systems, systems whose outputs depend only on inputs that are in the past. We will see that computation in these systems follows a comonadic structure and that one instance of this structure strongly enforces causality and weakly enforces time-invariance.

Our causal lists are simply a newtype of list with the added restriction that they are non-empty; causal is a “smart constructor” that enforces this restriction. We use GeneralizedNewtypeDeriving to get the Functor instance for free.

> newtype Causal a = Causal [a]
>    deriving (Functor, Show)
>
> causal :: a -> [a] -> Causal a
> causal x xs = Causal (x:xs)
>
> unCausal :: Causal a -> [a]
> unCausal (Causal xs) = xs
>
> type Voltage = Float

Background. (If you’re already familiar with signal processing, feel free to skip this section.) One such system models point-to-point communication of voltage samples across an imperfect wire channel. In an ideal world, we would very much like to be able to pretend that any voltage I put into this channel would instantly perfectly transmit this voltage to the other end of the channel. In practice, we’ll see any number of imperfections, including time to rise and fall, a delay, ringing and noise. Noise is a party pooper, so we’re going to ignore it for the purposes of this post.

To a first approximation, we can impose the following important conditions on our system:

Causality. Our wire can’t peek into the future and transmit some voltage before it has even gotten it.
Time-invariance. Any signal will get the same response whether or not it gets sent now or later.
Linearity. A simple and useful approximation for wires, which states this mathematical property: if an input x1 results in an output y1, and an input x2 results in an output y2, then the input Ax1 + Bx2 results in the output Ay1 + By2. This also means we get superposition, which is an important technique that we’ll use soon.

When you see a linear time-invariant system, it means that we get to use a favorite mathematical tool, the convolution.

Discrete convolutions. The overall structure of the discretized computation that a channel performs is [Voltage] -> [Voltage]; that is, we put in a sequence of input voltage samples, and get out another sequence of output voltage samples. On the other hand, the discrete convolution is the function calculated by (with variables suggestively named):

(u ∗ f)[n] = sum from m = -∞ to ∞ of f[m]u[n-m]

It’s not quite obvious why the convolution is the mathematical abstraction we’re looking for here, so we’ll sketch a brief derivation.

One special case of our computation is when the input corresponds to [1, 0, 0, 0 ...], called the unit sample. In fact, due to linearity and time-invariance, the output that our system gives when posed with the unit sample, the unit sample response, precisely specifies the behavior of a system for all inputs: any possible input sequence could be composed of any number of delayed and scaled unit samples, and linearity says we can sum all of the results together to get a result.

A list is actually a function ℕ → a, and we can extend the domain to be over integers if we propose the convention f[n] = 0 for n < 0. Suppose that f[n] represents our input samples varying over time, δ[n] represents a unit sample (δ[0] = 1, δ[n] = 0 for all other n; you’ll commonly see δ[n-t], which is a unit sample at time t), and u[n] represents our unit sample response. Then, we decompose f[n] into a series of unit samples:

f[n] = f[0]δ[n] + f[1]δ[n-1] + ...

and the use linearity to retrieve our response g[n]:

g[n] = f[0]u[n] + f[1]u[n-1] + ...
     = sum from m = 0 to ∞ of f[m]u[n-m]

which looks just like the discrete convolution, just without the -∞ bound. Remember that we defined f[m] = 0 for m < 0, so the two are actually equivalent.

I’d like to linger on that final mathematical definition for a moment, before writing out the equivalent Haskell. We originally stated that the input-response computation had the type [Voltage] -> [Voltage]; however, in our math, we’ve actually defined a relation [Voltage] -> Voltage, a channel specific function that takes all of the inputs up to time n, i.e. f[0]..f[n], and returns a single output g[n]. I’ve written the following definition in a suggestive curried form to reflect this:

> ltiChannel :: [Voltage] -> Causal Voltage -> Voltage
> ltiChannel u = \(Causal f) -> sum $ zipWith (*) (reverse f) u

The unit sample response may be a finite or infinite list, for reasons of efficiency a finite list is recommended:

> usr :: [Voltage]
> usr = [1,2,5,2,1]

Comonads. By now, it should be clear where we’ve been working towards: we have ltiChannel usr :: Causal Voltage -> Voltage and we want: Causal Voltage -> Causal Voltage. This is precisely the form of computation that the comonad induces! For your convenience, here is the definition of the Copointed and Comonad type classes:

class Functor f => Copointed f where
    extract :: f a -> a

class Copointed w => Comonad w where
    duplicate :: w a -> w (w a)
    extend :: (w a -> b) -> w a -> w b

The Copointed instance is straight-forward, but demonstrates why the Causal must contain a non-empty list:

> instance Copointed Causal where
>    extract (Causal xs) = head xs

The Comonad instance can be defined using either duplicate or extend; both have default implementations defined in terms of each other. Deriving these default implementations is left as an exercise to the reader; we’ll define both here:

> instance Comonad Causal where
>    extend f  = Causal . map (f . Causal) . tail . inits . unCausal
>    duplicate = Causal . map      Causal  . tail . inits . unCausal

The intent of the code is somewhat obscured by the unwrapping and wrapping of Causal; for a pure list the instance would look like this:

instance Comonad [] where
    extend f  = map f . tail . inits
    duplicate = tail . inits

The function duplicate really gets to the heart of what this comonad instance does: we take our input list and transform it into a list of histories, each one one step further than the last. The tail tags along to drop the first value of inits which is an empty list. duplicate builds up w (w a), and then the user-supplied function tears it back down to w b (if you think of monads, the lifted user function builds up m (m b), and then join tears it back down to m b.)

One quick test to make sure it works:

> unitStep :: Causal Voltage
> unitStep = Causal (repeat 1)
>
> result :: Causal Voltage
> result = unitStep =>> ltiChannel usr

and sure enough, the result is:

Causal [1.0, 3.0, 8.0, 10.0, 11.0, 11.0, ...]

=>> is a flipped extend, and the comonadic equivalent of the monadic >>=.

Enforced invariants. Structuring our computation in this form (as opposed to writing the darn convolution out explicitly) gives us some interesting enforced invariants in our code. Our channels need not be linear; I could have squared all of the inputs before convolving them with the unit sample response, and that certainly would not be linear. However, any channel we write must be causal and and will usually be time-invariant: it must be causal because we never pass any values from the future to the user function, and it is weakly time invariant because we don’t explicitly let the user know how far along the are the input stream they are. In practice with our implementation, they could divine this information using length; we could get stronger guarantees employing a combinator that reverses the list and then appends repeat 0:

> tiChannel :: ([Voltage] -> Voltage) -> Causal Voltage -> Voltage
> tiChannel f (Causal xs) = f (reverse xs ++ repeat 0)
>
> ltiChannel' :: [Voltage] -> Causal Voltage -> Voltage
> ltiChannel' u = tiChannel (\xs -> sum $ zipWith (*) u xs)

u in this case must be finite, and if it is infinite can be truncated at some point to specify how precise our computation should be.

Open question. The unit sample response has been expressed in our sample code as [Voltage], but it really is Causal Voltage. Unfortunately, the comonad doesn’t seem to specify mechanisms for combining comonadic values the same way the list monad automatically combines the results of computations for each of the values of a list. I’m kind of curious how something like that might work.

Type manipulation: Tricks of the trade

February 19, 2010

I present here a few pieces of folklore, well known to those practiced in Haskell, that I’ve found to be really useful techniques when analyzing code whose types don’t seem to make any sense. We’ll build practical techniques for reasoning about types, to be able to derive the type of fmap fmap fmap by ourselves. Note that you could just ask GHCI what the type is, but that would spoil the fun! (More seriously, working out the example by hand, just like a good problem set, helps develop intuition for what might be happening.)

Currying and types. Three type signatures that have a superficial similarities are a -> b -> c, (a -> b) -> c and a -> (b -> c). If you don’t have a visceral feel for Haskell’s automatic currying, it can be easy to confuse the three. In this particular case, a -> b -> c which reads as “takes two arguments a and b and returns c” is equivalent to a -> (b -> c) read “takes a and returns a function that takes b and returns c. These are distinct from (a -> b) -> c read “takes a function of a -> b and returns c”. A visual rule you can apply, in these cases, is that parentheses that are flush with the right side of the type signature can be freely added or removed, whereas any other parentheses cannot be.

Higher-order functions. If I pass an Int to id :: a -> a, it’s reasonably obvious that id takes the shape of Int -> Int. If I pass a function a -> a to id :: a -> a, id then takes the shape (a -> a) -> a -> a. Personally, I find the overloading of type parameters kind of confusing, so if I have a cadre of functions that I’m trying to derive the type of, I’ll give them all unique names. Since id id is a tad trivial, we’ll consider something a little nastier: (.) (.). Recall that (.) :: (b -> c) -> (a -> b) -> a -> c. We’re not actually going to use those letters for our manipulation: since our expression has two instances of (.), we’ll name the first a and the second b, and we’ll number them from one to three. Then:

(.) :: (a2 -> a3) -> (a1 -> a2) -> a1 -> a3
(.) :: (b2 -> b3) -> (b1 -> b2) -> b1 -> b3

Slightly less aesthetically pleasing, but we don’t have anymore conflicting types. Next step is to identify what equivalences are present in the type variables, and eliminate redundancy. Since we’re passing the second (.) to the first (.) as an argument:

(a2 -> a3) == (b2 -> b3) -> (b1 -> b2) -> b1 -> b3

to which you might say, “those function signatures don’t look anything alike!” which leads us to our next point:

Currying and type substitution. If your function’s type is n-ary, and the type you’re trying to match it against is m-ary, curry so that your function is m-ary to! So, if you have a -> b -> c, and you want to pass it as d -> e, then you actually have a -> (b -> c), and thus d == a and e == (b -> c). A curious case if it’s in the other direction, in which case d -> e is actually restricted to be d -> (e1 -> e2), where e == (e1 -> e2) and the obvious equalities hold.

To go back to our original example, the second (.) would be grouped as such:

(.) :: (b2 -> b3) -> ((b1 -> b2) -> b1 -> b3)

and we achieve the type equalities:

a2 == (b2 -> b3)
a3 == (b1 -> b2) -> b1 -> b3

Now, let’s substitute in these values for the first (.):

(.) :: ((b2 -> b3) -> (b1 -> b2) -> b1 -> b3) ->
       (a1 -> b2 -> b3) -> a1 -> (b1 -> b2) -> b1 -> b3

and drop the first argument, since it’s been applied:

(.) (.) :: (a1 -> b2 -> b3) -> a1 -> (b1 -> b2) -> b1 -> b3

You might be wondering what that monstrous type signature does…

Interpreting type signatures. A great thing about polymorphic types is that there’s not very much non-pathological behavior that can be specified: because the type is fully polymorphic, we can’t actually stick our hand in the box and use the fact that it’s actually an integer. This property makes programs like Djinn, which automatically derive a function’s contents given a type signature, possible, and with a little practice, you can figure it out too.

Working backwards: we first take a look at b3. There’s no way for our function to magically generate a value of type b3 (excluding undefined or bottom, which counts as pathological), so there’s got to be something else in our script that generates it. And sure enough, it’s the first argument, but we need to pass it a1 and b2 first:

(.) (.) w x y z = w undefined undefined

We repeat the process for each of those types in turn: where is a1 specified? Well, we pass it in as the second argument. Where is b2 specified? Well, we have another function y :: b1 -> b2, but we need a b1 which is z. Excellent, we now have a full implementation:

(.) (.) w x y z = w x (y z)

Pointfree style as operator composition. So, we now know what (.) (.) does, but we don’t really have a good motivation for why this might be the case. (By motivation, I mean, look at (.) (.) taking function composition at face value, and then realizing, “oh yes, it should do that.”) So what we’d really like to focus on is the semantics of (.), namely function composition, and the fact that we’re currying it. One line of thought might be:

Function composition is defined to be (f . g) x = f (g x).
We’re partially applying the composition, so actually we have (f.) g x, but g is missing. (if the (f.) looks funny to you, compare it to (2+), which is partially applied addition. Note that addition is commutative, so you’re more likely to see (+2), which becomes (x+2) when applied.)
f is actually another composition operator. Since functional composition is single-argument oriented, we want to focused on the curried version of (.), which takes a function and returns a function (1) that takes another function (2) and a value and returns the first function applied to the result of the second function applied to the value.
Read out the arguments. Since (f.) is on the outside, the first argument completes the curry. The next argument is what will actually get passed through the first argument, and the result of that will get passed through f. The return value of that is another function, but (barring the previous discussion) we haven’t figured out what that would be yet. Still, we’ve figured out what the first two arguments might look like.

If we now cheat and look at the type signature, we see our hypotheses are verified:

(.) (.) :: (a1 -> b2 -> b3) -> a1 -> (b1 -> b2) -> b1 -> b3

The first argument g :: a1 -> b2 -> b3 completes the curry, and then the next argument is fed straight into it, so it would have to be a1. The resulting value b2 -> b3 is fed into the next composition operator (notice that it’s not a single variable, since the next composition forces it to be a 1-ary function) and is now waiting for another function to complete the curry, which is the next argument b1 -> b2 (i.e. b1 -> b2 -> b3). Then it’s a simple matter of supplying the remaining arguments.

I find thinking of functions as partially applied and waiting to be “completed” leads to a deeper intuitive understanding of what a complex chain of higher order functions might do.

Putting it together. It is now time to work out the types for fmap fmap fmap. We first write out the types for each fmap:

fmap :: (Functor f) => (a1 -> a2) -> f a1 -> f a2
fmap :: (Functor g) => (b1 -> b2) -> g b1 -> g b2
fmap :: (Functor h) => (c1 -> c2) -> h c1 -> h c2

Perform application and we see:

(a1 -> a2) == (b1 -> b2) -> g b1 -> g b2
f a1 == (c1 -> c2) -> h c1 -> h c2

Luckily enough, we have enough arguments to fill up the first fmap, so that’s one layer less of complexity. We can also further break these down:

-- from the first argument
a1 == b1 -> b2
a2 == g b1 -> g b2
-- from the second argument
a1 == h c1 -> h c2
f == (->) (c1 -> c2)

The last equality stems from the fact that there’s only one reasonable functor instance for (c1 -> c2) -> h c1 -> h c2; the functor for functions i.e. the reader monad, taking (c1 -> c2) as its “read-in”.

We can do a few more simplifications:

h c1 -> h c2 == b1 -> b2
b1 == h c1
b2 == h c2

Substitute everything in, and now we see:

fmap fmap fmap :: (Functor g, Functor h) =>
   (c1 -> c2) -> g (h c1) -> g (h c2)

Interpret the types and we realize that fmap fmap fmap does a “double” lift of a function c1 -> c2 to two functors. So we can run fmap fmap fmap (+2) [Just 3] and get back [Just 5] (utilizing the functor instance for the outer list and the inner maybe).

We also notice that the f functor dropped out; this is because it was forced to a specific form, so really fmap fmap fmap == fmap . fmap. This makes it even more obvious that we’re doing a double lift: the function is fmap‘ed once, and then the result is fmap‘ed again.

We can even use this result to figure out what (.) (.) (.) (or (.) . (.)) might do; in the functions fmap = (.), so a normal function is lifted into one reader context by the first fmap, and another reader context with the second fmap. So we’d expect (.) . (.) :: (a -> b) -> (r2 -> r1 -> a) -> (r2 -> r1 -> b) (recall that f a if f = (->) r becomes r -> a) and indeed, that is the case. Compose composed with compose is merely a compose that can take a 2-ary function as it’s second argument and “do the right thing!”

Why we stay up late

February 17, 2010

I was having a discussion a few nights ago about attention, and someone mentioned the fact that contiguous blocks of time are precious. It’s obvious once it’s been pointed out to you, and with it in mind I’ve noticed my tendency to bucket useful activities into various categories: checking email, reading news feeds and simple tasks fall into the “less than an hour” time bucket, while really actually creating software, fixing hard bugs or reading code fall into the “more than an hour, preferably several hours” time bucket. I’ve recognized that attempting to tackle the “more than an hour” bucket in the snatches of time between classes and extracurriculars is simply wasteful.

But unlike the programmer working a day job that Paul Graham describes in his essay, I am a college student. I mostly don’t do my work during the day; that’s the time to imbibe information from lectures and recitations. In particular, I’m an MIT student, which means that there’s no 5:00 “oh, time to go home and relax” period; it’s work all the time (and you grab precious moments of relief with flights of procrastination when you can.) My mornings and evenings are saturated by meetings: they require me to physically relocate myself and pay attention to something else on someone else’s schedule. And throw in orchestra rehearsal from 7:30-10PM or small ensemble rehearsal from 5-7PM or SIPB meetings from 7:30-8:30PM and you’ve got a recipe for fragmentation.

When can you get that uninterrupted time? Only late at night, because when you are working on that problem set at two in the morning, there is one good thing: no one has scheduled a meeting at 2:30 AM. And while it’s definitely a shame that these benefits may be outweighed by the intoxication of sleep deprivation, you’re not going to find this sort of contiguous block of time anywhere else. And that’s why we stay up late.

Hunting for constraints

February 15, 2010

> import Data.List
> import Control.Monad

The following question appeared as part of a numbers-based puzzle in the 2010 MIT Mystery Hunt:

His final level on Wizard of Wor was equal to the smallest number that can be written as the sum of 4 non-zero squares in exactly 9 ways

I’d like to explore constraint search in Haskell to solve this problem. The hope is to find a (search) program that directly reflects the problem as posed, and also gives us an answer!

Because we are looking for the smallest number, it makes sense to start testing from a small number and start counting up. We’ll assume that the answer to this question won’t overflow Int.

Now, we need to test if it can be written as the sum of 4 non-zero squares in exactly 9 ways. This problem reduces to “how many ways can n be written as the sum of squares”, which is another search problem.

We’ll assume that 4+1+1+1 and 1+4+1+1 don’t constitute distinct for the purposes of our nine squares. This results in the first piece of cleverness: if we impose a strict ordering on our squares, we once again get uniqueness.

We also need to bound our search space; while fair search can help to some degree with infinite failure, our implementation will be much simpler if we can do some early termination. A very simple condition to terminate on is if the sum of the squares exceeds the number we’re looking for.

Considering the case where we are matching for x, and we have candidate roots a, b and c. Then, the maximum the remaining square can be is x - a^2 - b^2 - c^2, and the maximum value for d is the floor of the square root. Square roots are cheap, and we’re using machine size integers, so things are good.

> floorSqrt :: Int -> Int
> floorSqrt = floor . sqrt . fromIntegral
>
> sumSquares :: [Int] -> Int
> sumSquares as = sum (map (^2) as)
>
> rootMax :: Int -> [Int] -> Int
> rootMax x as = floorSqrt (x - sumSquares as)

From there, we just write out the search for distinct sums of squares of a number:

> searchSumFourSquares :: Int -> [(Int, Int, Int, Int)]
> searchSumFourSquares x = do
>       a <- [1..(rootMax x [])]
>       b <- [a..(rootMax x [a])]
>       c <- [b..(rootMax x [a,b])]
>       d <- [c..(rootMax x [a,b,c])]
>       guard $ sumSquares [a,b,c,d] == x
>       return (a,b,c,d)

And from there, the solution falls out:

> search :: Maybe Int
> search = findIndex (==9) (map (length . searchSumFourSquares) [0..])

We cleverly use [0..] so that the index is the same as the number itself. Alternative methods might use tuples.

Scheming environments

February 12, 2010

Environments are first-class objects in MIT/GNU Scheme. This is neat, because an integral part of the lexical structure of a Scheme is also a data-structure in its own right, able to encode data and behavior. In fact, the environment data structure is precisely what Yegge calls property lists, maps that can be linked up with inheritance. So not only is it a data structure, it’s a highly general one too.

Even without the ability to pass around an environment as a first class variable, you can still leverage its syntactic ties to stash away local state in an object-oriented manner. The data is only accessible inside procedures that pointed to the appropriate environment frame, and traditionally the closure returns a lambda (with its double-bubble pointed to the newly born environment frame) that acts as the only interface into the enclosing state. This requires a fair amount of boilerplate, since this lambda has to support every possible operation you might want to do to the innards of the function.

With a first class environment object, however, you can futz around with the closure’s bindings arbitrarily. Unfortunately, there’s no way to directly get-current-environent (except for the top-level REPL environment, which doesn’t count), so we resort to the following trick:

(procedure-environment (lambda () '()))

procedure-environment can grab the environment pointer from the double-bubble for some procedure. So, we force the creation of a double-bubble pointing to the environment we care about with the empty lambda.

I most recently used this technique in a 6.945 project, in which I used a lambda to generate a bunch of procedures with various parameters swapped out (encouraging code-reuse), something akin to the time-honored trick of including C files multiple times with different macro definitions. Instead of returning these procedures as a hash-table which then people would have to explicitly call, I just returned the environment, and thus any consumer could enter “a different universe” by using an appropriate environment.

Scheme is pretty unique in its celebration of environments as first-class objects. I could try this trick in Python, but the func_closure attribute on functions is read-only, and also Python has some pretty lame scoping rules. A shame, since this technique allows for some lovely syntactic simplifications.

Comment. Oleg Kiselyov writes in to mention that “*MIT Scheme* specifically is unique in its celebration of environments as first class environments,” and notes that even some developers of MIT scheme have second thoughts about the feature. It makes code difficult to optimize, and is both theoretically and practically dangerous: theoretically dangerous since environments are really an implementation detail, and practically dangerous because it makes it very difficult to reason about code.

From in-person discussions, I’m not surprised that Sussman’s favored dialect of Scheme allows such a dangerous feature; Sussman has always been in favor of letting people have access to dangerous toys and trusting them to use them correctly.

Sources of music

February 10, 2010

I love listening to music, especially new and interesting pieces that I’ve never heard before. Unfortunately, being a somewhat frugal spender my own personal collection of music grows very slowly; perhaps a bit too slowly for my own tastes. When I need new music, where do I turn?

MixApp is a collaborative music listening application. At its worst, it can be simply used as an extension to your current music library; anyone who is your friend and who is online, you can search for music and queue it up for yourself. However, the serendipitous part of MixApp is when you’ve dropped into a room of people you don’t know and music you don’t know, but man, it sounds good and suddenly you’re being taken on a sonic adventure across artists you’ve never heard of and a genre you’ve just discovered and wham: you just got MixApp’ed. Update: MixApp is dead (the founders went on to build Meteor), though there are replacements popping up like turntable.fm
Pandora and last.fm are both pretty reliable methods to get a stream of genre appropriate singles, one after another. The serendipity level is not as nice as MixApp, though, so I don’t find myself turning to these much.
There’s not really much that can beat a good radio host. People like David Garland and John Schaefer have such a diverse and deep palette of musical knowledge, and they’ve had every evening for who knows how many years to hone the craft of sharing this with the listeners of public radio. I was very pleased when WQXR finally managed to get a high-quality internet stream back online.
I was room-skipping on MixApp one evening, and was caught by the Kleptone’s latest album Uptime/Downtime. I have nothing against mix artists: the whole tradition of music has been founded upon the borrowing, stealing, and building upon of earlier work, and in many cases, an adept mix artist can improve the “popular music” material it was founded upon. Or sometimes the source material is just really awesome, and should be listened to in its own right: one of the most interesting musical adventures I’ve had recently was taking the samples list for Uptime/Downtime and listening to each source piece in turn.
Orchestra, wind ensemble, small ensemble, or really any type of ensemble, rehearsal, affords time several months to get intimately familiar with a particular piece of music. I would have never have gotten the chance to fully appreciate contemporary works such as Bells for Stokowski or Persichetti’s Masquerade for Band without this really in-depth exploration into a piece.

I should consider myself extremely lucky to be living in an era where new music is constantly at my fingertips. How do you seek out new and interesting music?

Nested loops and exceptions (Oleg Kiselyov)

February 8, 2010

Editorial. Today we interrupt our regularly scheduled programming to bring you a guest post by Oleg Kiselyov, reinterpreting our previous post about nested loops and continuations with exceptions.

Hello!

I noticed your recent article about nested loops and continuations. I should have commented on it using the provided form, but I was not sure how formatting would come out. The comment includes a lot of code. Please feel free to post the code in whole or in part, or do anything else with it.

The thesis of my comment is that callCC is not necessary for the implementation of break and continue in single and nested loops. We observe that the continuations of each iteration and of the entire loop are invoked either 0 or 1 time (but never more than once). That is the pattern of exceptions. So, the problem posed by your article can be solved with exceptions. Here are several variations of the solution.

First, a few preliminaries: this message is the complete literate Haskell code.

> import Prelude hiding (break, catch)
> import Control.Monad
> import Control.Monad.Trans

Alas, ErrorT in Control.Monad.Error has the stupid Error constraint. So, we have to write our own Exception monad transformer. The code below is standard.

> newtype ExT e m a = ExT{unExT :: m (Either e a)}
>
> instance Monad m => Monad (ExT e m) where
>     return  = ExT . return . Right
>     m >>= f = ExT $ unExT m >>= either (return . Left) (unExT . f)
>
> instance MonadTrans (ExT e) where
>     lift m = ExT $ m >>= return . Right
>
> instance MonadIO m => MonadIO (ExT e m) where
>     liftIO m = ExT $ liftIO m >>= return . Right
>
> raise :: Monad m => e -> ExT e m a
> raise = ExT . return . Left
>
> catch :: Monad m => ExT e m a -> (e -> ExT e' m a) -> ExT e' m a
> catch m h = ExT $ unExT m >>= either (unExT . h) (return . Right)
>
> runExT :: Monad m => ExT e m a -> m a
> runExT m = unExT m >>= either (const $ fail "Unhandled exc") return

We are ready to code the first solution, for simple, non-nested loops. The idea is to treat ‘break’ and ‘continue’ as exceptions. After all, both control operators cause computations to be skipped—which is what exceptions do. We define the datatype of our ’exceptions’:

> data BC = Break | Cont
>
> break, continue :: Monad m => ExT BC m a
> break    = raise Break
> continue = raise Cont

Here is the code for the loop: it catches exceptions at some points:

> for_in :: Monad m => [a] -> (a -> ExT BC m ()) -> m ()
> for_in xs f = runExT $ mapM_ iter xs `catch` hbreak
>  where 
>  iter x = catch (f x) hcont
>  hcont  Cont  = return () -- handle Cont, re-raise Break
>  hcont  e     = raise e
>  hbreak Break = return ()
>  hbreak Cont  = return ()     -- Shouldn't happen actually

Here is your test:

> loopLookForIt1 :: IO ()
> loopLookForIt1 =
>     for_in [0..100] $ \x -> do
>         when (x `mod` 3 == 1) $ continue
>         when (x `div` 17 == 2) $ break
>         lift $ print x

Running it:

> tf1 = loopLookForIt1 :: IO ()

prints 23 numbers starting with 0, 2, 3 and ending with 30, 32, 33.

We have to generalize to nested loops. Two solutions are apparent. I would call the first one ‘dynamic’. We index the exceptions by levels, which are natural numbers. Level 0 pertains to the current loop, level 1 is for the parent loop, etc.

> data BCN = BCN BC Int         -- Add the level of breaking

Operators break and continue now take the number: how many loop scopes to break. I think Perl has a similar breaking-with-number operator.

> breakN    = raise . BCN Break
> continueN = raise . BCN Cont

The new iterator:

> for_inN :: Monad m => [a] -> (a -> ExT BCN m ()) -> ExT BCN m ()
> for_inN xs f = mapM_ iter xs `catch` hbreak
>  where 
>  iter x = catch (f x) hcont
>  hcont  (BCN Cont 0)  = return () -- handle Cont, re-raise Break
>  hcont  e             = raise e
>  -- If the exception is for a parent, re-raise it, decrementing its level
>  hbreak (BCN Break 0) = return ()
>  hbreak (BCN Cont 0)  = return ()     -- Shouldn't happen actually
>  hbreak (BCN exc n)   = raise (BCN exc (n-1))

The single-loop test now looks as follows.

> loopLookForItN :: ExT BCN IO ()
> loopLookForItN =
>     for_inN [0..100] $ \x -> do
>         when (x `mod` 3 == 1) $ continueN 0
>         when (x `div` 17 == 2) $ breakN 0
>         lift $ print x
>
> tfN = runExT loopLookForItN :: IO ()

We can now write the nested loop test. I took a liberty to enhance the example in your article, so to exercises all cases:

> loopBreakOuter1 :: ExT BCN IO ()
> loopBreakOuter1 =
>     for_inN [1,2,3] $ \x -> do
>         lift $ print x
>         for_inN [4,5,6] $ \y -> do
>             lift $ print y
>             when (y == 4) $ continueN 0
>             when (x == 1) $ breakN 0
>             when (x == 3) $ breakN 1
>             when (y == 5) $ continueN 1
>             breakN 1
>         lift $ print x
>
> tbN1 = runExT loopBreakOuter1 :: IO ()

The result is the sequence of numbers: 1 4 5 1 2 4 5 3 4 5

There exists another solution for the nested-loop problem, which I call ‘static’. What if we just iterate the single-loop solution? We can nest ExT BC monad transformers to any given depth. To refer to particular layer in the transformer stack, we use lift. We can use the for_in iterator and the operators break, continue defined earlier. We write the nested test as follows:

> loopBreakOuterS1 :: IO ()
> loopBreakOuterS1 =
>     for_in [1,2,3] $ \x -> do
>         liftIO $ print x
>         for_in [4,5,6] $ \y -> do
>             liftIO $ print y
>             when (y == 4) $ continue
>             when (x == 1) $ break
>             when (x == 3) $ lift $ break
>             when (y == 5) $ lift $ continue
>             lift $ break
>         liftIO $ print x
> tbS1 = loopBreakOuterS1 :: IO ()

I guess the lesson here might be that callCC is often not needed (I would argue that callCC is never needed, but that’s the argument for another time). Here is another example of simple exceptions sufficing where call/cc was thought to be required:

http://okmij.org/ftp/Computation/lem.html

Cheers,
Oleg

Nested loops and continuations

February 5, 2010

The bread and butter of an imperative programmer is the loop. Coming from a C/assembly perspective, a loop is simply a structured goto which jumps back to a set location if some condition is not met. Frequently, this loop ranges over the elements of some list data structure. In C, you might be doing pointer arithmetic over the elements of an array or following pointers on a linked list until you get NULL; in Python and other higher-level languages you get the for x in xs construct which neatly abstracts this functionality. Inside of a loop, you also have access to the flow control operators break and continue, which are also highly structured gotos. An even more compact form of loops and nested loops are list comprehensions, which don’t permit those flow operators.

Haskell encourages you to use the higher order forms such as map and fold, which even further restrict what may happen to the data. You’ll certainly not see a for loop anywhere in Haskell… However, as a pernicious little exercise, and also a way to get a little more insight into what callCC might be good for, I decided to implement for...in loops with both the continue and break keywords. The end hope is to be able to write code such as:

import Prelude hiding (break)

loopLookForIt :: ContT () IO ()
loopLookForIt =
    for_in [0..100] $ \loop x -> do
        when (x `mod` 3 == 1) $ continue loop
        when (x `div` 17 == 2) $ break loop
        lift $ print x

as well as:

loopBreakOuter :: ContT () IO ()
loopBreakOuter =
    for_in [1,2,3] $ \outer x -> do
        for_in [4,5,6] $ \inner y -> do
            lift $ print y
            break outer
        lift $ print x

the latter solving the classic “nested loops” problem by explicitly labeling each loop. We might run these pieces of code using:

runContT loopBreakOuter return :: IO ()

Since continuations represent, well, “continuations” to the program flow, we should have some notion of a continuation that functions as break, as well as a continuation that functions as continue. We will store the continuations that correspond to breaking and continuing inside a loop “label”, which is the first argument of our hanging lambda:

data (MonadCont m) => Label m = Label {
    continue :: m (),
    break :: m ()
}

It’s sufficient then to call continue label or break label inside the monad to extract and follow the continuation.

The next bit is to implement the actual for_in construct. If we didn’t have to supply any of the continuations, this is actually just a flipped mapM_:

for_in' :: (Monad m) => [a] -> (a -> m ()) -> m ()
for_in' xs f = mapM_ f xs

Of course, sample code, f has the type Label m -> a -> m (), so this won’t do! Consider this first transformation:

for_in'' :: (MonadCont m) => [a] -> (a -> m ()) -> m ()
for_in'' xs f = callCC $ \c -> mapM_ f xs

This function does the same thing as for_in', but we placed it inside the continuation monad and made explicit a variable c. What does the current continuation c correspond to in this context? Well, it’s in the very outer context, which means the “current continuation” is completely out of the loop. That must mean it’s the break continuation. Cool.

Consider this second alternative transformation:

for_in''' :: (MonadCont m) => [a] -> (a -> m ()) -> m ()
for_in''' xs f = mapM_ (\x -> callCC $ \c -> f x) xs

This time, we’ve replaced f with a wrapper lambda that uses callCC before actually calling f, and the current continuation results in the next step of mapM_ being called. This is the continue continuation.

All that remains is to stick them together, and package them into the Label datatype:

for_in :: (MonadCont m) => [a] -> (Label m -> a -> m ()) -> m ()
for_in xs f = callCC $ \breakCont ->
    mapM_ (\x -> callCC $ \continueCont -> f (Label (continueCont ()) (breakCont ())) x) xs

Et voila! Imperative looping constructs in Haskell. (Not that you’d ever want to use them, nudge nudge wink wink.)

Addendum. Thanks to Nelson Elhage and Anders Kaseorg for pointing out a stylistic mistake: storing the continuations as () -> m () is unnecessary because Haskell is a lazy language (in my defense, the imperative paradigm was leaking in!)

Addendum 2. Added type signatures and code for running the initial two examples.

Addendum 3. Sebastian Fischer points out a mistake introduced by addendum 1. That’s what I get for not testing my edits!

Classes begin

February 3, 2010

And so classes begin this Spring Term of 2010. The classes that I am currently signed up to take are:

6.005: Software Construction
6.02: Intro to EECS II
6.045: Automata, Computing and Complexity
6.945: Large-scale Symbolic Systems
21M.283: Musicals of Stage and Screen

6.945 is the “fun” class of the semester; I expect to have to sink a lot of time into and get a lot out of it in return. 6.005 and 6.02 are strictly being taken because my degree requires it (I’ve scheduled 6.02 as a conflict class on top of 6.945, so I’ll probably have to do a little more legwork to make sure I get all the material for that class.) 6.045 is my mathematical class for the semester; no pure course 18 class for me, unfortunately! And on the advice of Robert Jacobs, 21M.283 is my HASS-ish class (I’m quite pleased that I’ve gotten the HASS-D requirement out of the way).

Among the classes I’m not taking this term: 6.006 (goodness knows I do need the algorithms knowledge), 7.01X (punt punt punt) and 6.824 (sounds like lots of fun, but would result in a triple conflict, which I’m not willing to do.)

Cute macro tricks in the kernel

February 1, 2010

A classic stylistic tip given to C programmers is that inline functions should be preferred over macros, when possible. This advice stems from the fact that a macro and an inline function can achieve the same effect, but the inline function also gets type checking.

As it turns out, you can achieve static type checking with macros, if you’re willing to resort to the same cute trick that this following snippet from the Linux kernel uses:

#define module_param_named(name, value, type, perm) \
        param_check_##type(name, &(value)); \
        module_param_call(name, param_set_##type, param_get_##type, &value, perm); \
        __MODULE_PARM_TYPE(name, #type)

Hmm… I wonder what that param_check_##type call is all about. Digging through a few more macro definitions, we see:

#define __param_check(name, p, type) \
        static inline type *__check_##name(void) { return(p); }

So there you go. A throw-away inline function named __check_##name enforces that p is the same type as type. A comment is also given, explaining what’s going on:

/* The macros to do compile-time type checking stolen from Jakub
   Jelinek, who IIRC came up with this idea for the 2.4 module init code. */

« Newer Posts Older Posts »