Haskell : ezyang's blog

Haskell

On type synonyms June 15, 2011

I recently had to remove a number of type synonyms from the GHC code base which were along the lines of type CmmActuals = [CmmActual]. The process made me wonder a little about when type synonyms are appropriate for Haskell code. The Wikibooks article says type synonyms are “for making the roles of types clearer or providing an alias to, for instance, a complicated list or tuple type” and Learn You a Haskell says they “make more sense to someone reading our code and documentation.” But under what circumstances is this actually true?

Tail recursion makes your loops cleaner May 23, 2011

Recursion is one of those things that functional programming languages shine at—but it seems a bit disappointing that in many cases, you have to convert your beautiful recursive function back into iterative form. After all, iteration is what imperative languages do best, right?

Actually, explicitly tail-recursive functions in functional programming languages can be fairly beautiful: in fact, in the cases of complicated loops, they can be even prettier than their imperative counterparts. Take this midpoint line-drawing algorithm as an example:

Reified laziness May 13, 2011

Short post, longer ones in progress.

One of the really neat things about the Par monad is how it explicitly reifies laziness, using a little structure called an IVar (also known in the literature as I-structures). An IVar is a little bit like an MVar, except that once you’ve put a value in one, you can never take it out again (and you’re not allowed to put another value in.) In fact, this precisely corresponds to lazy evaluation.

Calling all space leaks May 11, 2011

I’m currently collecting non-stack-overflow space leaks, in preparation for a future post in the Haskell Heap series. If you have any interesting space leaks, especially if they’re due to laziness, send them my way.

Here’s what I have so far (unverified: some of these may not leak or may be stack overflows. I’ll be curating them soon).

import Control.Concurrent.MVar

-- http://groups.google.com/group/fa.haskell/msg/e6d1d5862ecb319b
main1 = do file <- getContents
           putStrLn $ show (length $ lines file) ++ " " ++
                      show (length $ words file) ++ " " ++
                      show (length file) 

-- http://www.haskell.org/haskellwiki/Memory_leak
main2 = let xs = [1..1000000::Integer]
        in print (sum xs * product xs)

-- http://hackage.haskell.org/trac/ghc/ticket/4334
leaky_lines                   :: String -> [String]
leaky_lines ""                =  []
leaky_lines s                 =  let (l, s') = break (== '\n') s
                                 in  l : case s' of
                                              []      -> []
                                              (_:s'') -> leaky_lines s''

-- http://stackoverflow.com/questions/5552433/how-to-reason-about-space-complexity-in-haskell
data MyTree = MyNode [MyTree] | MyLeaf [Int]

makeTree :: Int -> MyTree
makeTree 0 = MyLeaf [0..99]
makeTree n = MyNode [ makeTree (n - 1)
                  , makeTree (n - 1) ]

count2 :: MyTree -> MyTree -> Int
count2 r (MyNode xs) = 1 + sum (map (count2 r) xs)
count2 r (MyLeaf xs) = length xs

-- http://stackoverflow.com/questions/2777686/how-do-i-write-a-constant-space-length-function-in-haskell
leaky_length xs = length' xs 0
  where length' [] n = n
        length' (x:xs) n = length' xs (n + 1)

-- http://stackoverflow.com/questions/3190098/space-leak-in-list-program
leaky_sequence [] = [[]]
leaky_sequence (xs:xss) = [ y:ys | y <- xs, ys <- leaky_sequence xss ]

-- http://hackage.haskell.org/trac/ghc/ticket/917
initlast :: (()->[a]) -> ([a], a)
initlast xs = (init (xs ()), last (xs ()))

main8 = print $ case initlast (\()->[0..1000000000]) of
                 (init, last) -> (length init, last)

-- http://hackage.haskell.org/trac/ghc/ticket/3944
waitQSem :: MVar (Int,[MVar ()]) -> IO ()
waitQSem sem = do
   (avail,blocked) <- takeMVar sem
   if avail > 0 then
     putMVar sem (avail-1,[])
    else do
     b <- newEmptyMVar
     putMVar sem (0, blocked++[b])
     takeMVar b

-- http://hackage.haskell.org/trac/ghc/ticket/2607
data Tree a = Tree a [Tree a] deriving Show
data TreeEvent = Start String
                | Stop
                | Leaf String
                deriving Show
main10 = print . snd . build $ Start "top" : cycle [Leaf "sub"]
type UnconsumedEvent = TreeEvent        -- Alias for program documentation
build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
build (Start str : es) =
        let (es', subnodes) = build es
            (spill, siblings) = build es'
        in (spill, (Tree str subnodes : siblings))
build (Leaf str : es) =
        let (spill, siblings) = build es
        in (spill, Tree str [] : siblings)
build (Stop : es) = (es, [])
build [] = ([], [])

Unraveling the mystery of the IO monad May 4, 2011

When we teach beginners about Haskell, one of the things we handwave away is how the IO monad works. Yes, it’s a monad, and yes, it does IO, but it’s not something you can implement in Haskell itself, giving it a somewhat magical quality. In today’s post, I’d like to unravel the mystery of the IO monad by describing how GHC implements the IO monad internally in terms of primitive operations and the real world token. After reading this post, you should be able to understand the resolution of this ticket as well as the Core output of this Hello World! program:

Haskell: Not pure enough? May 2, 2011

It is well known that unsafePerformIO is an evil tool by which impure effects can make their way into otherwise pristine Haskell code. But is the rest of Haskell really that pure? Here are a few questions to ask:

What is the value of maxBound :: Int?
What is the value of \x y -> x / y == (3 / 7 :: Double) with 3 and 7 passed in as arguments?
What is the value of os :: String from System.Info?
What is the value of foldr (+) 0 [1..100] :: Int?

The answers to each of these questions are ill-defined—or you might say they’re well defined, but you need a little extra information to figure out what the actual result is.

Tracing the compilation of Hello Factorial! April 13, 2011

It is often said that the factorial function is the “Hello World!” of the functional programming language world. Indeed, factorial is a singularly useful way of testing the pattern matching and recursive facilities of FP languages: we don’t bother with such “petty” concerns as input-output. In this blog post, we’re going to trace the compilation of factorial through the bowels of GHC. You’ll learn how to read Core, STG and Cmm, and hopefully get a taste of what is involved in the compilation of functional programs. Those who would like to play along with the GHC sources can check out the description of the compilation of one module on the GHC wiki. We won’t compile with optimizations to keep things simple; perhaps an optimized factorial will be the topic of another post!

Bugs from using the wrong variable April 11, 2011

I was supposed to have another post about Hoopl today, but it got scuttled when an example program I had written triggered what I think is a bug in Hoopl (if it’s not a bug, then my mental model of how Hoopl works internally is seriously wrong, and I ought not write about it anyway.) So today’s post will be about the alleged bug Hoopl was a victim of: bugs from using the wrong variable.

Killer mutants attack (mutation gone wrong) March 9, 2011

This is a collection of WTFs due to misuse of mutable state.

We’ll start off with some Java. What do you expect this snippet of code to do? :

Sensor Accel = sm.getDefaultSensor(Sensor.TYPE_ACCELEROMETER);
Sensor Orient = sm.getDefaultSensor(Sensor.TYPE_ORIENTATION);
sm.registerListener((SensorEventListener) this, Accel, sm.SENSOR_DELAY_FASTEST);

Ostensibly, it registers the current object to receive just accelerometer updates. But what if I told you getDefaultSensor was implemented like this:

public Sensor getDefaultSensor (int type){
        if(sensors == null) {
                sensors = new Sensor(mContext, type);
                return sensors;                 
        }else if(sensors.checkList(type)){
                sensors.addSensor(type);
                return sensors;
        }else{
                sensors.removeSensor(type);
                return sensors;
        }
}

This code completely fails to manage the expected semantics: there is a single sm wide Sensor object (stored in sensors) that accumulates sensor values as getDefaultSensor is called. So in fact, this will receive events from both the accelerometer and the magnetometer. The only saving grace is that when we register event listeners, we usually do want them to receive all events, so we might not notice if we weren’t looking too closely. This is real code from OpenIntents SensorSimulator.

Type Technology Tree March 7, 2011

They say that one doesn’t discover advanced type system extensions: rather, the type system extensions discover you! Nevertheless, it’s worthwhile to know what the tech tree for GHC’s type extensions are, so you can decide how much power (and the correspondingly headache inducing error messages) you need. I’ve organized the relations in the following diagram with the following criterion in mind:

Some extensions automatically enable other extensions (implies);
Some extensions offer all the features another extension offers (subsumes);
Some extensions work really nicely with other extensions (synergy);
Some extensions offer equivalent (but differently formulated) functionality to another extension (equiv).

It’s also worth noting that the GHC manual divides these extensions into “Extensions to data types and type synonyms”, “Class and instances declarations”, “Type families” and “Other type system extensions”. I have them organized here a little differently.

« Newer Posts Older Posts »