Posts : ezyang's blog

Posts

Tail recursion makes your loops cleaner May 23, 2011

Recursion is one of those things that functional programming languages shine at—but it seems a bit disappointing that in many cases, you have to convert your beautiful recursive function back into iterative form. After all, iteration is what imperative languages do best, right?

Actually, explicitly tail-recursive functions in functional programming languages can be fairly beautiful: in fact, in the cases of complicated loops, they can be even prettier than their imperative counterparts. Take this midpoint line-drawing algorithm as an example:

Computing function composition May 20, 2011

This is an addendum to my second example in Anatomy of a thunk leak, in which I’d like to propose another solution to the space leak, involving computing the composition of all of these thunks. This solution is particularly notable because it preserves the denotation of the original function, that is, that f l (undefined, undefined) = (undefined, undefined). This should be surprising, because I claimed that it would be impossible for GHC to optimize a function with that had this denotation into one without the space leak by more eagerly evaluating some thunks. There is no contradiction: the optimization we would like to apply here is one of partial evaluation. Didn’t understand that? Don’t worry, a concrete example is coming soon.

Anatomy of a thunk leak May 18, 2011

In this post, we discuss the characteristics of a thunk leak, the leak that has come to symbolize the difficulties of “reasoning about space usage” in Haskell. I’ll consider a few examples of this type of leak and argue that these leaks are actually trivial to fix. Rather, the difficulty is when a thunk leak gets confused with other types of leaks (which we will cover in later posts).

Description

I’ll be describing the various leaks in two ways: I will first give an informal, concrete description using the metaphor I developed in the Haskell Heap series, and then I will give a more direct, clinical treatment at the end. If you can’t stand one form of explanation or the other, feel free to skip around.

Space leak zoo May 16, 2011

A big thanks to everyone who everyone who sent in space leak specimens. All of the leaks have been inspected and cataloged by our experts, and we are quite pleased to open the doors of the space leak zoo to the public!

There are a few different types of space leak here, but they are quite different and a visitor would do well not to confuse them (the methods for handling them if encountered in the wild vary, and using the wrong technique could exacerbate the situation).

Reified laziness May 13, 2011

Short post, longer ones in progress.

One of the really neat things about the Par monad is how it explicitly reifies laziness, using a little structure called an IVar (also known in the literature as I-structures). An IVar is a little bit like an MVar, except that once you’ve put a value in one, you can never take it out again (and you’re not allowed to put another value in.) In fact, this precisely corresponds to lazy evaluation.

Calling all space leaks May 11, 2011

I’m currently collecting non-stack-overflow space leaks, in preparation for a future post in the Haskell Heap series. If you have any interesting space leaks, especially if they’re due to laziness, send them my way.

Here’s what I have so far (unverified: some of these may not leak or may be stack overflows. I’ll be curating them soon).

import Control.Concurrent.MVar

-- http://groups.google.com/group/fa.haskell/msg/e6d1d5862ecb319b
main1 = do file <- getContents
           putStrLn $ show (length $ lines file) ++ " " ++
                      show (length $ words file) ++ " " ++
                      show (length file) 

-- http://www.haskell.org/haskellwiki/Memory_leak
main2 = let xs = [1..1000000::Integer]
        in print (sum xs * product xs)

-- http://hackage.haskell.org/trac/ghc/ticket/4334
leaky_lines                   :: String -> [String]
leaky_lines ""                =  []
leaky_lines s                 =  let (l, s') = break (== '\n') s
                                 in  l : case s' of
                                              []      -> []
                                              (_:s'') -> leaky_lines s''

-- http://stackoverflow.com/questions/5552433/how-to-reason-about-space-complexity-in-haskell
data MyTree = MyNode [MyTree] | MyLeaf [Int]

makeTree :: Int -> MyTree
makeTree 0 = MyLeaf [0..99]
makeTree n = MyNode [ makeTree (n - 1)
                  , makeTree (n - 1) ]

count2 :: MyTree -> MyTree -> Int
count2 r (MyNode xs) = 1 + sum (map (count2 r) xs)
count2 r (MyLeaf xs) = length xs

-- http://stackoverflow.com/questions/2777686/how-do-i-write-a-constant-space-length-function-in-haskell
leaky_length xs = length' xs 0
  where length' [] n = n
        length' (x:xs) n = length' xs (n + 1)

-- http://stackoverflow.com/questions/3190098/space-leak-in-list-program
leaky_sequence [] = [[]]
leaky_sequence (xs:xss) = [ y:ys | y <- xs, ys <- leaky_sequence xss ]

-- http://hackage.haskell.org/trac/ghc/ticket/917
initlast :: (()->[a]) -> ([a], a)
initlast xs = (init (xs ()), last (xs ()))

main8 = print $ case initlast (\()->[0..1000000000]) of
                 (init, last) -> (length init, last)

-- http://hackage.haskell.org/trac/ghc/ticket/3944
waitQSem :: MVar (Int,[MVar ()]) -> IO ()
waitQSem sem = do
   (avail,blocked) <- takeMVar sem
   if avail > 0 then
     putMVar sem (avail-1,[])
    else do
     b <- newEmptyMVar
     putMVar sem (0, blocked++[b])
     takeMVar b

-- http://hackage.haskell.org/trac/ghc/ticket/2607
data Tree a = Tree a [Tree a] deriving Show
data TreeEvent = Start String
                | Stop
                | Leaf String
                deriving Show
main10 = print . snd . build $ Start "top" : cycle [Leaf "sub"]
type UnconsumedEvent = TreeEvent        -- Alias for program documentation
build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
build (Start str : es) =
        let (es', subnodes) = build es
            (spill, siblings) = build es'
        in (spill, (Tree str subnodes : siblings))
build (Leaf str : es) =
        let (spill, siblings) = build es
        in (spill, Tree str [] : siblings)
build (Stop : es) = (es, [])
build [] = ([], [])

Bindings and CAFs on the Haskell Heap May 9, 2011

New to the series? Go to the beginning.

Today, we discuss how presents on the Haskell Heap are named, whether by top-level bindings, let-bindings or arguments. We introduce the Expression-Present Equivalent Exchange, which highlights the fact that expressions are also thunks on the Haskell heap. Finally, we explain how this let-bindings inside functions can result in the creation of more presents, as opposed to constant applicative forms (CAFs) which exist on the Haskell Heap from the very beginning of execution.

How Aristotle got it Wrong<br />On the importance of conceptual frameworks May 6, 2011

One of the persistent myths about Aristotelean physics—the physics that was proposed by the Ancient Greeks and held up until Newton and Galileo came along—is that Aristotle thought that “heavier objects fall more quickly than light objects”, the canonical example being that of a cannon ball and feather. Although some fraction of contemporary human society may indeed believe this “fact,” Aristotle had a far more subtle and well-thought out view on the matter. If you don’t believe me, an English translation of his original text (Part 8) adequately gives off this impression. Here is a relevant quote (emphasis mine):

Unraveling the mystery of the IO monad May 4, 2011

When we teach beginners about Haskell, one of the things we handwave away is how the IO monad works. Yes, it’s a monad, and yes, it does IO, but it’s not something you can implement in Haskell itself, giving it a somewhat magical quality. In today’s post, I’d like to unravel the mystery of the IO monad by describing how GHC implements the IO monad internally in terms of primitive operations and the real world token. After reading this post, you should be able to understand the resolution of this ticket as well as the Core output of this Hello World! program:

Haskell: Not pure enough? May 2, 2011

It is well known that unsafePerformIO is an evil tool by which impure effects can make their way into otherwise pristine Haskell code. But is the rest of Haskell really that pure? Here are a few questions to ask:

What is the value of maxBound :: Int?
What is the value of \x y -> x / y == (3 / 7 :: Double) with 3 and 7 passed in as arguments?
What is the value of os :: String from System.Info?
What is the value of foldr (+) 0 [1..100] :: Int?

The answers to each of these questions are ill-defined—or you might say they’re well defined, but you need a little extra information to figure out what the actual result is.

« Newer Posts Older Posts »