ezyang's blog

the arc of software bends towards understanding

Posts

How the Grinch stole the Haskell Heap

New to the series? Go to the beginning.

Today, we introduce the Grinch.

image

A formerly foul and unpleasant character, the Grinch has reformed his ways. He still has a penchant for stealing presents, but these days he does it ethically: he only takes a present if no one cares about it anymore. He is the garbage collector of the Haskell Heap, and he plays a very important role in keeping the Haskell Heap small (and thus our memory usage low)—especially since functional programs generate a lot of garbage. We’re not a particularly eco-friendly bunch.

Read more...

Functions produce the Haskell Heap

New to the series? Go to the beginning.

We’ve talked about how we open (evaluate) presents (thunks) in the Haskell Heap: we use IO. But where do all of these presents come from? Today we introduce where all these presents come from, the Ghost-o-matic machine (a function in a Haskell program).

image

Using a function involves three steps.

image

We can treat the machine as a black box that takes present labels and pops out presents, but you can imagine the inside as having an unlimited supply of identical ghosts and empty present boxes: when you run the machine, it puts a copy of the ghost in the box.

Read more...

Implementing the Haskell Heap in Python, v1

New to the series? Go to the beginning.

Here is a simple implementation of all of the parts of the Haskell heap we have discussed up until now, ghosts and all.

heap = {} # global

# ---------------------------------------------------------------------#

class Present(object): # Thunk
    def __init__(self, ghost):
        self.ghost    = ghost # Ghost haunting the present
        self.opened   = False # Evaluated?
        self.contents = None
    def open(self):
        if not self.opened:
            # Hey ghost! Give me your present!
            self.contents = self.ghost.disturb()
            self.opened   = True
            self.ghost    = None
        return self.contents

class Ghost(object): # Code and closure
    def __init__(self, *args):
        self.tags = args # List of names of presents (closure)
    def disturb(self):
        raise NotImplemented

class Inside(object):
    pass
class GiftCard(Inside): # Constructor
    def __init__(self, name, *args):
        self.name  = name  # Name of gift card
        self.items = args # List of presents on heap you can redeem!
    def __str__(self):
        return " ".join([self.name] + map(str, self.items))
class Box(Inside): # Boxed, primitive data type
    def __init__(self, prim):
        self.prim = prim # Like an integer
    def __str__(self):
        return str(self.prim)

# ---------------------------------------------------------------------#

class IndirectionGhost(Ghost):
    def disturb(self):
        # Your present is in another castle!
        return heap[self.tags[0]].open()
class AddingGhost(Ghost):
    def disturb(self):
        # Gotta make your gift, be back in a jiffy!
        item_1 = heap[self.tags[0]].open()
        item_2 = heap[self.tags[1]].open()
        result = item_1.prim + item_2.prim
        return Box(result)
class UnsafePerformIOGhost(Ghost):
    def disturb(self):
        print "Fire ze missiles!"
        return heap[self.tags[0]].open()
class PSeqGhost(Ghost):
    def disturb(self):
        heap[self.tags[0]].open() # Result ignored!
        return heap[self.tags[1]].open()
class TraceGhost(Ghost):
    def disturb(self):
        print "Tracing %s" % self.tags[0]
        return heap[self.tags[0]].open()
class ExplodingGhost(Ghost):
    def disturb(self):
        print "Boom!"
        raise Exception

# ---------------------------------------------------------------------#

def evaluate(tag):
    print "> evaluate %s" % tag
    heap[tag].open()

def makeOpenPresent(x):
    present = Present(None)
    present.opened = True
    present.contents = x
    return present

# Let's put some presents in the heap (since we can't make presents
# of our own yet.)

heap['bottom']  = Present(ExplodingGhost())
heap['io']      = Present(UnsafePerformIOGhost('just_1'))
heap['just_1']  = makeOpenPresent(GiftCard('Just', 'bottom'))
heap['1']       = makeOpenPresent(Box(1))
heap['2']       = makeOpenPresent(Box(2))
heap['3']       = makeOpenPresent(Box(3))
heap['traced_1']= Present(TraceGhost('1'))
heap['traced_2']= Present(TraceGhost('2'))
heap['traced_x']= Present(TraceGhost('x'))
heap['x']       = Present(AddingGhost('traced_1', '3'))
heap['y']       = Present(PSeqGhost('traced_2', 'x'))
heap['z']       = Present(IndirectionGhost('traced_x'))

print """$ cat src.hs
import Debug.Trace
import System.IO.Unsafe
import Control.Parallel
import Control.Exception

bottom = error "Boom!"
io = unsafePerformIO (putStrLn "Fire ze missiles" >> return (Just 1))
traced_1 = trace "Tracing 1" 1
traced_2 = trace "Tracing 2" 2
traced_x = trace "Tracing x" x
x = traced_1 + 3
y = pseq traced_2 x
z = traced_x

main = do
    putStrLn "> evaluate 1"
    evaluate 1
    putStrLn "> evaluate z"
    evaluate z
    putStrLn "> evaluate z"
    evaluate z
    putStrLn "> evaluate y"
    evaluate y
    putStrLn "> evaluate io"
    evaluate io
    putStrLn "> evaluate io"
    evaluate io
    putStrLn "> evaluate bottom"
    evaluate bottom
$ runghc src.hs"""

# Evaluating an already opened present doesn't do anything
evaluate('1')

# Evaluating an indirection ghost forces us to evaluate another present
evaluate('z')

# Once a present is opened, it stays opened
evaluate('z')

# Evaluating a pseq ghost may mean extra presents get opened
evaluate('y')

# unsafePerformIO can do anything, but maybe only once..
evaluate('io')
evaluate('io')

# Exploding presents may live in the heap, but they're only dangerous
# if you evaluate them...
evaluate('bottom')

Technical notes. You can already see some resemblance of the ghosts’ Python implementations and the actual Core GHC might spit out. Here’s an sample for pseq:

Read more...

IO evaluates the Haskell Heap

New to the series? Go to the beginning.

In today’s post, we focus on you, the unwitting person rooting around the Haskell heap to open a present. After all, presents in the Haskell heap do not spontaneously unwrap themselves.

image

Someone has to open the first present.

image

If the Haskell heap doesn’t interact with the outside world, no presents need to be opened: thus IO functions are the ones that will open presents. What presents they will open is not necessarily obvious for many functions, so we’ll focus on one function that makes it particularly obvious: evaluate. Which tells you to…

Read more...

Evaluation on the Haskell Heap

New to the series? Go to the beginning.

image

The Ghost of the Christmas Present

In today’s post, we’ll do a brief survey of the various things that can happen when you open haunted presents in the Haskell heap. Asides from constants and things that have already been evaluated, mostly everything on the Haskell heap is haunted. The real question is what the ghost haunting the present does.

In the simplest case, almost nothing!

Read more...

The Haskell Heap

image

The Haskell heap is a rather strange place. It’s not like the heap of a traditional, strictly evaluated language…

image

…which contains a lot of junk! (Plain old data.)

In the Haskell heap, every item is wrapped up nicely in a box: the Haskell heap is a heap of presents (thunks).

image

When you actually want what’s inside the present, you open it up (evaluate it).

image

Presents tend to have names, and sometimes when you open a present, you get a gift card (data constructor). Gift cards have two traits: they have a name (the Just gift card or the Right gift card), and they tell you where the rest of your presents are. There might be more than one (the tuple gift card), if you’re a lucky duck!

Read more...

Mailbox: Advice for a sup first-timer

What is the Mailbox? It’s a selection of interesting email conversations from my mailbox, which I can use in place of writing original content when I’m feeling lazy. I got the idea from Matt Might, who has a set of wonderful suggestions for low-cost academic blogging.

From: Brent Yorgey

I see you are a contributor to the sup mail client. At least I assume it is you, I doubt there are too many Edward Z. Yangs in the world. =) I’m thinking of switching from mutt. Do you still use sup? Any thoughts/encouragements/cautions to share?

Read more...

Tracing the compilation of Hello Factorial!

It is often said that the factorial function is the “Hello World!” of the functional programming language world. Indeed, factorial is a singularly useful way of testing the pattern matching and recursive facilities of FP languages: we don’t bother with such “petty” concerns as input-output. In this blog post, we’re going to trace the compilation of factorial through the bowels of GHC. You’ll learn how to read Core, STG and Cmm, and hopefully get a taste of what is involved in the compilation of functional programs. Those who would like to play along with the GHC sources can check out the description of the compilation of one module on the GHC wiki. We won’t compile with optimizations to keep things simple; perhaps an optimized factorial will be the topic of another post!

Read more...

Bugs from using the wrong variable

I was supposed to have another post about Hoopl today, but it got scuttled when an example program I had written triggered what I think is a bug in Hoopl (if it’s not a bug, then my mental model of how Hoopl works internally is seriously wrong, and I ought not write about it anyway.) So today’s post will be about the alleged bug Hoopl was a victim of: bugs from using the wrong variable.

Read more...

Hoopl: Dataflow analysis

Once you’ve determined what dataflow facts you will be collecting, the next step is to write the transfer function that actually performs this analysis for you!

image

Remember what your dataflow facts mean, and this step should be relatively easy: writing a transfer function usually involves going through every possible statement in your language and thinking about how it changes your state. We’ll walk through the transfer functions for constant propagation and liveness analysis.

Read more...