ezyang’s blog

the arc of software bends towards understanding

Calling all space leaks

I’m currently collecting non-stack-overflow space leaks, in preparation for a future post in the Haskell Heap series. If you have any interesting space leaks, especially if they’re due to laziness, send them my way.

Here’s what I have so far (unverified: some of these may not leak or may be stack overflows. I’ll be curating them soon).

import Control.Concurrent.MVar

-- http://groups.google.com/group/fa.haskell/msg/e6d1d5862ecb319b
main1 = do file <- getContents
           putStrLn $ show (length $ lines file) ++ " " ++
                      show (length $ words file) ++ " " ++
                      show (length file)

-- http://www.haskell.org/haskellwiki/Memory_leak
main2 = let xs = [1..1000000::Integer]
        in print (sum xs * product xs)

-- http://hackage.haskell.org/trac/ghc/ticket/4334
leaky_lines                   :: String -> [String]
leaky_lines ""                =  []
leaky_lines s                 =  let (l, s') = break (== '\n') s
                                 in  l : case s' of
                                              []      -> []
                                              (_:s'') -> leaky_lines s''

-- http://stackoverflow.com/questions/5552433/how-to-reason-about-space-complexity-in-haskell
data MyTree = MyNode [MyTree] | MyLeaf [Int]

makeTree :: Int -> MyTree
makeTree 0 = MyLeaf [0..99]
makeTree n = MyNode [ makeTree (n - 1)
                  , makeTree (n - 1) ]

count2 :: MyTree -> MyTree -> Int
count2 r (MyNode xs) = 1 + sum (map (count2 r) xs)
count2 r (MyLeaf xs) = length xs

-- http://stackoverflow.com/questions/2777686/how-do-i-write-a-constant-space-length-function-in-haskell
leaky_length xs = length' xs 0
  where length' [] n = n
        length' (x:xs) n = length' xs (n + 1)

-- http://stackoverflow.com/questions/3190098/space-leak-in-list-program
leaky_sequence [] = [[]]
leaky_sequence (xs:xss) = [ y:ys | y <- xs, ys <- leaky_sequence xss ]

-- http://hackage.haskell.org/trac/ghc/ticket/917
initlast :: (()->[a]) -> ([a], a)
initlast xs = (init (xs ()), last (xs ()))

main8 = print $ case initlast (\()->[0..1000000000]) of
                 (init, last) -> (length init, last)

-- http://hackage.haskell.org/trac/ghc/ticket/3944
waitQSem :: MVar (Int,[MVar ()]) -> IO ()
waitQSem sem = do
   (avail,blocked) <- takeMVar sem
   if avail > 0 then
     putMVar sem (avail-1,[])
    else do
     b <- newEmptyMVar
     putMVar sem (0, blocked++[b])
     takeMVar b

-- http://hackage.haskell.org/trac/ghc/ticket/2607
data Tree a = Tree a [Tree a] deriving Show
data TreeEvent = Start String
                | Stop
                | Leaf String
                deriving Show
main10 = print . snd . build $ Start "top" : cycle [Leaf "sub"]
type UnconsumedEvent = TreeEvent        -- Alias for program documentation
build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
build (Start str : es) =
        let (es', subnodes) = build es
            (spill, siblings) = build es'
        in (spill, (Tree str subnodes : siblings))
build (Leaf str : es) =
        let (spill, siblings) = build es
        in (spill, Tree str [] : siblings)
build (Stop : es) = (es, [])
build [] = ([], [])

14 Responses to “Calling all space leaks”

  1. Don Stewart says:

    How about the little known, but somewhat surprising fact, that you can’t use lazy `modify*`-style functions without leaking space.


  2. Hee hee hee. That’s a good one. :-)

    On a related note, I don’t think this code in Data.IORef.Strict actually works…

    atomicModifyIORef :: (NFData sa, NFData sb) => IORef sa -> (sa -> (sa, sb)) -> SIO sb
    atomicModifyIORef ref f = SIO $ do x < - IO.atomicModifyIORef ref (rnf' . f)
                                       rnf x `seq` return x
            -- since the result of 'f' is a pair which has to forced 
      where rnf' x = rnf x `seq` x
  3. elaforge says:

    A simple one:

    update rec = rec { rec_value = if x then new else rec_value rec, … other fields }

    The problem is that if rec_value is not forced for a long time, it builds up thunks, even if ‘x’ was False. I wound up not setting rec_value at all if x was not true, but nowadays I think I’d just strictify the record fields. It’s just a case of the lazy record update leak, but I wasn’t expecting it because ‘x’ was usually False.

  4. Tyler Breisacher says:

    I’m still pretty new to Haskell, slowly working my way through Real World Haskell. But they do mention space leaks in that book, and I *think* this code I wrote recently is probably space-leaky because `solve` will return an expression which reduces to 1+(1+(1+(1+(1+(1+…)))…)))


  5. fuzxxl says:

    This should stream nicely, although it doesn’t. https://gist.github.com/969253

  6. yachris says:

    Seems like this code:

    leaky_length xs = length' xs 0
    where length' [] n = n
    length' (x:xs) n = length' xs (n + 1)

    should be absolutely straightforward tail recursion… does Haskell not do tail recursion optimization?

  7. yachris: Yep, that function is tail recursive. The problem is more insidious: n + 1 is a thunk, and you may notice that n is not forced until the list is empty, so the chain of thunks builds up on the heap instead.

  8. ivan says:

    f _ = (a, b)
    a = space !! 10000 -- time intensive
    b = space !! 10001 -- time intensive
    space = [0..10001] -- space intensive

    main = do
    let list = fmap f [1..]
    print $ fmap fst list -- a is forced, not forcing b means space is not freed
    print $ list

  9. Tyler: Your code doesn’t “space leak” per se; the real problem is that you fail to manage a tail-call, which means that the stack will grow as you evaluate that function.

  10. fuzxxl: I diagnosed your space leak: the problem is that fromListWith is not strict w.r.t. the combining function. So if you swap that out with:

    fromListWith' :: Ord k => (a -> a -> a) -> [(k,a)] -> Map k a
    fromListWith' f xs
    = foldl' ins empty xs
    ins t (k,x) = insertWith' f k x t

    it uses constant space. Note that the original does stream nicely (the hint is that the heap profile shows lots of thunks, *not* cons cells.

  11. ivan says:

    Interesting. So my http://blog.ezyang.com/2011/05/calling-all-space-leaks/#comment-2418 looks like a selector leak. Should I expect ghc to do the right strictness annotation here?

  12. No, I actually think your leak is due to the full laziness transformation. But I haven’t experimentally verified yet.

  13. ivan says:

    In fact MFE transformation makes it not leak. My original problem was more complex with no MFE transformation possible. Without optimization, this simple example leaks. For every member of the `list’, in the pair `(a, b)’ only `a’ is forced and because `b’ is not, the whole intermediate datastructure (`space’) has to be kept in memory.

Leave a Comment