Existential Pontification and Generalized Abstract Digressions

## Calling all space leaks

I’m currently collecting non-stack-overflow space leaks, in preparation for a future post in the Haskell Heap series. If you have any interesting space leaks, especially if they’re due to laziness, send them my way.

Here’s what I have so far (unverified: some of these may not leak or may be stack overflows. I’ll be curating them soon).

```import Control.Concurrent.MVar

main1 = do file <- getContents
putStrLn \$ show (length \$ lines file) ++ " " ++
show (length \$ words file) ++ " " ++
show (length file)

main2 = let xs = [1..1000000::Integer]
in print (sum xs * product xs)

leaky_lines                   :: String -> [String]
leaky_lines ""                =  []
leaky_lines s                 =  let (l, s') = break (== '\n') s
in  l : case s' of
[]      -> []
(_:s'') -> leaky_lines s''

data MyTree = MyNode [MyTree] | MyLeaf [Int]

makeTree :: Int -> MyTree
makeTree 0 = MyLeaf [0..99]
makeTree n = MyNode [ makeTree (n - 1)
, makeTree (n - 1) ]

count2 :: MyTree -> MyTree -> Int
count2 r (MyNode xs) = 1 + sum (map (count2 r) xs)
count2 r (MyLeaf xs) = length xs

leaky_length xs = length' xs 0
where length' [] n = n
length' (x:xs) n = length' xs (n + 1)

-- http://stackoverflow.com/questions/3190098/space-leak-in-list-program
leaky_sequence [] = [[]]
leaky_sequence (xs:xss) = [ y:ys | y <- xs, ys <- leaky_sequence xss ]

initlast :: (()->[a]) -> ([a], a)
initlast xs = (init (xs ()), last (xs ()))

main8 = print \$ case initlast (\()->[0..1000000000]) of
(init, last) -> (length init, last)

waitQSem :: MVar (Int,[MVar ()]) -> IO ()
waitQSem sem = do
(avail,blocked) <- takeMVar sem
if avail > 0 then
putMVar sem (avail-1,[])
else do
b <- newEmptyMVar
putMVar sem (0, blocked++[b])
takeMVar b

data Tree a = Tree a [Tree a] deriving Show
data TreeEvent = Start String
| Stop
| Leaf String
deriving Show
main10 = print . snd . build \$ Start "top" : cycle [Leaf "sub"]
type UnconsumedEvent = TreeEvent        -- Alias for program documentation
build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
build (Start str : es) =
let (es', subnodes) = build es
(spill, siblings) = build es'
in (spill, (Tree str subnodes : siblings))
build (Leaf str : es) =
let (spill, siblings) = build es
in (spill, Tree str [] : siblings)
build (Stop : es) = (es, [])
build [] = ([], [])
```

### 14 Responses to “Calling all space leaks”

1. Don Stewart says:

How about the little known, but somewhat surprising fact, that you can’t use lazy `modify*`-style functions without leaking space.

2. Hee hee hee. That’s a good one. :-)

On a related note, I don’t think this code in Data.IORef.Strict actually works…

```atomicModifyIORef :: (NFData sa, NFData sb) => IORef sa -> (sa -> (sa, sb)) -> SIO sb
atomicModifyIORef ref f = SIO \$ do x < - IO.atomicModifyIORef ref (rnf' . f)
rnf x `seq` return x
-- since the result of 'f' is a pair which has to forced
where rnf' x = rnf x `seq` x```
3. elaforge says:

A simple one:

update rec = rec { rec_value = if x then new else rec_value rec, … other fields }

The problem is that if rec_value is not forced for a long time, it builds up thunks, even if ‘x’ was False. I wound up not setting rec_value at all if x was not true, but nowadays I think I’d just strictify the record fields. It’s just a case of the lazy record update leak, but I wasn’t expecting it because ‘x’ was usually False.

4. Tyler Breisacher says:

I’m still pretty new to Haskell, slowly working my way through Real World Haskell. But they do mention space leaks in that book, and I *think* this code I wrote recently is probably space-leaky because `solve` will return an expression which reduces to 1+(1+(1+(1+(1+(1+…)))…)))

https://gist.github.com/968189

5. fuzxxl says:

This should stream nicely, although it doesn’t. https://gist.github.com/969253

6. yachris says:

Seems like this code:

``` leaky_length xs = length' xs 0 where length' [] n = n length' (x:xs) n = length' xs (n + 1) ```

should be absolutely straightforward tail recursion… does Haskell not do tail recursion optimization?

7. yachris: Yep, that function is tail recursive. The problem is more insidious: n + 1 is a thunk, and you may notice that n is not forced until the list is empty, so the chain of thunks builds up on the heap instead.

8. yachris says:

Whoops, sorry, should have google’d first:

9. ivan says:

``` f _ = (a, b) where a = space !! 10000 -- time intensive b = space !! 10001 -- time intensive space = [0..10001] -- space intensive```

``` ```

```main = do let list = fmap f [1..] print \$ fmap fst list -- a is forced, not forcing b means space is not freed print \$ list ```

10. Tyler: Your code doesn’t “space leak” per se; the real problem is that you fail to manage a tail-call, which means that the stack will grow as you evaluate that function.

11. fuzxxl: I diagnosed your space leak: the problem is that fromListWith is not strict w.r.t. the combining function. So if you swap that out with:

```fromListWith' :: Ord k => (a -> a -> a) -> [(k,a)] -> Map k a fromListWith' f xs = foldl' ins empty xs where ins t (k,x) = insertWith' f k x t```

it uses constant space. Note that the original does stream nicely (the hint is that the heap profile shows lots of thunks, *not* cons cells.

12. ivan says:

Interesting. So my http://blog.ezyang.com/2011/05/calling-all-space-leaks/#comment-2418 looks like a selector leak. Should I expect ghc to do the right strictness annotation here?

13. No, I actually think your leak is due to the full laziness transformation. But I haven’t experimentally verified yet.

14. ivan says:

In fact MFE transformation makes it not leak. My original problem was more complex with no MFE transformation possible. Without optimization, this simple example leaks. For every member of the `list’, in the pair `(a, b)’ only `a’ is forced and because `b’ is not, the whole intermediate datastructure (`space’) has to be kept in memory.