August 1, 2011As it stands, it is impossible to define certain value-strict operations on IntMaps with the current containers API. The reader is invited, for example, to try efficiently implementing map :: (a -> b) -> IntMap a -> IntMap b, in such a way that for a non-bottom and non-empty map m, Data.IntMap.map (\_ -> undefined) m == undefined.
Now, we could have just added a lot of apostrophe suffixed operations to the existing API, which would have greatly blown it up in size, but following conversation on libraries@haskell.org, we’ve decided we will be splitting up the module into two modules: Data.IntMap.Strict and Data.IntMap.Lazy. For backwards compatibility, Data.IntMap will be the lazy version of the module, and the current value-strict functions residing in this module will be deprecated.
The details of what happened are a little subtle. Here is the reader’s digest version:
- The
IntMap in Data.IntMap.Strict and the IntMap in Data.IntMap.Lazy are exactly the same map; there is no runtime or type level difference between the two. The user can swap between “implementations” by importing one module or another, but we won’t prevent you from using lazy functions on strict maps. You can convert lazy maps to strict ones using seqFoldable. - Similarly, if you pass a map with lazy values to a strict function, the function will do the maximally lazy operation on the map that would still result in correct operation in the strict case. Usually, this means that the lazy value probably won’t get evaluated… unless it is.
- Most type class instances remain valid for both strict and lazy maps, however,
Functor and Traversable do not have valid “strict” versions which obey the appropriate laws, so we’ve selected the lazy implementation for them. - The lazy and strict folds remain, because whether or not a fold is strict is independent of whether or not the data structure is value strict or spine strict.
I hacked up a first version for the strict module at Hac Phi on Sunday, you can see it here. The full implementation can be found here.
July 29, 2011Fall is coming, and with it come hoards of ravenous Freshmen arriving on MIT’s campus. I’ll be doing three food events… all of them functional programming puns. Whee!
Dumpling Hylomorphism
Anamorphism: the building up of a structure. Catamorphism: the consumption of a structure. Hylomorphism: both an anamorphism and a catamorphism. This event? A hylomorphism on dumplings. Come learn the basic fold, or just perform a metabolic reduction on food.
I’ve done this event several times in the past and it’s always good (if a little sweaty) fun. Dumpling making, as it turns out, is an extremely parallelizable task: you can have multiple people rolling out wraps, wrapping the dumplings, and a few brave cooks actually boiling or pan-frying them. (Actually, in China, no one makes their own wraps anymore, because the store bought ones are so good. Not so in the US…)
The WOK Combinator
The combinator is familiar to computer scientists, but less well known to food scientists. A closely guarded secret among the Chinese is the WOK combinator, guaranteed to combine vegetable and meat in record runtime. (Vegan-friendly too.)
Dumplings are a little impractical for an MIT student in the thick of term; they usually tend to get reserved for special events near the beginning of term, if at all. However, stir fry is quick, cheap and easy and an essential healthy component to any college student’s diet. My personal mainstay is broccoli and chicken (which is nearly impossible to get wrong), but I’ve diversified and I’m quite a fan of bell peppers and chorizo these days too (see below). One difficulty with running this event is making sure there are enough rice cookers… running out of rice is no fun, since it takes so long to cook up a new batch!
Polyroastism
Roast(X) where X = {Broccoli, Garlic, Pork, Bell Peppers, Chorizo, Carrots, Onions, Asparagus, Sweet Potatoes}.
This is a new one. Really, I just wanted to do roast broccoli with garlic. It’s sooooo good. I’ve never roasted chorizo before, but there didn’t seem to be enough meat on the menu, so I tossed it in. I did a lot of roasting when I was in Portland, because I’d frequently buy random vegetables at the farmers markets, get home, and then have to figure out how to cook them. Roasting is a pretty good bet, in many cases! I forgot to put beets in the event description; maybe I’ll get some…
July 27, 2011This post was adapted from a post I made to the glasgow-haskell-users list.
According to Control.Exception, the BlockedIndefinitelyOnMVar exception (and related exception BlockedIndefinitelyOnSTM) is thrown when “the thread is blocked on an MVar, but there are no other references to the MVar so it can’t ever continue.” The description is actually reasonably precise, but it is easy to misinterpret. Fully understanding how this exception works requires some extra documentation from Control.Concurrent as well as an intuitive feel for how garbage collection in GHC works with respects to Haskell’s green threads.
Here’s the litmus test: can you predict what these three programs will do? :
main1 = do
lock <- newMVar ()
forkIO $ takeMVar lock
forkIO $ takeMVar lock
threadDelay 1000 -- let threads run
performGC -- trigger exception
threadDelay 1000
main2 = do
lock <- newEmptyMVar
complete <- newEmptyMVar
forkIO $ takeMVar lock `finally` putMVar complete ()
takeMVar complete
main3 = do
lock <- newEmptyMVar
forkIO $ takeMVar lock `finally` putMVar lock ()
let loop = do
b <- isEmptyMVar lock
if b
then yield >> performGC >> loop
else return ()
loop
Try not to peek. For a hint, check the documentation for forkIO.
The first program gives no output, even though the threadDelay ostensibly lets both forked threads get scheduled, run, and deadlocked. In fact, BlockedIndefinitelyOnMVar is raised, and the reason you don’t see it is because forkIO installs an exception handler that mutes this exception, along with BlockedIndefinitelyOnSTM and ThreadKilled. You can install your own exception handler using catch and co.
There is an interesting extra set of incants at the end of this program that ensure, with high probability, that the threads get scheduled and the BlockedIndefinitelyOnMVar exception gets thrown. Notice that the exception only gets thrown when “no references are left to the MVar.” Since Haskell is a garbage collected language, the only time it finds out references are gone are when garbage collections happen, so you need to make sure one of those occurs before you see one of these errors.
One implication of this is that GHC does not magically know which thread to throw the exception at to “unwedge” the program: instead, it will just throw BlockedIndefinitelyOnMVar at all of the deadlocked threads, including (if applicable) the main thread. This behavior is demonstrated in the second program, where the program terminates with BlockedIndefinitelyOnMVar because the main thread gets a copy of the exception, even though the finally handler of the child thread would have resolved the deadlock. Try replacing the last line with takeMVar complete `catch` \BlockedIndefinitelyOnMVar -> takeMVar complete >> putStrLn "done". It’s pretty hilarious.
The last program considers what it means for an MVar to be “reachable”. As it deadlocks silently, this must mean the MVar stayed reachable; and indeed, our reference isEmptyMVar prevents the MVar from ever going dead, and thus we loop infinitely, even though there was no possibility of the MVar getting filled in. GHC only knows that a thread can be considered garbage (which results in the exception being thrown) if there are no references to it. Who is holding a reference to the thread? The MVar, as the thread is blocking on this data structure and has added itself to the blocking list of this. Who is keeping the MVar alive? Why, our closure that contains a call to isEmptyMVar. So the thread stays. The general rule is as follows: if a thread is blocked on an MVar which is accessible from a non-blocked thread, the thread sticks around. While there are some obvious cases (which GHC doesn’t manage) where the MVar is obviously dead, even if there are references sticking around to it, figuring this out in general is undecidable. (Exercise: Write a program that solves the halting problem if GHC was able to figure this out in general.)
To conclude, without a bit of work (which would be, by the way, quite interesting to see), BlockedIndefinitelyOnMVar is not an obviously useful mechanism for giving your Haskell programs deadlock protection. Instead, you are invited to think of it as a way of garbage collecting threads that would have otherwise languished around forever: by default, a deadlocked thread is silent (except in memory usage.) The fact that an exception shows up was convenient, operationally speaking, but should not be relied on.
July 25, 2011What do these problems have in common: recursive equality/ordering checks, printing string representations, serializing/unserializing binary protocols, hashing, generating getters/setters? They are repetitive boilerplate pieces of code that have a strong dependence on the structure of the data they operate over. Since programmers love automating things away, various schools of thought have emerged on how to do this:
- Let your IDE generate this boilerplate for you. You right-click on the context menu, click “Generate
hashCode()”, and your IDE does the necessary program analysis for you; - Create a custom metadata format (usually XML), which you then run another program on which converts this description into code;
- Add sufficiently strong macro/higher-order capabilities to your language, so you can write programs which generate implementations in-program;
- Add sufficiently strong reflective capabilities to your language, so you can write a fully generic dynamic implementation for this functionality;
- Be a compiler and do static analysis over abstract syntax trees in order to figure out how to implement the relevant operations.
It hadn’t struck me how prevalent the fifth option was until I ran into wide scale use of one particular facet of the camlp4 system. While it describes itself as a “macro system,” in sexplib and bin-prot, the macros being used are not those of the C tradition (which would be good for implementing 3), rather, they are in the Lisp tradition, including access to the full syntax tree of OCaml and the ability to modify OCaml’s grammar. Unlike most Lisps, however, camlp4 has access to the abstract syntax tree of data type definitions (in untyped languages these are usually implicit), which it can use to transform into code.
One question that I’m interested in is whether or not this sort of metaprogramming can be made popular with casual users of languages. If I write code to convert a data structure into a Lisp-like version, is a logical next step to generalize this code into metaprogramming code, or is that a really large leap only to be done by extreme power users? At least from a user standpoint, camlp4 is extremely unobtrusive. In fact, I didn’t even realize I was using it until a month later! Using sexplib, for example, is a simple matter of writing:
type t = bar | baz of int * int
with sexp
Almost magically, sexp_of_t and sexp_to_t spring into existence.
But defining new transformations is considerably more involved. Part of the problem is the fact that the abstract-syntax tree you are operating over is quite complex, the unavoidable side-effect of making a language nice to program in. I could theoretically define all of the types I cared about using sums and products, but real OCaml programs use labeled constructors, records, anonymous types, anonymous variants, mutable fields, etc. So I have to write cases for all of these, and that’s difficult if I’m not already a language expert.
A possible solution for this is to define a simpler, core language on which to operate over, much the same way GHC Haskell compiles down to Core prior to code generation. You can then make the extra information available through an annotations system (which is desirable even when you have access to the full AST.) If the idea is fundamentally simple, don’t force the end-user to have to handle all of the incidental complexity that comes along with making a nice programming language. Unless, of course, they want to.
Postscript. One of the things I’m absolutely terrible at is literature search. As with most ideas, it’s a pretty safe bet to assume someone else has done this already. But I couldn’t find any prior art here. Maybe I need a better search query than “intermediate language for metaprogramming.”
July 22, 2011OCaml supports anonymous variant types, of the form type a = [`Foo of int | `Bar of bool], with the appropriate subtyping relations. Subtyping is, in general, kind of tricky, so I have been using these variant types fairly conservatively. (Even if a feature gives you too much rope, it can be manageable and useful if you use discipline.) Indeed, they are remarkably handy for one particular use-case for which I would have normally deployed GADTs. This is the “Combining multiple sum types into a single sum type” use-case.
Consider the following program in Haskell:
data A = Foo Int | Bar Bool
data B = Baz Char | Qux
If one would like to define the moral equivalent of A plus B, the most naive way to do this is:
data AorB = A A | B B
But this kind of sucks: I would have preferred some kind of flat namespace by which I could refer to A and B (also, this encoding is not equivalent to data AorB = Foo Int | Bar Bool | Baz Char | Qux in the presence of laziness.) If you use normal sum types in OCaml, you’re similarly out of luck. However, you can handily manage this if you use variant types:
type a = [`Foo of int | `Bar of bool]
type b = [`Baz of char | `Quz]
type a_or_b = [a | b]
Sweet! Note that we’re not using the full generality of variant types: I will only ever refer to these variant constructors in the context of a, b or a_or_b: anonymous variant types are right out. This prevents coercion messes.
I can actually pull this off in Haskell with GADTs, although it’s certainly not obvious for a beginning programmer:
data A
data B
data AorB t where
Foo :: Int -> AorB A
Bar :: Bool -> AorB A
Baz :: Char -> AorB B
Quz :: AorB B
To pattern match against all constructors, I specify the type AorB t; to only do A I use AorB A, and to only do B I use AorB B. Don’t ask me how to specify arbitrary subsets of more than two combined sum types. (Solutions in the comment section welcome, though they will be graded on clarity.)
The Haskell approach does have one advantage, which is that the sum type is still closed. Since OCaml can make no such guarantee, things like bin-prot need to use up a full honking four-bytes to specify what variant it is (they hash the name and use that as a unique identifier) rather than the two bits (but more likely, one byte) needed here. This also means for more efficient generated code.
July 20, 2011I’ll be at this year’s Hac Phi (coming up in a week and a half), and I am planning on working on in-program garbage collector statistics for GHC. There is nothing really technically difficult about this task (we just need to expose some functions in the RTS), but it’s not been done yet and I know quite a few performance-minded and long-running-server-minded folks who would love to see this functionality.
My question for you is this: what would you like such an API to look like? What things should it offer, how would you like to interact with it?
Here’s one sample API to get the ball rolling:
module GHC.RTS.Stats where
-- Info is not collected unless you run with certain RTS options. If
-- you are planning on using this on a long-running server, costs of the
-- options would be good to have (we also probably need to add extra
-- options which record, but have no outwardly visible effect.)
-- Read out static parameters that were provided via +RTS
generations :: IO Int
---------------------------------------------------------------------
-- Full statistics
-- Many stats are internally collected as words. Should be publish
-- words?
-- Names off of machine readable formats
bytesAllocated :: IO Int64
numGCs :: IO Int64
numByteUsageSamples :: IO Int64
averageBytesUsed :: IO Int64 -- cumulativeBytesUsed / numByteUsageSamples
maxBytesUsed :: IO Int64
-- peakMemoryBlocksAllocated :: IO Int64
peakMegabytesAllocated :: IO Int64
initCpuSeconds :: IO Double
initWallSeconds :: IO Double
mutatorCpuSeconds :: IO Double
mutatorWallSeconds :: IO Double
gcCpuSeconds :: IO Double
gcWallSeconds :: IO Double
-- Wouldn't be too unreasonable to offer a data structure with all of
-- this? Unclear. At least, it would prevent related data from
-- desynchronizing.
data GlobalStats = GlobalStats
{ g_bytes_allocated :: Int64
, g_num_GCs :: Int64
, g_num_byte_usage_samples :: Int64
, g_average_bytes_used :: Int64
, g_max_bytes_used :: Int64
, g_peak_megabytes_allocated :: Int64
, g_init_cpu_seconds :: Double
, g_init_wall_seconds :: Double
, g_mutator_cpu_seconds :: Double
, g_mutator_wall_seconds :: Double
, g_gc_cpu_seconds :: Double
, g_gc_wall_seconds :: Double
}
globalStats :: IO GlobalStats
generationStats :: Int -> IO GlobalStats
---------------------------------------------------------------------
-- GC statistics
-- We can't offer a realtime stream of GC events, because they come
-- to fast. (Test? eventlog comes to fast, maybe GC is manageable,
-- but you don't want to trigger GC in your handler.)
data GCStats = GCStats
{ gc_alloc :: Int64
, gc_live :: Int64
, gc_copied :: Int64
, gc_gen :: Int
, gc_max_copied :: Int64
, gc_avg_copied :: Int64
, gc_slop :: Int64
, gc_wall_seconds :: Int64
, gc_cpu_seconds :: Int64
, gc_faults :: Int
}
lastGC :: IO GCStats
lastMajorGC :: IO GCStats
allocationRate :: IO Double
---------------------------------------------------------------------
-- Parallel GC statistics
data ParGCStats = ParGCStats
{ par_avg_copied :: Int64
, par_max_copied :: Int64
}
parGCStats :: IO ParGCStats
parGCNodes :: IO Int64
---------------------------------------------------------------------
-- Threaded runtime statistics
data TaskStats = TaskStats
-- Inconsistent naming convention here: mut_time or mut_cpu_seconds?
-- mut_etime or mut_wall_seconds? Hmm...
{ task_mut_time :: Int64
, task_mut_etime :: Int64
, task_gc_time :: Int64
, task_gc_etime :: Int64
}
---------------------------------------------------------------------
-- Spark statistics
data SparkStats = SparkStats
{ s_created :: Int64
, s_dud :: Int64
, s_overflowed :: Int64
, s_converted :: Int64
, s_gcd :: Int64
, s_fizzled :: Int64
}
sparkStats :: IO SparkStats
sparkStatsCapability :: Int -> IO SparkStats
July 18, 2011In theory, Git supports custom, low-level merge drivers with the merge configuration properties. In practice, no one actually wants to write their own merge driver from scratch. Well, for many cases where a custom merge driver would come in handy, you don’t have to write your merge driver from scratch! Consider these cases:
- You want to merge files which have differing newline styles,
- You want to merge files where one had lots of trailing whitespace removed,
- You want to merge files one branch has replaced certain strings with custom strings (for example, a configuration file which instantiated
PASSWORD, or a file that needs to be anonymized if there is a merge conflict), - You want to merge a binary file that has a stable textual format, or
- You want to merge with knowledge about specific types of conflicts and how to resolve them (a super-smart
rerere).
For all of these cases, you can instead perform a synthetic Git merge by modifying the input files (constructing synthetic merge inputs), calling Git’s git merge-file to do the actual merge, and then possibly editing the result, before handing it back off to the original invoker of your merge driver. It’s really simple. Here’s an example driver that handles files with differing newline styles by canonicalizing them to UNIX:
#!/bin/sh
CURRENT="$1"
ANCESTOR="$2"
OTHER="$3"
dos2unix "$CURRENT"
dos2unix "$ANCESTOR"
dos2unix "$OTHER"
exec git merge-file "$CURRENT" "$ANCESTOR" "$OTHER"
You can then set it up by frobbing your .git/config:
[merge.nl]
name = Newline driver
driver = /home/ezyang/merge-fixnewline.sh %A %O %B
And your .git/info/attributes:
*.txt merge=nl
In Wizard, we implemented (more clever) newline canonicalization, configuration value de-substitution (this reduces the diff between upstream and downstream, reducing the amount of conflicts due to proximity), and custom rerere behavior. I’ve also seen a coworker of mine use this technique manually to handle merge conflicts involving trailing whitespace (in Mercurial, no less!)
Actually, we took this concept further: rather than only create synthetic files, we create entirely synthetic trees, and then call git merge on them proper. This has several benefits:
- We can now pick an arbitrary ancestor commit to perform the merge from (this, surprisingly enough, really comes in handy for our use-case),
- Git has an easier time detecting when files moved and changed newline style, etc, and
- It’s a bit easier to use, since you just call a custom command rather than have to remember how to setup your Git config and attributes properly (and keep them up to date!)
Merges are just metadata—multiple parents commits. Git doesn’t care how you get the contents of your merge commit. Happy merging!
July 15, 2011It is not too difficult (scroll to “Non sequitur”) to create a combinator which combines two folds into a single fold that operates on a single input list in one pass. This is pretty important if your input list is pretty big, since doing the folds separately could result in a space leak, as might be seen in the famous “average” space leak:
import Data.List
big = [1..10000000]
sum' = foldl' (+) 0
average xs = fromIntegral (sum' xs) / fromIntegral (length xs)
main = print (average big)
(I’ve redefined sum so we don’t stack overflow.) I used to think combining functions for folds were pretty modular, since they had a fairly regular interface, could be combined together, and really represented the core notion of when it was possible to eliminate such a space leak: obviously, if you have two functions that require random access to elements of the list, they’ll retain the entirety of it all the way through.
Of course, a coworker of mine complained, “No! That’s not actually modular!” He wanted to write the nice version of the code, not some horrible gigantic fold function. This got me thinking: is it actually true that the compiler can’t figure out when two computations on a streaming data structure can be run in parallel?
But wait! We can tell the compiler to run these in parallel:
import Data.List
import Control.Parallel
big = [1..10000000]
sum' = foldl' (+) 0
average' xs =
let s = sum' xs
l = length xs
in s `par` l `par` fromIntegral s / fromIntegral l
main = print (average big)
And lo and behold, the space leak goes away (don’t forget to compile with -threaded and run with at least -N2. With the power of multiple threads, both operations can run at the same time, and thus there is no unnecessary retention.
It is perhaps not too surprising that par can plug space leaks, given that seq can do so too. But seq has denotational content; par does not, and indeed, does nothing when you are single-threaded. This makes this solution very fragile: at runtime, we may or may not decide to evaluate the other thunk in parallel depending on core availability. But we can still profitably use par in a single-threaded context, if it can manage pre-emptive switching between two consumers of a stream. This would be a pretty interesting primitive to have, and it would also be interesting to see some sort of semantics which makes clear the beneficial space effects of such a function. Another unbaked idea is that we already have a notion of good producers and consumers for stream fusion. It doesn’t seem like too far a stretch that we could use this analysis to determine when consumers could be merged together, improving space usage.
July 13, 2011This one’s for the MIT crowd. This morning, I finished my Facebook module for BarnOwl to my satisfaction (my satisfaction being asynchronous support for Facebook API calls, i.e. no more random freezing!) Getting it to run on Linerva was a bit involved, however, so here is the recipe.
- Setup a local CPAN installation using the instructions at sipb.mit.edu, using
local::lib. Don’t forget to add the setup code to .bashrc.mine, not .bashrc, and then source them. Don’t forget to follow prerequisites: otherwise, CPAN will give a lot of prompts. - Install all of the CPAN dependencies you need. For the Facebook module, this means
Facebook::Graph and AnyEvent::HTTP. I suggest using notest, since Any::Moose seems to fail a harmless test on Linerva. Facebook::Graph fails several tests, but don’t worry about it since we’ll be using a pre-packaged version. If you want to use other modules, you will need to install them in CPAN as well. - Clone BarnOwl to a local directory (
git clone git://github.com/ezyang/barnowl.git barnowl), ./autogen.sh, configure and make. - Run using
./barnowl, and then type the command :facebook-auth and follow the instructions!
Happy Facebooking!
Postscript. I am really, really surprised that there is not a popular imperative language that has green threads and pre-emptive scheduling, allowing you to actually write code that looks blocking, although it uses an event loop under the hood. Maybe it’s because being safe while being pre-emptive is hard…
Known bugs. Read/write authentication bug has been fixed. We seem to be tickling some bugs in BarnOwl’s event loop implementation, which is causing crashing on the order of day (making it tough to debug). Keep a backup instance of BarnOwl handy.
July 11, 2011It still feels a little strange how this happened. Not a year ago, I was pretty sure I was going to do the Masters of Engineering program at MIT, to make up for a “missed year” that was spent abroad (four years at MIT plus one at Cambridge, not a bad deal.)
But a combination of conversations with various researchers whom I greatly respect, nagging from my Dad, and an inability to really figure out who I would actually do my Master’s thesis under while at MIT meant that at some point about a month and a half ago, I decided to go for the graduate school admissions cycle this fall. Oh my. It feels like undergraduate admissions all over again (which was not really a pleasant experience), though this time around, what I need to write is the “Research Statement.”
One of the reasons I blogged recently about Mendeley was because I was hoping that Mendeley would give me some insights about the kinds of papers I found interesting, and would let me easily figure out the affiliations of those researchers. (Oops, not quite there yet.) Actually, a conversation with Simon Marlow was much more fruitful: I’m currently actively looking into Ramsey (Tufts), Morrisett (Harvard), Harper (CMU), Pierce/Weirich (UPenn) and Mazieres (Stanford). Of course, I can’t help but think that I’ve missed some key players around topics that I regularly discuss on my blog, so if any of you have some ideas, please do shout out.
The process (well, what little of it I’ve started) has been quite bipolar. I frequently switch between thinking, “Oh, look at this grad student, he didn’t start having any publications till he started grad school—so I’m OK for not having any either” to “Wow, this person had multiple papers out while an undergraduate, solved multiple open problems with his thesis, and won an NSF fellowship—what am I supposed to do!” I’m still uncertain as to whether or not I’m cut out to do research—it’s certainly not for everyone. But I do know I greatly enjoyed the two times I worked at industrial research shops, and I do know that I love teaching, and I definitely know I do not want a conventional industry job. Grad school it is.