2010/11 : ezyang's blog

2010/11

My first proof in Isabelle November 29, 2010

One of the distinctive differences between academic institutions in the United States and in Great Britain is the supplementary learning outside of lectures. We have recitations in the US, which are something like extra lectures, while in the UK they have tutorials, or supervisions as they are called in Cambridge parlance. As always, they are something of a mixed bag: some supervisors are terrible, others are merely competent, and others inspire and encourage a sort of interest in the subject far beyond the outlines of the course.

Reflexivity. Qed. November 26, 2010

In which Mendeley, Software Foundations and Coq are discussed.

I was grousing on #haskell-blah one day about how annoying it was to organize all of the papers that I have downloaded (and, of course, not read yet.) When you download a paper off the Internet, it will be named all sorts of tremendously unhelpful things like 2010.pdf or icfp10.pdf or paper.pdf. So to have any hope of finding that paper which you skimmed a month ago and vaguely recall the title of, you’ll need some sort of organization system. Pre-Mendeley, I had adopted the convention of AuthorName-PaperTitle.pdf, but I’d always feel a bit bad picking an author out of a list of five people to stick at the beginning, and I still couldn’t ever find the paper I was looking for.

Integer sequences every computer scientist should know? November 24, 2010

The On-Line Encyclopedia of Integer Sequences is quite a nifty website. Suppose that you’re solving a problem, and you come up with the following sequence of integers: 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2... and you wonder to yourself: “huh, what’s that sequence?” Well, just type it in and the answer comes back: A007814, along with all sorts of tasty tidbits like constructions, closed forms, mathematical properties, and more. Even simple sequences like powers of two have a bazillion alternate interpretations and generators.

Is multiply-carry strongly universal? November 22, 2010

I’ve been wanting to implement a count-min sketch for some time now; it’s a little less widely known than the bloom filter, a closely related sketch data structure (that is, a probabilistic data structure that approximates answers to certain queries), but it seems like a pretty practical structure and has been used in some interesting ways.

Alas, when you want to implement a data structure that was proposed less than a decade ago and hasn’t found its way into textbooks yet, there are a lot of theoretical vagaries that get in the way. In this particular case, the theoretical vagary was selection of a universal hash family. Having not taken a graduate-level algorithms course yet, I did not know what a universal hash family was, so it was off to the books for me.

Cambridge potpourri November 19, 2010

In which Edward tells some stories about Cambridge and posts lots of pictures.

Apparently, Alyssa B. Hacker (sic) went on the Cambridge-MIT exchange.

This is perhaps a phenomenon of having been around MIT for too long, a campus which has a reputation for not being too picturesque (I can probably count the actually pretty spots on campus with one hand), but it’s real easy to stop appreciating how nice the buildings and architecture are around here. Look!

Haskell.org committee November 17, 2010

Ever try to go to haskell.org and notice that it was down? Well, now you have someone to complain to: the haskell.org committee has formed and apparently I’m on it. 8-)

One of the first things we’ll be doing is moving haskell.org from a server being hosted at Yale (the hosting has been good, but what will happen is the server will go down during the weekend and there will be no one to kick it until Monday) to some dedicated hardware. I must admit, I do feel a bit sheepish for being on the committee but not having any bits (or direct experience) to help do the maintenance work—hopefully that will change soon.

It's just a longjmp to the left November 15, 2010

And then a signal to the ri-i-i-ight.

One notable wart with readline is that if you ^C during the prompt, nothing happens, and if you do a second ^C (and weren’t buggering around with the signal handlers) your entire program unceremoniously terminates. That’s not very nice! Fortunately, readline appears to be one of the rare C libraries that actually put some work into making sure that you could longjmp out of a signal handler and not completely break the library’s internal state (they do this with liberal masking and unmasking, and their own signal handler which cleans up and then rethrows the signal).

You are in a maze of twisty little passages, all alike... (a GHC hacking post) November 12, 2010

About a month ago I decided that it would be cool if I could solve the bug GHC’s runtime never terminates unused worker threads. Well, I just got around to looking at it today, and after wandering aimlessly around the twisty maze that is the GHC RTS for an hour or so, I finally found a light at the end of a tunnel, in the form of a heart-warmingly simple patch. I’ve sent mail off to Simon Marlow to make sure the light isn’t actually a train, but it occurred to me that it would be interesting to look at my command history and blog about the process by which I came to the conclusion that line 464 of Capability.c was the correct place to add my change, since this sort of mental journey is not the one that is really ever recorded anywhere in any shape or form.

mod_fcgid 2.3 is broken (fixed in 2.3.6) November 10, 2010

This is a post to get some Google juice for a problem that basically prevented Scripts from being able to cut over from Fedora 11 to Fedora 13. The cluster of new machines kept falling over from load, and we kept scratching our heads, wondering, “Why?”

Turns out, the following commit broke mod_fcgid in a pretty terrifying way: essentially, mod_fcgid is unable to manage the pools of running FastCGI processes, so it keeps spawning new ones until the system runs out of memory. This is especially obvious in systems with large amounts of generated virtual hosts, i.e. people using mod_vhost_ldap. It got fixed in mod_fcgid 2.3.6, which was released last weekend.

Medieval medicine and computers November 8, 2010

This is a bit of a provocative post, and its impressions (I dare not promote them to the level of conclusions) should be taken with the amount of salt found in a McDonald’s Happy Meal. Essentially, I was doing some reading about medieval medicine and was struck by some of the similarities between it and computer engineering, which I attempt to describe below.

Division between computer scientists and software engineers. The division between those who studied medicine, the physics (a name derived from physica or natural science) and those who practiced medicine, the empirics, was extremely pronounced. There was a mutual distrust—Petrarch wrote, “I have never believed in doctors nor ever will” (Porter 169)—that stemmed in part from the social division between the physics and empirics. Physics would have obtained a doctorate from a university, having studied one of the highest three faculties possible (the others being theology and law), and tended to be among the upper strata of society. In fact, the actual art of medicine was not considered “worthy of study,” though the study of natural science was. (Cook 407).

Older Posts »