Posts : ezyang's blog

Posts

On type synonyms June 15, 2011

I recently had to remove a number of type synonyms from the GHC code base which were along the lines of type CmmActuals = [CmmActual]. The process made me wonder a little about when type synonyms are appropriate for Haskell code. The Wikibooks article says type synonyms are “for making the roles of types clearer or providing an alias to, for instance, a complicated list or tuple type” and Learn You a Haskell says they “make more sense to someone reading our code and documentation.” But under what circumstances is this actually true?

A Taxonomy of Socialization on the Internet June 13, 2011

There’s networking, and then there’s socialization. Networking is establishing the link, knowing how to find and talk to the person if the need arises. Socialization is communication for the sake of communication; it strengthens networks and keeps people in touch. In many ways, it is the utility a social networking site provides. Trouble is, there are so many different ways to communicate on the Internet these days. In this blog post, I try to explain the essential differences of socialization over the Internet. I believe what is called one-to-many socialization is the fundamental characteristic of the newest social patterns (facilitated by Facebook and Twitter), and describe some of the challenges in designing methods for consuming and aggregating this information.

Pinpointing space leaks in big programs June 10, 2011

What is the biggest possible Haskell program that you could try debugging a space leak in? One very good choice is GHC, weighing in nearly a 100k lines of code (though, thankfully, 25% of that figure is comments.) Today, I’m going to describe one such space leak that I have fixed in GHC. This is not a full story: my code was ultimately the culprit, so not covered is how to debug code you did not write. However, I still like to think this story will cover some of the major points:

Measurement, quantification and reduction June 8, 2011

Today we continue the theme, “What can Philosophy of Science say for Software Engineering,” by looking at some topics taken from the Philosophy of Physical Sciences.

Measurement and quantification

Quantification is an activity that is embedded in modern society. We live by numbers, whether they are temperature readings, velocity, points of IQ, college rankings, safety ratings, etc. Some of these are uncontroversial, others, very much so, and a software engineer must always be careful about numbers they deal in, for quantification is a very tricky business.

What Philosophy of Science Can Say for Software Engineers June 6, 2011

I spent part of my year in Cambridge reading the History and Philosophy of Science course. It has been a thrilling and enlightening course, and I cannot recommend it highly enough for anyone lucky enough to take the HPS strand at Cambridge. Of course, I was a bit of an odd one out, since the course is designed for Natural Science majors, and I am, of course, a Computer Scientist.

The Cryptography of Bitcoin June 3, 2011

It is actually surprisingly difficult for a layperson to find out precisely what cryptography Bitcoin uses, without consulting the source of Bitcoin directly. For example, the opcode OP_CHECKSIG, ostensibly checks the signature of something… but there is no indication what kind of signature it checks! (What are opcodes in Bitcoin? Well it turns out that the protocol has a really neat scripting system built in for building transactions. You can read more about it here.) So in fact, I managed to get some factual details wrong on my post Bitcoin is not decentralized, which I realized when commenter cruzer claimed that a break in the cryptographic hash would only reduce mining difficulty, and not allow fake transactions.

Bitcoin is not decentralized June 1, 2011

Bitcoin was designed by Satoshi Nakamoto, and the primary client is developed by a bunch of folks at bitcoin.org. Do you care who these people are? In theory, you shouldn’t: all they do is develop an open source client for an open source protocol. Anyone else can develop their own client (and some people have) and no one, save the agreement of everyone in the Bitcoin network, can change the protocol. This is because the Bitcoin network is designed to be decentralized.

Chain Rule + Dynamic Programming <br />= Neural Networks May 30, 2011

(Guess what Edward has in a week: Exams! The theming of these posts might have something to do with that…)

At this point in my life, I’ve taken a course on introductory artificial intelligence twice. (Not my fault: I happened to have taken MIT’s version before going to Cambridge, which also administers this material as part of the year 2 curriculum.) My first spin through 6.034 was a mixture of disbelief at how simple the algorithms were, indignation at their examination methods, and the vague sense at the end that I really should have paid more attention. My second time through, I managed to distill a lot more algorithmic content out of the course, since I wasn’t worrying as much about the details.

An insufficiently lazy map May 27, 2011

Another common thunk leak arises from mapping functions over containers, which do not execute their combining function strictly. The usual fix is to instead use a strict version of the function, ala foldl' or insertWith', or perhaps using a completely strict version of the structure. In today’s post, we’ll look at this situation more closely. In particular, the questions I want to answer are as follows:

Why do we need to create strict and lazy versions of these functions—why can’t these leaks be fixed by the user adding appropriate bang-patterns to some functions?
Though introducing a stricter API is usually the correct fix, in some circumstances, the problem is not that the API is insufficiently strict, but that the data structure is too insufficiently lazy (that is, inappropriately spine strict.) That is to say, what do I mean by an insufficiently lazy map?
For data structures in which spine-strictness is necessary, is there any reason that this strictness should not extend to the values themselves? I want to argue that in fact, all spine strict data structures should also be value strict. This may be a bit controversial.

Example

Our example is a very simple data structure, the spine-strict linked list:

If it has lots of comments, it's probably buggy May 25, 2011

Yesterday we had guest speaker Byron Cook come in to give a talk about SLAM, a nice real-world example of theorem proving technology being applied to device drivers.

Having worked in the trenches, Byron had some very hilarious (and interesting) quips about device driver development. After all, when a device driver crashes, it’s not the device driver writer that gets blamed: it’s Microsoft. He pointed out that, in a hardware company, “If you’re not so smart, you get assigned to write software drivers. The smart people go work on hardware”, and that when you’re reading device driver code, “If there are a lot of comments and they’re misspelled, there’s probably a bug.” Zing! We’re always used to extolling the benefits of commenting your code, but it certainly is indisputable that writing comments can help clarify confusing code to yourself, whereas if the code wasn’t confusing in the first place you wouldn’t have felt the need to write comments anyway. Thus, one situation is some guru from the days of yore wrote very clever code, and then you came along and weren’t quite clever enough to fully understand what was going on, so you wrote lots of comments to explain the code to yourself as you went along. Well, it’s not the comment’s fault, but the fact that the code was too clever for you probably means you introduced a bug when you made your modifications.

« Newer Posts Older Posts »