Posts : ezyang's blog

Posts

Refactoring Haskell code? May 12, 2010

I have to admit, refactoring Haskell code (or perhaps even just functional code) is a bit of a mystery to me. A typical refactoring session for me might look like this: sit down in front of code, reread code. Run hlint on the code, fix the problems it gives you. Look at the code some more. Make some local transformations to make a pipeline tighter or give a local subexpression a name. Decide the code is kind of pretty and functional and go do something else.

Nested Data Parallelism versus Creative Catamorphisms May 10, 2010

I got to watch (unfortunately not in person) Simon Peyton Jones’ excellent talk (no really, if you haven’t seen it, you should carve out the hour necessary to watch it) on Data Parallel Haskell (slides). The talk got me thinking about a previous talk about parallelism given by Guy Steele I had seen recently.

What’s the relationship between these two talks? At first I though, “Man, Guy Steele must be advocating a discipline for programmers, while Simon Peyton Jones’ is advocating a discipline for compilers.” But this didn’t really seem to fit right: maybe you have a clever catamorphism for the problem, the overhead for fully parallelizing everything is prohibitive. As Steele notes, we need “hybrid sequential/parallel strategies,” the most simple of which is “parallelize it until it’s manageable and run the fast sequential algorithm on it,” ala flat data parallelism. Nor is nested data parallelism a silver bullet; while it has wider applicability, there are still domains it fits poorly on.

Omnipresent Cabal May 7, 2010

A short public service announcement: you might think you don’t need Cabal. Oh, you might be just whipping up a tiny throw-away script, or a small application that you never intend on distributing. Cabal? Isn’t that what you do if you’re planning on sticking your package on Hackage? But the Cabal always knows. The Cabal is always there. And you should embrace the Cabal, even if you think you’re too small to care. Here’s why:

Name conflicts on Hackage May 5, 2010

Attention Conservation Notice. Unqualified identifiers that are used the most on Hackage.

Perhaps you dread the error message:

Ambiguous occurrence `lookup'
It could refer to either `Prelude.lookup', imported from Prelude
                      or `Data.Map.lookup', imported from Data.Map

It is the message of the piper that has come to collect his dues for your unhygenic unqualified unrestricted module import style.

Or perhaps your a library writer and trying to think up of a new symbol for your funky infix combinator, but you aren’t sure what other libraries have used already.

Design Patterns in Haskell May 3, 2010

Attention Conservation Notice. A listing of how Gang of Four design patterns might be equivalently implemented in Haskell. A phrasebook for object-oriented programmers dealing with functional programming concepts.

In their introduction to seminal work Design Patterns, the Gang of Four say, “The choice of programming language is important because it influences one’s point of view. Our patterns assume Smalltalk/C++-level language features, and that choice determines what can and cannot be implemented easily. If we assumed procedural languages, we might have included design patterns called ‘Inheritance,’ ‘Encapsulation,’ and ‘Polymorphism.’”

Art. Code. Math. (And mit-scheme) April 30, 2010

I was in rehearsal today, doodling away second oboe for Saint Saens’ Organ Symphony for the nth time, and it occurred to me: I’ve listened to and played this piece of music enough times to know the full overall flow as well as a good chunk of the orchestral parts, not just mine. So when the hymnal calls give way to the triumphant entrance of the organ in the last movement, or when the tempos start shifting, simultaneously speeding up and slowing down, at the end of the piece, it’s not surprising; it’s almost inevitable. Couldn’t have it any other way.

Inessential guide to fclabels April 28, 2010

Last time I did an Inessential guide to data-accessor and everyone told me, “You should use fclabels instead!” So here’s the partner guide, the inessential guide to fclabels. Like data-accessor the goal is to make record access and editing not suck. However, it gives you some more useful abstractions. It uses Template Haskell on top of your records, so it is not compatible with data-accessor.

Identification. There are three tell-tale signs:

The Problem with xUnit April 26, 2010

Tagline: Assertions considered not ideal.

I think automated tests are great. I used two particular flavors of test, the unit test and the integration test, extensively in HTML Purifier and they’re the only reason why I feel comfortable making changes to code that I first wrote in High School. The automated tests let me hack and then figure out if I broke anything with the single stroke of a button, rather than manually shove a few inputs in and see if they “look alright.” They’re also an informal specification of “what I wanted the code to do” when I originally wrote it, by the fine tradition of an example.

Creative catamorphisms April 23, 2010

The bag of programming tricks that has served us so well for the last 50 years is the wrong way to think going forward and must be thrown out.

Last week, Guy Steele came in and did a guest lecture “The Future is Parallel: What’s a Programmer to Do?” for my advanced symbolic class (6.945). It’s a really excellent talk; such an excellent talk that I had seen the slides for prior to the talk. However hearing Guy Steele talk about it in person really helped set things in context for me.

Association maps in mit-scheme April 21, 2010

I recently some did some benchmarking of persistent data structures in mit-scheme for my UROP. There were a few questions we were interested in:

For what association sizes does a fancier data structure beat out your plain old association list?
What is the price of persistence? That is, how many times slower are persistent data structures as compared to your plain old hash table?
What is the best persistent data structure?

These are by no means authoritative results; I still need to carefully comb through the harness and code for correctness. But they already have some interesting implications, so I thought I’d share. The implementations tested are:

« Newer Posts Older Posts »