Association maps in mit-scheme

by Edward Z. Yang

I recently some did some benchmarking of persistent data structures in mit-scheme for my UROP. There were a few questions we were interested in:

  1. For what association sizes does a fancier data structure beat out your plain old association list?
  2. What is the price of persistence? That is, how many times slower are persistent data structures as compared to your plain old hash table?
  3. What is the best persistent data structure?

These are by no means authoritative results; I still need to carefully comb through the harness and code for correctness. But they already have some interesting implications, so I thought I'd share. The implementations tested are:

All implementations use eq? for key comparison.

/img/scheme-hamt/insert.png

Unsurprisingly, assoc beats out everyone else, since all it has to do is a simple cons. However, there are some strange spikes at regular intervals, which I am not sure of the origin; it might be the garbage collector kicking in.

/img/scheme-hamt/lookup-hit.png

Of course, you pay back the cheap updates in assoc with a linear lookup time; the story also holds true for weight-balanced trees, which have fast inserts but the slowest lookups.

/img/scheme-hamt/lookup-miss.png

The hamt really flies when the key isn't present, even beating out hash-tables until 15 elements or so.

Source code for running the benchmarks, our home-grown implementations, and graphing can be found at the scheme-hamt repository.