Pretend This Optimization Doesn't Exist

In any modern discussion of algorithms, there's mention of being cache-friendly, of organizing data in a way that's a good match for the memory architectures of CPUs. There's an inevitable attempt at making the concepts concrete with a benchmark manipulating huge--1000x1000--matrices. When rows are organized sequentially in memory, no worries, but switch to column-major order, and there's a very real slowdown. This is used to drive home the impressive gains to be had if you keep cache-friendliness in mind.

Now forget all about that and get on with your projects.

It's difficult to design code for non-trivial problems. Beautiful code quickly falls apart, and it takes effort to keep things both organized and correct. Now add in another constraint: that the solution needs to access memory in linear patterns and avoid chasing pointers to parts unknown.

You'll go mad trying to write code that way. It's like writing a short story without using the letter "t."

If you fixate on the inner workings of caches, fundamental and useful techniques suddenly turn horrible. Reading a single global byte loads an entire cache line. Think objects are better? Querying a byte-sized field is just as bad. Spreading the state of a program across objects scattered throughout memory is guaranteed to set off alarms when you run a hardware-level performance analyzer.

Linked lists are a worst case, potentially jumping to a new cache line for each element. That's damning evidence against languages like Haskell, Clojure, and Erlang. Yet some naive developers insist on using Haskell, Clojure, and Erlang, and they cavalierly disregard the warnings of the hardware engineers and use lists as their primary data structure...

...and they manage to write code where performance is not an issue.

(If you liked this, you might enjoy Impressed by Slow Code.)

permalink January 31, 2012

previously