09:35

30 October 2015

You can’t build a real-life system without caching.

That being said, it’s often the case that parts of the system you think are going to be slow aren’t. I’ve noticed a tendency to build out a huge stack of components (“we’ll have PostgreSQL, and Redis, and Celery, and Varnish, and…”) without actually measuring where the bottlenecks are.

Example: A counter.

Suppose you need a global counter for something. It needs to be super-fast, and available across all of the web front ends. It’s not transactional (you never “uncount” based on a rollback), but you do want it to be persistent.

Option 1: Drop Redis into the stack, use INCR to keep the counter, and have some other process that reads the counter and spills it into PostgreSQL, then have some other process that picks up the count when Redis starts and initializes it (or be smart enough to read from both places and add them when yo need it), and accept that there are windows in which you might use counts.

Option 2: Use SERIAL in PostgreSQL.

But option 2 is really really really slow compared to super-ultra-fast Redis, right?

Not really (test on an Amazon i2-2xlarge instance, client over local sockets, Python client):

Ten million INCR s in Redis: 824 seconds.

s in Redis: 824 seconds. Ten million SELECT nextval('') in PostgreSQL: 892 seconds.

So: Slower. 6.8 microseconds per increment slower. And no elaborate Redis tooling.

So, build for operation, apply load, then decide what to cache. Human intuition about what might be slow is almost certainly wrong.