Sharding is a database technique where you break up a big database into many smaller ones. Instead of having 1 million customers on a single, big iron machine, you perhaps have 100,000 customers on 10 different, smaller machines.

The general advise on sharding is that you don’t until you have to. It’s similar to Martin Fowler’s First Law of Distributed Object Design: Don’t distribute your objects! Sharding is still relatively hard, has relatively poor tool support, and will definitely complicate your setup.

Now I always knew that the inevitable day would come where we would have no choice. We would simply have to shard because there was no more vertical scaling to be done. But that day seems to get pushed further and further into the future.

Bigger caches, more reads

Our read performance is in some aspect being taken care of by the fact that you can get machines with 256GB RAM now. We upgraded the Basecamp database server from 32GB to 128GB RAM a while back and we thought that would be the end of it.

The box was maxed out and going beyond 128GB at the time was stupid expensive. But now there’s 256GB to be had at a reasonable price and I’m starting to think that by the time we reach that, there’ll be reasonably priced 512GB machines.

So as long as Moore’s law can give us capacity jumps like that, we can keep the entire working set in memory and all will be good. And even if we should hit a ceiling there, we can still go to active read slaves before worrying about sharding.

The bigger problem is writes

Traditionally it hasn’t been read performance that caused people to shard anyway. It has been write performance. Our applications are still very heavy on the reads vs writes, so it’s less of a problem than it is for many others.

But with the rise of SSD, like Fusion-IO’s ioDrive that can do 120K IOPS, it seems that we’re going to be saved by the progress of technology once again by the time we’ll need it.

Punt on sharding

So where does that leave sharding? For us, we’re in the same position we’ve been in for the past few years. We just don’t need to pay the complexity tax yet, so we don’t. That’s not to say that sharding doesn’t have other benefits than simply allowing that which otherwise couldn’t be, but the trade is not yet good enough.

One point of real pain we’ve suffered, though, is that migrating a database schema in MySQL on a huge table takes forever and a day. That’s a very real problem if you want to avoid an enterprisey schema full of kludges put in place to avoid adding, renaming, or dropping columns on big tables. Or avoid long scheduled maintenance windows.

I really hope that the clever chaps at MySQL comes up with something more reasonable for that problem, though. I’m told that PostgreSQL is a lot more accommodating in this regard, so hopefully competition will rise all boats for that.

Don’t try to preempt tomorrow

I guess the conclusion is that there’s no use in preempting the technological progress of tomorrow. Machines will get faster and cheaper all the time, but you’ll still only have the same limited programming resources that you had yesterday.

If you can spend them on adding stuff that users care about instead of prematurely optimizing for the future, you stand a better chance of being in business when that tomorrow finally rolls around.