Not Everyone Using noSQL is a Rails-Lovin’ Ass-Clown

so, i guess there’s this guy – ted dziuba – who’s apparently tired of all this nosql talk. i’m thinking to myself “wow, does this guy have an axe to grind or what? did he invent the rdbms or something?”

turns out, what he does is work on backend systems for read-heavy applications. stuff that’s easily aggregated and cached and cubed and slaved out to read servers. he was at google working on their intranet (i’d assume something like 20k monthly uniques), this pressflip search thing (and it’s, like, 5k monthly uniques) and now (the big one!) milo, a product search thing, with 200k monthly uniques (down from 400k).

oh, and i guess he likes to bash on valley entrepreneurs… whatever. i guess he just feels all “big fish in a little pond.” *shrug*

anyway. the way i see it, there are really two reasons why he hates nosql:

it doesn’t look like he’s ever done anything for a large, mainstream audience. i bet he still thinks getting slashdotted or techcrunched is the definition of “a lot of users.” he’s only worked on read-heavy, search and reporting type sites.

ding! lightbulb!

he just doesn’t have any experience with the problem set nosql solves. so just ignore him and move along, folks.

if you still wanna hang out, let’s look at his issues separately.

1. lack of large audience perspective.

Developing the app for Google-sized scale is a waste of your time, plus, there is no way you will get it right. Absolutely none.

this is especially obvious to me. he thinks there’s a tiny audience then his audience size and other than that, it’s just waaaaayyy up there to google or facebook.

um. no.

for example, there are over 600 facebook apps with at least 200k monthly uniques. and 300 of those have over half a million monthly uniques. (oh, and half a dozen of these facebook games have as many or more monthly users than the website for the “real” business of walmart.)

my point is, this isn’t 1998 anymore.

speaking from personal experience, getting slashdotted is about half the traffic of getting linked on boing boing. which is about half the traffic of getting linked off the bbc. throw in that farmville is about 10x the number of uniques as the bbc, and you start to see how wide the gap is and how much stuff there is in between.

i mean, name just about any brand your mom has heard of and their site gets 1,000,000+ uniques a month. ted – you need some perspective on just how big “mass market” really is.

and that, my friends, is a perfect representation of today’s web traffic. niche -> popular -> mainstream. those of us targeting mainstream users are dealing with an n^2 problem when it comes to scaling. there are more people online and those people are doing more things on more sites for more time than ever before in history.

any nielsen ratings data can tell you that. and with the gameification of everything these days, all of them are looking dead-center at my next point…

2. not big on write-heavy applications.

… and no, i don’t mean how your search engine spiders crawl and process and write metric tons of data. i’m not talking about the sphinx index you build every hour.

no, i’m talking about the real-time web. high availability of data, not just services. email, im, facebook – all of these have ruined offline aggregate processing for the mainstream. if their friend bob posts something, they expect it to *ping* pop up on their screen.

i’m talking about the raw lust for real time data. that’s not going anywhere. there’s not a web professional in the world who disagrees with that.

but, you ask, “what does that have to do with write-heavy applications?” well, if there’s no new data, then real time isn’t necessary and the website publishers sure as heck aren’t the ones publishing all of this data in real time. it’s gotta come from somewhere, right?

that defines the old days right there. it was a publisher’s dream. you’d just push out your new content to the rdbms, cache it in boring html on the front end, round-robin your web servers and done! there were no comments. no “liking”. no friends of friends. no content suggestions. no leaderboards. no viral. no earned virtual currency. no reputation systems. no “social media.”

none of that is easily cacheable.

because none of that is static.

because all of that is constantly updated and written back to the data store.

Besides, did you know that Google Adwords is implemented on top of MySQL? What, that business critical code that operates at massive scale doesn’t use BigTable?

in the old days, reads used to outnumber writes to the database 100 to 1. these days, it’s more like 7 to 1. seems like people don’t just consume information – they update and republish it too.

and sharing at internet speed is kinda useless if “internet speed” means waiting 3 hours for your stream processor to mashup the data and store it somewhere as an olap cube. because i totally believe that google adwords runs on mysql. IT’S READ-ONLY! that’s what mysql is good for – lots of read-heavy, cacheable data you can map against other read-heavy cacheable data.

look, ted. you seem like a really, really smart cookie at what you do. a helluva lot better at it than i am, i’m sure. but, i sure as hell wouldn’t assign you to design the next hit facebook game. but, i’d totally assign you to build the metrics system powering it.

it’s all about the right tool for the job. that terribly named “web 2.0” thing changed the job description. that means we need to find a new tool.

mysql is not it. nosql is.

which flavor of key/value store? which messaging queue? meh. i don’t care. that’s like asking which brand of screwgun do you wanna use? i don’t care, but every single one of them is better for driving in 3" deck screws than that old school phillips screwdriver you’ve got sitting over there. but that doesn’t mean your screwdriver’s not still useful, tho.

nobody is trying to kill your puppy, ted. i promise.

now, here’s where you’re thinking to yourself “where does this guy get off? who is this dork calling out ted?”

well, i’ve been building web and interactive stuff for “real” companies from universal pictures to hasbro. from hp to conagra to sybase. and doing it for 15 some-odd years. i’m currently working on one of the top 15 emerging facebook games where in the last 14 days we’ve painfully had to scale from 44,000 daily uniques to 104,000 daily uniques. believe me. not everyone using a nosql solution is a rails-lovin’ ass-clown, man.

so, i’m pretty comfortable saying: hey, ted. you’re gonna be waiting a long, long time. everyone else? ignore him and just use the right tool for the job.

m3mnoch.

p.s. ted. sorry about calling you out, but, if you’re gonna troll about silly crap, you have to expect it.