SCALE: Can you walk back through the inception of MongoDB about 8 years ago, and why you and your co-founders created the technology?

ELIOT HOROWITZ: We started working on the idea in the summer of 2007. We were actually working on some other products and realized that, once again, we would have to work around databases. In roughly the decade before we started working in MongoDB — between Dwight Merriman, the other co-founder, and myself — we realized we had probably built 12 custom solutions to work around database problems. Whether it was Oracle or BerkeleyDB or MySQL, nothing ever quite worked for what we needed it to. There were a number of reasons for this. One was scalability and the other, frankly, was just developer productivity and how much we can get done with the database.

So we really set out to design something that solves two huge problems. One is developer productivity; you want a data model that makes more sense for developers. Tables and rows are great for what they’re designed for. Relational data models are great for what they’re designed for. They were designed for accounting, for bookkeeping — anything you would use Excel for. And for those problems, they work unbelievably well.

But if you look at the kinds of things people are storing in databases today and the programming languages they’re using today, relational databases don’t quite fit that model. Most people are developing languages that have objects or structures, whether it’s in Java or Ruby or Python. People develop classes, they develop rich structures. What they do then is take those rich structures and attach some object-relational mapping to those structures, and try to store it in a relational database.

“We really wanted to take distributed systems to the next level, where we could build a distributed database system that was both accessible and easy enough for every company out there to use.”

This has a number of problems. One, it’s complicated: mapping a complex structure in a programming language back to a relational data model is complicated; it’s fraught with problems and inconsistencies; and it’s not very human. If you go look at a deconstructed model inside of Oracle, it is very complicated. For a typical enterprise application, a user profile takes around 75 tables.

Whereas in Mongo, you can usually store that as a single document. It’s just simpler to work with, and it lets people more intuitively understand what’s going on inside the database. When people understand it more intuitively and when it maps better to their program languages, it’s a more productive tool.

The other big thing is distributed systems. A lot of very big companies like Google and Facebook have spent a ton of time building unbelievably good distributed systems inside their organizations. At that point, no one had really designed a distributed system for most companies that gave them enough flexibility that they can actually take advantage of all the great stuff in distributed systems, but that also is simple enough to operate. That’s really what we wanted to do.

For example, horizontal scalability is becoming more and more important. Single machines can no longer quite handle the workloads. With cloud computing, you want to have lots of the same machines rather than having a special machine, so you really care about horizontal scalability. You also care about multi-datacenter and geographic issues. You don’t want to have to have users from Australia wait to go back to New York to talk to the database. You’d like to have databases spread around the world. You care about things like data governance, where you want to be able to keep certain data in certain countries. You want to be able to do things like auto-archiving data that’s always on, but on cheaper machines. All these things are distributed systems problems.

“[W]hen we starting designing MongoDB, it really was a research experiment in some ways. We were designing something that we thought we wanted but we didn’t know if anyone else wanted.”

When did you realize, “Hey, we might have something here. We might be able to make a company out of this”?

It took us about a year and a half. We started working on it roughly in the fall of 2007. The first public release was in February 2009. No one knew what we were; we just put out some stuff and we talked on the blog. We basically started talking at any user group, any meeting of developers we possibly could.

At that point, the first user that took it on for real was actually SourceForge. SourceForge, at that point, was still pretty large, and they were rebuilding their entire system and they wanted to do it in a very modern way that would be more flexible. They built it on top of MongoDB, and they wrote a blog post saying how great it was. And they were using it pre-1.0 — they were making a very big bet on a very early technology. It was incredibly successful for them, and they started writing about it and people started catching on. That was the summer of 2009.

In the spring of 2010 was when we had our first MongoBD Day in San Francisco, which was way more successful and way more crowded than we ever thought possible. At that point I think there were probably around 10 or 15 people at the company, and that’s when we sort of realized that “Wow, this is a real thing.” [Ed. note: Here’s my Gigaom story about the MongoSV conference in 2011.]

It’s also very interesting because when we starting designing MongoDB, it really was a research experiment in some ways. We were designing something that we thought we wanted but we didn’t know if anyone else wanted. We did a lot of things that we thought were the right things. We’d focus on things like the data model and distributed systems.

And everything else that was good from relational databases we took—indexes, the idea of a query language, the idea of a shell. We really tried to keep all the good things in relational databases, of which there are many, but change the two things that we think really needed to change. That let us move a lot faster than if we were trying to reinvent everything from scratch.

“By being open source, by being able to leverage this huge community, it lets us move very quickly and in very interesting ways without being omniscient—because we definitely are not.”

Around 2010 or 2011, it seemed MongoDB was suddenly everywhere. Did you capture lightning in a bottle, with cloud computing hitting critical mass and new class of developers building web and mobile apps?

I think adoption was because of the latter of the things you said, the whole new generation of developers. A document model is a simpler, more intuitive model for developers. There is whole new class of developers with whole new types of applications, where people want to move faster and faster on the product side, and where six months is way too long to get a new version of their product out. When you want the next version of your iPad app three months later, you need a database that can be as agile and as flexible as your product teams, and as intuitive. You don’t want technologies where as you add more and features, they get more and more complicated such that some point you are stuck.

One of our big early customers, they actually they told us they were 18 months behind on their product roadmap entirely because they couldn’t design their way out of the relational maps. They actually spent a year porting their entire application to MongoDB and ended up being able to catch up on their product roadmap. That’s not to say they couldn’t have done the same thing if they had known what they knew at the end, and they had started to design their relational application from Day One knowing that.

But that’s not how applications are built. Applications start and they evolve and over the course of a year, 5 years, 10 years, 20 years, 40 years. It’s being able to maintain the ability to innovate and to adapt is what documents really bring to the table.