Last week I wrote about our 10x insertion performance increase with MongoDB. We’ve continued our experimental integration of Fractal Tree® Indexes into MongoDB, adding support for clustered indexes. A clustered index stores all non-index fields as the “value” portion of the index, as opposed to a standard MongoDB index that stores a pointer to the document data. The benefit is that indexed lookups can immediately return any requested values instead of needing to do an additional lookup (and potential disk IOs) for the requested fields.

To create a clustered index you just need to add “clustering:true” as in the following example (note that version 2 indexes are Fractal Tree Indexes):

db.tokubench.ensureIndex({URI : 1}, {v : 2, clustering : true}) 1 db .tokubench .ensureIndex ( { URI : 1 } , { v : 2 , clustering : true } )

In this benchmark I measured the performance of a single threaded insertion workload combined with a range query retrieving 1000 documents greater than or equal to a random URI. The range query runs on a separate thread and sleeps 60 seconds after each completed query.

The inserted documents contained the following: URI (character), name (character), origin (character), creation date (timestamp), and expiration date (timestamp). We created a total of four secondary indexes: URI (clustered), name, origin, and creation date.

We ran the benchmark with journaling disabled and the default WriteConcern of disabled.

My benchmark client is available here.

Benchmark Environment

Sun x4150, (2) Xeon 5460, 8GB RAM, StorageTek Controller (256MB, write-back), 4x10K SAS/RAID 0

Ubuntu 10.04 Server (64-bit), ext4 filesystem

MongoDB v2.2.RC0

Benchmark Results

The exit velocity of standard MongoDB was 1,092 inserts per second at 38 million document insertions versus MongoDB with Fractal Tree Indexes exit velocity of 12,241 inserts per second at 49 million document insertions: an improvement of 1,020%.

More interesting is the query performance. Note that this is a latency graph where lower is better and also that the Y-axis is on a log scale to make comparison easier. MongoDB exited with an average of 16,668 milliseconds per query versus MongoDB with Fractal Tree Indexes average of 62 milliseconds: a 26,816% improvement.

As I said in my last post, we’re not MongoDB experts by any stretch but we wanted to share these results with the community and get people’s thoughts on applications where this might help, suggestions for next steps, and any other feedback. Also, if you are interested in learning more about TokuDB, please stop by to hear us speak at StrangeLoop, MySQL Connect, Percona Live, or join our introductory webinar next week.

By the way, MongoDB also supports covered indexes, which I will talk about in my next post. Covered indexes can provide some of the benefits of a clustered index, but can have significant drawbacks as well.