Both Couchbase Server and MongoDB allow you to divide your dataset into groups of documents: Couchbase has buckets and MongoDB has collections.

While MongoDB collections are of an equivalent scope to relational tables, Couchbase Server buckets are perhaps more the equivalent of a relational database.

That distinction matters because usually you’d want no more than ten buckets in a single Couchbase cluster. This makes them unsuitable as namespaces. Instead they serve as a way to share configuration and modeling decisions between similar types of documents.

When to use multiple buckets

First, we need to think about how you can allocate resources to buckets. The two big considerations are:

RAM

Views and indexes.

When you create a bucket, you allocate to it a portion of each machine’s RAM. The RAM you give to a bucket should be big enough to store the working set of that data plus the few bytes of metadata associated with each document.

This means you can allocate different amounts of RAM appropriately to different datasets based on how you access them.

Similarly, Couchbase views and indexes run across the documents inside a bucket, much as a MongoDB map-reduce query runs across a single collection.

If you have some documents that don’t need indexing — because you only ever access them through their key — and you have some groups of documents that have different velocities, you can see that it would be prudent never to run indexers on the first set of data and to run the indexers with appropriate intervals across the rest.

Dividing our data into different buckets lets us accomplish both good use of RAM and of the CPU time needed by the indexers.

Let’s look at an example of an ecommerce application: the data you’d store, its profile and how you respond to that in your bucket configuration.

Data type Data profile Bucket profile Sessions Fast responses, key-value access, predictable concurrent sessions RAM to fit typical number of live sessions, no indexing User profiles Fast responses while users active, data changes slowly RAM to fit user profiles for typical number of live sessions, indexing on Order data Read-heavy after initial creation, short lifetime RAM to fit orders for typical number of live sessions, indexing on Product data Fast responses needed, read heavy RAM to fit entire catalogue, indexing on