This post is part 3 of a 3-part series about tuning Elasticsearch Indexing. Part 1 can be found here and Part 2 can be found here.

This tutorial series focuses specifically on tuning elasticsearch to achieve maximum indexing throughput and reduce monitoring and management load.

Elasticsearch provides sharding and replication as the recommended way for scaling and increasing availability of an index. A little over allocation is good but a bazillion shards is bad. It is difficult to define what constitutes too many shards, as it depends on their size and how they are being used. A hundred shards that are seldom used may be fine, while two shards experiencing very heavy usage could be too many. Monitor your nodes to ensure that they have enough spare capacity to deal with exceptional conditions.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster."

Scaling out should be done in phases. Build in enough capacity to get to the next phase. Once you get to the next phase, you have time to think about the changes you need to make to reach the phase after that. One can also gain a lot from optimizing the way in which you transfer indexing requests to Elasticsearch like - Do you have to send a separate request for each document? Or can you buffer documents in order to use the bulk API for indexing multiple documents with a single request?

We previously looked at indexing performance metrics and settings like refresh, flushing, segment merging and auto throttling. This tutorial will list a collection of ideas to increase indexing throughput of Elasticsearch with reference to sharding and replication, requests, clients and storage.

Scale Out The Elasticsearch Cluster

Elasticsearch is built to scale. It will run very happily on your machine or in a cluster containing hundreds of nodes, and the experience is almost identical. Growing from a small cluster to a large cluster is almost entirely automatic and painless. Growing from a large cluster to a very large cluster requires a bit more planning and design, but it is still relatively painless.

Tutorial: Auto-Scaling Kubernetes on DigitalOcean with Supergiant

The default settings in Elasticsearch will take you a long way, but to get the most bang for your buck, you need to think about how data flows through your system. It can be time-based data (such as log events or social network streams, where relevance is driven by recency) or user-based data (where a large document collection can be subdivided by user or customer).

The create index API allows to instantiate an index. Elasticsearch provides support for multiple indices, including executing operations across several indices. Each index created can have specific settings associated with it. The number of shards of an index needs to be set on index creation and cannot be changed later. In case you do not know exactly how much data to expect, you may consider over allocating a few shards (but not too many, they are not free!) to have some spare capacity available. The number of replicas, however, can be change later.

curl -XPUT 'localhost:9200/my_index -d '{ "settings" : { "index" : { "number_of_shards" : 3, "number_of_replicas" : 2 } } }'

Index aliases may also provide a way (with limitations) of scaling out an index at a later point in time. The index aliases API allow to alias an index with a name, with all APIs automatically converting the alias name to the actual index name. An alias can also be mapped to more than one index, and when specifying it, the alias will automatically expand to the aliases indices. An alias can also be associated with a filter that will automatically be applied when searching, and routing values. An alias cannot have the same name as an index.

Here is a sample of associating the alias alias1 with index test1:

curl -XPOST 'localhost:9200/_aliases' -d '{ "actions" : [ { "add" : { "index" : "test1", "alias" : "alias1" } } ] }'

And here is removing that same alias:

curl -XPOST 'localhost:9200/_aliases' -d '{ "actions" : [ { "remove" : { "index" : "test1", "alias" : "alias1" } } ] }'

Renaming an alias is a simple remove then add operation within the same API. This operation is atomic, no need to worry about a short period of time where the alias does not point to an index:

curl -XPOST 'localhost:9200/_aliases' -d '{ "actions" : [ { "remove" : { "index" : "test1", "alias" : "alias1" } }, { "add" : { "index" : "test2", "alias" : "alias1" } } ] }'

Replication

Replication is important for two primary reasons:

It provides high availability in case a shard or node fails. For this reason, it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.

It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.

Replication is an important feature for being able to cope with failure, but the more replicas you have the longer indexing will take. Thus, for raw indexing throughput it would be best to have no replicas at all. Luckily, in contrast to the number of shards, you may change the number of replicas of an index at any time, which gives us some additional options.

In certain situations, such as populating a new index initially, or migrating data from one index to another, it may prove beneficial to start without replication and only add replicas later, once the time-critical initial indexing has been completed.

Number of replicas can be updated via Update Indices API:

curl -XPUT 'localhost:9200/my_index/_settings' -d '{ "index" : { "number_of_replicas" : 0 } }'

Once done with the indexing operation, the number of replicas can be set back to the relevant value.

Use Dedicated Data Nodes

Data nodes hold the shards that contain the documents you have indexed. Data nodes handle data related operations like CRUD, search, and aggregations. These operations are I/O-, memory and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.

The main benefit of having dedicated data nodes is the separation of the master and data roles.

To create a dedicated data node, set:

node.master: false node.data: true node.ingest: false

When aggregator nodes handle search queries and only contact data nodes as needed, they take load off the data nodes which will then have more capacity for handling indexing requests.

If all of your data nodes are running low on disk space, you will need to add more data nodes to your cluster. You will also need to make sure that your indices have enough primary shards to be able to balance their data across all those nodes. Elasticsearch disk based shard allocation takes available disk space into account when allocating shards to nodes. By default, it will not assign shards to nodes that have over 85 percent disk in use.

There are two remedies for low disk space. One is to remove outdated data and store it off the cluster. This may not be a viable option for all users, but, if you’re storing time-based data, you can store a snapshot of older indices data off-cluster for backup, and update the index settings to turn off replication for those indices.

Tutorial: Auto-Scaling Kubernetes Cluster on AWS EC2 with Supergiant

The second approach is the only option for you if you need to continue storing all of your data on the cluster: scaling vertically or horizontally. If you choose to scale vertically, that means upgrading your hardware. However, to avoid having to upgrade again down the line, you should take advantage of the fact that Elasticsearch was designed to scale horizontally. To better accommodate future growth, you may be better off reindexing the data and specifying more primary shards in the newly created index.

Optimizing Bulk Requests

The Bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase the indexing speed. Each subrequest is executed independently, so the failure of one subrequest won’t affect the success of the others. If any of the requests fail, the top-level error flag is set to true and the error details will be reported under the relevant request.

The possible actions are index, create, delete and update. index and create expect a source on the next line, and have the same semantics as the op_type parameter to the standard index API (i.e. index will add or replace a document whereas create will fail if a document with the same index and type exists already). delete does not expect a source on the following line, and has the same semantics as the standard delete API. update expects that the partial doc, upsert and script and its options are specified on the next line.

The entire bulk request needs to be loaded into memory by the node that receives our request, so the bigger the request, the less memory available for other requests. There is an optimal size of bulk request. Above that size, performance no longer improves and may even drop off. The optimal size, however, is not a fixed number. It depends entirely on your hardware, your document size and complexity, and your indexing and search load.

Fortunately, it is easy to find this sweet spot: Try indexing typical documents in batches of increasing size. When performance starts to drop off, your batch size is too big. A good place to start is with batches of 1,000 to 5,000 documents or, if your documents are very large, with even smaller batches. Bulk sizing is dependent on your data, analysis, and cluster configuration, but a good starting point is 5–15 MB per bulk. Note that this is physical size. Document count is not a good metric for bulk size. For example, if you are indexing 1,000 documents per bulk, keep the following in mind:

1,000 documents at 1 KB each is 1 MB.

1,000 documents at 100 KB each is 100 MB.

Those are drastically different bulk sizes. Bulks need to be loaded into memory at the coordinating node, so it is the physical size of the bulk that is more important than the document count.

Start with a bulk size around 5–15 MB and slowly increase it until you do not see performance gains anymore. Then start increasing the concurrency of your bulk ingestion (multiple threads, and so forth).

Monitor your nodes with Marvel and/or tools such as iostat, top, and ps to see when resources start to bottleneck. If you start to receive EsRejectedExecutionException, your cluster can no longer keep up: at least one resource has reached capacity. Either reduce concurrency, provide more of the limited resource (such as switching from spinning disks to SSDs), or add more nodes.

When ingesting data, make sure bulk requests are round-robined across all your data nodes. Do not send all requests to a single node, since that single node will need to store all the bulks in memory while processing.

Storage

If you’ve been following the normal development path, you’ve probably been playing with Elasticsearch on your machine or on a small cluster of machines lying around. But when it comes time to deploy Elasticsearch to production, there are a few recommendations that you should consider. Nothing is a hard-and-fast rule; Elasticsearch is used for a wide range of tasks and on a bewildering array of machines. But these recommendations provide good starting points based on our experience with production clusters.

Disks are usually the bottleneck of any modern server. Elasticsearch heavily uses disks, and the more throughput your disks can handle, the more stable your nodes will be. Here are some tips for optimizing disk I/O:

If you can afford SSDs, they are by far superior to any spinning media. SSD-backed nodes see boosts in both query and indexing performance. Alternately, If you use spinning media, try to obtain the fastest disks possible (high-performance server disks, 15k RPM drives).

Use RAID 0. Striped RAID will increase disk I/O, at the obvious expense of potential failure if a drive dies. There is no need to use mirroring or parity variants of RAID, since high availability is built into Elasticsearch via replicas. Using RAID 0 is an effective way to increase disk speed, for both spinning disks and SSD.

Avoid using EFS as the sacrifices made to offer durability, shared storage, and grow/shrink come at performance cost. Such file systems have been known to cause corruption of indices, and due to Elasticsearch being distributed and having built-in replication, the benefits that EFS offers are not needed.

Do not use remote-mounted storage, such as NFS or SMB/CIFS. The latency introduced here is directly opposed to performance.

If you are on EC2, beware of EBS. Even the SSD-backed EBS options are often slower than local instance storage. EBS works well for running a small cluster (1-2 nodes) but cannot tolerate the load of larger searching and indexing infrastructure. If EBS is used, then leverage provisioned IOPS to ensure performance.

Finally, avoid network-attached storage (NAS). People routinely claim their NAS solution is faster and more reliable than local drives. Despite these claims, we have never seen NAS live up to its hype. NAS is often slower, displays larger latencies with a wider deviation in average latency, and is a single point of failure.

Give It a Whirl!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, Amazon or Microsoft Azure data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch.

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.