Thanks to my collegue Andreas Chatzakis for this great post on how to use Elasticsearch with AWS OpsWorks.

AWS OpsWorks is an application management service that makes it easy to deploy and operate applications of all shapes and sizes. OpsWorks uses Chef to install and configure software on Amazon EC2 instances and lets you automate any task that can be scripted. In this blog post, we will show a step-by-step example for how to install Elasticsearch on Amazon EC2 instances using OpsWorks.

Elasticsearch is a popular open source, distributed, real-time search and analytics engine with a lot of functionality and a thriving community. It is architected from the ground up to operate in a distributed architecture which allows it to take advantage of the scalability and reliability characteristics of the EC2 platform.

While it is easy to get started and install Elasticsearch, configuring and running a multi-node production environment requires careful planning and continuous monitoring. This is where an application management service like OpsWorks can help by automating and simplifying the deployment and the operations involved in running a distributed search layer on EC2.

OpsWorks & Chef Because OpsWorks utilizes the Chef framework, instead of starting from scratch, we can leverage the community-built cookbooks that implement some of the best practices out of the box. These cookbooks will install Elasticsearch and its dependencies and set up a service to control the Elasticsearch process on each node. They will also configure the Elasticsearch AWS Cloud plugin that allows Elasticsearch processes, running on different EC2 instances, to discover each other and form a single cluster. This is great for the scalability and reliability of our cluster; we can simply launch additional instances when we need them and they will automatically join the cluster. And in case of an instance failure, OpsWorks’ auto healing feature can replace that node. The new instance would then join the cluster automatically with no need for manual intervention (e.g. there is no need to manage a file with the list of IP addresses that form the cluster). The community cookbook also follows some of the best practices around increasing the open files OS limits, allocating half the instance’s RAM for the JVM etc.

Step 1: Store cookbooks on S3

AWS OpsWorks can retrieve custom cookbooks from a variety of repository types. In our case we will use an S3 bucket which is typically the preferred option for a private archive.

We are basing our example on version 0.3.7 of the excellent public cookbook that the community created and maintains at https://github.com/elasticsearch/cookbook-elasticsearch . This will install ElasticSearch version 0.90.5. We will also need to include the following dependencies:

https://github.com/opscode-cookbooks/apt (v 2.3.8)

https://github.com/opscode-cookbooks/ark (v 0.4.2)

https://github.com/opscode-cookbooks/bluepill (v 2.3.1)

https://github.com/opscode-cookbooks/build-essential (v 1.4.2)

https://github.com/opscode-cookbooks/java (v 1.21.2)

https://github.com/apsoto/monit (v 0.6)

https://github.com/opscode-cookbooks/nginx (v 2.2.2)

https://github.com/opscode-cookbooks/ohai (v 1.1.12)

https://github.com/poise/python (v 1.4.6)

https://github.com/opscode-cookbooks/rsyslog (v 1.11.0)

https://github.com/opscode-cookbooks/windows (v 1.30.0)

https://github.com/opscode-cookbooks/yum (v 3.1.0)

https://github.com/opscode-cookbooks/yum-epel (v 0.3.4)

In addition we have created a few custom recipes that you can find here

The esplugin recipe (layer-custom/recipes/esplugins.rb) automates the installation of the elasticsearch-head plugin to add a nice GUI for monitoring the state of the cluster. The community cookbook version we tested had an issue with the installation of the plugins which is why we built our own custom recipe.

A custom recipe was also required to configure monit so that it automatically restarts the elasticsearch service if it fails. The way the community cookbook sets up monit is not compatible with the OpsWorks configuration (OpsWorks itself uses monit to ensure its own agent is running). So our custom recipe simply copies the cookbook’s monit configuration file for the elasticsearch process to /etc/monit.d (layer-custom/recipes/esmonit.rb).

Another custom recipe (layer-custom/recipes/allocation-awareness.rb) configures Elasticsearch so that it is aware of each instance’s availability zone and spreads shards and their replicas so that even an Availability Zone disruption does not eliminate both a shard and its replica(s). This uses Opsworks’ Stack configuration JSON to retrieve the availability zone in which the particular instance is deployed and passes this information to the Elasticsearch configuration.

Finally we created a custom recipe (layer-custom/recipes/cloudwatch-custom.rb) that pushes custom metrics into Amazon CloudWatch. In the example we simply send the number of Elasticsearch data nodes in the cluster. Based on that, we could create an alarm to notify us when this falls below the expected value. It would be simple to expand on this and add additional metrics.

We placed all the above cookbooks under a folder called ‘cookbooks’, which we zipped and uploaded to an S3 bucket. We also created an AWS IAM user to give OpsWorks access to that bucket. See the OpsWorks documentation for more information about cookbook repositories.

Step 2: Create an IAM Role for node discovery & custom Cloudwatch metrics We created an AWS IAM Role called ‘elasticsearch’ that will allow our EC2 instances to discover each other through the Elasticsearch AWS cloud plugin (which relies on the AWS API) and push custom metrics to CloudWatch. Step 3: Add the OpsWorks Stack The next step was to create our Opsworks stack. We selected the region to deploy in and set Amazon Linux as our operating system of choice: Under the advanced options we selected 11.4 as our Chef version and specified the S3 object that contains our cookbooks. We then used the custom JSON functionality of OpsWorks to override some of the default attributes of the community cookbook. Note: make sure to replace the username and password entries for nginx. The above attributes will: install Oracle’s JVM

setup nginx to act as a password-protected proxy (In a real production environment you would want to lock this down further e.g. by deploying in a VPC and restricting access via security groups and network ACLs).

specify parameters around the local gateway (to be able to automatically recover data from the Amazon EBS volumes if a full restart of the cluster is performed)

configure the AWS Cloud plugin to automate the discovery of the rest of the cluster based on a specific tag

define the parameter (in our case rack_id=availability zone) that will help Elasticsearch spread data redundantly in separate facilities.

setup the minimum_master_nodes value to (N+1)/2=2 (so that we avoid the split-brain scenario in the scenario of N=3 data nodes in total) Step 4: Add the Elasticsearch layer We then created the Elasticsearch layer and added the following recipes to its setup event: elasticsearch, elasticsearch::aws (for the discovery of nodes via the AWS API), elasticsearch::proxy (to provide a password protected web proxy), java (to install Oracle’s Java runtime, apt & ark (required dependencies), layer-custom::esplugins (a custom recipe we have written to overcome the issue with the community cookbook), layer-custom::allocation-awareness (to pass the availability zone of the node as the rack_id parameter used by Elasticsearch for location aware distribution of shards), layer-custom::esmonit (to add Elasticsearch to the list of services monit will monitor and restart in case they fail) and layer-custom::cloudwatch-custom (to push custom metrics into CloudWatch): We used OpsWorks’ Amazon EBS volumes functionality to mark our instances as EBS-optimized and add a PIOPS volume for the Elasticsearch data folder. This is a configuration setting that is not available when you create a layer so you will need to edit the layer after its creation. In case of an instance failure the replacement instance will inherit the original instance’s volume. In combination with Elasticsearch’s local gateway configuration, this would allow for automatic recovery of data from disk e.g. even in the scenario where a multi-instance failure affects both a shard and its replica(s). This configuration is just an example and we encourage customers to measure their storage requirements both in terms of size and IOPS. By using EBS volumes we also have an easy way to perform backups through the ability to store EBS snapshots to S3. A different option would be to use SSD ephemeral storage of some of the newer EC2 instance types as long as the durability characteristics of EBS are not required. This might be the case if you are deploying ElasticSearch version 1 that provides an incremental point in time snapshot and restore functionality. Finally we added an IAM instance profile so that we improve security (since we will not need to add or rotate AWS IAM credentials in our custom JSON). Step 5: Launch the instances We were now ready to launch our instances. In our example we launched 3 instances in 3 distinct availability zones and a few minutes later they were up and running: We can verify the cluster has been setup as expected via the GUI of the elasticsearch-head plugin (that also allows us to create e.g. a test index): http:// instance-ip-address/_plugin/head Launching in your production environment We have created a CloudFormation template to simplify the OpsWorks Stack creation process in the US East (Northern Virginia) region. To use this template: We recommend that you download and store the Oracle JDK and Elasticsearch bundle on S3 to avoid dependencies on those repositories. You can point to those assets in the custom JSON. Download the CloudFormation template and run it from the CloudFormation console to create your OpsWorks stack. It uses a sample GitHub repository with the layer-custom cookbook provided and a Berksfile with all the community referenced cookbooks. It will create an OpsWorks stack, an Elasticsearch layer. Once the CloudFormation template creation is complete, open the OpsWorks console . You will need to edit the stack and enable Berkshelf. Then create and start three instances. When you are done, you can then delete your instances, then delete the CloudFormation stack to clean up your resources. The above should serve as a starting point for the configuration of Elasticsearch on EC2 with Opsworks. For a real production environment you would want to implement additional monitoring, autoscaling, scheduled backups to a regional service like S3, tighter security etc. You would also want to frequently benchmark your application to make sure you are running on the optimal instance type and EBS configuration. If you prefer to focus on building your application without having to manage the search service you might want to consider AWS ElasticSearch service instead. Amazon Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud. References: http://www.elasticsearch.org/tutorials/deploying-elasticsearch-with-chef-solo/ http://asquera.de/opensource/2012/11/25/elasticsearch-pre-flight-checklist/ https://blog.liip.ch/archive/2013/07/19/on-elasticsearch-performance.html

— Andreas Chatzakis, Solutions Architect