It is not a good idea to store backups in your on production servers. better to use Cloud data storage service like AWS S3 or some other services. You can also use Minio like open source self hosted Object Storage with ZFS. I am going to demonstrate how to backup Elasticsearch data into S3 and create the retention policy on it.As I mentioned before, I used two script to manage the backup lifecycle in s3, one for creating snapshot and another one for delete the old snapshot. But the Snapshot Lifecycle Management (SLM) is one of great feature on Elasticsearch 7.5 version on wards. In this post demonstrating how to configure S3 and manage the lifecycle of backup. Just follow the previous documentation if you want to install Elasticsearch and configure it.

Create S3 Bucket

I hope you have already created an AWS account and IAM user credentials for S3. So Let’s start with creating a S3 bucket to store Elasticsearch snapshots. Go to the AWS S3 services page and click on ‘create bucket’ button and set a name for the bucket and follow the steps.

Configure S3 Access in Elasticsearch keystore

Next, set IAM User credentials to Elasticsearch keystore for accessing S3 bucket what we already created. Use following commands to set these credentials.

bin/elasticsearch-keystore add s3.client.default.access_key

bin/elasticsearch-keystore add s3.client.default.secret_key

Install S3 Plugin

You need to install the Elasticsearch s3-repository plugin, just run the command:

bin/elasticsearch-plugin install repository-s3

you can check if the repository-s3 is installed.

curl -X GET "localhost:9200/_cat/plugins?v&s=component&h=component,version,description&pretty"

response should be like:

component version description

repository-s3 7.5.0 The S3 repository plugin adds S3 repositories

Configure Backup Repository

Setup the backup repository with s3 bucket.

response should be “acknowledged” : true if the repository creation succeeded.

Configure SLM policy

After that set the s3 snapshot lifecycle management policy on it. set backup time, snapshot name, repository, indices and retention policy.

Finally just run the command for backing up into s3 immediately.

curl -X POST "localhost:9200/_slm/policy/my-s3-snapshots/_execute?pretty"

you will get a response with snapshot name if it is succeeded. and your S3 bucket looks like:

Now we can check the policy.

curl -X GET "localhost:9200/_slm/policy/my-s3-snapshots?human&pretty"

The response should be :

This response contains number of snapshots taken, next snapshot time, and lot of other details on SLM policy.

Restore from Snapshot

You will get full list of snapshot and note down the snapshot name which one you want to restore.

curl -X GET “localhost:9200/_snapshot/my_s3_repository/_all?pretty”

Remove one index from Elasticsearch cluster to test the restore processes.

First, list available indices in your cluster.

Delete index ‘hastags’ for demonstrating restoration of s3 backup.

curl -X DELETE "localhost:9200/hashtags?pretty"

Just run the above indices list command to verify the deletion. Now time to run restore command :

No need to specify the index name if you are restoring the data into a new cluster. Otherwise you will get an error message ‘index with same name already exists’ if some of the indices already available in Elasticsearch cluster.

Now again verify index restore with index list command. you can see the difference in uuid before and after index restored.

This is the coolest method to manage the backup lifecycle on Elasticsearch. you can also use Kibana to configure life cycle policy. You can directly restore the snapshot to a new Elasticsearch cluster using above repository configuration.

Cheers,