The configuration for both of the workloads are similar, but let’s take some time to appreciate the differences between the two node types.

All of the Pods are created using the same docker image, with Elasticsearch being configured by environment variables. Four environment variables are used to define the capabilities of the node, node.master , node.ingest , node.data , and search.remote.connect . Learn more about what each of these options mean here.

, , , and . Learn more about what each of these options mean here. There are two kinds of health checks, but the master nodes will only use one of them. There is a livenessProbe , which ensures Elasticsearch is listening on port 9300 to confirm the application is running. There is also a readinessProbe , which performs an Elasticsearch cluster health check and expects a 200 response. The master nodes only use livenessProbe , so they will be added to the DNS entry for elasticsearch-master as soon as Elasticsearch is running, regardless of whether or not the Elasticsearch cluster is actually available. The data nodes also have a readinessProbe , which means clients connecting to elasticsearch.default.svc.cluster.local will not see those nodes until after they are able to join the cluster through their discovery mechanism.

, which ensures Elasticsearch is listening on port 9300 to confirm the application is running. There is also a , which performs an Elasticsearch cluster health check and expects a 200 response. The master nodes only use , so they will be added to the DNS entry for as soon as Elasticsearch is running, regardless of whether or not the Elasticsearch cluster is actually available. The data nodes also have a , which means clients connecting to will not see those nodes until after they are able to join the cluster through their discovery mechanism. Of course, a big difference between a data node and a master node is the attached data volume. Kubernetes will create volumes depending on your StorageClass, which for me means that volumes are created as Google Compute Engine Disks.

Start Kibana

Kibana’s official docker image can also be configured with simple environment variables, so all we need is to do is create a Service and Deployment for it.

$ kubectl apply -f 3_kibana

service/kibana created

deployment.apps/kibana created

The Service we created this time will get a cluster ip, and Kubernetes will load balance it among the available Pods for us instead of creating a unique DNS A Record for each. One way this can be useful is that the IP address will be stable, so even if the Pod is changed, the IP address will not.

Check on the cluster

Check and see if all your Pods are all ready.

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

elasticsearch-data-0 1/1 Running 0 33m

elasticsearch-data-1 1/1 Running 0 32m

elasticsearch-master-66c5597... 1/1 Running 0 33m

elasticsearch-master-66c5597... 1/1 Running 0 33m

elasticsearch-master-66c5597... 1/1 Running 0 33m

kibana-5c9767dc4-dqmwt 1/1 Running 0 1m

Hint: If you choose a namespace, use -n NAMESPACE or change your default namespace see your Pods.

At this point, we should have successfully started a new Elasticsearch cluster and pointed Kibana at that cluster. To test it, we can use port fowarding to access the cluster from our computer.

$ kubectl port-forward service/elasticsearch 9200

Forwarding from 127.0.0.1:9200 -> 9200

Forwarding from [::1]:9200 -> 9200

And in a separate shell:

$ curl localhost:9200/_cluster/health?pretty

{

"cluster_name" : "my-es-cluster",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 5,

"number_of_data_nodes" : 2,

"active_primary_shards" : 0,

"active_shards" : 0,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0,

"delayed_unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0,

"task_max_waiting_in_queue_millis" : 0,

"active_shards_percent_as_number" : 100.0

}

Set up Index Templates and Kibana dashboards

Now that the Elasticsearch cluster is running, it is a good idea to configure the index templates you intend to use. I am going to be using Filebeat to collect logs and Metricbeat to collect metrics from Kubernetes. In the 4_beats_init folder are four configs for Jobs which configure the index templates or install Kibana dashboards. These Jobs will typically run once, unless they do not exit cleanly. Run the ones you want individually, or else run them all by applying the folder.

$ kubectl apply -f 4_beats_init

job.batch/filebeat-dashboard-init created

job.batch/filebeat-template-init created

job.batch/metricbeat-dashboard-init created

job.batch/metricbeat-template-init created

$ kubectl get jobs

NAME DESIRED SUCCESSFUL AGE

filebeat-dashboard-init 1 1 1m

filebeat-template-init 1 1 1m

metricbeat-dashboard-init 1 1 1m

metricbeat-template-init 1 1 1m

Once the jobs are successful, your Elasticsearch cluster is ready to use.

Start Filebeat and Metricbeat Daemons

This step is optional, but if you would like to add extra monitoring and log collection to your Kubernetes cluster, Filebeat and Metricbeat make that possible. Kubernetes DaemonSets are another kind of workload that ensure that all Kubernetes nodes run a copy of a Pod, which we can use to deploy the containers. Apply the next directory to launch them*.

$ kubectl apply -f 5_beats_agents

configmap/filebeat-config created

configmap/filebeat-inputs created

daemonset.extensions/filebeat created

clusterrolebinding.rbac.authorization.k8s.io/filebeat created

clusterrole.rbac.authorization.k8s.io/filebeat created

serviceaccount/filebeat created

configmap/metricbeat-config created

configmap/metricbeat-daemonset-modules created

daemonset.extensions/metricbeat created

configmap/metricbeat-deployment-modules created

deployment.apps/metricbeat created

clusterrolebinding.rbac.authorization.k8s.io/metricbeat created

clusterrole.rbac.authorization.k8s.io/metricbeat created

serviceaccount/metricbeat created

*Note: If you are running in GKE, you will need to use the following command to give yourself a cluster-admin role before you will be able to fully deploy the daemons.

kubectl create clusterrolebinding cluster-admin-binding \

--clusterrole=cluster-admin \

--user $(gcloud config get-value account)

Second Note: In order for Metricbeat to collect all metrics, you also need to install kube-state-metrics on your cluster.



kubectl apply -f git clone https://github.com/kubernetes/kube-state-metrics.git kubectl apply -f kube-state-metrics /kubernetes

Connect to Kibana

For now, we can connect to Kibana the same way we did with Elasticsearch; through Kubernetes port-forwarding. To open the tunnel, run kubectl port-forward service/kibana 5601 . Afterwards, open up http://localhost:5601 in your browser to get to Kibana. Set up a default index to begin exploring the data from the beats agents.

If it looks sort of like this, things are going well.

Next…

Congratulations for making it this far! Next up we learn how to expose our cluster on the internet by using Logstash to authenticate data sources, and oauth2_proxy to secure Kibana. Continue on to part two of this guide by following the link.