Apache Kafka on Kubernetes series:

Kafka on Kubernetes - using etcd



This is the second blog post in the Spark on Kubernetes series, so I hope you’ll bear with me as I recap a few items of interest from our only previous one.

Containerization and cluster management technologies are constantly evolving within the cluster computing community. Apache Spark currently supports Apache Hadoop YARN and Apache Mesos, in addition to offering its own standalone cluster manager. In 2014, Google announced the development of Kubernetes which has its own feature set and differentiates itself from YARN and Mesos. In a nutshell, Kubernetes is an open source container orchestration framework that can run on local machines, on-premise or in the cloud. It supports multiple container frameworks (docker, rkt and clear containers), which allows users to select whichever they prefer.

Although support for standalone Spark clusters on k8s currently exists, there are still major advantages to, and significant interest in, native execution support due to the limitations of standalone mode and the advantages of Kubernetes. This is what inspired the spark-on-k8s project, which we at Banzai Cloud are also contributing to, while simultaneously building our PaaS, Pipeline.

Let’s take a look at a simple example of how to upscale Spark automatically while it executes a basic SparkPI job that’s running on EC2 instances.

NAME STATUS ROLES AGE VERSION ip-10-0-100-40.eu-west-1.compute.internal Ready <none> 15m v1.8.3

No pods are running except a shuffle service.

$ kubectl get po -o wide NAME READY STATUS RESTARTS AGE IP NODE shuffle-lckjg 1/1 Running 0 19s 10.46.0.1 ip-10-0-100-40.eu-west-1.compute.internal

Let’s submit SparkPi with dynamic allocation enabled, and pass 50000 as an input parameter in order to see what’s going on in the k8s cluster.

bin/spark-submit \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --master k8s://https://34.242.4.60:443 \ --kubernetes-namespace default \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.driver.cores="400m" \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.shuffle.service.enabled=true \ --conf spark.kubernetes.shuffle.namespace=default \ --conf spark.kubernetes.shuffle.labels="app=spark-shuffle-service,spark-version=2.2.0" \ --conf spark.local.dir=/tmp/spark-local \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.driver.docker.image=banzaicloud/spark-driver-py:v2.2.0-k8s-1.0.179 \ --conf spark.kubernetes.executor.docker.image=banzaicloud/spark-executor-py:v2.2.0-k8s-1.0.179 \ --conf spark.kubernetes.authenticate.submission.caCertFile=my-k8s-aws-ca.crt \ --conf spark.kubernetes.authenticate.submission.clientKeyFile=my-k8s-aws-client.key \ --conf spark.kubernetes.authenticate.submission.clientCertFile=my-k8s-aws-client.crt \ local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar 50000

$ kubectl get po NAME READY STATUS RESTARTS AGE spark-pi-1511535479452-driver 1/1 Running 0 2m spark-pi-1511535479452-exec-1 0/1 Pending 0 38s

The driver is running inside the k8s cluster; so far so good. But, as we can see, only one executor is scheduled and it’s not going anywhere.

Let’s take a look at it:

$ kubectl describe po spark-pi-1511535479452-exec-1 ... ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 11s (x8 over 1m) default-scheduler No nodes are available that match all of the predicates: Insufficient cpu (1), PodToleratesNodeTaints (1).

This warning tells us that there is no node in the k8s cluster with enough resources to run spark executors.

After adding a new EC2 node and making more resources available in the k8s cluster executors start running, since now, with the addition of the new node, there are enough resources.

NAME READY STATUS RESTARTS AGE shuffle-lckjg 1/1 Running 0 10m shuffle-ncprp 1/1 Running 0 47s spark-pi-1511535479452-driver 1/1 Running 0 7m spark-pi-1511535479452-exec-1 0/1 ContainerCreating 0 4m

NAME READY STATUS RESTARTS AGE shuffle-lckjg 1/1 Running 0 11m shuffle-ncprp 1/1 Running 0 56s spark-pi-1511535479452-driver 1/1 Running 0 7m spark-pi-1511535479452-exec-1 1/1 Running 0 5m

All up-scaling is handled transparently, as k8s began to automatically execute pending executors as soon as the required resource became available. We didn’t have to do anything in Spark, no reconfiguration and no installation of components whatsoever.

The above process still involves manual steps if you want to operate the cluster in a manual way - however, our end goal and the aim of our Apache Spark Spotguide is to automate the whole process. Pipeline uses Prometheus to extrernalize monitoring information from cluster infrastructure, the cloud and the running Spark application, itself, then wires that information back to Kubernetes to anticipate and make upscales/downscales as practical as possible while maintaining predefined SLA rules.

As usual, we’ve open sourced all the Spark images and charts necessary to run Spark natively on Kubernetes, and made them available in our Banzai Cloud GitHub repository.

In our next post in this series we’ll discuss the internals of Spark on Kubernetes. For a brief preview, check out this sequence diagram that highlights the internal flow of events and illustrates how a Spark cluster works with/inside Kubernetes. Sure, it might not be completely straightforward now, but rest assured that after finishing this series, the details will have been made crystal clear.