Typical cloud application requirements of deployment, versioning, scaling, monitoring are also among the common operational challenges of Machine Learning (ML) services.

This post will focus on building a ML serving infrastructure to continuously update, version and deploy models.

Infrastructure Stack

In building our ML serving infrastructure, we will setup a Kubernetes cluster in a cloud environment and leverage Istio to handle service level operations. Next, we will use TensorFlow Serving to deploy and serve a ResNet model hosted on a S3 bucket. Lastly, we will take a look at how to perform staged canary rollouts of newer model versions and eventually automating the rollout process with Flagger.

At a high-level, our infrastructure stack includes:

Kubernetes : open-source container orchestration system for application infrastructure and management.

: open-source container orchestration system for application infrastructure and management. Istio : open-source “service-mesh” to enable operational management of micro-services in distributed environments.

: open-source “service-mesh” to enable operational management of micro-services in distributed environments. TensorFlow Serving : open-source high-performance ML model serving system.

: open-source high-performance ML model serving system. S3 Storage : AWS cloud object storage.

: AWS cloud object storage. Flagger: open-source automated canary deployment manager as Kubernetes operator.

Kubernetes Cluster

Kubernetes has done wonders in re-shaping the cloud infrastructure landscape. Spinning up a cluster is supported on multiple environments, with almost all major cloud providers offering managed Kubernetes as hosted solutions.

For this post, we will take the opportunity to test-drive one of the newest solutions around the block, DigitalOcean’s managed Kubernetes solution. Get started by creating a new DigitalOcean Kubernetes cluster on a chosen datacenter and node pool configurations.

Download the configuration file and add it to bash session.

export KUBECONFIG=k8s-1-13-5-do-1-sfo2-1555861262145-kubeconfig.yaml

Check the status of the node and verify that it is healthy and ready to accept workloads.

kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-1-13-5-do-1-sfo2-1555861262145-1-msa8 Ready <none> 57s v1.13.5

Istio

Istio is an open-source “service mesh” that layers itself transparently onto existing distributed infrastructure.

A “service mesh” is the abstraction for inter-connected services interacting with each other. This kind of abstraction helps reduce the complexity of managing connectivity, security, and observability of applications in a distributed environment.

Istio helps tackle these problems by providing a complete solution with insights and operational control over connected services within the “mesh”. Some of core features of Istio includes:

Load balancing on HTTP, gRPC, TCP connections

Traffic management control with routing, retry and failover capabilities

A monitoring infrastructure that includes metrics, tracing and observability components

End to end TLS security

Installing Istio on an existing Kubernetes cluster is pretty simple. For installation guide, take a look at this excellent post by @nethminiromina:

Create the custom resource definitions (CRDs) from downloaded Istio package directory:

kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml

Next deploy Istio operator resources to the cluster from the packaged “all-in-one” manifest:

kubectl apply -f install/kubernetes/istio-demo.yaml