Banzai Cloud’s Pipeline platform is an operating system which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security - multiple authentication backends, fine grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, etc. - is a tier zero feature of the Pipeline platform, which we strive to automate and enable to all enterprises.

The Pipeline platform automatically scans images for vulnerabilities

We switched from Clair to Anchore Engine to gain multiple vulnerability backends, better multi-tenancy and policy validation

We open sourced a Helm chart to deploy Anchore

We open sourced a Kubernetes Admission Webhook to scan images

Pipeline automates all of these steps

In this post we’d like to go into detail about how container image vulnerability scans work - with a focus on catching vulnerabilities at the point in time at which deployments are submitted into the cluster.

Key aspects of container image vulnerability scans 🔗︎

Every image should be scanned no matter where it comes from (i.e: deployment, operator, etc.)

It should be possible to set up policies with certain rules to allow or reject a pod

or a pod These policies should be associated with clusters

If a policy result is rejected, creation of the pod should be blocked

There should be an easy way to whitelist a Helm deployment

Admission webhooks and Anchore Engine 🔗︎

A few months back our vulnerability scans were based on Clair, but we ended up switching to Anchore Engine, due to the multi-tenant nature of our platform and a host of new requirements from our users.

The Anchore Engine is an open source project that provides a centralized service for the inspection, analysis, and certification of container images. A PostgreSQL database is required to provide persistent storage for the engine. And the Anchore engine can be accessed directly through a RESTful API or via the Anchore CLI.

If you want to try it out for yourself, we open sourced the Helm chart that we built and are using on our Pipeline platform. It supports PostgreSQL and Google’s CloudSQL as database backends. Needless to say, the whole process is automated thanks to Pipeline.

Anchore Image Validator 🔗︎

The Anchore Image Validator works as an admission server. After it is registered to the Kubernetes cluster as a Validating Webhook , it will validate any Pod deployed to the cluster. The server inspects the images that are defined in PodSpec against the configured Anchore Engine endpoint. Based on that response, the admission hook can decide whether to accept or reject that deployment.

If you want to learn more about Validating Admission Webhooks , you can find a detailed description in one of our previous blog posts, here: in-depth introduction to admission webhooks

Anchore Image Validator was inspired by Vic Iglesias’ kubernetes-anchore-image-validator which leverages the Generic Admission Server for most of the heavy lifting of implementing the admission webhook API. We redesigned and extended it with whitelist and scanlog features. For flexibility, it uses Custom Resource Definitions to store and evaluate those extensions.

Using the Anchore Image Validator 🔗︎

The Helm deployment of Anchore Policy Validator contains all the necessary resources including CRDs.

$ kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io NAME AGE validator-anchore-policy-validator.admission.anchore.io 1d

$ kubectl get apiservices.apiregistration.k8s.io NAME AGE v1. 16d v1.apps 16d v1.authentication.k8s.io 16d v1.authorization.k8s.io 16d v1.autoscaling 16d v1.batch 16d v1.networking.k8s.io 16d v1.rbac.authorization.k8s.io 16d v1.storage.k8s.io 16d v1alpha1.security.banzaicloud.com 1d v1beta1.admission.anchore.io 1d v1beta1.admissionregistration.k8s.io 16d v1beta1.apiextensions.k8s.io 16d v1beta1.apps 16d v1beta1.authentication.k8s.io 16d v1beta1.authorization.k8s.io 16d v1beta1.batch 16d v1beta1.certificates.k8s.io 16d v1beta1.extensions 16d v1beta1.metrics.k8s.io 16d v1beta1.policy 16d v1beta1.rbac.authorization.k8s.io 16d v1beta1.storage.k8s.io 16d v1beta2.apps 16d v2beta1.autoscaling 16d

After deploying these CRDs, you can access them via the Kubernetes API server:

$ curl http://<k8s-apiserver>/apis/security.banzaicloud.com/v1alpha1 { "kind" : "APIResourceList" , "apiVersion" : "v1" , "groupVersion" : "security.banzaicloud.com/v1alpha1" , "resources" : [ { "name" : "whitelistitems" , "singularName" : "whitelistitem" , "namespaced" : false, "kind" : "WhiteListItem" , "verbs" : [ ... ] , "shortNames" : [ "wl" ] } , { "name" : "audits" , "singularName" : "audit" , "namespaced" : false, "kind" : "Audit" , "verbs" : [ ... ] } ] }

And these resources are accessible with the kubectl command:

$ kubectl get crd NAME AGE audits.security.banzaicloud.com 21h whitelistitems.security.banzaicloud.com 21h $ kubectl get whitelistitems -o wide -o = custom-columns = NAME:.metadata.name,CREATOR:.spec.creator,REASON:.spec.reason NAME CREATOR REASON test-whiltelist pbalogh-sa just-testing $ kubectl get audits -o wide -o = custom-columns = NAME:.metadata.name,RELEASE:.spec.releaseName,IMAGES:.spec.image,RESULT:.spec.result NAME RELEASE IMAGES RESULT replicaset-test-b468ccf8b test-b468ccf8b-2s6tj [ nginx ] [ reject ]

While there exists a way of whitelisting in the Anchore Engine itself, such whitelists are only applicable to attributes like:

image name, tag, hash,

on concrete CVEs,

libraries, files or other filesystem based matches.

Our approach to filtering is based on Helm Deployments . However, covering whitelists at the deployment level with CVE or image names is simply not feasible. To manage whitelisted deployments we use a custom resource definition, so the admission hook will accept deployments that match any whitelist element no matter what the scan result is.

Note: All resources included in a Helm Deployment must have the release-name label.

The CRD structure should include the following data:

Name Name of the whitelisted release

Name of the whitelisted release creator The Pipeline user who created the rule

The Pipeline user who created the rule reason Reason for whitelisting

Example whitelist:

$ kubectl get whitelist test-whitelist -o yaml apiVersion: security.banzaicloud.com/v1alpha1 kind: WhiteListItem metadata: clusterName: "" creationTimestamp: 2018-09-25T06:44:49Z name: test-whitelist namespace: "" resourceVersion: "1981225" selfLink: /apis/security.banzaicloud.com/v1alpha1/test-whiltelist uid: 7f9a094d-c08e-11e8-b34e-42010a8e010f spec: creator: pbalogh-sa reason: just-testing

This approach will allow the investigation of problems while not disturbing production services.

Scan Events (Audit logs) 🔗︎

Finding the result of an admission hook decision can be troublesome, so we introduced the Audit custom resource. With this custom resource it’s easy to track the result of each scan. Instead of searching in events, you can also easily filter these resources with kubectl . The CRD structure includes the following data:

releaseName Scanned release

Scanned release resource Scanned resource (Pod)

Scanned resource (Pod) image Scanned images (in Pod)

Scanned images (in Pod) result Scan results (per image)

Scan results (per image) action Admission action (allow, reject)

During image scans, Admission server logs result to audits.security.banzaicloud.com and set their ownerReferences to the scanned Pod’s parent. This provides us with a compact overview of the resources running on the cluster. Because these events are bound to Kubernetes resources, it allows for the cluster to clean them up when the original resource (pod) is no longer present.

Example audit log:

$ kubectl get audits replicaset-test-b468ccf8b -o yaml apiVersion: security.banzaicloud.com/v1alpha1 kind: Audit metadata: clusterName: "" creationTimestamp: 2018-09-24T09:06:31Z labels: fakerelease: "true" name: replicaset-test-b468ccf8b namespace: "" ownerReferences: - apiVersion: extensions/v1beta1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: test-b468ccf8b uid: 1c20ed8d-bfd9-11e8-b34e-42010a8e010f resourceVersion: "1857033" selfLink: /apis/security.banzaicloud.com/v1alpha1/replicaset-test-b468ccf8b uid: 20e75829-bfd9-11e8-b34e-42010a8e010f spec: action: allowed image: - postgres releaseName: test-b468ccf8b-2s6tj resource: Pod result: - 'Image passed policy check: postgres' status: state: ""

A core feature of the Pipeline Platform 🔗︎

These building blocks are great. However, ordinarily there would be many steps left to perform manually. We have tightly integrated these tasks into Pipeline, to help manage your cluster security. We automated the following:

Generate Anchore User with Credentials (This is one technical user per cluster) Save the generated credentials to Vault - Pipeline’s main Secret Store. (We persist these credentials for later use) Setup the Anchore User Policy bundles . The user can choose one from a number of predefined policy bundles or create a custom one. Deploy the Validating Admission Webhook using credentials and Anchore Engine service URL Provide a RESTful API for all resources trough Pipeline

Predefined policy bundles 🔗︎

To simplify bootstrapping, we have predefined basic policy bundles for Anchore

Allow all This policy is the most permissive. One can deploy anything, but it recieves feedback about all the deployed images

This policy is the most permissive. One can deploy anything, but it recieves feedback about all the deployed images Reject Critical This policy will prevent deploying containers with critical CVE

This policy will prevent deploying containers with critical CVE Reject High This policy will prevent deploying containers with high severity CVE

This policy will prevent deploying containers with high severity CVE Block root This policy will prevent deploying containers with apps running root privileges

This policy will prevent deploying containers with apps running root privileges Deny all This is the most restrictive policy. Only explicitly whitelisted releases are accepted

Next steps 🔗︎