Kubernetes is the de facto standard of container orchestration (deploying workloads on distributed systems). Google Kubernetes Engine (GKE) is the managed Kubernetes as a Service provided by Google Cloud Platform.

Currently, GKE is still your best choice compares to other managed Kubernetes services, i.e., Azure Container Service (AKS) and Amazon Elastic Container Service for Kubernetes (EKS).

ref:

https://kubernetes.io/

https://cloud.google.com/kubernetes-engine/

You could find the sample project on GitHub.

https://github.com/vinta/simple-project-on-k8s

Installation

Install gcloud to create Kubernetes clusters on Google Cloud Platform.

Install kubectl to interact with any Kubernetes cluster.

$ brew install kubernetes-cli # or $ gcloud components install kubectl $ gcloud components update

ref:

https://cloud.google.com/sdk/docs/

https://kubernetes.io/docs/tasks/tools/install-kubectl/

Some useful tools:

Concepts

Nodes

Cluster: A set of machines, called nodes, that run containerized applications.

Node: A single virtual or physical machine that provides hardware resources.

Edge Node: The node which is exposed to the Internet.

Master Node: The node which is responsible for managing the whole cluster.

Objects

Pod: A group of tightly related containers. Each pod is like a logical host has its own IP, hostname, and storages.

PodPreset: A set of pre-defined configurations can be injected into Pods automatically.

Service: A load balancer of a set of Pods which selected by labels, also called Service Discovery.

Ingress: A revered proxy acts as an entry point to the cluster, which allows domain-based and path-based routing to different Services.

ConfigMap: Key-value configuration data can be mounted into containers or consumed as environment variables.

Secret: Similar to ConfigMap but for storing sensitive data only.

Volume: A ephemeral file system whose lifetime is the same as the Pod.

PersistentVolume: A persistent file system that can be mounted to the cluster, without being associated with any particular node.

PersistentVolumeClaim: A binding between a Pod and a PersistentVolume.

StorageClass: A storage provisioner which allows users to request storages dynamically.

Namespace: The way to partition a single cluster into multiple virtual groups.

Controllers

ReplicationController: Ensures that a specified number of Pods are always running.

ReplicaSet: The next-generation ReplicationController.

Deployment: The recommended way to deploy stateless Pods.

StatefulSet: Similar to Deployment but provides guarantees about the ordering and unique names of Pods.

DaemonSet: Ensures a copy of a Pod is running on every node.

Job: Creates Pods that runs to completion (exit with 0).

CronJob: A Job which can run at a specific time or run regularly.

HorizontalPodAutoscaler: Automatically scales the number of Pods based on CPU and memory utilization or custom metric targets.

ref:

https://kubernetes.io/docs/concepts/

https://kubernetes.io/docs/reference/glossary/?all=true

Setup Google Cloud Accounts

Make sure you use the right Google Cloud Platform account.

$ gcloud init # or $ gcloud config configurations list $ gcloud config configurations activate default $ gcloud config set project simple-project-198818 $ gcloud config set compute/region asia-east1 $ gcloud config set compute/zone asia-east1-a $ gcloud config list

Create Clusters

Create a regional cluster in asia-east1 region which has 1 node in each of the asia-east1 zones using --region=asia-east1 --num-nodes=1 . By default, a cluster only creates its cluster master and nodes in a single compute zone.

# show available OSs and versions of Kubernetes $ gcloud container get-server-config # show available CPU platforms in the desired zone $ gcloud compute zones describe asia-east1-a availableCpuPlatforms: - Intel Skylake - Intel Broadwell - Intel Haswell - Intel Ivy Bridge $ gcloud container clusters create demo \ --cluster-version=1.11.6-gke.6 \ --node-version=1.11.6-gke.6 \ --scopes=gke-default,cloud-platform,storage-full,compute-ro,pubsub,https://www.googleapis.com/auth/cloud_debugger \ --region=asia-east1 \ --num-nodes=1 \ --enable-autoscaling --min-nodes=1 --max-nodes=10 \ --maintenance-window=20:00 \ --machine-type=n1-standard-4 \ --min-cpu-platform="Intel Skylake" \ --enable-ip-alias \ --create-subnetwork="" \ --image-type=UBUNTU \ --node-labels=custom.kubernetes.io/fs-type=xfs $ gcloud container clusters describe demo --region=asia-east1 $ kubectl version Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-04T04:48:55Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.5-gke.5", GitCommit:"9aba9c1237d9d2347bef28652b93b1cba3aca6d8", GitTreeState:"clean", BuildDate:"2018-12-11T02:36:50Z", GoVersion:"go1.10.3b4", Compiler:"gc", Platform:"linux/amd64"} $ kubectl get nodes -o wide

You can only get a regional cluster by creating a whole new cluster, Google currently won't allow you to turn an existed cluster into a regional one.

ref:

https://cloud.google.com/sdk/gcloud/reference/container/clusters/create

https://cloud.google.com/compute/docs/machine-types

https://cloud.google.com/kubernetes-engine/docs/concepts/regional-clusters

https://cloud.google.com/kubernetes-engine/docs/how-to/min-cpu-platform

https://cloud.google.com/kubernetes-engine/docs/how-to/alias-ips

Google Kubernetes Engine clusters running Kubernetes version 1.8+ enable Role-Based Access Control (RBAC) by default. Therefore, you must explicitly provide --enable-legacy-authorization option to disable RBAC.

ref:

https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control

Delete the cluster. After you delete the cluster, you might also need to manually delete persistent disks (under Compute Engine), load balancers (under Network services) and static IPs (under VPC network) which belong to the cluster on Google Cloud Platform Console.

$ gcloud container clusters delete demo --region=asia-east1

Create Node Pools

Create a cluster with preemptible VMs which are much cheaper than regular instances using --preemptible .

You might receive The connection to the server x.x.x.x was refused - did you specify the right host or port? error while upgrading the cluster which includes adding new node pools.

$ gcloud container node-pools create n1-standard-4-pre \ --cluster=demo \ --node-version=1.11.6-gke.6 \ --scopes=gke-default,storage-full,compute-ro,pubsub,https://www.googleapis.com/auth/cloud_debugger \ --region=asia-east1 \ --num-nodes=1 \ --enable-autoscaling --min-nodes=1 --max-nodes=10 \ --machine-type=n1-standard-4 \ --min-cpu-platform="Intel Skylake" \ --node-labels=custom.kubernetes.io/scopes-storage-full=true --enable-autorepair \ --preemptible $ gcloud container node-pools list --cluster=demo --region=asia-east1 $ gcloud container operations list

ref:

https://cloud.google.com/sdk/gcloud/reference/container/node-pools/create

https://cloud.google.com/kubernetes-engine/docs/concepts/preemptible-vm

https://cloud.google.com/compute/docs/regions-zones/

Build Docker Images

You could use Google Cloud Build or any Continuous Integration (CI) service to automatically build Docker images and push them to Google Container Registry.

Furthermore, you need to tag your Docker images appropriately with the registry name format: region_name.gcr.io/your_project_id/your_image_name:version .

ref:

https://cloud.google.com/container-builder/

https://cloud.google.com/container-registry/

An example of cloudbuild.yaml :

substitutions: _REPO_NAME: simple-api steps: - id: pull-image name: gcr.io/cloud-builders/docker entrypoint: "/bin/sh" args: [ "-c", "docker pull asia.gcr.io/$PROJECT_ID/$_REPO_NAME:$BRANCH_NAME || true" ] waitFor: [ "-" ] - id: build-image name: gcr.io/cloud-builders/docker args: [ "build", "--cache-from", "asia.gcr.io/$PROJECT_ID/$_REPO_NAME:$BRANCH_NAME", "--label", "git.commit=$SHORT_SHA", "--label", "git.branch=$BRANCH_NAME", "--label", "ci.build-id=$BUILD_ID", "-t", "asia.gcr.io/$PROJECT_ID/$_REPO_NAME:$SHORT_SHA", "simple-api/" ] waitFor: [ "pull-image", ] images: - asia.gcr.io/$PROJECT_ID/$_REPO_NAME:$SHORT_SHA

ref:

https://cloud.google.com/container-builder/docs/build-config

https://cloud.google.com/container-builder/docs/create-custom-build-steps

Of course, you could also manually push Docker images to Google Container Registry.

$ gcloud auth configure-docker && \ gcloud config set project simple-project-198818 && \ export PROJECT_ID="$(gcloud config get-value project -q)" $ docker build --rm -t asia.gcr.io/${PROJECT_ID}/simple-api:v1 simple-api/ $ gcloud docker -- push asia.gcr.io/${PROJECT_ID}/simple-api:v1 $ gcloud container images list --repository=asia.gcr.io/${PROJECT_ID}

ref:

https://cloud.google.com/container-registry/docs/pushing-and-pulling

Moreover, you should always adopt Multi-Stage builds for your Dockerfiles.

FROM python:3.6.8-alpine3.7 AS builder ENV PATH=$PATH:/root/.local/bin ENV PIP_DISABLE_PIP_VERSION_CHECK=1 WORKDIR /usr/src/app/ RUN apk add --no-cache --virtual .build-deps \ build-base \ linux-headers \ openssl-dev \ zlib-dev COPY requirements.txt . RUN pip install --user -r requirements.txt && \ find $(python -m site --user-base) -type f -name "*.pyc" -delete && \ find $(python -m site --user-base) -type f -name "*.pyo" -delete && \ find $(python -m site --user-base) -type d -name "__pycache__" -delete ### FROM python:3.6.8-alpine3.7 ENV PATH=$PATH:/root/.local/bin ENV FLASK_APP=app.py WORKDIR /usr/src/app/ RUN apk add --no-cache --virtual .run-deps \ ca-certificates \ curl \ openssl \ zlib COPY --from=builder /root/.local/ /root/.local/ COPY . . EXPOSE 8000 CMD ["uwsgi", "--ini", "config/uwsgi.ini", "--single-interpreter", "--enable-threads", "--http", ":8000"]

ref:

https://medium.com/@tonistiigi/advanced-multi-stage-build-patterns-6f741b852fae

Create Pods

No, you should never create Pods directly which are so-called naked Pods. Use Deployment instead.

ref:

https://kubernetes.io/docs/concepts/workloads/pods/pod-overview/

Pods have following life cycles (states):

Pending

Running

Succeeded

Failed

Unknown

ref:

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/

Inspect Pods

Show information about Pods.

$ kubectl get all $ kubectl get deploy $ kubectl get pods $ kubectl get pods -l app=simple-api $ kubectl get pods $ kubectl describe pod simple-api-5bbf4dd4f9-8b4c9 $ kubectl get pod simple-api-5bbf4dd4f9-8b4c9 -o yaml

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get

Execute a command in a container.

$ kubectl exec -i -t simple-api-5bbf4dd4f9-8b4c9 -- sh

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#exec

Tail Pod logs. It is also recommended to use kubetail .

$ kubectl logs simple-api-5bbf4dd4f9-8b4c9 -f $ kubectl logs deploy/simple-api -f $ kubectl logs statefulset/mongodb-rs0 -f $ kubetail simple-api $ kubetail simple-worker $ kubetail mongodb-rs0 -c db

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#logs

https://github.com/johanhaleby/kubetail

List all Pods on a certain node.

$ kubectl describe node gke-demo-default-pool-fb33ac26-frkw ... Non-terminated Pods: (7 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits --------- ---- ------------ ---------- --------------- ------------- default mongodb-rs0-1 2100m (53%) 4 (102%) 4G (30%) 4G (30%) default simple-api-84554476df-w5b5g 500m (25%) 1 (51%) 1G (16%) 1G (16%) default simple-worker-6495b6b74b-rqplv 500m (25%) 1 (51%) 1G (16%) 1G (16%) kube-system fluentd-gcp-v3.0.0-848nq 100m (2%) 0 (0%) 200Mi (1%) 300Mi (2%) kube-system heapster-v1.5.3-6447d67f78-7psb2 138m (3%) 138m (3%) 301856Ki (2%) 301856Ki (2%) kube-system kube-dns-788979dc8f-5zvfk 260m (6%) 0 (0%) 110Mi (0%) 170Mi (1%) kube-system kube-proxy-gke-demo-default-pool-3c058fcf-x7cv 100m (2%) 0 (0%) 0 (0%) 0 (0%) ... $ kubectl get pods --all-namespaces -o wide --sort-by="{.spec.nodeName}"

Check resource usage.

$ kubectl top pods $ kubectl top nodes

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#top

https://kubernetes.io/docs/tasks/debug-application-cluster/

Restart Pods.

# you could simply kill Pods which would restart automatically if your Pods are managed by any Deployment $ kubectl delete pods -l app=simple-worker # you could replace a resource by providing a manifest $ kubectl replace --force -f simple-api/

ref:

https://stackoverflow.com/questions/40259178/how-to-restart-kubernetes-pods

Completely delete resources.

$ kubectl delete -f simple-api/ -R $ kubectl delete deploy simple-api $ kubectl delete deploy -l app=simple,role=worker # delete a Pod forcefully $ kubectl delete pod simple-api-668d465985-886h5 --grace-period=0 --force $ kubectl delete deploy simple-api --grace-period=0 --force # delete all resources under a namespace $ kubectl delete daemonsets,deployments,services,statefulset,pvc,pv --all --namespace tick

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#delete

Create ConfigMaps

Create an environment-variable-like ConfigMap.

kind: ConfigMap apiVersion: v1 metadata: name: simple-api data: FLASK_ENV: production MONGODB_URL: mongodb://mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local,mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local,mongodb-rs0-3.mongodb-rs0.default.svc.cluster.local/demo?readPreference=secondaryPreferred&maxPoolSize=10 CACHE_URL: redis://redis-cache.default.svc.cluster.local/0 CELERY_BROKER_URL: redis://redis-broker.default.svc.cluster.local/0 CELERY_RESULT_BACKEND: redis://redis-broker.default.svc.cluster.local/1

Load environment variables from a ConfigMap:

kind: Deployment apiVersion: apps/v1 metadata: name: simple-api labels: app: simple-api spec: replicas: 1 selector: matchLabels: app: simple-api template: metadata: labels: app: simple-api spec: containers: - name: simple-api image: asia.gcr.io/simple-project-198818/simple-api:4fc4199 command: ["uwsgi", "--ini", "config/uwsgi.ini", "--single-interpreter", "--enable-threads", "--http", ":8000"] envFrom: - configMapRef: name: simple-api ports: - containerPort: 8000

Create a file-like ConfigMap.

kind: ConfigMap apiVersion: v1 metadata: name: redis-cache data: redis.conf: |- maxmemory-policy allkeys-lfu appendonly no save ""

Mount files from a ConfigMap:

kind: Deployment apiVersion: apps/v1 metadata: name: redis-cache labels: app: redis-cache spec: replicas: 1 selector: matchLabels: app: redis-cache template: metadata: labels: app: redis-cache spec: volumes: - name: config configMap: name: redis-cache containers: - name: redis image: redis:4.0.10-alpine command: ["redis-server"] args: ["/etc/redis/redis.conf", "--loglevel", "verbose", "--maxmemory", "1g"] volumeMounts: - name: config mountPath: /etc/redis ports: - containerPort: 6379

ref:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/

Only mount a single file with subPath .

kind: Deployment apiVersion: apps/v1 metadata: name: redis-cache labels: app: redis-cache spec: replicas: 1 selector: matchLabels: app: redis-cache template: metadata: labels: app: redis-cache spec: volumes: - name: config configMap: name: redis-cache containers: - name: redis image: redis:4.0.10-alpine command: ["redis-server"] args: ["/etc/redis/redis.conf", "--loglevel", "verbose", "--maxmemory", "1g"] volumeMounts: - name: config mountPath: /etc/redis/redis.conf subPath: redis.conf ports: - containerPort: 6379

ref:

https://github.com/kubernetes/kubernetes/issues/44815#issuecomment-297077509

It is worth noting that changing ConfigMap or Secret won't trigger re-deploying Deployment. A workaround might be changing the name of ConfigMap every time you change the content of ConfigMap. If you mount ConfigMap as environment variables, you must trigger a re-deployment explicitly.

ref:

https://github.com/kubernetes/kubernetes/issues/22368

Create Secrets

First of all, Secrets are only base64 encoded, not encrypted.

Encode and decode a Secret value.

$ echo -n 'YOUR_SECRET_KEY' | base64 WU9VUl9TRUNSRVRfS0VZ $ echo 'WU9VUl9TRUNSRVRfS0VZ' | base64 --decode YOUR_SECRET_KEY

Create an environment-variable-like Secret.

kind: Secret apiVersion: v1 metadata: name: simple-api data: SECRET_KEY: WU9VUl9TRUNSRVRfS0VZ

Export data (base64-encoded) from a Secret.

$ kubectl get secret simple-project-com --export=true -o yaml

ref:

https://kubernetes.io/docs/concepts/configuration/secret/

Create Deployments With Probes

Deployment are designed for stateless (or nearly stateless) services. Deployment controls ReplicaSet and ReplicaSet controls Pod.

ref:

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

livenessProbe can be used to determine when an application must be restarted by Kubernetes, while readinessProbe can be used to determine when a container is ready to accept traffic.

ref:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

It is also a best practice to always specify resource limits: resources.requests and resources.limits .

ref:

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/

Create a Deployment with probes.

kind: Deployment apiVersion: apps/v1 metadata: name: simple-api labels: app: simple-api spec: replicas: 1 selector: matchLabels: app: simple-api template: metadata: labels: app: simple-api spec: containers: - name: simple-api image: asia.gcr.io/simple-project-198818/simple-api:4fc4199 command: ["uwsgi", "--ini", "config/uwsgi.ini", "--single-interpreter", "--enable-threads", "--http", ":8000"] envFrom: - configMapRef: name: simple-api ports: - containerPort: 8000 livenessProbe: exec: command: ["curl", "-fsS", "-m", "0.1", "-H", "User-Agent: KubernetesHealthCheck/1.0", "http://127.0.0.1:8000/health"] initialDelaySeconds: 5 periodSeconds: 1 successThreshold: 1 failureThreshold: 5 readinessProbe: exec: command: ["curl", "-fsS", "-m", "0.1", "-H", "User-Agent: KubernetesHealthCheck/1.0", "http://127.0.0.1:8000/health"] initialDelaySeconds: 3 periodSeconds: 1 successThreshold: 1 failureThreshold: 3 resources: requests: cpu: 500m memory: 1G limits: cpu: 1000m memory: 1G

Create another Deployment of Celery workers.

kind: Deployment apiVersion: apps/v1 metadata: name: simple-worker spec: replicas: 2 selector: matchLabels: app: simple-worker template: metadata: labels: app: simple-worker spec: terminationGracePeriodSeconds: 30 containers: - name: simple-worker image: asia.gcr.io/simple-project-198818/simple-api:4fc4199 command: ["celery", "-A", "app:celery", "worker", "--without-gossip", "-Ofair", "-l", "info"] envFrom: - configMapRef: name: simple-api readinessProbe: exec: command: ["sh", "-c", "celery inspect -q -A app:celery -d [email protected]$(hostname) --timeout 10 ping"] initialDelaySeconds: 15 periodSeconds: 15 timeoutSeconds: 10 successThreshold: 1 failureThreshold: 3 resources: requests: cpu: 500m memory: 1G limits: cpu: 1000m memory: 1G

$ kubectl apply -f simple-api/ -R $ kubectl get pods

The minimum value of timeoutSeconds is 1 so that you might need to use exec.command to run arbitrary shell commands with custom timeout settings.

ref:

https://cloudplatform.googleblog.com/2018/05/Kubernetes-best-practices-Setting-up-health-checks-with-readiness-and-liveness-probes.html

Create Deployments With InitContainers

If multiple Init Containers are specified for a Pod, those Containers are run one at a time in sequential order. Each must succeed before the next can run. When all of the Init Containers have run to completion, Kubernetes initializes regular containers as usual.

kind: Service apiVersion: v1 metadata: name: gcs-proxy-media-simple-project-com spec: type: NodePort selector: app: gcs-proxy-media-simple-project-com ports: - name: http port: 80 targetPort: 80 --- kind: ConfigMap apiVersion: v1 metadata: name: google-cloud-storage-proxy data: nginx.conf: |- worker_processes auto; http { include mime.types; default_type application/octet-stream; server { listen 80; if ( $http_user_agent ~* (GoogleHC|KubernetesHealthCheck) ) { return 200; } root /usr/share/nginx/html; open_file_cache max=10000 inactive=10m; open_file_cache_valid 1m; open_file_cache_min_uses 1; open_file_cache_errors on; include /etc/nginx/conf.d/*.conf; } } --- apiVersion: apps/v1 kind: Deployment metadata: name: gcs-proxy-media-simple-project-com spec: replicas: 1 selector: matchLabels: app: gcs-proxy-media-simple-project-com template: metadata: labels: app: gcs-proxy-media-simple-project-com spec: volumes: - name: nginx-config configMap: name: google-cloud-storage-proxy - name: nginx-config-extra emptyDir: {} initContainers: - name: create-robots-txt image: busybox command: ["sh", "-c"] args: - | set -euo pipefail cat << 'EOF' > /etc/nginx/conf.d/robots.txt User-agent: * Disallow: / EOF volumeMounts: - name: nginx-config-extra mountPath: /etc/nginx/conf.d/ - name: create-nginx-extra-conf image: busybox command: ["sh", "-c"] args: - | set -euo pipefail cat << 'EOF' > /etc/nginx/conf.d/extra.conf location /robots.txt { alias /etc/nginx/conf.d/robots.txt; } EOF volumeMounts: - name: nginx-config-extra mountPath: /etc/nginx/conf.d/ containers: - name: http image: swaglive/openresty:gcsfuse imagePullPolicy: Always args: ["nginx", "-c", "/usr/local/openresty/nginx/conf/nginx.conf", "-g", "daemon off;"] ports: - containerPort: 80 securityContext: privileged: true capabilities: add: ["CAP_SYS_ADMIN"] env: - name: GCSFUSE_OPTIONS value: "--debug_gcs --implicit-dirs --stat-cache-ttl 1s --type-cache-ttl 24h --limit-bytes-per-sec -1 --limit-ops-per-sec -1 -o ro,allow_other" - name: GOOGLE_CLOUD_STORAGE_BUCKET value: asia.contents.simple-project.com volumeMounts: - name: nginx-config mountPath: /usr/local/openresty/nginx/conf/nginx.conf subPath: nginx.conf readOnly: true - name: nginx-config-extra mountPath: /etc/nginx/conf.d/ readOnly: true readinessProbe: httpGet: port: 80 path: / httpHeaders: - name: User-Agent value: "KubernetesHealthCheck/1.0" timeoutSeconds: 1 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 1 successThreshold: 1 resources: requests: cpu: 0m memory: 500Mi limits: cpu: 1000m memory: 500Mi

$ kubectl exec -i -t simple-api-5968cfc48d-8g755 -- sh (gke_simple-project-198818_asia-east1_demo/default) > curl http://gcs-proxy-media-simple-project-com/robots.txt User-agent: * Disallow: /

ref:

https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340

Create Deployments With Canary Deployment

TODO

ref:

https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments

https://medium.com/google-cloud/kubernetes-canary-deployments-for-mere-mortals-13728ce032fe

Rollback A Deployment

Yes, you could publish a deployment with kubectl apply --record and rollback it with kubectl rollout undo . However, the simplest way might be just git checkout the previous commit and deploy again with kubectl apply .

The formal way.

$ kubectl apply -f simple-api/ -R --record $ kubectl rollout history deploy/simple-api $ kubectl rollout undo deploy/simple-api --to-revision=2

The git way.

$ git checkout b7ed8d5 $ kubectl apply -f simple-api/ -R $ kubectl get pods

ref:

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment

Scale A Deployment

Simply increase the number of spec.replicas and deploy again.

$ kubectl apply -f simple-api/ -R # or $ kubectl scale --replicas=10 deploy/simple-api $ kubectl get pods

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#scale

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#scaling-a-deployment

Create HorizontalPodAutoscalers (HPA)

The Horizontal Pod Autoscaler automatically scales the number of pods in a Deployment based on observed CPU utilization, memory usage, or custom metrics. Yes, HPA only applies to Deployments and ReplicationControllers.

kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta1 metadata: name: simple-api spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: simple-api minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 80 - type: Resource resource: name: memory targetAverageValue: 800M --- kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta1 metadata: name: simple-worker spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: simple-worker minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 80 - type: Resource resource: name: memory targetAverageValue: 500M

$ kubectl apply -f simple-api/hpa.yaml $ kubectl get hpa --watch NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE simple-api Deployment/simple-api 18685952/800M, 4%/80% 2 20 3 10m simple-worker Deployment/simple-worker 122834944/500M, 11%/80% 2 10 3 10m

ref:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/

You could run some load testing.

ref:

https://medium.com/@jonbcampos/kubernetes-horizontal-pod-scaling-190e95c258f5

There is also Cluster Autoscaler in Google Kubernetes Engine.

$ gcloud container clusters update demo \ --enable-autoscaling --min-nodes=1 --max-nodes=10 \ --node-pool=default-pool

ref:

https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler

Create VerticalPodsAutoscalers (VPA)

TODO

ref:

https://medium.com/@Mohamed.ahmed/kubernetes-autoscaling-101-cluster-autoscaler-horizontal-pod-autoscaler-and-vertical-pod-2a441d9ad231

Create PodDisruptionBudget (PDB)

Voluntary disruptions: actions initiated by application owners or admins.

Involuntary disruptions: unavoidable cases like hardware failures or system software error.

PodDisruptionBudgets are only accounted for with voluntary disruptions, something like a hardware failure will not take PodDisruptionBudget into account. PDB cannot prevent involuntary disruptions from occurring, but they do count against the budget.

Create a PodDisruptionBudget for a stateless application.

kind: PodDisruptionBudget apiVersion: policy/v1beta1 metadata: name: simple-api spec: minAvailable: 90% selector: matchLabels: app: simple-api

Create a PodDisruptionBudget for a multiple-instance stateful application.

kind: PodDisruptionBudget apiVersion: policy/v1beta1 metadata: name: mongodb-rs0 spec: minAvailable: 2 selector: matchLabels: app: mongodb-rs0

$ kubectl apply -f simple-api/pdb.yaml $ kubectl apply -f mongodb/pdb.yaml $ kubectl get pdb NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE mongodb-rs0 2 N/A 1 48m simple-api 90% N/A 0 48m

ref:

https://kubernetes.io/docs/concepts/workloads/pods/disruptions/

https://kubernetes.io/docs/tasks/run-application/configure-pdb/

Actually, you could also have the similar functionality using .spec.strategy.rollingUpdate .

maxUnavailable : The maximum number of Pods that can be unavailable during the update process.

: The maximum number of Pods that can be unavailable during the update process. maxSurge : The maximum number of Pods that can be created over the desired number of Pods.

Which makes sure that total ready Pods >= total desired Pods - maxUnavailable and total Pods <= total desired Pods + maxSurge .

ref:

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#writing-a-deployment-spec

https://cloud.google.com/kubernetes-engine/docs/how-to/updating-apps

Create Services

A Service is basically a load balancer of a set of Pods which are selected by labels. Since you can't rely on any Pod's IP which changes every time it creates and destroys, you should always provide a Service as an entry point for your Pods or so-called Microservice.

Typically, containers you run in the cluster are not accessible from the Internet, because they do not have external IP addresses. You must explicitly expose your application by creating a Service or an Ingress.

There are following Service types:

ClusterIP : A virtual IP which is only reachable from within the cluster. Also, the default Service type.

: A virtual IP which is only reachable from within the cluster. Also, the default Service type. NodePort : It opens a specific port on all Nodes, and any traffic sent to the specific port on any node is forwarded to the Service.

: It opens a specific port on all Nodes, and any traffic sent to the specific port on any node is forwarded to the Service. LoadBalancer : It builds on NodePorts by additionally configuring the cloud provider to create an external load balancer.

: It builds on by additionally configuring the cloud provider to create an external load balancer. ExternalName : It maps the service to an external CNAME record, i.e., your MySQL RDS on AWS.

Create a Service.

kind: Service apiVersion: v1 metadata: name: simple-api spec: type: NodePort selector: app: simple-api ports: - name: http port: 80 targetPort: 8000

type: NodePorts is enough in most cases; spec.selector must match labels defined in the corresponding Deployment as the same as spec.ports.targetPort and spec.ports.protocol .

$ kubectl apply -f simple-api/ -R $ kubectl get svc,endpoints $ kubespy trace service simple-api [ADDED v1/Service] default/simple-api [ADDED v1/Endpoints] default/simple-api Directs traffic to the following live Pods: - [Ready] simple-api-6b4b4c4bfb-g5dln @ 10.28.1.42 - [Ready] simple-api-6b4b4c4bfb-h66dg @ 10.28.8.24

ref:

https://kubernetes.io/docs/concepts/services-networking/service/

https://medium.com/google-cloud/kubernetes-nodeport-vs-loadbalancer-vs-ingress-when-should-i-use-what-922f010849e0

After a Service is created, kube-dns creates a corresponding DNS A record named your-service.your-namespace.svc.cluster.local which resolves to an internal IP in the cluster. In ths case: simple-api.default.svc.cluster.local . Headless Services (without a cluster IP) are also assigned a DNS A record which has the same form. Unlike normal Services, this A record directly resolves to a set of IPs of Pods selected by the Service. Clients should be expected to consume the set of IPs or use round-robin selection from the set.

You should always prefer DNS names of a Service over injected environment variables, e.g., FOO_SERVICE_HOST and FOO_SERVICE_PORT .

ref:

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/

For more detail about Kubernetes networking, go to:

https://github.com/hackstoic/kubernetes_practice/blob/master/%E7%BD%91%E7%BB%9C.md

https://containerops.org/2017/01/30/kubernetes-services-and-ingress-under-x-ray/

https://www.safaribooksonline.com/library/view/kubernetes-up-and/9781491935668/ch07.html

Configure Services With Google Cloud CDN

kind: BackendConfig apiVersion: cloud.google.com/v1beta1 metadata: name: cdn spec: cdn: enabled: true cachePolicy: includeHost: false includeProtocol: false includeQueryString: false --- kind: Service apiVersion: v1 metadata: name: gcs-proxy-media-simple-project-com annotations: beta.cloud.google.com/backend-config: '{"ports": {"http":"cdn"}}' cloud.google.com/neg: '{"ingress": true}' spec: selector: app: gcs-proxy-media-simple-project-com ports: - name: http port: 80 targetPort: 80

ref:

https://cloud.google.com/kubernetes-engine/docs/concepts/backendconfig

Configure Services With Network Endpoint Groups (NEGs)

To use container-native load balancing, you must create a cluster with --enable-ip-alias flag, and just add an annotation to your Services. However, the load balancer is not created until you create an Ingress for the Service.

kind: Service apiVersion: v1 metadata: name: simple-api annotations: cloud.google.com/neg: '{"ingress": true}' spec: selector: app: simple-api ports: - name: http port: 80 targetPort: 8000

ref:

https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing

Create An Internal Load Balancer

ref:

https://medium.com/@johnjjung/creating-an-inter-kubernetes-cluster-services-using-an-internal-loadbalancer-137f768bb3fc

Use Port Forwarding

Access a Service or a Pod on your local machine with port forwarding.

# 8080 is the local port and 80 is the remote port $ kubectl port-forward svc/simple-api 8080:80 # port forward to a Pod directly $ kubectl port-forward mongo-rs0-0 27017:27017 $ open http://127.0.0.1:8080/

ref:

https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/

Create An Ingress

Pods in Kubernetes are not reachable from outside the cluster, so you need a way to expose your Pods to the Internet. Even though you could associate Pods with a Service of the right type, i.e., NodePort or LoadBalancer , the recommended way to expose services is using Ingress. You can do a lot of different things with an Ingress, and there are many types of Ingress controllers that have different capabilities.

There are some reasons to choose Ingress over Service:

Service is internal load balancer and Ingress is a gateway of external access to Services

Service is L3 load balancer and Ingress is L7 load balancer

Ingress allows domain-based and path-based routing to different Services

It is not efficient to create a cloud provider's load balancer for each Service you want to expose

Create an Ingress which is implemented using Google Cloud Load Balancing (L7 HTTP load balancer). You should make sure Services exist before creating the Ingress.

kind: Ingress apiVersion: extensions/v1beta1 metadata: name: simple-project annotations: kubernetes.io/ingress.class: "gce" # kubernetes.io/tls-acme: "true" # ingress.kubernetes.io/ssl-redirect: "true" spec: # tls: # - secretName: simple-project-com-tls # hosts: # - simple-project.com # - www.simple-project.com # - api.simple-project.com rules: - host: simple-project.com http: paths: - path: /* backend: serviceName: simple-frontend servicePort: 80 - host: www.simple-project.com http: paths: - path: /* backend: serviceName: simple-frontend servicePort: 80 - host: api.simple-project.com http: paths: - path: /* backend: serviceName: simple-api servicePort: 80 - host: asia.contents.simple-project.com http: paths: - path: /* backend: serviceName: gcs-proxy-media-simple-project-com servicePort: 80 backend: serviceName: simple-api servicePort: 80

It might take several minutes to spin up a Google HTTP load balancer (includes acquiring the public IP), and at least 5 minutes before the GCE API starts healthchecking backends. After getting your public IP, you could go to your domain provider and create new DNS records which point to the IP.

$ kubectl apply -f ingress.yaml $ kubectl describe ing simple-project

ref:

https://kubernetes.io/docs/concepts/services-networking/ingress/

https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/

https://www.joyfulbikeshedding.com/blog/2018-03-26-studying-the-kubernetes-ingress-system.html

To read more about Google Load balancer, go to:

https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer

https://cloud.google.com/compute/docs/load-balancing/http/backend-service

Setup The Ingress With TLS Certificates

To automatically create HTTPS certificates for your domains:

Create Ingress Controllers

Kubernetes supports multiple Ingress controllers:

ref:

https://container-solutions.com/production-ready-ingress-kubernetes/

Create StorageClasses

StorageClass provides a way to define different available storage types, for instance, ext4 SSD, XFS SSD, CephFS, NFS. You could specify what you want in PersistentVolumeClaim or StatefulSet.

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: ssd provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd --- kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: ssd-xfs provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd fsType: xfs --- kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: ssd-regional provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd zones: asia-east1-a, asia-east1-b, asia-east1-c replication-type: regional-pd

$ kubectl apply -f storageclass.yaml $ kubectl get sc NAME PROVISIONER AGE ssd kubernetes.io/gce-pd 5s ssd-regional kubernetes.io/gce-pd 4s ssd-xfs kubernetes.io/gce-pd 3s standard (default) kubernetes.io/gce-pd 1h

ref:

https://kubernetes.io/docs/concepts/storage/storage-classes/#gce

Create PersistentVolumeClaims

A Volume is just a directory which you could mount into containers and it is shared by all containers inside the same Pod. Also, it has an explicit lifetime - the same as the Pod that encloses it. Sources of Volume are various, they could be a remote Git repo, a file path of the host machine, a folder from a PersistentVolumeClaim, or data from a ConfigMap and a Secret.

PersistentVolumes are used to manage durable storage in a cluster. Unlike Volumes, PersistentVolumes have a lifecycle independent of any individual Pod. On Google Kubernetes Engine, PersistentVolumes are typically backed by Google Compute Engine Persistent Disks. Typically, you don't have to create PersistentVolumes explicitly. In Kubernetes 1.6 and later versions, you only need to create PersistentVolumeClaim, and the corresponding PersistentVolume would be dynamically provisioned with StorageClasses. Pods use PersistentVolumeClaims as Volumes.

Be care of creating a Deployment with PersistentVolumeClaim. In most of the case, you might not want to multiple replica of a Deployment write data into the same PersistentVolumeClaim.

ref:

https://kubernetes.io/docs/concepts/storage/volumes/

https://kubernetes.io/docs/concepts/storage/persistent-volumes/

https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes

Also, IOPS is based on the disk size and node size. You need to claim a large disk size if you want high IOPS even you only have very few disk usage.

ref:

https://cloud.google.com/compute/docs/disks/performance

On Kubernetes v1.10+, it is possible to create local PersistentVolumes for your StatefulSets. Previously, PersistentVolumes only supported remote volume types, for instance, GCE's Persistent Disk and AWS's EBS. However, using local storage ties your applications to that specific node, making your application harder to schedule.

ref:

https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/

Create A StatefulSet

Pods created under a StatefulSet have a few unique attributes: the name of the pod is not random, instead each pod gets an ordinal name. In addition, Pods are created one at a time instead of all at once, which can help when bootstrapping a stateful system. StatefulSet also deletes/updates one Pod at a time, in reverse order with respect to its ordinal index, and it waits for each to be completely shutdown before deleting the next.

Rule of thumb: once you find out that you need PersistentVolume for the component, you might just consider using StatefulSet.

ref:

https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/

https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

https://akomljen.com/kubernetes-persistent-volumes-with-deployment-and-statefulset/

Create a StatefulSet of a three-node MongoDB replica set.

kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: default-view roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: view subjects: - kind: ServiceAccount name: default namespace: default --- kind: Service apiVersion: v1 metadata: name: mongodb-rs0 spec: clusterIP: None selector: app: mongodb-rs0 ports: - port: 27017 targetPort: 27017 --- kind: StatefulSet apiVersion: apps/v1 metadata: name: mongodb-rs0 spec: replicas: 3 updateStrategy: type: RollingUpdate serviceName: mongodb-rs0 selector: matchLabels: app: mongodb-rs0 volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: ssd-xfs resources: requests: storage: 100G template: metadata: labels: app: mongodb-rs0 spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: custom.kubernetes.io/fs-type operator: In values: - "xfs" - key: cloud.google.com/gke-preemptible operator: NotIn values: - "true" podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - topologyKey: "kubernetes.io/hostname" labelSelector: matchExpressions: - key: "app" operator: In values: - mongodb-rs0 terminationGracePeriodSeconds: 10 containers: - name: db image: mongo:3.6.5 command: ["mongod"] args: ["--bind_ip_all", "--replSet", "rs0"] ports: - containerPort: 27017 volumeMounts: - name: data mountPath: /data/db readinessProbe: exec: command: ["mongo", --eval, "db.adminCommand('ping')"] resources: requests: cpu: 2 memory: 4G limits: cpu: 4 memory: 4G - name: sidecar image: cvallance/mongo-k8s-sidecar env: - name: MONGO_SIDECAR_POD_LABELS value: app=mongodb-rs0 - name: KUBE_NAMESPACE value: default - name: KUBERNETES_MONGO_SERVICE_NAME value: mongodb-rs0

$ kubectl apply -f storageclass.yaml $ kubectl apply -f mongodb/ -R $ kubectl get pods $ kubetail mongodb -c db $ kubetail mongodb -c sidecar $ kubectl scale statefulset mongodb-rs0 --replicas=4

The purpose of cvallance/mongo-k8s-sidecar is to automatically add new Pods to the replica set and remove Pods from the replica set while you scale up or down MongoDB StatefulSet.

ref:

https://github.com/cvallance/mongo-k8s-sidecar

https://kubernetes.io/blog/2017/01/running-mongodb-on-kubernetes-with-statefulsets/

https://medium.com/@thakur.vaibhav23/scaling-mongodb-on-kubernetes-32e446c16b82

Create A Headless Service For A StatefulSet

Headless Services ( clusterIP: None ) are just like normal Kubernetes Services, except they don’t do any load balancing for you. For a typical StatefulSet component, for instance, a database with Master-Slave replication, you don't want Kubernetes load balancing in order to prevent writing data to slaves accidentally.

When headless Services combine with StatefulSets, they can give you unique DNS addresses which return A records that point directly to Pods themselves. DNS names are in the format of static-pod-name.headless-service-name.namespace.svc.cluster.local .

kind: Service apiVersion: v1 metadata: name: redis-broker spec: clusterIP: None selector: app: redis-broker ports: - port: 6379 targetPort: 6379 --- kind: StatefulSet apiVersion: apps/v1 metadata: name: redis-broker spec: replicas: 1 serviceName: redis-broker selector: matchLabels: app: redis-broker volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: ssd resources: requests: storage: 32Gi template: metadata: labels: app: redis-broker spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-preemptible operator: NotIn values: - "true" volumes: - name: config configMap: name: redis-broker containers: - name: redis image: redis:4.0.10-alpine command: ["redis-server"] args: ["/etc/redis/redis.conf", "--loglevel", "verbose", "--maxmemory", "1g"] ports: - containerPort: 6379 volumeMounts: - name: data mountPath: /data - name: config mountPath: /etc/redis readinessProbe: exec: command: ["sh", "-c", "redis-cli -h $(hostname) ping"] initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 1 successThreshold: 1 failureThreshold: 3 resources: requests: cpu: 250m memory: 1G limits: cpu: 1000m memory: 1G

If redis-broker has 2 replicas, nslookup redis-broker.default.svc.cluster.local returns multiple A records for a single DNS lookup is commonly known as round-robin DNS.

$ kubectl run -i -t --image busybox dns-test --restart=Never --rm /bin/sh > nslookup redis-broker.default.svc.cluster.local Server: 10.63.240.10 Address 1: 10.63.240.10 kube-dns.kube-system.svc.cluster.local Name: redis-broker.default.svc.cluster.local Address 1: 10.60.6.2 redis-broker-0.redis-broker.default.svc.cluster.local Address 2: 10.60.6.7 redis-broker-1.redis-broker.default.svc.cluster.local > nslookup redis-broker-0.redis-broker.default.svc.cluster.local Server: 10.63.240.10 Address 1: 10.63.240.10 kube-dns.kube-system.svc.cluster.local Name: redis-broker-0.redis-broker.default Address 1: 10.60.6.2 redis-broker-0.redis-broker.default.svc.cluster.local

ref:

https://kubernetes.io/docs/concepts/services-networking/service/#headless-services

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services

https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#using-stable-network-identities

Moreover, there is no port re-mapping for a headless Service due to the IP resolves to Pod directly.

kind: Service apiVersion: v1 metadata: namespace: tick name: influxdb spec: clusterIP: None selector: app: influxdb ports: - name: api port: 4444 targetPort: 8086 - name: admin port: 8083 targetPort: 8083

$ kubectl apply -f tick/ -R $ kubectl get svc --namespace tick NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE influxdb ClusterIP None <none> 4444/TCP,8083/TCP 1h $ curl http://influxdb.tick.svc.cluster.local:4444/ping curl: (7) Failed to connect to influxdb.tick.svc.cluster.local port 4444: Connection refused $ curl -I http://influxdb.tick.svc.cluster.local:8086/ping HTTP/1.1 204 No Content Content-Type: application/json Request-Id: 7fc09a56-8538-11e8-8d1d-000000000000

Create A DaemonSet

Create a DaemonSet which changes OS kernel configurations on each node.

kind: DaemonSet apiVersion: apps/v1 metadata: name: thp-disabler spec: selector: matchLabels: app: thp-disabler template: metadata: labels: app: thp-disabler spec: hostPID: true containers: - name: configurer image: gcr.io/google-containers/startup-script:v1 securityContext: privileged: true env: - name: STARTUP_SCRIPT value: | #! /bin/bash set -o errexit set -o pipefail set -o nounset echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag

ref:

https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/

Create A CronJob

Backup your MongoDB database every hour.

kind: CronJob apiVersion: batch/v1beta1 metadata: name: backup-mongodb-rs0 spec: suspend: false schedule: "30 * * * *" startingDeadlineSeconds: 600 jobTemplate: spec: template: spec: restartPolicy: OnFailure affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: custom.kubernetes.io/scopes-storage-full operator: In values: - "true" volumes: - name: backups-dir emptyDir: {} initContainers: - name: clean image: busybox command: ["rm", "-rf", "/backups/*"] volumeMounts: - name: backups-dir mountPath: /backups - name: backup image: vinta/mongodb-tools:4.0.1 workingDir: /backups command: ["sh", "-c"] args: - mongodump --host=$MONGODB_URL --readPreference=secondaryPreferred --oplog --gzip --archive=$(date +%Y-%m-%dT%H-%M-%S).tar.gz env: - name: MONGODB_URL value: mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local,mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local,mongodb-rs0-3.mongodb-rs0.default.svc.cluster.local volumeMounts: - name: backups-dir mountPath: /backups resources: requests: cpu: 2 memory: 2G containers: - name: upload image: google/cloud-sdk:alpine workingDir: /backups command: ["sh", "-c"] args: - gsutil -m cp -r . gs://$(GOOGLE_CLOUD_STORAGE_BUCKET) env: - name: GOOGLE_CLOUD_STORAGE_BUCKET value: simple-project-backups volumeMounts: - name: backups-dir mountPath: /backups readOnly: true

Note: The environment variable appears in parentheses, $(VAR) , and it is required for the variable to be expanded in the command or args field.

apiVersion: batch/v1beta1 kind: CronJob metadata: name: simple-api-send-email spec: schedule: "*/30 * * * *" concurrencyPolicy: Forbid jobTemplate: spec: template: spec: restartPolicy: Never containers: - name: simple-api-send-email image: asia.gcr.io/simple-project-198818/simple-api:4fc4199 command: ["flask", "shell", "-c"] args: - | from bar.tasks import send_email send_email.delay('Hey!', 'Stand up!', to=['[email protected]']) envFrom: - configMapRef: name: simple-api

You could just write a simple Python script as a CronJob since everyting is containerized.

ref:

https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/

Define NodeAffinity And PodAffinity

Prevent that Pods locate on preemptible nodes. Also, you should always prefer nodeAffinity over nodeSelector .

kind: StatefulSet apiVersion: apps/v1 spec: template: spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-preemptible operator: NotIn values: - "true"

ref:

https://medium.com/google-cloud/using-preemptible-vms-to-cut-kubernetes-engine-bills-in-half-de2481b8e814

spec.PodAntiAffinity ensures that each Pod of the same Deployment or StatefulSet does not co-locate on a single node.

kind: StatefulSet apiVersion: apps/v1 spec: template: spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - topologyKey: "kubernetes.io/hostname" labelSelector: matchExpressions: - key: "app" operator: In values: - mongodb-rs0

ref:

https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

Migrate Pods from Old Nodes to New Nodes

Cordon marks old nodes as unschedulable

Drain evicts all Pods on old nodes

for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=n1-standard-4-pre -o=name); do kubectl cordon "$node"; done for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=n1-standard-4-pre -o=name); do kubectl drain --ignore-daemonsets --delete-local-data --grace-period=2 "$node"; done $ kubectl get nodes NAME STATUS ROLES AGE VERSION gke-demo-default-pool-3c058fcf-x7cv Ready <none> 2h v1.11.6-gke.6 gke-demo-default-pool-58da1098-1h00 Ready <none> 2h v1.11.6-gke.6 gke-demo-default-pool-fc34abbf-9dwr Ready <none> 2h v1.11.6-gke.6 gke-demo-n1-standard-4-pre-1a54e45a-0m7p Ready,SchedulingDisabled <none> 58m v1.11.6-gke.6 gke-demo-n1-standard-4-pre-1a54e45a-mx3h Ready,SchedulingDisabled <none> 58m v1.11.6-gke.6 gke-demo-n1-standard-4-pre-1a54e45a-qhdz Ready,SchedulingDisabled <none> 58m v1.11.6-gke.6

ref:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#cordon

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#drain

https://cloud.google.com/kubernetes-engine/docs/tutorials/migrating-node-pool

Show Objects' Events

$ kubectl get events -w $ kubectl get events -w --sort-by=.metadata.creationTimestamp $ kubectl get events -w --sort-by=.metadata.creationTimestamp | grep mongo

ref:

https://kubernetes.io/docs/tasks/debug-application-cluster/

You could find more comprehensive logs on Google Cloud Stackdriver Logging if you are using GKE.

View Pods' Logs on Stackdriver Logging

You could use the following search formats.

textPayload:"OBJECT_FINALIZE" logName="projects/simple-project-198818/logs/worker" textPayload:"Added media preset" logName="projects/simple-project-198818/logs/beat" textPayload:"backend_cleanup" resource.labels.pod_id="simple-api-6744bf74db-529qf" textPayload:"5adb2bd460d6487649fe82ea" timestamp>="2018-04-21T12:00:00Z" timestamp<="2018-04-21T16:00:00Z" resource.type="k8s_container" resource.labels.cluster_name="production" resource.labels.namespace_id="default" resource.labels.pod_id:"simple-worker" textPayload:"ConcurrentObjectUseError" resource.type="k8s_node" resource.labels.location="asia-east1" resource.labels.cluster_name="production" logName="projects/simple-project-198818/logs/node-problem-detector" # see a Pod's logs resource.type="k8s_container" resource.labels.cluster_name="production" resource.labels.namespace_id="default" resource.labels.pod_name="cache-redis-0" "start" # see a Node's logs resource.type="k8s_node" resource.labels.location="asia-east1" resource.labels.cluster_name="production" resource.labels.node_name="gke-production-n1-highmem-32-p0-2bd334ec-v4ng" "start"

ref:

https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/

https://cloud.google.com/logging/docs/view/advanced-filters

Best Practices

ref:

https://cloud.google.com/solutions/best-practices-for-building-containers

https://medium.com/@sachin.arote1/kubernetes-best-practices-9b1435a4cb53

https://medium.com/@brendanrius/scaling-kubernetes-for-25m-users-a7937e3536a0

Common Issues

Switch Contexts

Get authentication credentials to allow your kubectl to interact with the cluster.

$ gcloud container clusters get-credentials demo --project simple-project-198818

ref:

https://cloud.google.com/sdk/gcloud/reference/container/clusters/get-credentials

https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/

A Context is roughly a configuration profile which indicates the cluster, the namespace, and the user you use. Contexts are stored in ~/.kube/config .

$ kubectl config get-contexts $ kubectl config use-context gke_simple-project-198818_asia-east1_demo $ kubectl config view

ref:

https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/

The recommended way to switch contexts is using fubectl .

$ kcs

ref:

https://github.com/kubermatic/fubectl

Pending Pods

One of the most common reasons of Pending Pods is lack of resources.

$ kubectl describe pod mongodb-rs0-1 ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 3m (x739 over 1d) default-scheduler 0/3 nodes are available: 1 ExistingPodsAntiAffinityRulesNotMatch, 1 MatchInterPodAffinity, 1 NodeNotReady, 2 NoVolumeZoneConflict, 3 Insufficient cpu, 3 Insufficient memory, 3 MatchNodeSelector. ...

You could resize nodes in the cluster.

$ gcloud container clusters resize demo --node-pool=n1-standard-4-pre --size=5 --region=asia-east1

ref:

https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/

Init:Error Pods

$ kubectl describe mongodump-sh0-1543978800-bdkhl $ kubectl logs mongodump-sh0-1543978800-bdkhl -c mongodump

ref:

https://kubernetes.io/docs/tasks/debug-application-cluster/debug-init-containers/#accessing-logs-from-init-containers

CrashLoopBackOff Pods

CrashLoopBackOff means the Pod is starting, then crashing, then starting again and crashing again.

When in doubt, kubectl describe .

$ kubectl describe pod the-pod-name $ kubectl logs the-pod-name --previous

ref:

https://www.krenger.ch/blog/crashloopbackoff-and-how-to-fix-it/

https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/