Background

When Kubernetes pods start up their health state is assessed by probes. This allows K8s to perform management operations. It ensures that pods which aren’t up or ready do not receive traffic and it also ensures that pods can be restarted to restore a healthy state if they fall ill.

The Health probe asks the pod if is healthy. Usually meaning that that the server is up and the endpoints can be hit.

The Readiness probe asks the pod if it is ready. This is because the application may be healthy but busy with some other process such as data loading or dependency checking and as such wouldn’t be fully ready to to be operational.

The problem

Recently, I was having an error where pods could not start. When pods are failing to start you can query the reason:

kubectl describe pod <pod-name>

From this we found a ‘getsockopt’ error. A Reason:‘Error’ and Exit Code:’137'.

After googling around I found that this reason code appears when the pod runs out of memory. However I didn’t think it was that because any instance where my pods ran out of memory I did get this code but also the message ‘OOMKilled’.

I then found that there are some issue tickets raised for ‘getsockopt’ and amidst the various root causes, one was around probes failing.

The Race conditions

The probes create a race condition for the pod on start-up.

A ‘race condition’ is where a system has dependent events and certain events must occur before other events. However, there is no execution control. The race is where event A must complete before event B is executed.

Our race condition is where a pod is booting up. It must complete and be able to indicate a healthy state before the probe event occurs.

As such you have two choices.

The Solution

1: Meet the Race Condition

Allocate Higher Quantities of CPU to your Deployments so that the boot process is faster and the pod will be up in time for Liveness and Readiness checks.

2: Change the Race Condition

Set a longer Initial wait time on the Live and Readiness probe and extend the failure deadline and the test interval . This should give you plenty of time for your Service to boot up and become Ready.

If you raise CPU this can lead to an overallocation of CPU after boot. This could be handled with vertical CPU scaling to allow the pod to request more CPU when needed. But, most of us will already be using Horizontal Pod Scaling which is easier to manage with a static CPU configuration. (see my post on k8s sizing: https://medium.com/pareture/k8s-reusable-cross-env-microservice-sizing-template-d03dd8bfebf2 )

Below is a deployment.yaml setting which waits 1 minute before the first check. Then will do 5 additional checks 30 seconds apart. Giving the pod a total of 3.5 minutes to be ready before a re-deployment. It will succeed on the first indication and each probe times out after 3 seconds. Both the readiness and liveness probes are the same in this case.

readinessProbe:

httpGet:

scheme: http

path: /health

port: 8080

initialDelaySeconds: 60

timeoutSeconds: 3

periodSeconds: 30

successThreshold: 1

failureThreshold: 5 livenessProbe:

httpGet:

scheme: http

path: /health

port: 8080

initialDelaySeconds: 60

timeoutSeconds: 3

periodSeconds: 30

successThreshold: 1

failureThreshold: 5

Even if the CPU allocation does need to be higher it is important to know these settings and what they are doing for you.

Obviously there could be other reasons for the probe failing and ending up with this type of Error but this is what solved it for me. I hope it can be useful for others as well.

Resources

K8s probes: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

getsockopt issue ticket & comment which inspired this post: https://github.com/kubernetes/kubernetes/issues/62594#issuecomment-42068573

Race Conditon: https://en.wikipedia.org/wiki/Race_condition