At Banzai Cloud we run and deploy containerized applications to Pipeline, our PaaS. Those of you who (like us) run Java applications inside Docker, have probably already come across the problem of JVMs inaccurately detecting available memory when running inside a container. Instead of accurately detecting the memory available in a Docker container, JVMs see the available memory of the machine. This can lead to cases wherein applications that run inside containers are killed whenever they try to use an amount of memory that exceeds the limits of the Docker container.

A follow up and complementary post

The incorrect detection of available memory by JVMs is associated with Linux tools/libs that were created for returning system resource information (e.g. /proc/meminfo , /proc/vmstat ) before cgroups even existed. These return the resource information of a host (whether that host is a physical or virtual machine).

Let’s explore this process in action by observing how a simple Java application allocates a percentage of memory while running inside a Docker container. We’re going to deploy the application as a Kubernetes pod (using Minikube) to illustrate how the issue is also present on Kubernetes, which is unsurprising, since Kubernetes uses Docker as a container engine.

package com.banzaicloud ; import java.util.Vector ; public class MemoryConsumer { private static float CAP = 0 . 8f ; // 80% private static int ONE_MB = 1024 * 1024 ; private static Vector cache = new Vector (); public static void main ( String [] args ) { Runtime rt = Runtime . getRuntime (); long maxMemBytes = rt . maxMemory (); long usedMemBytes = rt . totalMemory () - rt . freeMemory (); long freeMemBytes = rt . maxMemory () - usedMemBytes ; int allocBytes = Math . round ( freeMemBytes * CAP ); System . out . println ( "Initial free memory: " + freeMemBytes / ONE_MB + "MB" ); System . out . println ( "Max memory: " + maxMemBytes / ONE_MB + "MB" ); System . out . println ( "Reserve: " + allocBytes / ONE_MB + "MB" ); for ( int i = 0 ; i < allocBytes / ONE_MB ; i ++){ cache . add ( new byte [ ONE_MB ]); } usedMemBytes = rt . totalMemory () - rt . freeMemory (); freeMemBytes = rt . maxMemory () - usedMemBytes ; System . out . println ( "Free memory: " + freeMemBytes / ONE_MB + "MB" ); } }

We use a Docker build file to create the image that contains the jar that’s built from the Java code above. We need this Docker image in order to deploy the application as a Kubernetes Pod.

Dockerfile

FROM openjdk:8-alpine ADD memory_consumer.jar /opt/local/jars/memory_consumer.jar CMD java $JVM_OPTS -cp /opt/local/jars/memory_consumer.jar com.banzaicloud.MemoryConsumer

docker build -t memory_consumer .

Now that we have the Docker image, we need to create a pod definition to deploy the application to kubernetes:

memory-consumer.yaml

apiVersion : v1 kind : Pod metadata : name : memory-consumer spec : containers : - name : memory-consumer-container image : memory_consumer imagePullPolicy : Never resources : requests : memory : "64Mi" limits : memory : "256Mi" restartPolicy : Never

This pod definition ensures that the container is scheduled to a node that has at least 64MB of free memory and that it will not be allowed to use more than 256MB of memory.

$ kubectl create -f memory-consumer.yaml pod "memory-consumer" created

Output of the pod:

$ kubectl logs memory-consumer Initial free memory: 877MB Max memory: 878MB Reserve: 702MB Killed $ kubectl get po --show-all NAME READY STATUS RESTARTS AGE memory-consumer 0/1 OOMKilled 0 1m

The Java application that was running inside the container detected 877MB of free memory and consequentially attempted to reserve 702MB of it. Since we previously limited the maximum memory usage to 256MB , the container was killed.

To avoid this outcome, we need to instruct the JVM as to the correct maximum amount of memory it can reserve. We do that via the -Xmx option. We need to modify our pod definition to pass an -Xmx setting through the JVM_OPTS env variable to the Java application in the container.

memory-consumer.yaml

apiVersion : v1 kind : Pod metadata : name : memory-consumer spec : containers : - name : memory-consumer-container image : memory_consumer imagePullPolicy : Never resources : requests : memory : "64Mi" limits : memory : "256Mi" env : - name : JVM_OPTS value : "-Xms64M -Xmx256M" restartPolicy : Never

$ kubectl delete pod memory-consumer pod "memory-consumer" deleted $ kubectl get po --show-all No resources found. $ kubectl create -f memory_consumer.yaml pod "memory-consumer" created $ kubectl logs memory-consumer Initial free memory: 227MB Max memory: 228MB Reserve: 181MB Free memory: 50MB $ kubectl get po --show-all NAME READY STATUS RESTARTS AGE memory-consumer 0/1 Completed 0 1m

This time the application ran successfully; it detected the correct available memory we passed via -Xmx256M and so did not hit the memory limit memory: "256Mi" specified in the pod definition.

While this solution works, it requires that the memory limit be specified in two places: once as a limit for the container memory: "256Mi" , and once in the option that is passed to -Xmx256M . It would be much more convenient if the JVM accurately detected the maximum amount of available memory based on the memory: "256Mi" setting, wouldn’t it?

Well, there’s a change in Java 9 that makes it Docker aware, which has been backported to Java 8.

In order to make use of this feature, our pod definition has to look like this:

memory-consumer.yaml

apiVersion : v1 kind : Pod metadata : name : memory-consumer spec : containers : - name : memory-consumer-container image : memory_consumer imagePullPolicy : Never resources : requests : memory : "64Mi" limits : memory : "256Mi" env : - name : JVM_OPTS value : "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=1 -Xms64M" restartPolicy : Never

$ kubectl delete pod memory-consumer pod "memory-consumer" deleted $ kubectl get pod --show-all No resources found. $ kubectl create -f memory_consumer.yaml pod "memory-consumer" created $ kubectl logs memory-consumer Initial free memory: 227MB Max memory: 228MB Reserve: 181MB Free memory: 54MB $ kubectl get po --show-all NAME READY STATUS RESTARTS AGE memory-consumer 0/1 Completed 0 50s

Please note the -XX:MaxRAMFraction=1 through which we tell the JVM how much available memory to use as a max heap size.

Having a max heap size set that takes into account the available memory limit, either through -Xmx or dynamically with UseCGroupMemoryLimitForHeap , is important since it helps notify the JVM when memory usage is approaching its limit in order that it should free up space. If the max heap size is incorrect (exceeds the available memory limit), the JVM may blindly hit the limit without trying to free up memory, and the process will be OOMKilled.

The java.lang.OutOfMemoryError error is different. It indicates that the max heap size is not enough to hold all live objects in memory. If that’s the case, the max heap size needs to be increased via -Xmx or, if UseCGroupMemoryLimitForHeap is being used, via the memory limit of the container.