TL;DR: Java and cgroups/Docker memory constraints don’t always

behave as you might expect. Always explicitly specify JVM heap

sizes. Also be aware that kernel features may not be enabled. And Linux… lies.

I’ve recently discovered an interesting “quirk” in potential

interactions between Java, cgroups, Docker, and the kernel which can

cause some surprising results.

Unless you explicitly state heap sizes, the JVM makes guesses about

sizing based on the host on which it runs. In general on any “server

class” machine — which now refers to just about anything other than a

Windows desktop or a Raspberry Pi — by default specifies a maximum

heap size of approximately 1/4 of the ram on the host. Where this

becomes interesting is that specifying the amount of memory available

to a container does not affect what the jvm believes is available.

Last year I wrote in

Looking Inside a JVM: -XX:+PrintFlagsFinal

about finding the values configured in the JVM at runtime. By not

specifying a heap size, I get the following on a host with 12G of ram:

$ java -XX:+PrintFlagsFinal -version|grep -i heapsize|egrep 'Initial|Max' java version "1.8.0_74" Java(TM) SE Runtime Environment (build 1.8.0_74-b02) Java HotSpot(TM) 64-Bit Server VM (build 25.74-b02, mixed mode) uintx InitialHeapSize := 188743680 {product} uintx MaxHeapSize := 2988441600 {product} 1 2 3 4 5 6 7 $ java - XX : + PrintFlagsFinal - version | grep - i heapsize | egrep 'Initial|Max' java version "1.8.0_74" Java ( TM ) SE Runtime Environment ( build 1.8.0_74 - b02 ) Java HotSpot ( TM ) 64 - Bit Server VM ( build 25.74 - b02 , mixed mode ) uintx InitialHeapSize : = 188743680 { product } uintx MaxHeapSize : = 2988441600 { product }

Notice that the MaxHeapSize is ~3GB.

You ever look inside of Java …. in Docker? — Half Brewed

$ docker run --rm java java -XX:+PrintFlagsFinal -version |grep -i heapsize | egrep 'Initial|Max' openjdk version "1.8.0_72-internal" OpenJDK Runtime Environment (build 1.8.0_72-internal-b15) OpenJDK 64-Bit Server VM (build 25.72-b15, mixed mode) uintx InitialHeapSize := 188743680 {product} uintx MaxHeapSize := 2988441600 {product} 1 2 3 4 5 6 7 $ docker run -- rm java java - XX : + PrintFlagsFinal - version | grep - i heapsize | egrep 'Initial|Max' openjdk version "1.8.0_72-internal" OpenJDK Runtime Environment ( build 1.8.0_72 - internal - b15 ) OpenJDK 64 - Bit Server VM ( build 25.72 - b15 , mixed mode ) uintx InitialHeapSize : = 188743680 { product } uintx MaxHeapSize : = 2988441600 { product }

It’s the same. Ok, let’s set the max memory size of the container to

256m ( -m 256m ) and try again:

$ docker run -m 256m --rm java java -XX:+PrintFlagsFinal -version |grep -i heapsize | egrep 'Initial|Max' WARNING: Your kernel does not support swap limit capabilities, memory limited without swap. openjdk version "1.8.0_72-internal" OpenJDK Runtime Environment (build 1.8.0_72-internal-b15) OpenJDK 64-Bit Server VM (build 25.72-b15, mixed mode) uintx InitialHeapSize := 188743680 {product} uintx MaxHeapSize := 2988441600 {product} 1 2 3 4 5 6 7 8 9 $ docker run - m 256m -- rm java java - XX : + PrintFlagsFinal - version | grep - i heapsize | egrep 'Initial|Max' WARNING : Your kernel does not support swap limit capabilities , memory limited without swap . openjdk version "1.8.0_72-internal" OpenJDK Runtime Environment ( build 1.8.0_72 - internal - b15 ) OpenJDK 64 - Bit Server VM ( build 25.72 - b15 , mixed mode ) uintx InitialHeapSize : = 188743680 { product } uintx MaxHeapSize : = 2988441600 { product }

Note the Warning…. we’ll come back to it later (much later)

And… it’s the same.

Fabio Kung has written an interesting discussion of

Memory inside Linux containers

and the reasons for why system calls do not return the amount of

memory inside a container. In short, the various tools and system

calls (including those which the JVM invoke) were created before

cgroups and have no concept that such limits might exist.

So, how much memory is actually available to the

JVM? Let’s start with a class which eats memory. I found the following

code at

Java memory test – How to consume all the memory (RAM) on a computer:

import java.util.Vector; // The following is from: // http://alvinalexander.com/blog/post/java/java-program-consume-all-memory-ram-on-computer public class MemoryEater { public static void main(String[] args) { Vector v = new Vector(); while (true) { byte b[] = new byte[1048576]; v.add(b); Runtime rt = Runtime.getRuntime(); System.out.println( "free memory: " + rt.freeMemory() ); } } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 import java . util . Vector ; // The following is from: // http://alvinalexander.com/blog/post/java/java-program-consume-all-memory-ram-on-computer public class MemoryEater { public static void main ( String [ ] args ) { Vector v = new Vector ( ) ; while ( true ) { byte b [ ] = new byte [ 1048576 ] ; v . add ( b ) ; Runtime rt = Runtime . getRuntime ( ) ; System . out . println ( "free memory: " + rt . freeMemory ( ) ) ; } } }

We can use the Docker Java container to compile it:

docker run --rm -v "$PWD":/usr/src/myapp -w /usr/src/myapp java javac MemoryEater.java 1 2 docker run -- rm - v "$PWD" : / usr / src / myapp - w / usr / src / myapp java javac MemoryEater . java

Now that it is compiled, let’s test:

docker run --name memory_eater -d -v "$PWD":/usr/src/myapp -w /usr/src/myapp -m 256m java java -XX:+PrintFlagsFinal -XX:OnOutOfMemoryError="echo Out of Memory" -XX:ErrorFile=fatal.log MemoryEater 1 2 docker run -- name memory_eater - d - v "$PWD" : / usr / src / myapp - w / usr / src / myapp - m 256m java java - XX : + PrintFlagsFinal - XX : OnOutOfMemoryError = "echo Out of Memory" - XX : ErrorFile = fatal . log MemoryEater

There are a few interesting flags:

Flag Explanation -XX:OnOutOfMemoryError=”echo Out of Memory” Instruct the

JVM to output a message on

[OutOfMemoryError](https://docs.oracle.com/javase/7/docs/api/java/lang/OutOfMemoryError.html) -XX:ErrorFile=fatal.log When a fatal error occurs, an

error log is created with information and the state obtained at the

time of the fatal

error. ([Fatal Error Log – Troubleshooting Guide for Java SE 6 with HotSpot VM](http://www.oracle.com/technetwork/java/javase/felog-138657.html))

Betwixt the two flags, we should get some indication of an error….

Testing, Testing….

The tests were performed in a variety of scenarios:

Environment Docker Version Ram Swap Docker Memory Constraint Note(s) 4 Core, Openstack Instance 1.8.3 24G 0 --memory=256m [HCF](https://en.wikipedia.org/wiki/Halt_and_Catch_Fire) within seconds — the OOMKiller kills the process. 4 Core, Physical 1.10.3 12G 15G --memory=256m Runs for a while and ends with OutOfMemoryError 8 Core, Physical 1.9.1 32G 32G --memory=256m Runs for about 5 minutes and exits with OutOfMemoryError 8 Core, Physical 1.9.1 32G 32G --memory=255m --memory-swap=256m Runs for about 5 minutes and exits with OutOfMemoryError 8 Core, Physical 1.9.1 32G 32G --memory=255m --memory-swap=256m Kernel level swap accounting turned on. OOMKiller strikes almost immediately.

In each case, the OS is Ubuntu 14.04 and the Docker container is java:latest .

I was expecting that the jvm would quickly attempt to grow beyond the container constraints and be killed. In the first test, it behaved as I expected. The container starts and then the logs abruptly end:

..... free memory: 185915048 free memory: 184866456 free memory: 183817864 1 2 3 4 5 . . . . . free memory : 185915048 free memory : 184866456 free memory : 183817864

Upon inspection of the container, I see that it was killed by the OOMKiller:

.... "State": { "Running": false, "Paused": false, "Restarting": false, "OOMKilled": true, "Dead": false, "Pid": 0, "ExitCode": 137, "Error": "", "StartedAt": "2016-03-15T21:21:48.845032635Z", "FinishedAt": "2016-03-15T21:21:49.140794192Z" }, .... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 . . . . "State" : { "Running" : false , "Paused" : false , "Restarting" : false , "OOMKilled" : true , "Dead" : false , "Pid" : 0 , "ExitCode" : 137 , "Error" : "" , "StartedAt" : "2016-03-15T21:21:48.845032635Z" , "FinishedAt" : "2016-03-15T21:21:49.140794192Z" } , . . . .

Odd behavior, but just as I expected. cgroups is enforcing the amount

of space used by a container, but when the JVM or any other program

queries for the available memory, it doesn’t interfere:

<br />matt@nimbus:~/memory_eater$ free total used free shared buffers cached Mem: 32414832 1228224 31186608 948 262900 451720 -/+ buffers/cache: 513604 31901228 Swap: 33013756 0 33013756 matt@nimbus:~/memory_eater$ docker run --rm --memory=256m -it ubuntu /bin/bash root@e584d1c56f32:/# free total used free shared buffers cached Mem: 32414832 1237240 31177592 1012 262956 451844 -/+ buffers/cache: 522440 31892392 Swap: 33013756 0 33013756 root@e584d1c56f32:/# exit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 < br / > matt @ nimbus : ~ / memory_eater $ free total used free shared buffers cached Mem : 32414832 1228224 31186608 948 262900 451720 - / + buffers / cache : 513604 31901228 Swap : 33013756 0 33013756 matt @ nimbus : ~ / memory_eater $ docker run -- rm -- memory = 256m - it ubuntu / bin / bash root @ e584d1c56f32 : / # free total used free shared buffers cached Mem : 32414832 1237240 31177592 1012 262956 451844 - / + buffers / cache : 522440 31892392 Swap : 33013756 0 33013756 root @ e584d1c56f32 : / # exit

At this point I decided that I had an interesting enough topic to write about. Little did I know but that I was about to go…..

Down the Rabbit Hole

I set down to diligently write about my findings; re-running the test on my laptop (the second entry in the table above), I was surprised to find that it behaved differently.

At first I thought it might be due to differences in Docker versions, so I tried on the 3rd host, where it ran even longer than on the laptop!

269.180: [Full GC (Ergonomics) [PSYoungGen: 1252864K->1252371K(1274368K)] [ParOldGen: 5401989K->5401989K(5402624K)] 6654853K->6654360K(6676992K), [Metaspace: 2574K->2574K(1056768K)], 3.7960775 secs] [Times: user=11.39 sys=0.91, real=3.80 secs] 272.978: [Full GC (Allocation Failure) [PSYoungGen: 1252371K->1252371K(1274368K)] [ParOldGen: 5401989K->5401977K(5402624K)] 6654360K->6654349K(6676992K), [Metaspace: 2574K->2574K(1056768K)], 87.2372140 secs] [Times: user=529.11 sys=34.78, real=87.24 secs] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at MemoryEater.main(MemoryEater.java:15) Heap PSYoungGen total 1274368K, used 1252864K [0x000000071b200000, 0x000000076c400000, 0x00000007c0000000) eden space 1252864K, 100% used [0x000000071b200000,0x0000000767980000,0x0000000767980000) from space 21504K, 0% used [0x0000000768e80000,0x0000000768e80000,0x000000076a380000) to space 21504K, 0% used [0x0000000767980000,0x0000000767980000,0x0000000768e80000) ParOldGen total 5402624K, used 5401978K [0x00000005d1600000, 0x000000071b200000, 0x000000071b200000) object space 5402624K, 99% used [0x00000005d1600000,0x000000071b15ea50,0x000000071b200000) Metaspace used 2604K, capacity 4486K, committed 4864K, reserved 1056768K class space used 273K, capacity 386K, committed 512K, reserved 1048576K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 269.180 : [ Full GC ( Ergonomics ) [ PSYoungGen : 1252864K -> 1252371K ( 1274368K ) ] [ ParOldGen : 5401989K -> 5401989K ( 5402624K ) ] 6654853K -> 6654360K ( 6676992K ) , [ Metaspace : 2574K -> 2574K ( 1056768K ) ] , 3.7960775 secs ] [ Times : user = 11.39 sys = 0.91 , real = 3.80 secs ] 272.978 : [ Full GC ( Allocation Failure ) [ PSYoungGen : 1252371K -> 1252371K ( 1274368K ) ] [ ParOldGen : 5401989K -> 5401977K ( 5402624K ) ] 6654360K -> 6654349K ( 6676992K ) , [ Metaspace : 2574K -> 2574K ( 1056768K ) ] , 87.2372140 secs ] [ Times : user = 529.11 sys = 34.78 , real = 87.24 secs ] Exception in thread "main" java . lang . OutOfMemoryError : Java heap space at MemoryEater . main ( MemoryEater . java : 15 ) Heap PSYoungGen total 1274368K , used 1252864K [ 0x000000071b200000 , 0x000000076c400000 , 0x00000007c0000000 ) eden space 1252864K , 100 % used [ 0x000000071b200000 , 0x0000000767980000 , 0x0000000767980000 ) from space 21504K , 0 % used [ 0x0000000768e80000 , 0x0000000768e80000 , 0x000000076a380000 ) to space 21504K , 0 % used [ 0x0000000767980000 , 0x0000000767980000 , 0x0000000768e80000 ) ParOldGen total 5402624K , used 5401978K [ 0x00000005d1600000 , 0x000000071b200000 , 0x000000071b200000 ) object space 5402624K , 99 % used [ 0x00000005d1600000 , 0x000000071b15ea50 , 0x000000071b200000 ) Metaspace used 2604K , capacity 4486K , committed 4864K , reserved 1056768K class space used 273K , capacity 386K , committed 512K , reserved 1048576K

(note the insane length of the garbage collection; this should have been my clue that something was seriously weird!)

I didn’t find anything indicating that memory constraints behaved differently between the 1.8.3 and more current versions.

I then wondered if it might be related to HugePageTables. As of 2011, Documentation/cgroups/memory.txt [LWN.net] states:

Kernel memory and Hugepages are not under control yet. We just manage

pages on LRU.

Ok… let’s see if it’s enabled:

sudo hugeadm --explain 1 2 sudo hugeadm -- explain

Yup… I had them.

I then disabled HugePages on the 8 core host:

sudo hugeadm --thp-never 1 2 sudo hugeadm -- thp - never

Ok, disabled. I rebooted for paranoia and re-ran my test. Still failed. Grump.

It was time to….



The Docker Run Reference section on memory constraints specifies that there are four scenarios for setting user memory usage:

No memory limits; the container can use as much as it likes. (Default behavior) Specify memory , but no memory-swap — the container ram is limited and it may use an equivalent amount of swap as memory. Specify memory and infinite (-1) memory-swap — the container is limited in ram, but not in swap. Specify memory and memory-swap to set the total amount. In this case, memory-swap needs to be larger than memory :

$ docker run --rm --memory=255m --memory-swap=128m -it ubuntu /bin/bash Error response from daemon: Minimum memoryswap limit should be larger than memory limit, see usage. 1 2 3 $ docker run -- rm -- memory = 255m -- memory - swap = 128m - it ubuntu / bin / bash Error response from daemon : Minimum memoryswap limit should be larger than memory limit , see usage .

The total amount is denoted by memory-swap .

Aha! I’ll just set these flags and run my container again…. Drat.

It still isn’t working.

And swap keeps growing and growing….

By now, it’s going on 3AM, but I’m definitely going to figure this out.

At this point I remembered the warning:

WARNING: Your kernel does not support swap limit capabilities, memory limited without swap. 1 2 WARNING : Your kernel does not support swap limit capabilities , memory limited without swap .

A little bit of googling and I find that I need to set a kernel parameter. This can be done via grub .

You will need to edit /etc/default/grub — it is owned by root, so you will likely need to sudo.

On the GRUB_CMDLINE_LINUX line, edit it to add

cgroup_enable=memory

swapaccount=1

If there are no other arguments, it will look like this:

GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"

If there are other arguments, then just add the above; you’ll end up

with something along the lines of:

GRUB_CMDLINE_LINUX="acpi=off noapic cgroup_enable=memory swapaccount=1"

Next sudo update-grub && sudo reboot

Once the host reboots, the warning disappears and jvm is killed as expected:

"State": { "Status": "exited", "Running": false, "Paused": false, "Restarting": false, "OOMKilled": true, "Dead": false, "Pid": 0, "ExitCode": 137, "Error": "", "StartedAt": "2016-03-16T07:06:51.254992071Z", "FinishedAt": "2016-03-16T07:06:51.724280821Z" }, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 "State" : { "Status" : "exited" , "Running" : false , "Paused" : false , "Restarting" : false , "OOMKilled" : true , "Dead" : false , "Pid" : 0 , "ExitCode" : 137 , "Error" : "" , "StartedAt" : "2016-03-16T07:06:51.254992071Z" , "FinishedAt" : "2016-03-16T07:06:51.724280821Z" } ,

Conclusion

The reason it behaved as expected on the OpenStack instance was that

there is no swap on the instance. Since there is no swap to be had,

the container is, by necessity, limited to the size of the memory

specified. And the jvm instance was reaped by the OOMKiller, as I’d expected it would.

This was definitely an instance of accidental success!

The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny…’ Isaac Asimov

I’m glad I went down the rabbit hole on this one; I learned a good bit even if it took considerably longer than I’d expected.

A few caveats with which to leave you:

It is best to always specify heap sizes when using the JVM. Don’t depend on heuristics. They can, have, and do change from version to version, let alone operating system and a host of other variables. Assume that the OS lies and there’s less memory than it tells you. I haven’t even mentioned Linux’ “optimistic malloc” yet. Know thy system. Understand how the different pieces work together. And remember…. No software, just like no plan, survives contact with the …. user.

Share this: LinkedIn

Email

Print

Twitter

Google

Tumblr

Facebook

Reddit

Pinterest



Like this: Like Loading...