But does that mean the size of Java reference is the same as the machine pointer width? Not necessarily. Java objects are usually quite reference-heavy, and there is pressure for runtimes to employ the optimizations that make the references smaller. The most ubiquitous trick is to compress the references: make their representation smaller than the machine pointer width. In fact, the example above was executed with that optimization explicitly disabled.

Since Java runtime environment is in full control of internal representation, this can be done without changing any user programs. It is possible to do in other environments, but you would need to handle the leakage through ABIs, etc, see for example X32 ABI .

In Hotspot, due to a historical accident, the internal names had leaked to the VM options list that control this optimization. In Hotspot, the references to Java objects are called "ordinary object pointers", or "oops", which is why Hotspot VM options have these weird names: -XX:+UseCompressedOops , -XX:+PrintCompressedOopsMode , -Xlog:gc+heap+coops . In this post we would try to use the proper nomenclature, where possible.

See, the access is still in the same form, that is because the hardware itself just accepts the 32-bit pointer and extends it to 64 bits when doing the access. We have got this optimization for almost free.

Now, the reference field only takes 4 bytes and the instance size is down to 16 bytes:

This whole shebang is obviously possible when heap size is less than 4 GB (or, 2 32 bytes). Technically, the heap start address might be far away from zero address, and so the actual limit is lower than 4 GB. See the "Heap Address" in logging above. It says that heap starts at 0x0000000080000000 mark, closer to 2 GB.

On most heap sizes, the higher bits of 64-bit machine pointer are usually zero. On the heap that can be mapped over the first 4 GB of virtual memory, higher 32 bits are definitely zero. In that case, we can just use the lower 32-bit to store the reference in 32-bit machine pointer. In Hotspot, this is called "32-bit" mode, as can be seen with logging:

"Zero-Based" Mode

But what if we cannot fit the untreated reference into 32 bits? There is a way out as well, and it exploits the fact that objects are aligned: objects always start at some multiple of alignment. So, the lowest bits of untreated reference representation are always zero. This opens up the way to use those bits for storing significant bits that did not fit into 32 bits. The easiest way to do that is to bit-shift-right the reference bits, and this gives us 2(32+shift) bytes of heap encodeable into 32 bits.

Graphically, it can be sketched like this:

With default object alignment of 8 bytes, shift is 3 (23 = 8), therefore we can represent the references to 235 = 32 GB heap. Again, the same problem with base heap address surfaces here and makes the actual limit a bit lower.

In Hotspot, this mode is called "zero based compressed oops", see for example:

$ java -Xmx20g -Xlog:gc+heap+coops ... [0.010s][info][gc,heap,coops] Heap address: 0x0000000300000000, size: 20480 MB, Compressed Oops mode: Zero based, Oop shift amount: 3

The access via the reference is now a bit more complicated:

....[Hottest Region 3]..................................................... c2, level 4, org.openjdk.CompressedRefs::access, version 715 (36 bytes) [Verified Entry Point] 0.94% ...40: mov %eax,-0x14000(%rsp) ; prolog 7.43% ...47: push %rbp 0.52% ...48: sub $0x10,%rsp 1.26% ...4c: mov 0xc(%rsi),%r11d ; get field "o" 6.08% ...50: mov 0xc(%r12,%r11,8),%eax ; get field "o.x" 6.94% ...55: add $0x10,%rsp ; epilog 0.54% ...59: pop %rbp 0.27% ...5a: mov 0x108(%r15),%r10 ; thread-local handshake 0.57% ...61: test %eax,(%r10) 6.50% ...64: retq

Getting the field o.x involves executing mov 0xc(%r12,%r11,8),%eax : "Taketh the ref’rence from %r11, multiplyeth the ref’rence by 8, addeth the heapeth base from %r12, and that wouldst be the objecteth that you can now readeth at offset 0xc ; putteth that value into %eax , please". In other words, this instruction combines the decoding of the compressed reference with the access through it, and it is done in one sway. In zero-based mode, %r12 is zero, but it is easier on code generator to emit the access involving %r12 nevertheless. The fact that %r12 is zero in this mode can be used by code generator in other places too.