Virtual memory is taken for granted now. Only a few now remember, let alone do, some "real mode" programming, where you are exposed to the actual physical memory. Instead, every process has its own virtual memory space, and that space is mapped onto actual memory. That allows, for instance, for two processes have distinct data at same virtual address 0x42424242 , which will be backed by different physical memory. Now, when a program does the access at that address, something should translate that virtual address to physical one.

This is normally achieved by OS maintaining the "page table", and hardware doing the "page table walk" through that table to translate the address. The whole thing gets easier when translations are maintained at page granularity. But it is nevertheless not very cheap, and it needs to happen for every memory access! Therefore, there is also a small cache of latest translations, Translation Lookaside Buffer (TLB). TLB is usually very small, below 100 of entries, because it needs to be at least as fast as L1 cache, if not faster. For many workloads, TLB misses and associated page table walks take significant time.

Since we cannot do TLB larger, we can do something else: make larger pages! Most hardware has 4K basic pages, and 2M/4M/1G "large pages". Having larger pages to cover the same region also makes page tables themselves smaller, making the cost of page table walk lower.

In Linux world, there are at least two distinct ways to get this in applications:

hugetlbfs . Cut out the part of system memory, expose it as virtual filesystem, and let applications mmap(2) from it. This is a peculiar interface that requires both OS configuration and application changes to use. This also "all or nothing" kind of deal: the space allocated for (the persistent part of) hugetlbfs cannot be used by regular processes.

Transparent Huge Pages (THP). Let application allocate memory as usual, but try to provide large-pages-backed storage transparently to the application. Ideally, no application changes are needed, but we will see how applications can benefit from knowing THP is available. In practice, though, there are memory overheads (because you will allocate an entire large page for something small), or time overheads (because sometimes THP needs to defrag memory to allocate pages). The good part is that there is a middle-ground: madvise(2) lets application tell Linux where to use THP.

Why the nomenclature uses "large" and "huge" interchangeably is beyond me. Anyway, OpenJDK supports both modes:

$ java -XX:+PrintFlagsFinal 2>&1 | grep Huge bool UseHugeTLBFS = false {product} {default} bool UseTransparentHugePages = false {product} {default} $ java -XX:+PrintFlagsFinal 2>&1 | grep LargePage bool UseLargePages = false {pd product} {default}

-XX:+UseHugeTLBFS mmaps Java heap into hugetlbfs, that should be prepared separately.

-XX:+UseTransparentHugePages just madvise -s that Java heap should use THP. This is convenient option, because we know that Java heap is large, mostly contiguous, and probably benefits from large pages the most.

-XX:+UseLargePages is a generic shortcut that enables anything available. On Linux, it enables hugetlbfs, not THP. I guess that is for historical reasons, because hugetlbfs came first.