Here’s my mental model for memory accesses. First, if virtual memory is turned OFF:

The CPU issues a request to read or write a certain address. The request goes to the chipset. In most cases, the chipset forwards the request to RAM. For certain addresses, the chipset forwards the request to a memory-mapped hardware device.

Two examples of memory requests with paging disabled. The chipset routes the request to RAM or to another hardware device.

To turn on virtual memory, the kernel first has to set up a couple data structures in RAM called page tables. These are big tables that map “virtual” addresses to “physical” addresses. There may be multiple levels of page tables, in which case the higher-order bits of the virtual address determine which entry to look for in the first-level page table. That entry contains the physical address of another page table. The next highest bits in the virtual address determine which entry to look for the in the second-level page table. That entry contains the physical address of a page (section) of memory. The lowest order bits of the virtual address determine the offset inside the physical page to fetch.

Diagram taken from https://61creview.wordpress.com/tag/tlb/

So when virtual memory is turned ON:

The CPU issues a request to read or write a certain address. The memory management unit (MMU) intercepts the request. It considers the request to be a virtual address and translates it to a physical address using page tables (which are also stored in RAM). The MMU then forwards the physical address to the chipset like normal.

An example memory request with paging enabled. The MMU translate the virtual address 0x01 to the physical address 0xC1. Page tables are stored somewhere in RAM but aren’t shown.

Note that every request from the CPU now requires an additional request to look up an entry in one or more page tables. This does not cause an infinite loop because page tables store physical addresses, so they don’t need to be translated.

Also note that this would seem to increase the number of requests to RAM by 2x (or more!). Why does this not make all our programs twice as slow? Because caching. The MMU caches the result of these translations in a set of registers called the TLB (translation look-aside buffer).