16-bit days

There were days when computers had 16-bit registers and 20-bit addressable memory. That is a total of 1MB memory – some claimed that it ought to be enough for anybody. Memory address space was flat and not protected by anything, now it’s known as the real mode. How was it possible to address 20-bit memory with only 16-bit registers?



Well, it was necessary to use two registers. One pointed to the high bits of physical location – a segment, the other was a relative offset. Physical memory location was just counted as: segment*16 + offset. By the way, it wasn’t possible to access more than 64KB data at a time – that’s the maximum value for the offset.



32-bit days

Years passed and we got 32 bit processors. 16 bit AX become 32 bit EAX, memory isn’t flat any more (for simplicity, let’s skip the 286 ). Protected mode with paging and proper access restrictions replaced good old flat real mode. Using 32-bit registers you could address up to 4GB RAM – that sounded like more than enough. Almost no-one cared about segment registers any more. The funny thing is that they are still there, deep in your CPU, but they are forgotten. For years they were used only by OS kernel, but recently they started to play a completely new role. First let me explain how they behave in a protected mode. As opposed to general purpose registers, segment registers haven’t been extended to 32 bits. They still hold only 16-bit value. In protected mode, instead of pointing to segment memory location directly they store a descriptor number, which points to memory via a Local Descriptors Table data structure. From our point of view this data structure has two interesting fields: base_addr and limit. The former stores the address of the beginning of the segment, the latter is setting segment length. There is a Linux/BSD/MacOSX syscall which allows access to this data structure on a per-process basis – modify_ldt. It doesn’t require any special permissions, every process can freely play with its own LDT descriptors (this is safe due to paging).

Even though segment registers changed their meaning they are still used by the processor, every access to memory is affected by the currently selected descriptor and the LDT data structure it points to.

By default, when the process starts, the segment registers point to a descriptor that sets a flat view of the process memory. That’s how unix processes work and what compilers assume.

But it’s just a default that the memory is viewed as flat, it can be modified on per-process basis via segment machinery.

Boxes of sand

The segmentation mechanism was recently rediscovered to do intra-process sandboxing. In other words – to run foreign binary code safely inside a trusted process.

Segmentation is a very nice mechanism to restrict memory access without the burden of whole operating system machinery. To restrict memory access of some code fragment to a certain ‘”subspace” it is enough to modify descriptors via modify_ldt syscall and run the code with segment registers pointing accordingly. Such modified code fragment cannot, while running, affect any code from its neighborhood inside the same OS process.

Unfortunately the real sandbox implementation is more complicated – apart from LDT it needs to go through machine code and check for some processor commands that can escape from the sandbox. For example it’s necessary to disallow modification of segment registers. This technique leads to a stable machinery which can be used to run arbitrary i386 machine code without noticable performance degradation.

This idea is used by Google Native Client to run foreign binaries natively, despite the fact that they’re downloaded from an untrusted source. Apart from this browser-specific project, there’s also a very nice and lightweight VX32 library.