All programmers know about multicore parallelism: your CPU is made of several nearly independent processors (called cores) that can run instructions in parallel. However, our processors are parallel in many different ways. I am interested in a particular form of parallelism called “memory-level parallelism” where the same processor can issue several memory requests. This is an important form of parallelism because current memory subsystems have high latency: it can take dozens of nanoseconds or more between the moment the processor asks for data and the time the data comes back from RAM. The general trend has not been a positive one in this respect: in many cases, the more advanced and expensive the processor, the higher the latency. To compensate for the high latency, we have parallelism: you can ask for many data elements from the memory subsystems at the same time.

In earlier work, we showed that current Intel processors (Skylake microarchitecture) are limited to about ten concurrent memory requests whereas Apple’s A12 processor scale to 40 or more memory requests.

Intel just released a more recent microarchitecture (cannonlake) and we have been putting it to the test. Is Intel improving?

It seems so. In a benchmark where you randomly access a large array, using a number of separate paths (which I call “lanes”), we find that the cannonlake processor appears to support twice as many concurrent memory requests as the skylake processors.

The Skylake processor has lower latency (70 ns/query) compared to the Cannonlake processor (110 ns/query). Nevertheless, the Cannonlake is eventually able to beat the Skylake processor in bandwidth by a wide margin (12 GB/s vs 9 GB/s).

The story is similar to the Apple A12 experiments.

This suggests that even though future processors may not have lower latency when accessing memory, we might be better able to hide this latency through more parallelism.

Even if you are writing single-threaded code, you ought to think more and more about parallelism.

Our code is available.

Credit: Though all the mistakes are mine, this is joint work with Travis Downs.