1 - Introduction, problem definition and test system 2 - Bandwidth limits even for 2D graphics 3 - Bandwidth limits in everyday life and conclusion

Yesterday I had already put the new Radeon Pro W5700 through its paces in the launch article “AMD Radeon Pro W5700 Review – price and performance are right, but it’s enough for the Quadro RTX 4000?“, but I was still unclear where the advantage of the new test platform with the Ryzen 9 3950X including X570 motherboard and PCIe Generation 4 should be. Today I am smarter and think that it is important to publish the state of knowledge. This ranges from “it’s no use” to “well, you can take it with you” to “wow, look at that”. Ergo, everything is mixed and sometimes it’s quite useful, but sometimes it’s not.

To have a direct comparison and to make sure that the motherboard does not get stuck with a bug when switching, I first counter-tested both interface settings of the X570 motherboard with an Nvidia Quadro RTX 6000, even if it is a card with PCIe 3.0. Then I let the CPU run with a X470 motherboard and also got consistent results with the RTX 6000. Thus nothing more stood in the way of the very extensive test, which takes almost 5 hours.

But what does PCI Express 4.0 really bring, or could really bring? In theory, PCIe 4.0 doubles the data transfer rate from 8 GT/s to 16 GT/s in comparison to 3.0, which then corresponds to about 2 GByte/s per lane. You can definitely need this when it comes to pure PCIe connections for storage media and network. As we know, NVMe SSDs with M.2 and U.2 ports are limited to only four PCIe lanes, which slows down the whole 4 GByte/s in PCIe 3.0. However, there is not yet much to read about what you can really benefit from graphics cards, especially not from daily practice.

Surely, if you use such a small graphic card like a Radeon RX 5500 XT, which can only use an 8x connection in the PCIe slot, with a suitable PCIe 4.0 board, you’ll be able to measure a clear push at the latest when textures are swapped out and also subjectively determine it. But not everybody uses such a trimmed beginner card, but fully connected hardware. And that’s exactly where this test will start today and build on the launch article, where I produced decent workloads, which often enough generate more load than a game can currently handle.

To improve the overview, I only selected the benchmarks or partial benchmarks from yesterday’s article for this review, where really bigger differences were measurable. Applications like Solidworks or Creo do not generate enough load on the interface to be able to speak of real differences. This already falls into the area of measurement tolerances, which is why I left out these comparisons. This also applies to pure compute tasks, where the bandwidth as such does not represent the bottleneck. If you still want to compare, you can read the launch article with the many tests again.

Test System

The benchmark system is the same as in the launch article, whereby the benchmarks were carried out exclusively at the workstation workstation, because the power consumption and have not changed significantly.

I have also summarized the individual components of the test system in tabular form: