Along with today’s NVIDIA Studio branding announcement, NVIDIA is also using Computex to update their lineup of Quadro GPUs for notebooks and mobile workstations. Along with bringing some of the existing Quadro RTX desktop parts to the mobile space, the company is also launching a sub-series of parts under the Quadro T series, and finally a pair of new Quadro P series graphics adapters for the low-end.

Starting things off, we have the mobile Quadro RTX parts, which are all new for the mobile space. Like NVIDIA’s GeForce mobile counterparts, these Quadro RTX mobile parts are essentially the same chip configurations as their desktop siblings, but put into a mobile form factor and with their TDPs and clockspeeds turned down accordingly. As a result the mobile Quadro RTX parts pack all the features and VRAM of the desktop parts that NVIDIA has previously launched, while retaining a good deal of their performance and all of the Turing architecture's functionality.

NVIDIA Mobile Quadro RTX Spec Comparison RTX 5000 RTX 4000 RTX 3000 P5200 CUDA Cores 3072 2560 2304 2560 Boost Clock ~1.53GHz ~1.56GHz ~1.39GHz ~1.74GHz Memory Clock 14Gbps GDDR6 14Gbps GDDR6 14Gbps GDDR6 8Gbps GDDR5 Memory Bus Width 256-bit 256-bit 192-bit 256-bit VRAM 16GB 8GB 6GB 16GB Single Precision Perf. 9.4 TFLOPs 8 TFLOPs 6.4 TFLOPs 8.9 TFLOPs Tensor Perf. (FP16) 75.2 TOPs 63.9 TOPs 51.4 TOPs N/A TGP Max Power 80-110W 80-110W 60-80W 150W GPU TU104 TU104 TU106 GP104 Transistor Count 13.6B 13.6B 10.8B 7.2B Architecture Turing Turing Turing Pascal Manufacturing Process TSMC 12nm "FFN" TSMC 12nm "FFN" TSMC 12nm "FFN" TSMC 16nm Launch Date 05/27/2019 05/27/2019 05/27/2019 N/A

Owing to the tighter TDPs of mobile, NVIDIA’s mobile Quadro RTX stack doesn’t go quite as high as it does on the desktop. For mobile the fastest part is the Quadro RTX 5000, which is based on the same TU104 GPU as the desktop version. This part replaces the Quadro P5200 as NVIDIA’s flagship mobile Quadro part. Meanwhile below that we have the Quadro RTX 4000 and RTX 3000, which appear to be based on a cut-down TU104 and full-fledged TU106 GPU respectively.

In terms of performance, the RTX 5000 will top out at 9.4 TFLOPs, followed by 8 TFLOPs for the RTX 4000 and 6.4 TFLOPs for the RTX 3000. NVIDIA’s peak clockspeeds seem to vary a bit depending on the processor – we’re estimating anywhere from 1.39GHz to 1.56GHz – though these are still fairly aggressive for a mobile part. Sustained performance will be lower, of course, with that varying with the cooling capabilities of the host laptop.

Meanwhile in terms of memory, the situation is again a mirror of the desktop. The RTX 5000 gets 16GB of GDDR6 – a full complement of memory for a mobile TU104 part – while RTX 4000 and RTX 3000 drop down to 8GB and 6GB respectively. NVIDIA continues to treat memory capacity as a feature differentiator between the Quadro and GeForce families and even among Quadro cards, so the 16GB RTX 5000 is a halo part in this respect. The flip side, however, is that RTX 5000 doesn’t improve on its predecessor here as far as capacity goes, as both the old and new cards are 16GB.

It is interesting to note that while performance has gone up and memory capacities have at least held even, power consumption is actually down generation-over-generation. Starting with the mobile Quadro RTX series, NVIDIA is providing a range of max power values instead a single value, but even at the top of this range, none of these cards passes 110W, well below the 150W that the older P5200 peaked at. The RTX 4000 and RTX 3000 parts don’t see quite the same savings as their own predecessors, but the range is still there. NVIDIA seems increasingly focused on getting high-end GPUs into ever thinner and lighter notebooks, so bringing down their TDPs is a huge component of how they’re going to get there.

Overall, the Quadro RTX series is the flagship series in terms of features. Of particular note here, all of these parts include NVIDIA’s ray tracing hardware acceleration – hence the RTX moniker – so they benefit the most from all of NVIDIA’s efforts to get ray tracing incorporated into various content creation applications. They also have a full tensor core complement for their size, which along with helping RT performance also means they can hold their own in neural network simulations and other tensor-related tasks.

Quadro T Series – T2000 and T1000

Also new to the mobile Quadro family are the Quadro T series parts, the Quadro T2000 and Quadro T1000. These parts slot in below the Quadro RTX parts in terms of performance, power consumption, and features, providing a clear progression downward in terms of price versus functionality.

NVIDIA Mobile Quadro T Spec Comparison T2000 T1000 P2000 P1000 CUDA Cores 1024 768 768 512 Boost Clock ~1.71GHz ~1.69GHz ~1.56GHz ~1.56GHz Memory Clock 8Gbps GDDR5 8Gbps GDDR5 6Gbps GDDR5 6Gbps GDDR5 Memory Bus Width 128-bit 128-bit 128-bit 128-bit VRAM 4GB 4GB 4GB 4GB Single Precision Perf. 3.5 TFLOPs 2.6 TFLOPs 2.4 TFLOPs 1.6 TFLOPs TGP Max Power 40-60W 40-50W 50W 40W GPU TU117 TU117 GP107 GP107 Transistor Count 4.7B 4.7B 3.3B 3.3B Architecture Turing Turing Pascal Pascal Manufacturing Process TSMC 12nm "FFN" TSMC 12nm "FFN" GloFo 14nm GloFo 14nm Launch Date 05/27/2019 05/27/2019 N/A N/A

Both of the new Quadro T series parts are based on the same TU117 GPU, which is NVIDIA’s smallest Turing architecture GPU. As a result there’s a pretty significant gap in performance between the T2000 and RTX 3000; performance drops by around 45%. At peak clockspeeds, this translates to around 3.5 TFLOPs and 2.6 TFLOPs of FP32 performance respectively.

In terms of memory, both cards come with 4GB of GDDR5, which is clocked at 8Gbps and attached to a 128-bit memory bus. These are very much low-end cards, so it looks like NVIDIA is aiming to be cost-efficient rather than offer more memory, which would start undercutting the RTX 3000 and its 6GB of VRAM. Meanwhile TDPs are down to a max of 60W for the T2000, and a max of 50W for the T1000. These are again ranges, depending on what the laptop OEM designs for, and performance will scale accordingly.

Overall these are Turing parts, but they are based on we’ve been calling NVIDIA’s “Turing Minor” GPUs. Turing Minor parts have the same core architecture as Turing – so all of the performance optimizations and new rasterization/shading features that come with the Turing architecture – however they forgo the ray tracing hardware acceleration and tensor cores. As a result these parts are leaner and meaner, however they are hardly the part of choice if ray tracing acceleration is needed. This is a level of feature differentiation that past generations of the Quadro family has lacked, since they’ve typically been based on a single, unified GPU architecture (outside of the very cheapest parts).

Quadro P Series Expanded – P620 and P520

Finally, bringing up the rear of the new mobile Quadro product stack are the Quadro P620 and Quadro P520. As hinted at by the name, these parts aren’t Turing based at all. Instead, they are minor refreshes of the existing Pascal-based P600/P500 parts. Since NVIDIA’s Turing GPU stack doesn’t go below the TU117 used in the T2000/T1000, for these smallest and cheapest of parts, NVIDIA instead relies on their bottom-tier Pascal GPUs.

NVIDIA Mobile Quadro P Spec Comparison P620 P520 P600 P500 CUDA Cores 512 384 384 256 Boost Clock ~1.46GHz ~1.43GHz ~1.56GHz ~1.46GHz Memory Clock 6Gbps GDDR5 6Gbps GDDR5 5Gbps GDDR5 5Gbps GDDR5 Memory Bus Width 128-bit 64-bit 128-bit 64-bit VRAM 4GB 2GB 4GB 2GB Single Precision Perf. 1.5 TFLOPs 1.1 TFLOPs 1.2 TFLOPs 0.75 TFLOPs TGP Max Power 25W 18W 25W 18W GPU GP107 GP108 GP107 GP108 Transistor Count 3.3B 1.8B 3.3B 1.8B Architecture Pascal Pascal Pascal Pascal Manufacturing Process GloFo 14nm GloFo 14nm GloFo 14nm GloFo 14nm Launch Date 05/27/2019 05/27/2019 N/A N/A

Relative to their immediate predecessors, both the P620 and P520 do see some fairly decent performance bumps, thanks to NVIDIA enabling more CUDA cores this time around. Still, with the fastest part topping out at 1.5 TFLOPs, there’s a clear jump in performance between the P series parts and the new T series parts.

In terms of memory speeds, both the P620 and P520 are receiving GDDR5 clocked at 6Gbps, up from 5Gbps the generation prior. However the P5xx parts all retain their 64-bit memory bus, so unlike their fastest 4GB siblings, the cheapest P520 gets just 2GB of VRAM and all of 48Gbps of memory bandwidth. Meanwhile power consumption is being held constant from the last generation, at 25W and 18W respectively for the P620 and P520.

On the whole, the refreshed Quadro P series parts are meant to serve as the entry-level parts in NVIDIA’s mobile Quadro product stack, and it shows. They will be cheap and will become a common feature in low-end productivity laptops, but they bring equally limited performance, and they don’t come with any of Turing’s new features.