NVIDIA’s most anticipated chip — GP100 aka ‘Big Pascal’ — could feature a massive performance upgrade compared to previous generation.

NVIDIA Pascal GP100: 12 TFLOPS SP, 4 TFLOPS DP?

Presentation dated from June 2015 created by Manuel Ujaldon, a Spanish university professor and ‘CUDA Fellow’, reveals that NVIDIA is planning a new chip based on Pascal architecture that could feature 4 TFLOPs double precision computing performance and 3 times higher single precision performance (12 TFLOPs)

It means Pascal could feature 1/3 DP to SP ratio, similar to Kepler architecture, but different from ‘Single Precision Oriented’ Maxwell.

NVIDIA Pascal GP100 Computing Performance Ratio Fabrication Process Die Size Native FP64 Rate GP100 (Big Pascal) FinFET ? 1/3 GM200 (Big Maxwell) 28nm 601mm2 1/32 GK110 (Big Kepler) 28nm 551mm2 1/3 GF110 (Big Fermi) 40 nm 520mm2 1/2 GT200 (Big Tesla) 55 nm 576mm2 1/8

More importantly computing performance gives us a hint on how many CUDA cores could Big Pascal have. Assuming GP100 has 1000 MHz core clock, the CUDA count would be 6144, twice as much as GM200. Of course the numbers shared by the professor are probably not very accurate, but we do believe they are based on information shared by NVIDIA itself, so it won’t hurt if we analyze them further.

Here’s an overview of how many cores could GP100 have assuming :

12 TFLOPs computing performance,

various GPU clock speeds,

Streaming Multiprocessor Pascal featuring 128 CUDA cores each (similar to Maxwell).

NVIDIA Pascal GP100 CUDA Cores prediction (SP Performance = 12288000 FLOPS) GPU Base Clock CUDA Core Count SMX Count 800 MHz 7680 60 850 MHz 7168 56 900 MHz 6784 53 950 MHz 6400 50 1000 MHz 6144 48 1050 MHz 5760 45 1100 MHz 5504 43 1150 MHz 5248 41 1200 MHz 5120 40

In our predictions we have assumed that 5120 CUDA core count is the most suitable number for GP100. Of course such core count would be reserved for professional line at launch, with new TITAN successor coming at later date.

The GP104 could therefore have 20 SMX with 2560 CUDA cores on board, which should be enough to compete against GTX 980 Ti, as we are also getting HBM and FinFET with next generation GPUs.

Source: 3DCenter, PDF