During the Super computing 15 conference, representatives of NVIDIA revealed some new details about the next-generation graphics processor, code-named Pascal. Among the details they include the process design, peak computing power and also some new information on next generation of NVIDIA’s GPU based on the Volta architecture, which is planned for the 2017 / 2018 (depending on market). New information seems to confirm recent rumors that the first GPU core based on the Pascal architecture may appear in early next year.

NVIDIA’s Pascal and Volta GPUs Peak Compute Performance Revealed – Volta To Push Memory Bandwidth To 1.2 TB/s

For a long time we had reports that the NVIDIA Pascal chips will use 16-nanometer manufacturing process. And now NVIDIA has officially confirmed that the new GPU will be manufactured in 16-nanometer process FinFET, although during the Super computing 15 conference they did not specify the name of the manufacturer, but it was confirmed that TSMC (Taiwan Semiconductor Manufacturing Company). Confirming the information on the manufacturing process may not be a great revelation, especially in half of this year confirmed the information that the work was completed on the GP100 GPU core, whose project was transferred to TSMC to produce the first samples of engineering. All this seems to consist of earlier information that Pascal GPUs will be available from the first half of next year. NVIDIA also mentioned that the new generation of GPU is toughest construction as they will have twice as many transistors. Given that the most powerful representative of Maxwell architecture consists of 8 billion transistors, you can easily calculate that its successor should have at least 16 billion.

Already during the conference, GTC 2015 head of NVIDIA, Jen-Hsun Huang said that one of the major architectural features of Pascal will be handling a mixed precision, which allows you to double the computing power of FP16, compared to FP32, and this is due to the use of calculations in 16-bits at double precision FP32 level. Pascal also provides support for FP16, FP32 and FP64, and they disclose it at SC15 conference SC 15, as presented the peak computing power, probably for the GP100, which should be the most advanced of a new generation. It should be noted that the Maxwell architecture has been developed primarily with a view to consumer applications. NVIDIA gave up hardware solutions for FP64, focusing only on performance calculations FP32. It is for this reason, graphics processors based on the architecture of Maxwell are not destined for the market of HPC (High-Performance Computing), for which NVIDIA holds an offer in the form of solutions based on the Kepler architecture.

So far what we know about the NVIDIA GP100 chip based on Pascal Architecture:

Pascal microarchitecture.

DirectX 12 feature level 12_1 or higher.

Successor to the GM200 GPU found in the GTX Titan X and GTX 980 Ti.

Built on the 16FF+ manufacturing process from TSMC.

Allegedly has a total of 17 billion transistors, more than twice that of GM200.

Taped out in June 2015.

Will feature four 4-Hi HBM2 stacks, for a total of 16GB of VRAM for the consumer variant and 32GB for the professional variant.

Features a 4096bit memory interface.

Features NVLink and support for Mixed Precision FP16 compute tasks at twice the rate of FP32 and full FP64 support. 2016 release.

Stephen W. Keckler of the NVIDIA company claims that if possible, it is best to use only single precision, which results in tangible results in terms of energy efficiency, and this is why it is so important in Pascal architecture, as it allows a way to balance both these issues. Pascal was created specifically to take advantage of this architecture not only for consumer products, but mainly for the HPC market. That’s why they will introduce a new, very fast interface NVLink or service addressing UVM (Unified Virtual Memory), and so the solution for platforms and software using the combined computing power of the GPU and the CPU.

At Super computing 15 NVIDIA announced that, in the case of Pascal peak power computing double-precision must extend over 4 TFLOPS a single GPU using stacks of RAM type HBM2 connected to it via the interface with a total width of 1 TB / s. It is worth noting that the current flagship of the NVIDIA in HPC market is Tesla K80, equipped with not one, but two processors GK210, offers a peak computing power at a little over 2 TFLOPS with standard clocks and max. 2.91 TFLOPS using increased frequency (GPU Boost). The Tesla K40 professional Graphics card with a single processor GK180 double-precision reaches 1.43 TFLOPS. For comparison, the current flagship GPU from AMD FirePro S9170 with 32 GB of RAM in double-precision (FP64) reaches 2.62 TFLOPS. When it comes to single precision, NVIDIA Pascal can safely cross the barrier of 10 TFLOPS.

NVIDIA at SC 2015 also presents the peak power for GPU core based on the Volta architecture, which should double-precision (FP64) reach up to 7 TFLOPS. Volta will be a big step towards creating systems that offer power calculated in many PFLOPS to be combined in Supercomputer Oak Ridge National Laboratory (Design Summit) and Lawrence Livemore National Laboratory (Project Sierra). Both supercomputers are to achieve the computing power in excess of 100 PFLOPS and will consist of a few thousand units with a capacity of over 40 TFLOPS.

Stephen W. Keckler translating the energy efficiency of the next generation GPUs even showed how important a role of HBM- type is in this matter, which will be used in both architectures Pascal and Volt. Processors based on the latter will use GPUs with the second generation of HBM with transfer speed of up to 1.2 TB / sec. Keckler drew attention to the fact that working with HBM2 will increase the power consumption. HBM2 Memory with transfer rate of 1.2 TB / sec will cause the graphics processor TDP to increased by 60 W. HBM1 which is used in AMD Fiji consume about 25W TDP. Such structures would hardly be energy-efficient, not only in the consumer market, but even HPC. Therefore, NVIDIA wants to be in front of the guard and analyze new ways to solve the problem of increasing bandwidth demand with HBM for energy storage. In 2020, the company may use a completely new memory architecture that will solve the problem.

It seems that the company has a very specific plan, not only for the consumer market, but also having to strengthen its position in the HPC market. In the latter case, it appears that the CPU no longer has a chance. Certainly not x86. As for the GPU, the NVIDIA already has a clearly dominant position in the consumer market and workstations. Large-scale supercomputers are apparently its next goal to achieve.

Source: Wccftech, Nvidia, 3DCenter, ed. incl.