NVIDIA® Tesla® GPUs deliver supercomputing performance at a lower power, lower cost, and using many fewer servers than standard CPU-only compute systems.

Powering the world’s leading Supercomputers, Microway designs customized GPU clusters, servers, and WhisperStations based on NVIDIA Tesla and NVIDIA Quadro® GPUs. We have been selected as the vendor of choice for a number of NVIDIA GPU Research Centers, including Carnegie Mellon University, Harvard, Johns Hopkins and Massachusetts General Hospital.

NVIDIA A100 GPU NVIDIA A100 – A Massive Leap for Every Workload Integrated in Microway Navion and NumberSmasher GPU Servers

Specifications Up to 9.7 TFLOPS double- and 19.5 TFLOPS single-precision floating-point performance

Up to 19.5 TFLOPS Tensor Core double- and 156 TFLOPS Tensor Core single-precision floating-point performance

Up to 312 Tensor TFLOPS of Deep Learning Performance (FP16 Tensor Core)

NVIDIA Ampere™ GPU architecture

3rd Generation NVIDIA NVLink interface, with 600GB/sec bi-directional bandwidth to the GPU (12 NVLink bricks)

6912 FP32 CUDA cores, 432 Ampere Tensor Cores

40GB on-die HBM2 GPU memory

Memory bandwidth up to 1,555 GB/s

NVIDIA NVLink™ GPU:GPU interface and PCI-E Gen4 link to system

Available in double-width PCI-E card and proprietary SXM4 form factors

Passive heatsink only, suitable for specially-designed GPU servers Tesla V100 GPU Tesla V100 – Advanced Datacenter GPU, for AI & HPC Integrated in Microway NumberSmasher and OpenPOWER GPU Servers & GPU Clusters Specifications Up to 7.8 TFLOPS double- and 15.7 TFLOPS single-precision floating-point performance

Up to 125 Tensor TFLOPS of Deep Learning Performance (FP16 Tensor Core)

NVIDIA Volta™ GPU architecture

5120 CUDA cores, 620 Tensor Cores

16GB or 32GB of on-die HBM2 GPU memory

Memory bandwidth up to 900GB/s

NVIDIA NVLink™ or PCI-E x16 Gen3 interface to system

Available with enhanced NVLink interface, with 300GB/sec bi-directional bandwidth to the GPU

Passive heatsink only, suitable for specially-designed GPU servers Tesla T4 GPU Tesla T4 – Price/performance for AI and Single Precision Integrated in Microway NumberSmasher and Navion GPU Servers & GPU Clusters Specifications Up to 8.1 TFLOPS single-precision floating-point performance

Up to 65 TensorTFLOPS of Deep Learning Training Performance; 260 INT4 TOPS of Inference Performance

NVIDIA “Turing” TU104 graphics processing unit (GPU)

2560 CUDA cores, 320 Tensor Cores

16GB of GDDR6 GPU memory

Memory bandwidth up to 320GB/s

PCI-E x16 Gen3 interface to system

Passive heatsink only, suitable for specially-designed GPU servers

Unique features available in the latest NVIDIA GPUs include:

High-speed, on-die GPU memory

NVLink interconnect speeds up data transfers up to 10X over PCI-Express

speeds up data transfers up to 10X over PCI-Express Unified Memory allows applications to directly access the memory of all GPUs and all of system memory

allows applications to directly access the memory of all GPUs and all of system memory Direct CPU-to-GPU NVLink connectivity on OpenPOWER systems support NVLink transfers between the CPUs and GPUs

on OpenPOWER systems support NVLink transfers between the CPUs and GPUs ECC memory error protection – meets a critical requirement for computing accuracy and reliability in data centers and supercomputing centers.

– meets a critical requirement for computing accuracy and reliability in data centers and supercomputing centers. System monitoring features – integrate the GPU subsystem with the host system’s monitoring and management capabilities such as IPMI. IT staff can manage the GPU processors in the computing system with widely-used cluster/grid management tools.

Many of the most popular applications already feature GPU support. Your own applications may take advantage of GPU acceleration through several different avenues:

“Drop-in” GPU-accelerated libraries – provide high-speed implementations of the functions your application currently executes on CPUs.

– provide high-speed implementations of the functions your application currently executes on CPUs. OpenACC / OpenMP Compiler directives – allow you to quickly add GPU acceleration to the most performance critical sections of your application while maintaining portability.

– allow you to quickly add GPU acceleration to the most performance critical sections of your application while maintaining portability. CUDA integrated with C, C++ or Fortran – provides maximum performance and flexibility for your applications. Third-party language extensions are available for a host of languages, including Java, Mathematica, MATLAB, Perl and Python.

Tesla GPU computing solutions fit seamlessly into your existing workstation or HPC infrastructure enabling you to solve problems orders-of-magnitude faster.

Call a Microway Sales Engineer for Assistance : 508.746.7341 or

Click Here to Request More Information.