Cray has new XC40 and CS400 superduper computers using Haswell processors and DataWarp burst buffer tech to keep the Haswell cores crammed with data to process.

The XC40 goes twice the speed of the existing XC30, courtesy of its Intel Xeon E5-2600 v3 ("Haswell") processor, scaling past a million cores. The architecture implements two processor engines per compute node, with four compute nodes per blade. Blades stack in eight pairs (16 to a chassis), and each cabinet can be populated with up to three chassis, meaning 384 sockets per cabinet.

This delivers up to 6,144 cores and enables 226 teraflops of performance per cabinet. Cray says "Future processor upgrades will boost clock frequency and bump the number of embedded cores, accelerating overall system performance."

DataWarp is an application IO accelerator using flash memory directly connected to the XC40 compute nodes - PCIe flash caching basically. It gets data from storage and feeds it fast to the hungry Haswells, Cray claims this means it meets "the worst case data I/O surge needs".

DataWarp PCIe-connect I/O blades with SSDs are inserted into XC40 banks of compute blades and all connected via the Aries HPC interconnect. Cray says: "High bandwidth can be delivered with virtually no impact on other I/O executing in the system, ensuring quality of service and sustained bandwidth to specific applications."

Gary Grider, the High Performance Computing Division Leader at Los Alamos National Lab, said: "The Cray XC40 Trinity system will provide the first multi-petabyte, multi-terabyte-per second burst handling capability ever.”

Both DataDirect Networks and EMC have been working on burst buffer tech at Los Alamos and it looks like Cray has leap-frogged them.

The XC40 and CS400 systems will be available with NVIDIA Tesla GPU accelerators and Intel Xeon Phi coprocessors to provide even more processing punch.

Cray XC40

Other XC40 features include:

Intel's Aries ASIC-based system interconnect for compute and I/O nodes on the XC40 base blades using a PCIe Gen3 host interface

Aries' Dragonfly network topology, with which all processors are linked to all other processors (with no more than five hops between any two)

A better cooling system

HPC-optimised programming environment

The CS400 clustered supercomputers use industry-standard server building blocks - blades or rackmount - and come with air- or liquid-cooling on CS400-AC and -LC systems respectively. They scale past 11,000 compute nodes and 40 peak petaflops.

Cray CS400-AC

Cray says the XC400s "can be tailored to multiple purposes - from an all-purpose massively parallel HPC cluster, to one suited for shared memory parallel tasks, to a cluster optimised for hybrid compute and data-intensive workloads."

They are integrated with Cray’s HPC software stack and include tools compatible with most open source and commercial compilers, schedulers, and libraries. Cray's Advanced Cluster Engine (CACE) is a management software suite providing network, server, cluster and storage management capabilities.

Cray HPC software stack

Cray has won a contract to provide an XC40 supercomputer to the Swiss National Supercomputing Centre (CSCS) in Lugano.

Get a CS400-AC brochure here (PDF) and corresponding liquid-cooled system brochure here (PDF). Still hungry for info? Get a DataWarp brochure here (PDF).

The Cray XC40 and CS400 systems are available now, likely priced in the tens of millions. We'll update if we get clarity on pricing. What's a few million between friends, right? ®