Amazon has announced a new cloud product aimed at bringing cloud-style scaling and flexibility to high-performance computing (HPC) applications. In short, the company has released a classic grid computing product that's based on their cloud offerings—it's a kind of the reverse of the normal grid-to-cloud evolutionary development that we described in our introduction to cloud computing. A quick look at the contrast between a cloud and an HPC-style grid will make it clear what Amazon has done and why.

The cloud model that Amazon uses for EC2 consists of multiple compute nodes, loosely coupled, with each node running a collection of tasks from different clients. Here's a depiction of this model, drawn from the aforementioned introduction.

The cloud. Different colored jobs belong to different clients. (One of those jobs belongs to your 18-year-old nephew.)

Cloud tasks are short-lived, lightweight with respect to compute intensity, and agnostic about the underlying node hardware. Clients bring up multiple tasks on different nodes, and multiple clients often share each node.

The cloud model described above works well for a large range of workloads, especially Web applications. But it's not the only way to use the underlying infrastructure. Amazon CTO Werner Vogels, in his blog post announcing CCI, says that some larger clients have been using EC2 in a grid-like manner for high-performance computing since it first launched. Specifically, companies in financial services, pharmaceuticals, and a few other sectors have been running large, multinode tasks on EC2, even though the underlying infrastructure is better suited to the inverse (i.e., multitask nodes).

Take a look at the grid model below, and contrast it to the cloud model above:

The grid. Different colored jobs belong to different clients. (One of those jobs belongs to the Department of Defense.)

You can see from the diagram that grid tasks are very large and utilize multiple nodes, ganged together. Where cloud tasks are short-lived, lightweight, and hardware-agnostic, grid tasks have the opposite characteristics—they're longer-lived (often running as batch processes), heavy with respect to compute intensity, and benefit greatly from being optimized for the underlying node hardware.

To make a grid work really well, the nodes need to be much more tightly coupled than they are in a cloud model, which means that they need higher-bandwidth links between them. This is why Amazon's CCI nodes have 10Gbps non-blocking I/O links connecting them. Amazon is also guaranteeing that CCI nodes will consist of a pair of Xeon X5570s, so that HPC developers can optimize for that specific hardware.

Amazon is pitching this CCI for EC2 product as a kind of "mid-range HPC" platform, for companies that are currently running dedicated HPC clusters in-house but would like to cut down on their overhead costs by getting those cycles from a service provider. Not only can the companies migrate to EC2 from their in-house hardware, but if they're already using EC2, they'll benefit from the fact that CCI instances are managed exactly like regular EC2 instances.