AMD has announced the release of the first OpenCL SDK for x86 CPUs, and it will enable developers to target x86 processors with the kind of OpenCL code that's normally written for GPUs. In a way, this is a reverse of the normal "GPGPU" trend, in which programs that run on a CPU are modified to run in whole or in part on a GPU.

Why would you want to run GPU programs on a CPU? Debugging is one reason, if you don't have access to an OpenCL-compliant GPU. And for now, that's essentially what'll be doing, since the new SDK doesn't appear to be able to target GPUs, yet. But eventually, developers will be able to write in OpenCL and target multicore x86 CPUs alongside GPUs from NVIDIA, AMD, and Intel. Of course, when you can write once and target a variety of parallel hardware types, the fact that Larrabee runs x86 will be irrelevant; so Intel had better be able to scale up Larrabee's performance, because its x86 support will not be a selling point (at least for Larrabee as a GPU, though an HPC coprocessor might be a different story).

Note that you can already write once, run anywhere for GPUs and multicore x86 already, but you'd have to use RapidMind's proprietary middleware layer. Because it's more than just an API—the middleware does just-in-time compilation targeting whatever hardware is in the system, dynamic load-balancing, and real-time optimization—an OpenGL vs. RapidMind comparison is a little bit apples-to-oranges, but only just a bit.

In reality, few workloads are such that you can break them up in the design phase into parallel chunks so that a middleware layer can dynamically map them to hardware resources at run-time. Certainly there are some problem domains that this works for—finance is one that comes to mind at the moment—but these are very specialized (though profitable) niches. Most of the stuff that ordinary developers will want to do with GPGPU in the medium-term is more mundane and application-specific, like using the GPU to speed up some specific part of a common application in order to give a performance boost vs. the CPU alone. In other words, these common apps don't solve data-parallel, compute-intensive problems—rather, they have specific parts that need acceleration, and if there's a capable GPU available then they can use OpenCL to hand off that part to it.

Note that Snow Leopard will come with an OpenCL implementation that works on both CPUs and GPUs. Ars will have a review when it launches, so stay tuned.