Snow Leopard has been out just a few days and folks are already trying to benchmark its OpenCL capabilities. While the available tools are barely a few days old, the results so far show some interesting results for Snow Leopard's OpenCL implementation.

OpenCL, as you may know, is a framework for writing general (as opposed to graphics-specific) code that can run on the fast, multicore GPUs in today's computers. As opposed to NVIDIA's CUDA and ATI's Stream APIs, which are designed to enable GPGPU programming for each company's respective GPU hardware, OpenCL is designed in a hardware-agnostic way. Code can run on whatever computing resources are available in a given system. That includes integrated GPUs, discrete GPUs, the main CPU, and even other specialized processors.

Apple proposed the spec last year, noting that it planned to build it into Snow Leopard. Apple then joined with the Khronos Group to create a working group to define the spec as an open standard. That work wrapped up last fall, and the 1.0 version was finalized last December. Apple released Snow Leopard last Friday with the first implementation of OpenCL.

In just a couple days time, there are already two benchmarking utilities designed to test OpenCL. Developer Andreas Michalak has put together a command line utility called OpenCL Benchmark, while Japanese developer "kloku" is porting the AO Bench floating point benchmark to OpenCL with mixed results. Some of the early testing with OpenCL Benchmark, though, show promise of some massive speed-ups in the kinds of calculations that OpenCL is designed for.

Here is one example run of OpenCL Benchmark V205.

The example run shown on the download page for OpenCL Benchmark shows the benchmark runs 12 times slower on a 3.2GHz Core 2 Duo compared to an NVIDIA GeForce 9600M GT. That's fast. However, more typical results show that OpenCL code can run about four to five times faster than on a Core 2 Duo in general. One result with a Nehalem-based Mac Pro shows the code can run slightly faster on those systems' CPUs, but given a large enough set of parallel tasks, the OS can spread around the computing tasks as needed, as some cores of the CPU would likely be involved in other tasks in real-world use.

Furthermore, an interesting tidbit about Snow Leopard's implementation is revealed by early tests. Though Snow Leopard doesn't seem to enable dual GPUs or on-the-fly GPU switching for machines using the NVIDIA GeForce 9400M chipset—a limitation carried over from Leopard—it does appear that the OS can use both as OpenCL resources simultaneously. So even if you have the 9600M GT enabled on your MacBook Pro, if OpenCL code is encountered in an application, Snow Leopard can send that code to be processed by the 16 GPU cores sitting pretty much dormant in the 9400M. The converse is not true, though—when running a MacBook Pro with just the 9400M enabled, the 9600M GT is shut down entirely to save power, and can't be used as an OpenCL resource.

Of course, OpenCL is sill in its infancy, and developers will have to update their software to take advantages of the benefits that OpenCL can offer. But the trend in computing seems to be throwing more and more processors at the increasingly complex computing problems, so OpenCL will be a handy tool for developers to have at their disposal to better take advantage of all this processing power.

Further Reading: