Joining us today is James McCombe, the technical visionary behind Caustic Graphics and one of the company's three founders. Most recently James was the chief architect of Apple's next-generation embedded rasterization algorithms, the basis of the rendering and compositing technology used in the iPhone and iPod. He was also a lead architect for Apple's OpenGL graphics system, and worked with the OpenGL standards committee to create early specifications for programmable shading languages.

Dr. Dobb's: James, what is ray tracing and why is it important?

JM: Ray tracing is a method of using "virtual light rays" to determine visibility between two points in a geometric scene. This allows for the modeling of physically accurate lighting phenomena such as reflection, refraction, soft shadowing, and global illumination, just to name a few.

The simplest example involves starting with a pixel on the display. We trace a ray outward until we encounter the first object in the scene that the ray intersects. Then, that object sets the color of the pixel to the color of the object. The object could then cast more rays into the scene based on the properties of the material that the object is simulating. These additional rays could be used to determine if the object is in shadow, to calculate a glossy reflection, to evaluate translucency, etc.

After many millions of rays bounce around the scene every which way, a beautiful image results.

Ray tracing is a much more intuitive rendering paradigm and can provide high-quality images with relative ease to the artist and programmer when compared to rasterization-based techniques which require many strange tricks to be employed to creating seemingly natural phenomena. Unfortunately, ray tracing has traditionally been orders of magnitude slower so it wasn't fit for interactive applications. In addition, ray tracing has never had a general purpose API to abstract it from the underlying hardware, like OpenGL and DirectX have provided for rasterization.

We believe ray tracing will gradually supersede rasterization as the most popular rendering technique once it can be performed at acceptable speeds and have a familiar hardware independent programming interface.

Dr. Dobb's: How does ray tracing differ from rasterization?

JM: As I mentioned earlier, ray tracing can allow all of the objects in the scene to interact with each other by casting light rays between them. These complex interactions create the difference between visually pleasing images and images that are obviously "computer generated". For example, light that bounces off of a red book on a shelf can produce a reddish tint on a white wall behind it.

Rasterization, by contrast, streams all of the geometry through the pipeline one object at a time. This means that the red book has no way to know that it should tint the white wall, and the white wall has no way to know about the red book.

To overcome some of the limitations of rasterization, many content creators use tricks and pre-baking. These can yield good results, but at a cost of labor and scene dynamism. Tens of thousands of hours and millions of dollars are spent creating the complex multi-pass renderers behind high-production-value video games. And each additional interaction needs to be coded into the game engine. Nothing just "falls out". Nothing is free.

And these hacks are heavily specialized to the scene they are rendering. Many of the supporting assets for an object (shadow maps, reflection maps, etc.) need to be recreated for every environment that the object may find itself inside of. You could not transplant a car in a racing game from an Alpine track to a City track without doing substantial work to regenerate many of its assets.

Dr. Dobb's: Caustic Graphics, the company you founded, has developed a ray-tracing co-processor. What are the advantages of the co-processor approach, as opposed to a software only ray tracer.

JM: In a word, efficiency.

Today's GPGPUs have a tremendous amount of compute power but their architectures are not suited to ray tracing. This is because, unlike rasterization, ray tracers cannot just stream geometry through a pipeline. The complex visual effects that make difference between "game" quality graphics and photorealistic quality are created with incoherent rays bouncing around the scene in a chaotic way.

Each of these rays could access a different part of the scene and run a different program to simulate the material it is interacting with. GPGPUs, which amount to incredibly wide SIMD machines, are optimized for running many instances of the same program kernel and accessing data that is fairly nearby in memory. This means that GPGPU ray tracers don't utilize most of the compute available since adjacent code threads will be taking divergent branching paths causing stalls and also frequently waiting on global memory access due to the small caches inherent in GPGPU architectures.

The Caustic co-processor offloads the scattered database operations inherent in ray-tracing and schedules results so that many rays accessing similar objects can be evaluated at the same time. This relatively small chip unlocks the massive compute available in a GPGPU and allows it to run highly complex shaders in a ray-tracing system.

Dr. Dobb's: Does it make sense for your co-processor to be implemented in one core of a multicore processor?

JM: The internal structure of the Caustic hardware is not similar to anything in general-purpose processors today. We don not believe it is possible to achieve comparable performance on today's CPUs.

However, just as floating-point instructions have made their way into almost all CPUs available, we believe the fundamental capabilities of the Caustic hardware could be integrated into future GPGPU architectures, enabling even better performance. With Caustic's help, of course.

Another alternative would be to take the Caustic co-processor IP as-is and transplant an existing general-purpose core in a multicore processor. As I said earlier, the Caustic co-processor is highly silicon efficient and would unlock the potential of the other general-purpose cores to perform shading operations in an efficient and ordered fashion.

Dr. Dobb's: In what way did you extend the OpenGL APIs?

JM: We have diverged from OpenGL ES in two significant ways. First, we allow shaders to cast rays. A shader author can programatically define the way a material interacts with light by using a ray shader. In addition to all of the shading techniques available in a traditional GLSL fragment shader, rays become the most powerful tool in the toolbox, enabling interactions between ray shaders on multiple different objects in the scene.

Second, we changed to rendering pipeline to require "retain mode" operation. Because ray tracing requires every part of the scene to be accessible when rays are being evaluated, we have created "Primitive Objects" to encapsulate all of the state needed to render an object in the scene. A primitive object can reference shaders, textures, uniforms, vertex data, and anything else that is specific to a single object.

Dr. Dobb's: Where can readers go for more information on the Caustic co-processor and SDK?

JM: For more information, you can read our technical brief on CausticGL published here. Also, our technical F.A.Q. contains lots of information about CausticOne published here.