These are truly exciting times for ray tracing. Latest advances like AI-accelerated denoising, Microsoft announcing native support in DirectX 12 and Peter Shirley releasing his books on a pay what you want basis make it look like ray tracing finally has the chance to become acceptable at court. It might be too early to speak of the beginning of a revolution, but it sure is a good idea to learn and build knowledge around the topic.

In this article, we’re going to write a very simple ray tracer from scratch using compute shaders in Unity. The languages we will use are C# for the scripts and HLSL for the shaders. All code is also hosted on Bitbucket.

Follow along and you will end up with a rendering like this:

Ray tracing theory

I would like to start by quickly reviewing the basic ray tracing theory. If you are familiar, please feel free to skip ahead.

Let’s think about how photographs emerge in the real world – highly simplified, but for the purpose of rendering this should be fine. It all starts with a light source emitting photons. A photon flies in a straight line until it hits a surface, at which point it is reflected or refracted and continues its journey minus some energy that has been absorbed by the surface. Eventually, some photons will hit the camera’s image sensor which in turn produces the resulting image. Ray tracing basically simulates these steps to create photorealistic images.

In practice, only a tiny fraction of the photons emitted by a light source will ever hit the camera. Therefore, applying the principle of Helmholtz reciprocity, calculations are commonly reversed: Instead of shooting photons from light sources, rays are shot from the camera into the scene, reflected or refracted and eventually hit a light source.

The ray tracer we are going to build is based on a 1980 paper by Turner Whitted. We will be able to simulate hard shadows and perfect reflections. It will also serve as a basis for more advanced effects like refraction, diffuse global illumination, glossy reflections and soft shadows.

Basic setup

Let’s start by creating a new Unity project. Create a C# script RayTracingMaster.cs and a compute shader RayTracingShader.compute . Fill the C# script with some basic code:

using UnityEngine; public class RayTracingMaster : MonoBehaviour { public ComputeShader RayTracingShader; private RenderTexture _target; private void OnRenderImage(RenderTexture source, RenderTexture destination) { Render(destination); } private void Render(RenderTexture destination) { // Make sure we have a current render target InitRenderTexture(); // Set the target and dispatch the compute shader RayTracingShader.SetTexture(0, "Result", _target); int threadGroupsX = Mathf.CeilToInt(Screen.width / 8.0f); int threadGroupsY = Mathf.CeilToInt(Screen.height / 8.0f); RayTracingShader.Dispatch(0, threadGroupsX, threadGroupsY, 1); // Blit the result texture to the screen Graphics.Blit(_target, destination); } private void InitRenderTexture() { if (_target == null || _target.width != Screen.width || _target.height != Screen.height) { // Release render texture if we already have one if (_target != null) _target.Release(); // Get a render target for Ray Tracing _target = new RenderTexture(Screen.width, Screen.height, 0, RenderTextureFormat.ARGBFloat, RenderTextureReadWrite.Linear); _target.enableRandomWrite = true; _target.Create(); } } }

The OnRenderImage function is automatically called by Unity whenever the camera has finished rendering. To render, we first create a render target of appropriate dimensions and tell the compute shader about it. The 0 is the index of the compute shader’s kernel function – we have only one.

Next, we dispatch the shader. This means that we are telling the GPU to get busy with a number of thread groups executing our shader code. Each thread group consists of a number of threads which is set in the shader itself. The size and number of thread groups can be specified in up to three dimensions, which makes it easy to apply compute shaders to problems of either dimensionality. In our case, we want to spawn one thread per pixel of the render target. The default thread group size as defined in the Unity compute shader template is [numthreads(8,8,1)] , so we’ll stick to that and spawn one thread group per 8×8 pixels. Finally, we write our result to the screen using Graphics.Blit .

Let’s give it a try. Add the RayTracingMaster component to the scene’s camera (this is important for OnRenderImage to be called), assign your compute shader and enter play mode. You should see the output of Unity’s compute shader template in the form of a beautiful triangle fractal.

Camera

Now that we can display things on screen, let’s generate some camera rays. Since Unity gives us a fully working camera, we will just use the calculated matrices to do this. Start by setting the matrices on the shader. Add the following lines to the script RayTracingMaster.cs :

private Camera _camera; private void Awake() { _camera = GetComponent<Camera>(); } private void SetShaderParameters() { RayTracingShader.SetMatrix("_CameraToWorld", _camera.cameraToWorldMatrix); RayTracingShader.SetMatrix("_CameraInverseProjection", _camera.projectionMatrix.inverse); }

Call SetShaderParameters from OnRenderImage before rendering.

In the shader, we define the matrices, a Ray structure and a function for construction. Please note that in HLSL, unlike in C#, a function or variable declaration needs to appear before it is used. For each screen pixel’s center, we calculate the origin and direction of the ray, and output the latter as color. Here is the full shader:

#pragma kernel CSMain RWTexture2D<float4> Result; float4x4 _CameraToWorld; float4x4 _CameraInverseProjection; struct Ray { float3 origin; float3 direction; }; Ray CreateRay(float3 origin, float3 direction) { Ray ray; ray.origin = origin; ray.direction = direction; return ray; } Ray CreateCameraRay(float2 uv) { // Transform the camera origin to world space float3 origin = mul(_CameraToWorld, float4(0.0f, 0.0f, 0.0f, 1.0f)).xyz; // Invert the perspective projection of the view-space position float3 direction = mul(_CameraInverseProjection, float4(uv, 0.0f, 1.0f)).xyz; // Transform the direction from camera to world space and normalize direction = mul(_CameraToWorld, float4(direction, 0.0f)).xyz; direction = normalize(direction); return CreateRay(origin, direction); } [numthreads(8,8,1)] void CSMain (uint3 id : SV_DispatchThreadID) { // Get the dimensions of the RenderTexture uint width, height; Result.GetDimensions(width, height); // Transform pixel to [-1,1] range float2 uv = float2((id.xy + float2(0.5f, 0.5f)) / float2(width, height) * 2.0f - 1.0f); // Get a ray for the UVs Ray ray = CreateCameraRay(uv); // Write some colors Result[id.xy] = float4(ray.direction * 0.5f + 0.5f, 1.0f); }

Try rotating the camera in the inspector. You should see that the ‘colorful sky’ behaves accordingly.

Now let’s replace the colors with an actual skybox. I am using HDRI Haven’s Cape Hill in my examples, but you can of course use any one that you like. Download and drop it into Unity. In the import settings, remember to increase the maximum resolution if you downloaded a higher resolution than 2048. Now add a public Texture SkyboxTexture to the script, assign your texture in the inspector and set it on the shader by adding this line to the SetShaderParameters function:

RayTracingShader.SetTexture(0, "_SkyboxTexture", SkyboxTexture);

In the shader, define the texture and a corresponding sampler, and a π constant that we’ll use in a minute:

Texture2D<float4> _SkyboxTexture; SamplerState sampler_SkyboxTexture; static const float PI = 3.14159265f;

Now instead of writing the direction as color, we’ll sample the skybox. To do this, we transform our cartesian direction vector to spherical coordinates and map this to texture coordinates. Replace the last bit of the CSMain by this:

// Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; Result[id.xy] = _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0);

Tracing

So far so good. Now we’re getting to the actual tracing of our rays. Mathematically, we will calculate the intersection between our ray and our scene geometry, and store the hit parameters (position, normal and distance along the ray). If our ray hits multiple objects, we will pick the closest one. Let’s define the struct RayHit in the shader:

struct RayHit { float3 position; float distance; float3 normal; }; RayHit CreateRayHit() { RayHit hit; hit.position = float3(0.0f, 0.0f, 0.0f); hit.distance = 1.#INF; hit.normal = float3(0.0f, 0.0f, 0.0f); return hit; }

Commonly, scenes are comprised of many triangles, but we will start simple: intersecting an infinite ground plane and a handful of spheres!

Ground Plane

Intersecting a line with an infinite plane at \(y=0\) is pretty simple. We only accept hits in positive ray direction though, and reject any hit that is not closer than a potential previous hit.

By default, parameters in HLSL are passed by value and not by reference, so we would only be able to work on a copy and not propagate changes to the calling function. We pass RayHit bestHit with the inout qualifier to be able to modify the original struct. Here’s the shader code:

void IntersectGroundPlane(Ray ray, inout RayHit bestHit) { // Calculate distance along the ray where the ground plane is intersected float t = -ray.origin.y / ray.direction.y; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = float3(0.0f, 1.0f, 0.0f); } }

To use it, let’s add a framework Trace function (we will extend it in a minute):

RayHit Trace(Ray ray) { RayHit bestHit = CreateRayHit(); IntersectGroundPlane(ray, bestHit); return bestHit; }

Furthermore, we need a basic shading function. Again, we pass the Ray with inout – we will modify it later on when we talk about reflection. For debug purposes, we return the normal if geometry was hit, and fall back to our skybox sampling code otherwise:

float3 Shade(inout Ray ray, RayHit hit) { if (hit.distance < 1.#INF) { // Return the normal return hit.normal * 0.5f + 0.5f; } else { // Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; return _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0).xyz; } }

We will use both functions down in CSMain . Remove the skybox sampling code if you haven’t already, and add the following lines to trace the ray and shade the hit:

// Trace and shade RayHit hit = Trace(ray); float3 result = Shade(ray, hit); Result[id.xy] = float4(result, 1);

Sphere

A plane is not the most exciting thing in the world, so let’s add a sphere rightaway. The math for a line-sphere intersection can be found on Wikipedia. This time there can be two ray hit candidates: the entry point p1 - p2 , and the exit point p1 + p2 . We will check the entry point first, and only use the exit point if the other one is not valid. A sphere in our case is defined as a float4 comprised of position (xyz) and radius (w). Here’s the code:

void IntersectSphere(Ray ray, inout RayHit bestHit, float4 sphere) { // Calculate distance along the ray where the sphere is intersected float3 d = ray.origin - sphere.xyz; float p1 = -dot(ray.direction, d); float p2sqr = p1 * p1 - dot(d, d) + sphere.w * sphere.w; if (p2sqr < 0) return; float p2 = sqrt(p2sqr); float t = p1 - p2 > 0 ? p1 - p2 : p1 + p2; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(bestHit.position - sphere.xyz); } }

To add a sphere, just call this function from Trace , for example:

// Add a floating unit sphere IntersectSphere(ray, bestHit, float4(0, 3.0f, 0, 1.0f));

Anti-Aliasing

There is one problem with the current approach: We’re only testing the center of each pixel, so you can see nasty aliasing effects (the dreaded ‘jaggies’) in the result. To circumvent this, we are going to trace not one but multiple rays per pixel. Each ray gets a random offset inside the pixel’s region. To keep an acceptable frame rate, we’re doing progressive sampling, meaning that we will trace one ray per pixel each frame and average the result over time if the camera didn’t move. Every time the camera moves (or any other parameter like field of view, scene geometry or scene lighting is changed), we need to start all over.

Let’s create a very simple image effect shader that we will use for adding up several results. Name your shader AddShader , make sure the first line reads Shader "Hidden/AddShader" . After Cull Off ZWrite Off ZTest Always add Blend SrcAlpha OneMinusSrcAlpha to enable alpha blending. Next, replace the default frag function with the following lines:

float _Sample; float4 frag (v2f i) : SV_Target { return float4(tex2D(_MainTex, i.uv).rgb, 1.0f / (_Sample + 1.0f)); }

This shader will now just draw the first sample with an opacity of \(1\), the next one with \(\frac{1}{2}\), then \(\frac{1}{3}\) and so on, averaging all samples with equal contribution.

In the script, we still need to count the samples and make use of the newly created image effect shader:

private uint _currentSample = 0; private Material _addMaterial;

You should also reset _currentSamples = 0 when the render target is rebuilt in InitRenderTexture , and add an Update function that detects camera transform changes:

private void Update() { if (transform.hasChanged) { _currentSample = 0; transform.hasChanged = false; } }

To use our custom shader, we need to initialize a material, tell it about the current sample and use it for blitting to the screen in the Render function:

// Blit the result texture to the screen if (_addMaterial == null) _addMaterial = new Material(Shader.Find("Hidden/AddShader")); _addMaterial.SetFloat("_Sample", _currentSample); Graphics.Blit(_target, destination, _addMaterial); _currentSample++;

So we’re doing progressive sampling, but we’re still always using the pixel center. In the compute shader, define a float2 _PixelOffset and use that in CSMain instead of the hard float2(0.5f, 0.5f) offset. Back in the script, create a random offset by adding this line to SetShaderParameters :

RayTracingShader.SetVector("_PixelOffset", new Vector2(Random.value, Random.value));

If you move the camera, you should see that the image still shows aliasing, but it will quickly vanish if you stand still for a couple of frames. Here is a side by side comparison of the good we’ve done:

Reflection

The groundwork for our ray tracer is now done, so we can start dealing with the fancy things that actually set ray tracing apart from other rendering techniques. Perfect reflections are the first item on our list. The idea is simple: Whenever we hit the surface, we reflect the ray according to the law of reflection that you will probably remember from school (incident angle = angle of reflection), reduce its energy, and repeat until we either hit the sky, run out of energy or after a fixed amount of maximum bounces.

In the shader, add a float3 energy to the ray and initialize it in the CreateRay function as ray.energy = float3(1.0f, 1.0f, 1.0f) . The ray starts with full throughput on all color channels, and will diminish with each reflection.

Now we’re going to execute a maximum number of 8 traces (the original ray plus 7 bounces), and add up the results of the Shade function calls, but multiplied with the ray’s energy. As an example, imagine a ray that has been reflected once and lost \(\frac{3}{4}\) of its energy. Now it travels on and hits the sky, so we only transfer \(\frac{1}{4}\) of the energy of the sky hit to the pixel. Adjust your CSMain like this, replacing the previous Trace and Shade calls:

// Trace and shade float3 result = float3(0, 0, 0); for (int i = 0; i < 8; i++) { RayHit hit = Trace(ray); result += ray.energy * Shade(ray, hit); if (!any(ray.energy)) break; }

Our Shade function is now also responsible for updating the energy and generating the reflected ray, so here’s where the inout becomes important. To update the energy, we perform an element-wise multiplication with the specular color of the surface. For example, gold has a specular reflectivity of roughly float3(1.0f, 0.78f, 0.34f) , so it will reflect 100% of red light, 78% of green light, but only 34% of blue light, giving the reflection its distinct golden tint. Be careful not to go over 1 with any of those values, since you would create energy out of nowhere. Also, the reflectivity is often lower than you would think. See e.g. slide 64 in Physics and Math of Shading by Naty Hoffman for some values.

HLSL has an inbuilt function to reflect a ray using a given normal, which is great. Due to floating point inaccuracy, it can happen that a reflected ray is blocked by the surface it is reflected on. To prevent this self-occlusion we will offset the position just a bit along the normal direction. Here’s the new Shade function:

float3 Shade(inout Ray ray, RayHit hit) { if (hit.distance < 1.#INF) { float3 specular = float3(0.6f, 0.6f, 0.6f); // Reflect the ray and multiply energy with specular reflection ray.origin = hit.position + hit.normal * 0.001f; ray.direction = reflect(ray.direction, hit.normal); ray.energy *= specular; // Return nothing return float3(0.0f, 0.0f, 0.0f); } else { // Erase the ray's energy - the sky doesn't reflect anything ray.energy = 0.0f; // Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; return _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0).xyz; } }

You might want to increase the intensity of the skybox a little by multiplying it with a factor greater than one. Now play around with your Trace function. Put some spheres in a loop and you will end up with a result like this:

Directional Light

So we can trace mirror-like reflections, which allows us to render smooth metallic surfaces, but for non-metals we need one more thing: diffuse reflection. In brief, metals will only reflect incoming light tinted with their specular color, while non-metals allow light to refract into the surface, scatter and leave it in a random direction tinted with their albedo color. In case of an ideal Lambertian surface which is commonly assumed, the probability is proportional to the cosine of the angle between said direction and the surface normal. A more in-depth discussion of the topic can be found here.

To get started with diffuse lighting, let’s add a public Light DirectionalLight to our RayTracingMaster and assign the scene’s directional light. You might also want to detect the light’s transform changes in the Update function, just like we already do it for the camera’s transform. Now add the following lines to your SetShaderParameters function:

Vector3 l = DirectionalLight.transform.forward; RayTracingShader.SetVector("_DirectionalLight", new Vector4(l.x, l.y, l.z, DirectionalLight.intensity));

Back in the shader, define float4 _DirectionalLight . In the Shade function, define the albedo color right below the specular color:

float3 albedo = float3(0.8f, 0.8f, 0.8f);

Replace the previously black return with a simple diffuse shading:

// Return a diffuse-shaded color return saturate(dot(hit.normal, _DirectionalLight.xyz) * -1) * _DirectionalLight.w * albedo;

Remember that the dot product is defined as \(a \cdot b = ||a||\ ||b|| \cos \theta\). Since both our vectors (the normal and the light direction) are of unit length, the dot product is exactly what we are looking for: the cosine of the angle. The ray and the light are pointing in opposite directions, so for head-on lighting the dot product returns -1 instead of 1. We need to flip the sign to make up for this. Finally, we saturate this value (i.e. clamp it to \([0,1]\) range) to prevent negative energy.

For the directional light to cast shadows, we will trace a shadow ray. It starts at the surface position in question (again with a very small displacement to avoid self-shadowing), and points in the direction the light comes from. If anything blocks the way to infinity, we won’t use any diffuse light. Add these lines above the diffuse return statement:

// Shadow test ray bool shadow = false; Ray shadowRay = CreateRay(hit.position + hit.normal * 0.001f, -1 * _DirectionalLight.xyz); RayHit shadowHit = Trace(shadowRay); if (shadowHit.distance != 1.#INF) { return float3(0.0f, 0.0f, 0.0f); }

Now we can trace some glossy plastic spheres with hard shadows! Setting 0.04 for specular and 0.8 for albedo yields the following image:

Scene and Materials

As today’s crescendo, let’s create some more complex and colorful scenes! Instead of hard-coding everything in the shader, we will define the scene in C# for more flexibility.

First we are going to extend the RayHit structure in the shader. Instead of globally defining the material properties in the Shade function, we will define them per object and store them in the RayHit . Add float3 albedo and float3 specular to the struct, and initialize them to float3(0.0f, 0.0f, 0.0f) in CreateRayHit . Also adjust the Shade function to use these values from hit instead of the hard-coded ones.

To establish a common understanding of what a sphere is on the CPU and the GPU, define a struct Sphere both in your shader and in the C# script. On the shader side, it looks like this:

struct Sphere { float3 position; float radius; float3 albedo; float3 specular; };

Mirror this structure in your C# script.

In the shader, we need to make the IntersectSphere function work with our custom struct instead of the float4 . This is simple to do:

void IntersectSphere(Ray ray, inout RayHit bestHit, Sphere sphere) { // Calculate distance along the ray where the sphere is intersected float3 d = ray.origin - sphere.position; float p1 = -dot(ray.direction, d); float p2sqr = p1 * p1 - dot(d, d) + sphere.radius * sphere.radius; if (p2sqr < 0) return; float p2 = sqrt(p2sqr); float t = p1 - p2 > 0 ? p1 - p2 : p1 + p2; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(bestHit.position - sphere.position); bestHit.albedo = sphere.albedo; bestHit.specular = sphere.specular; } }

Also set bestHit.albedo and bestHit.specular in the IntersectGroundPlane function to adjust its material.

Next, define StructuredBuffer<Sphere> _Spheres . This is the place where the CPU will store all spheres that comprise the scene. Remove all hardcoded spheres from your Trace function and add the following lines:

// Trace spheres uint numSpheres, stride; _Spheres.GetDimensions(numSpheres, stride); for (uint i = 0; i < numSpheres; i++) IntersectSphere(ray, bestHit, _Spheres[i]);

Now we will fill the scene with some life. Back in C#, let’s add some public parameters to control sphere placement and the actual compute buffer:

public Vector2 SphereRadius = new Vector2(3.0f, 8.0f); public uint SpheresMax = 100; public float SpherePlacementRadius = 100.0f; private ComputeBuffer _sphereBuffer;

Set up the scene in OnEnable , and release the buffer in OnDisable . This way, a random scene will be generated every time you enable the component. The SetUpScene function will try to position spheres in a certain radius, and reject those that would intersect spheres already in existence. Half of the spheres are metallic (black albedo, colored specular), the other half is non-metallic (colored albedo, 4% specular):

private void OnEnable() { _currentSample = 0; SetUpScene(); } private void OnDisable() { if (_sphereBuffer != null) _sphereBuffer.Release(); } private void SetUpScene() { List<Sphere> spheres = new List<Sphere>(); // Add a number of random spheres for (int i = 0; i < SpheresMax; i++) { Sphere sphere = new Sphere(); // Radius and radius sphere.radius = SphereRadius.x + Random.value * (SphereRadius.y - SphereRadius.x); Vector2 randomPos = Random.insideUnitCircle * SpherePlacementRadius; sphere.position = new Vector3(randomPos.x, sphere.radius, randomPos.y); // Reject spheres that are intersecting others foreach (Sphere other in spheres) { float minDist = sphere.radius + other.radius; if (Vector3.SqrMagnitude(sphere.position - other.position) < minDist * minDist) goto SkipSphere; } // Albedo and specular color Color color = Random.ColorHSV(); bool metal = Random.value < 0.5f; sphere.albedo = metal ? Vector3.zero : new Vector3(color.r, color.g, color.b); sphere.specular = metal ? new Vector3(color.r, color.g, color.b) : Vector3.one * 0.04f; // Add the sphere to the list spheres.Add(sphere); SkipSphere: continue; } // Assign to compute buffer _sphereBuffer = new ComputeBuffer(spheres.Count, 40); _sphereBuffer.SetData(spheres); }

The magic number 40 in new ComputeBuffer(spheres.Count, 40) is the stride of our buffer, i.e. the byte size of one sphere in memory. To calculate it, count the number of floats in the Sphere struct and multiply it by float’s byte size (4 bytes). Finally, set the buffer on the shader in the SetShaderParameters function:

RayTracingShader.SetBuffer(0, "_Spheres", _sphereBuffer);

Results

Congratulations, you made it! You now have a working GPU-powered Whitted ray tracer, able to render a plane and lots of spheres with mirror-like reflections, simple diffuse lighting and hard shadows. The full source code can be found on Bitbucket. Play around with the sphere placement parameters and enjoy the beautiful view:

What’s next?

We achieved quite something today, but there’s still a lot of ground to cover: diffuse global illumination, glossy reflections, soft shadows, non-opaque materials with refraction, and obviously using triangle meshes instead of spheres. In the next article, we will extend our Whitted ray tracer into a path tracer to conquer a number of the mentioned phenomena.

Thank you for taking the time to work through this article! Stay tuned, the follow-up is in the works.