Programming for the GPU — It’s Hard

I took on this project never having written a single program for the GPU before (unless you count this). In fact, I had never used Unity or had any experience with C#, the program language Unity uses. I was diving straight into the deep-end, which is hard when you don’t even have your water simulation ready yet!

One thing that’s worth pointing out is that the difficulty didn’t come from the theoretical difficulty of the algorithms involved, but instead from the faff of setting everything up to work. The algorithms weren’t hard to understand or grasp, but getting them to run on the hardware was an entirely different matter!

The code below shows the GPU compute kernel — code that runs on the GPU — that ‘splats’ Wave Particles to the water’s surface.

The code that ran on the GPU to ‘splat’ Wave Particles to a 2D texture representing the water’s surface, as keen be seen, it’s very simple.

Despite the simplicity of this code itself, getting it to run wasn’t the most straight forward thing in the world.

Boilerplate

This is where boilerplate code comes into play, setting everything up for this simple piece of code to run, and it isn’t pretty. I won’t go into detail about what is going on (it also handles some other GPU code beyond the above) but will highlight some of the more aggravating parts.

Recognize these strings from the previous snippet?

One of the first things that the boilerplate does is define a load of strings similar to the above. Why? Because the variables in the compute kernel (which are essentially parameters) above need to be identified by the CPU-side code somehow and string literals are the only option available! This also means that any programmers who wants to change the name of a variable that is set by the CPU needs to remember that these strings need to be edited, because you won’t get a compile-time error. That in and of itself would be relatively manageable if it weren’t for aggravating point #2:

Kernel initialization

The actual setting of the variables before we launch our compute shader.

The code above is how the variables are initialized and the compute kernel itself is invoked. Again, this doesn’t look unreasonable, but it would have been nice to have been able to do the following, complete with compile-time type-checking and boilerplate generation:

Pseudocode for how GPU compute kernel invocation could be nicer.

However, the real difficulty didn’t come from the code, it’s the fact that if there is any issue at all with the above, for example you forget an initialization, you don’t name a parameter correctly, or there is a type mismatch: you don’t get an error, just incorrect behavior. I was lucky in that it never bit me too hard, but I did get the following issue for two weeks with no idea as to how it was caused.