Hey all!

So image updates have been sparse since the last livestream. Why? I've been rearranging some of the ScrumbleShip backend. Let's dig into that a little.

Currently, the way rendering works is something like this:

One thread locates and Caches nearby blocks within your ship, storing its result inside the ship memory structure.

Another thread picks through your ships, locating and assembling blocks into a list. Then, it orders this list, and copies the Cache from its location in the ship memory structure to a temporary Voxel List. Once it's done with that, it goes through every other ship and does something similar.

The render thread uploads the entire Voxel List from system memory to gpu memory, every single frame, then renders it with a single opengl draw call.





There's a lot of weird behavior too, like backwards sharing of ordered ship-lists, which causes threads to wait for each other.

The biggest problem with this system is that it requires me to re-send ALL the voxel data EVERY frame. As the ScrumbleShip engine gets better at rendering voxels, I can both send more voxels to the gpu, and send more data about those voxels to the gpu. So I'm starting to use too much bandwidth between the computer's memory and its graphics card. For example, the extra data sent by the lighting change needlessly costs us around 10fps.

So reducing the number of sends is a high priority on my list. But as I was studying to fix the problem, I noticed several interesting facts about the current rendering pipeline, and I developed a plan to make it better. Here's an experiment I did to prove a particular point:

This proved to me that I COULD render efficiently using multiple opengl draw calls, which opens up a lot of interesting optimizations.

So the new plan is:

One thread maintains lists of ships in distance orders and hands them out as-needed.

Another thread maintains lists of ship chunks in distance orders and hands them out as-needed.

Another thread uses these ordered chunk lists to create a list of visible blocks.

The caching thread goes through this list of visible blocks, preparing nearby blocks for rendering and storing the result in the ship chunk list.

The rendering thread goes through the list of nearby blocks, uploading their data to the gpu as they get close enough to render. Then it goes through the list of visible blocks, rendering them in batches using separate opengl

draw calls.





What advantages does the newer pipeline have?

It'll render the nearest blocks, regardless of which ship they're in. No more weird block-unloading when two ships are close together

It'll operate smoother on multi-core machines, with less time wasted on waiting for other threads

It'll render very large ships and asteroids, up to 1km in diameter. The current system can't handle asteroids much bigger than 100m across.

It'll use less system memory to do the rendering.

I only need to upload/remove block data from the gpu when its render status changes, rather than every frame. Potentially 10-15fps boost.

It'll open the way for the following potential optimizations:

GPU-based Occlusion Culling, potentially a 2x-5x rendering performance boost.

Voxel-face rendering instead of entire-voxel rendering, for a potential 2x performance boost.

Distant ships can be rendered as billboards, for a solid performance boost.

Geometry-shader based dynamic voxel creation, with an unknown (1x-5x) performance boost.

Any disadvantages?

Rendering in multiple opengl calls costs anywhere from 2 to 10fps.

It causes Dirkson to work for a couple weeks on features that don't yet make good screenshots, videos, or bleeding edge releases.

At this point, I estimate I've got maybe another week left to switch to the new pipeline. Then I should start putting out cool videos like this one again:

I'm aiming for the next full release sometime in February. It'll contain the lighting changes, lots of performance improvements, the mining torch, a complete UI overhaul, and a building-system overhaul.

Cheers!

-Dirk