I've recently been working on an offline vulkan renderer/compositor. Our initial implementation was a one-shot renderer - spawn the process, render the image, and exit. However, to amortize startup costs, we are converting it to a multi-shot renderer with an HTTP API. The first implementation simply used vkQueueWaitIdle() , but in a multi-threaded environment this might be less than optimal as multiple command buffers are submitted to the same queue.

Using a fence allows the CPU to wait for the GPU to complete a specific command buffer, in this case, rendering the image and saving it to host memory.

In our renderer, we tried using a fence (incorrectly) with a timeout of 0, assuming it meant to wait indefinitely. We couldn't get it to work so we reverted back to using vkQueueWaitIdle() which was fine for a one-shot renderer. However, after implementing our multi-threaded renderer, and attempting to (incorrectly) use fences again, we experienced corrupt output:

Sample Output Corrupt Output

After checking the documentation, I found out we were doing it wrong:

If timeout is zero, then vkWaitForFences does not wait, but simply returns the current state of the fences. VK_TIMEOUT will be returned in this case if the condition is not satisfied, even though no actual wait was performed.

The correct implementation uses a loop; we can issue a warning if the job appears to be taking longer than expected:

Console::debug("Submitting command buffer to GPU..."); // The vulkan device: vk::Device & device = _context->device(); // The command buffer we want to submit: auto submits = vk::SubmitInfo() .setCommandBufferCount(1).setPCommandBuffers(&commands); // The queue we are going to submit to: auto queue = device.getQueue(graphics_queue, 0); // Generate a temporary fence: auto fence = device.createFenceUnique({}); // Submit the command buffer to the queue with the fence: queue.submit(1, &submits, *fence); // Loop until the fence is signalled: while (true) { // Wait for 10ms for the render to complete: auto result = device.waitForFences(*fence, true, 10000000); // Check the result - if it's successful we are done: if (result == vk::Result::eSuccess) break; // Otherwise, we took longer than 10ms to render: Console::warn("Wait for fence: ", vk::to_string(result)); // If the result wasn't a timeout (e.g. error), we fail: if (result != vk::Result::eTimeout) throw std::runtime_error("renderer failed"); }

In hindsight, this was a relatively trivial problem, however it highlighted the fact that Vulkan can sometimes be hard to comprehend in its entirety. I didn't write the original fence code, so without knowing any better, I initially suspected some problem with image barriers. When code gets bulky, it makes refactoring and the subsequent debugging harder.