A key difference between the new DirectX 12 mode ( -s dx12 ) and the older DirectX 11 mode ( -s dx11 , previously named -s hlsl ) is that the DirectX12 mode uses the live driver and follows the same compilation path as a real-world DirectX12 application. With that comes the power of generating disassembly and hardware resource usage statistics that are closest to the real-world case, and therefore making better performance optimization decisions.

To compile a DirectX12 graphics pipeline, you would need to provide the following inputs to the tool, in addition to the HLSL source files:

Root signature : The root signature can be either defined in the HLSL source code or provided in a pre-compiled binary file, as described in our previous article.

: The root signature can be either defined in the HLSL source code or provided in a pre-compiled binary file, as described in our previous article. .gpso file: For compute pipelines, the HLSL source code, together with a valid root signature, are enough for performing a successful compilation of the pipeline. For graphics, however, a subset of the D3D12 graphics pipeline state is required as well. Without that additional data, RGA would not be able to properly set the pipeline state for your shaders and this would result in a compilation failure. The subset of the graphics pipeline state that RGA requires is defined in a custom .gpso file of the following format: # schemaVersion 1.0 # InputLayoutNumElements: Number of D3D12_INPUT_ELEMENT_DESC elements in the D3D12_INPUT_LAYOUT_DESC structure. # Must match the following "InputLayout" section. 2 # InputLayout # { SemanticName, SemanticIndex, Format, InputSlot, AlignedByteOffset, InputSlotClass, InstanceDataStepRate } { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 }, { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } # PrimitiveTopologyType: The D3D12_PRIMITIVE_TOPOLOGY_TYPE value to be used when creating the PSO. D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE # NumRenderTargets: The number of formats in the upcoming RTVFormats section. 1 # RTVFormats: An array of DXGI_FORMAT-typed values for the render target formats. # The number of items in the array should match the above NumRenderTargets section. { DXGI_FORMAT_R8G8B8A8_UNORM } You can generate a template .gpso file and then edit it manually to match your pipeline by running:

rga -s dx12 --gpso-template "full path to output file"

Example

In our following example we will use the D3D12HelloTriangle sample from Microsoft’s DirectX Graphics Samples. The pipeline has two very simple shaders, both defined in shaders.hlsl : VSMain is the vertex shader and PSMain is the pixel shader.

Let’s start by generating a template .gpso file:

rga -s dx12 --gpso-template C:\shaders\hellotriangle.gpso

Now, we will tweak the file’s contents to match our source code. Let’s have a look at D3D12HelloTriangle.cpp where we can find the input layout definition:



// Define the vertex input layout. D3D12_INPUT_ELEMENT_DESC inputElementDescs[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 }, { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } };

Let’s copy the two input layout lines under the InputLayout section and adjust the InputLayoutNumElements value to 2.

Now, another quick look at the .cpp file shows that there is a single render target with a format of DXGI_FORMAT_R8G8B8A8_UNORM :



psoDesc.NumRenderTargets = 1; psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM;

Let’s update the NumRenderTargets and RTVFormats sections accordingly, so we would end up with a .gpso file that looks like this:



# schemaVersion 1.0 # InputLayoutNumElements: Number of D3D12_INPUT_ELEMENT_DESC elements in the D3D12_INPUT_LAYOUT_DESC structure. # Must match the following "InputLayout" section. 2 # InputLayout # { SemanticName, SemanticIndex, Format, InputSlot, AlignedByteOffset, InputSlotClass, InstanceDataStepRate } { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 }, { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } # PrimitiveTopologyType: The D3D12_PRIMITIVE_TOPOLOGY_TYPE value to be used when creating the PSO. D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE # NumRenderTargets: The number of formats in the upcoming RTVFormats section. 1 # RTVFormats: An array of DXGI_FORMAT-typed values for the render target formats. # The number of items in the array should match the above NumRenderTargets section. { DXGI_FORMAT_R8G8B8A8_UNORM }

All we have to do now is run the RGA command line tool with the following command:



rga -s dx12 --vs C:\shaders\shaders.hlsl --ps C:\shaders\shaders.hlsl --vs-model "vs_6_0" --ps-model "ps_6_0" --vs-entry VSMain --ps-entry PSMain --isa C:\output\isa.txt --rs-bin C:\RootSignatures\hellotriangle.rs.fxo --gpso C:\shaders\hellotriangle.gpso

Where --rs-bin points to the pre-compiled root signature binary file. For more information about root signatures in RGA, see our previous article.

Since both the vertex and pixel shaders are defined in the same file and use the same shader model, we can use the --all-hlsl and --all-model options to make our command a bit less verbose:



rga -s dx12 --all-hlsl C:\shaders\shaders.hlsl --all-model "6_0" --vs-entry VSMain --ps-entry PSMain --isa C:\output\isa.txt --rs-bin C:\RootSignatures\hellotriangle.rs.fxo --gpso C:\shaders\hellotriangle.gpso

That’s it. After a successful build, we get the disassembly in the output folder:



; -------- Disassembly -------------------- shader main asic(GFX10) type(PS) sgpr_count(6) vgpr_count(8) wave_size(64) s_inst_prefetch 0x0003 // 000000000000: BFA00003 s_mov_b32 m0, s2 // 000000000004: BEFC0302 v_interp_p1_f32 v2, v0, attr0.x // 000000000008: C8080000 v_interp_p1_f32 v3, v0, attr0.y // 00000000000C: C80C0100 v_interp_p1_f32 v4, v0, attr0.z // 000000000010: C8100200 v_interp_p1_f32 v0, v0, attr0.w // 000000000014: C8000300 v_interp_p2_f32 v2, v1, attr0.x // 000000000018: C8090001 v_interp_p2_f32 v3, v1, attr0.y // 00000000001C: C80D0101 v_interp_p2_f32 v4, v1, attr0.z // 000000000020: C8110201 v_interp_p2_f32 v0, v1, attr0.w // 000000000024: C8010301 v_cvt_pkrtz_f16_f32 v2, v2, v3 // 000000000028: 5E040702 v_cvt_pkrtz_f16_f32 v3, v4, v0 // 00000000002C: 5E060104 exp mrt0, v2, v2, v3, v3 done compr vm // 000000000030: F8001C0F 00000302 s_endpgm // 000000000038: BF810000 s_code_end // 00000000003C: BF9F0000

In addition to the --isa option that generates the disassembly, you can use the -a option that generates the hardware resource usage statistics for each shader in the pipeline, or the --livereg option that creates a live VGPR analysis report based on the generated disassembly.

For more information about the available options, run rga -s dx12 -h .

Acknowledgements

Code samples used herein are from Microsoft’s DirectX Graphics Samples and are © Microsoft 2015 and subject to the MIT License.