This is the second part of a tutorial series about creating a custom scriptable render pipeline. It covers the writing of shaders and drawing multiple objects efficiently.

This tutorial is made with Unity 2019.2.9f1.

Shaders

To draw something the CPU has to tell the GPU what to draw and how. What is drawn is usually a mesh. How it is drawn is defined by a shader, which is a set of instructions for the GPU. Besides the mesh, the shader needs additional information to do its work, including the object's transformation matrices and material properties.

Unity's LW/Universal and HD RPs allow you to design shaders with the Shader Graph package, which generates shader code for you. But our custom RP doesn't support that, so we have to write the shader code ourselves. This gives us full control over and understanding of what a shader does.

Unlit Shader Our first shader will simply draw a mesh with a solid color, without any lighting. A shader asset can be created via one of the options in the Assets / Create / Shader menu. The Unlit Shader is most appropriate, but we're going to start fresh, by deleting all the default code from the created shader file. Name the asset Unlit and put in in a new Shaders folder under Custom RP. Unlit shader asset. Shader code looks like C# code for the most part, but it consists of a mix of different approaches, including some archaic old bits that made sense in the past but no more. The shader is defined like a class, but with just the Shader keyword followed by a string that is used to create an entry for it in the Shader dropdown menu of materials. Let's use Custom RP/Unlit. It's followed by a code block, which contains more blocks with keywords in front of them. There's a Properties block to define material properties, followed by a SubShader block that needs to have a Pass block inside it, which defines one way to render something. Create that structure with otherwise empty blocks. Shader "Custom RP/Unlit" { Properties {} SubShader { Pass {} } } That defines a minimal shader that compiles and allows us to create a material that uses it. Custom unlit material. The default shader implementation renders the mesh solid white. The material shows a default property for the render queue, which it takes from the shader automatically and is set to 2000, which is the default for opaque geometry. It also has a toggle to enable double-sided global illumination, but that's not relevant for us.

HLSL Programs The language that we use to write shader code is the High-Level Shading Language, HLSL for short. We have to put it in the Pass block, in between HLSLPROGRAM and ENDHLSL keywords. We have to do that because it's possible put other non-HLSL code inside the Pass block as well. Pass { HLSLPROGRAM ENDHLSL } What about CG programs? Unity also supports writing CG instead of HLSL programs, but we'll use HLSL exclusively, just like Unity's modern RPs. To draw a mesh the GPU has to rasterize all its triangles, converting it to pixel data. It does this by transforming the vertex coordinates from 3D space to 2D visualization space and then filling all pixels that are covered by the resulting triangle. These two steps are controlled by separate shader programs, both of which we have to define. The first is known as the vertex kernel/program/shader and the second as the fragment kernel/program/shader. A fragment corresponds to a display pixel or texture texel, although it might not represent the final result as it could be overwritten when something gets drawn on top of it later. We have to identify both programs with a name, which is done via pragma directives. These are single-line statements beginning with #pragma and are followed by either vertex or fragment plus the relevant name. We'll use UnlitPassVertex and UnlitPassFragment . HLSLPROGRAM #pragma vertex UnlitPassVertex #pragma fragment UnlitPassFragment ENDHLSL What does pragma mean? The word pragma comes from Greek and refers to an action, or something that needs to be done. It's used in many programming languages to issue special compiler directives. The shader compiler will now complain that it cannot find the declared shader kernels. We have to write HLSL functions with the same names to define their implementation. We could do this directly below the pragma directives, but we'll put all HLSL code in a separate file instead. Specifically, we'll use an UnlitPass.hlsl file in the same asset folder. We can instruct the shader compiler to insert the contents of that file by adding an #include directive with the relative path to the file. HLSLPROGRAM #pragma vertex UnlitPassVertex #pragma fragment UnlitPassFragment #include "UnlitPass.hlsl" ENDHLSL Unity doesn't have a convenient menu option to create an HLSL file, so you'll have to do something like duplicate the shader file, rename it to UnlitPass, change its file extension to hlsl externally and clear its contents. UnlitPass HLSL asset file.

Include Guard HLSL files are used to group code just like C# classes, although HLSL doesn't have the concept of a class. There is only a single global scope, besides the local scopes of code blocks. So everything is accessible everywhere. Including a files is also not the same as using a namespace. It inserts the entire contents of the file at the point of the include directive, so if you include the same file more than once you'll get duplicate code, which will most likely lead to compiler errors. To prevent that we'll add an include guard to UnlitPass.hlsl. It is possible to use the #define directive to define any identifier, which is usually done in uppercase. We'll use this to define CUSTOM_UNLIT_PASS_INCLUDED at the top of the file. #define CUSTOM_UNLIT_PASS_INCLUDED This is an example of a simple macro that just defines an identifier. If it exists then it means that our file has been included. So we don't want to include its contents again. Phrased differently, we only want to insert the code when it hasn't been defined yet. We can check that with the #ifndef directive. Do this before defining the macro. #ifndef CUSTOM_UNLIT_PASS_INCLUDED #define CUSTOM_UNLIT_PASS_INCLUDED All code following the #ifndef will be skipped and thus won't get compiled if the macro has already been defined. We have to terminate its scope by adding an #endif directive at the end of the file. #ifndef CUSTOM_UNLIT_PASS_INCLUDED #define CUSTOM_UNLIT_PASS_INCLUDED #endif Now we can be sure that all relevant code of the file will never be inserted multiple times, even if we end up including it more than once.

Shader Functions We define our shader functions inside the scope of the include guard. They're written just like C# methods without any access modifiers. Begin with simple void functions that do nothing. #ifndef CUSTOM_UNLIT_PASS_INCLUDED #define CUSTOM_UNLIT_PASS_INCLUDED void UnlitPassVertex () {} void UnlitPassFragment () {} #endif This is enough to get our shader to compile. The result will be a default cyan shader. Cyan sphere. We can change the color by making our fragment function return a different one. The color is defined with a four-component float4 vector containing its red, green, blue, and alpha components. We can define solid black via float4(0.0, 0.0, 0.0, 0.0) but we can also write a single zero, as single values get automatically expanded to a full vector. The alpha value doesn't matter because we're creating an opaque shader, so zero is fine. float4 UnlitPassFragment () { return 0.0; } Why write 0.0 instead of just 0 ? It's to indicate that we mean a floating-point value and not an integer, but it makes no difference to the compiler. Should we use float or half precision? Most mobile GPUs support both precision types, half being more efficient. So if you're optimizing for mobiles it makes sense to use half as much as possible. The rule of thumb is to use float for positions and texture coordinates only and half for everything else, provided that the results are good enough. When not targeting mobile platforms, precision isn't an issue because the GPU always uses float , even if we write half . I'll consistently use float in this tutorial series. There's also the fixed type, but it's only really supported by old hardware that you wouldn't target for modern apps. It's usually equivalent to half . At this point the shader compiler will fail because our function is missing semantics. We have to indicate what we mean with the value that we return, because we could potentially produce lots of data with different meanings. In this case we provide the default system value for the render target, indicated by writing a colon followed by SV_TARGET after the parameter list of UnlitPassFragment . float4 UnlitPassFragment () : SV_TARGET { return 0.0; } UnlitPassVertex is responsible for transforming vertex positions, so should return a position. That's also a float4 vector because it must be defined as a homogeneous clip space position, but we'll get to that later. Again we begin with the zero vector and in this case we have to indicate that its meaning is SV_POSITION . float4 UnlitPassVertex () : SV_POSITION { return 0.0; }

Space Transformation When all vertices are set to zero the mesh collapses to a point and nothing gets rendered. The main job of the vertex function is to convert the original vertex position to the correct space. When invoked, the function is provided with the available vertex data, if we ask for it. We do that by adding parameters to UnlitPassVertex . We need the vertex position, which is defined in object space, so we'll name it positionOS , using the same convention as Unity's new RPs. The position's type is float3 , because it's a 3D point. Let's initially return it, adding 1 as the fourth required component via float4(positionOS, 1.0) . float4 UnlitPassVertex ( float3 positionOS ) : SV_POSITION { return float4(positionOS, 1.0) ; } Isn't the vertex position a float4 ? Often points in 3D space are define with 4D vectors with their fourth component set to 1, while direction vectors have it set to zero instead. This makes it possible to transform both positions and directions correctly with the same transformation matrix. However, this technique is only needed when positions and directions are mixed, which is usually never the case. Instead, different code is used for rotation transformations that require fewer calculations. Positions are originally 3D vectors but automatically get expanded to 4D vectors with the fourth component set to 1. So we could define the position as float4 but it is not needed. This behavior also applies to other input data. Specifically, missing XYZ values are set to zero and W always gets set to 1. We also have to add semantics to input, because vertex data can contain more than just a position. We need POSITION in this case, added with a color directly after the parameter name. float4 UnlitPassVertex (float4 positionOS : POSITION ) : SV_POSITION { return float4(positionOS, 1.0); } Using object-space position. The mesh shows up again, but incorrect because the position that we output is in the wrong space. Space conversion requires matrices, which are send to the GPU when something gets drawn. We have to add these matrices to our shader, but because they're always the same we'll put the standard input provided by Unity in a separate HLSL file, both to keep code structured and to be able to include the code in other shaders. Add a UnityInput.hlsl file and put it in a ShaderLibrary folder directly under Custom RP, to mirror the folder structure of Unity's RPs. ShaderLibrary folder with UnityInput file. Begin the file with a CUSTOM_UNITY_INPUT_INCLUDED include guard and then define a float4x4 matrix named unity_ObjectToWorld in the global scope. In a C# class this would define a field, but here it's known as a uniform value. It's set by the GPU once per draw, remaining constant—uniform—for all invocations of the vertex and fragment functions during that draw. #ifndef CUSTOM_UNITY_INPUT_INCLUDED #define CUSTOM_UNITY_INPUT_INCLUDED float4x4 unity_ObjectToWorld; #endif We can use the matrix to convert from object space to world space. As this is common functionality let's create a function for it and put it in yet another file, this time Common.hlsl in the same ShaderLibrary folder. We include UnityInput there and then declare a TransformObjectToWorld function with a float3 as both input and output. #ifndef CUSTOM_COMMON_INCLUDED #define CUSTOM_COMMON_INCLUDED #include "UnityInput.hlsl" float3 TransformObjectToWorld (float3 positionOS) { return 0.0; } #endif The space conversion is done by invoking the mul function with a matrix and a vector. In this case we do need a 4D vector, but as its fourth component is always 1 we can add it ourselves by using float4(positionOS, 1.0) . The result is again a 4D vector with always 1 as its fourth component. We can extract the first three components from it by accessing the xyz property of the vector, which is known as a swizzle operation. float3 TransformObjectToWorld (float3 positionOS) { return mul(unity_ObjectToWorld, float4(positionOS, 1.0)).xyz ; } We can now covert to world space in UnlitPassVertex . First include Common.hlsl directly above the function. As it exists in a different folder we can reach it via the relative path ../ShaderLibrary/Common.hlsl. Then use TransformObjectToWorld to calculate a positionWS variable and return it instead of the object-space position. #include "../ShaderLibrary/Common.hlsl" float4 UnlitPassVertex (float3 positionOS : POSITION) : SV_POSITION { float3 positionWS = TransformObjectToWorld(positionOS.xyz); return float4( positionWS , 1.0); } The result is still wrong because we need a position in homogeneous clip space. This space defines a cube containing everything that is in view of the camera, distorted into a trapezoid in case of a perspective camera. Transforming from world space to this space can be done by multiplying with the view-projection matrix, which accounts for the camera's position, orientation, projection, field-of-view, and near-far clipping planes. It's made available the unity_ObjectToWorld matrix, so add it to UnityInput.hlsl. float4x4 unity_ObjectToWorld; float4x4 unity_MatrixVP; Add a TransformWorldToHClip to Common.hlsl which works the same as TransformObjectToWorld , except its input is in world space, uses the other matrix, and produces a float4 . float3 TransformObjectToWorld (float3 positionOS) { return mul(unity_ObjectToWorld, float4(positionOS, 1.0)).xyz; } float4 TransformWorldToHClip (float3 positionWS) { return mul(unity_MatrixVP, float4(positionWS, 1.0)); } Have UnlitPassVertex use the function to return the position in the correct space. float4 UnlitPassVertex (float3 positionOS : POSITION) : SV_POSITION { float3 positionWS = TransformObjectToWorld(positionOS.xyz); return TransformWorldToHClip(positionWS) ; } Correct black sphere.

Core Library The two functions that we just defined are so common that they're also included in the Core RP Pipeline package. The core library defines many more useful and essential things, so let's install that package, remove our own definitions and instead include the relevant file, in this case Packages/com.unity.render-pipelines.core/ShaderLibrary/SpaceTransforms.hlsl. //float3 TransformObjectToWorld (float3 positionOS) { // return mul(unity_ObjectToWorld, float4(positionOS, 1.0)).xyz; //} //float4 TransformWorldToHClip (float3 positionWS) { // return mul(unity_MatrixVP, float4(positionWS, 1.0)); //} #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/SpaceTransforms.hlsl" That fails to compile, because the code in SpaceTransforms.hlsl doesn't assume the existence of unity_ObjectToWorld . Instead it expects that the relevant matrix is defined as UNITY_MATRIX_M by a macro, so let's do that before including the file by writing #define UNITY_MATRIX_M unity_ObjectToWorld on a separate line. After that all occurrances of UNITY_MATRIX_M will get replaced by unity_ObjectToWorld . There's a reason for this that we'll discover later. #define UNITY_MATRIX_M unity_ObjectToWorld #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/SpaceTransforms.hlsl" This is also true for the inverse matrix, unity_WorldToObject , which should be defined via UNITY_MATRIX_I_M , the unity_MatrixV matrix via UNITY_MATRIX_V , and unity_MatrixVP via UNITY_MATRIX_VP . Finally, there also the projection matrix defined via UNITY_MATRIX_P which is made available as glstate_matrix_projection . We don't need these extra matrices but the code won't compile if we don't include them. #define UNITY_MATRIX_M unity_ObjectToWorld #define UNITY_MATRIX_I_M unity_WorldToObject #define UNITY_MATRIX_V unity_MatrixV #define UNITY_MATRIX_VP unity_MatrixVP #define UNITY_MATRIX_P glstate_matrix_projection Add the extra matrices to UnityInput as well. float4x4 unity_ObjectToWorld; float4x4 unity_WorldToObject; float4x4 unity_MatrixVP; float4x4 unity_MatrixV; float4x4 glstate_matrix_projection; The last thing missing is something else than a matrix. It's unity_WorldTransformParams , which contains some transform information that we again don't need here. It is a vector defined as real4 , which isn't a valid type itself but instead an alias to either float4 or half4 depending on the target platform. float4x4 unity_ObjectToWorld; float4x4 unity_WorldToObject; real4 unity_WorldTransformParams; That alias and a lot of other basic macros are defined per graphics API and we can get all that by including Packages/com.unity.render-pipelines.core/ShaderLibrary/Common.hlsl. Do so in our Common.hlsl file before including UnityInput.hlsl. You can inspect those files in the imported package if you're curious about their contents. #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/Common.hlsl" #include "UnityInput.hlsl"