Hopefully you’ve seen the 2018 Graphics post in the Unity Blog, and if not, go look at it now. SRP obviously has me very excited at the potential because a majority of my day-to-day frustrations come from hitting annoying render bugs or quirks with the built-in pipeline which prevent us from doing cool stuff, or at least make it very painful. A CBuffer not correctly populated at a Camera Event no longer requires hours to days of workarounds.

I’ve been keeping a close eye on the progress with this since I first heard about it during a Unity office visit a GDC-or-two ago. Their Github for it has been and continues to be very active, though if you don’t have access to newer-than-bleeding-edge Unity pre-alphas, chances are you can’t really use more than a 2018.1 beta branch. On the upside, they’re updating things very frequently now, which means they’re very likely approaching a release milestone for the full release of 2018.1.

I’ve been peeking at it quite frequently since late last year (2017.3 beta) and have been checking out some others interesting implementations like Keijiro’s to get the gist.

I’m keeping all my experimentation in a GitHub Project with an MIT license, feel free to use anything I’ve posted. Much credit goes to Google and Valve for their initial works.

Disclaimer: As I’m mostly using this blog to record my thoughts as I work through things, some blurbs will more than likely seem disparate as my stream of consciousness flows.

Assets Credits: Free HDRI maps 2.0 – ProAssets

Also I’m too lazy to edit things.

SRP at a Glance

Unity’s Blog has a good overview of the basic setup, so I’ll only retread what little bits I need to.

The Basics

First, there exists a ScriptableObject derived class called RenderPipelineAsset which serves as the Editor handle for your SRP. It’s here that you’d implement some Inspector code for modifying settings of your pipeline. I’d imagine this can work in harmony with some of Unity’s Quality and Graphics Settings, as certain controls in those settings windows get hidden when an SRP is in use. I’m unsure if what gets hidden is something that is configurable at this time or if not, if it’s planned to be added in future iterations.

The RenderPipelineAsset asset will then create a RenderPipeline derived class that is the meat of your SRP. This is what executes the actual render loop by way of the Render function.

public override void Render(ScriptableRenderContext context, Camera[] cameras)

There also exists something called a ScriptableRenderContext . This is kind of like a CommandBuffer of CommandBuffers, since they both function by way of delayed execution. Again, all of this is explained better in Unity’s overview of SRP. But basically, you’re going to do something like this with varying complexity:

foreach Camera set up render targets do culling set up scaffolding for a Pass // populate constant buffers, etc. build your display list // Renderers to be drawn in Pass submit your context // actually execute your loop

There are built-in data structures for things like querying culling results, which makes sense because Unity already implements Frustum and Occlusion culling. There doesn’t appear to be a way to significantly extend the culling process at this time, though there is a ScriptableCullingParameters struct that provides for some fine-tuning that would otherwise just come from the Camera. I’d be curious as to how much trouble we’d have here for projects that require oblique or off-axis projection.

The CullResults are going to give you info like Visible Lights and Reflection Probes, as well as the visible Renderers you’ll need for your context’s DrawRenderers command.

Executing Passes is essentially just a DrawRenderers command after setting up which Shader Pass (by name) to execute. Since this is done by string I can see a lot of headaches with the more complex things will get. My intuition here is to have constants handle this to reduce issues from renames or ye olde typo.

Unity has a ShaderPassName struct which seems to serve as an identifier for a given Pass, although it is still created by string name.

There’s a FilterRendererSettings struct that exists as a means of further filtering your visible Renderers, by which you have 3 options in the form of renderQueueRange (e.g. 2000-2449 for Geometry), layerMask (like the Camera’s culling layers), and a uint renderingLayerMask which I cannot find any example of at this time, but appears to just be an arbitrary value which can be assigned to a Renderer. The Inspector for Renderer just visualizes it as an enum flag dropdown.

At first glance, it appeared like that was basically it for filtering our Renderers. One thing I was expecting was some sort of scripted heuristic to do some further customization, such as whether or not Receive Shadows is enabled. The way Unity does this in their current built-in pipeline is by way of switching Shader Variants. In SRP, things seem to be geared much more towards Passes. Since you get all Renderers with a Material that have a given ShaderPass, it stands to reason that Passes are the key groupings of the type of things you are drawing at that instant.

Diving In

I’ve played around with the examples in the SRP Git Project and saw some really interesting features. Unfortunately, without much existing material at the moment to stress test them, I’m kind of at a loss of where to go with what they’ve done so far. So I think I’m going to start implementing my own pipeline as an opportunity to gain a solid understanding of its limitations and strengths.

What To Build?

In an attempt not to get too complicated too fast, I want to build something around rigid constraints. Mobile VR is a great place to look because of how important performance is, so the constraints are everywhere. I’ve worked on several products for this platform since the release of Gear VR and Google Daydream, so I feel like I have a fairly good understanding of what the hardware is capable of and where the majority of effort goes when optimizing.

My general guidelines will be based off of the Daydream Renderer which piggybacks off of Unity’s built-in Legacy Vertex-Lit Passes (e.g. Vertex, VertexLM). This gives us up-to 8 real-time per-pixel Lights that support Normal Maps, with their Shaders and a custom real-time shadow system for ‘hero’ objects. There is also the custom shadow system in Valve’s Lab Renderer. These both work but personally I haven’t found them to be very intuitive, but that is likely due to the fact that they are inherently a workaround of Unity’s built-in Renderer. Since I’ll be defining the Render loop directly now, I’d like to try and have something that feels more seamless to the naive content creator.

Another reference I have are Valve’s Half-Life 2 Basis which is a highly efficient means of normal mapping from multiple light sources uniformly. The Daydream Renderer uses this implementation as well.

Key Features

I’ll end up modifying this as time goes by, but here is what I’m thinking of so far:

Multiple pixel Lights in a single Pass Support Normal maps!

Shadow Maps

Lightmaps

Configurable Shader

Basic Lighting

In SRP, Lights come into the Render loop through the CullResults object in the form of a List<VisibleLight> . What constitutes if a Light is visible? Well, it seems that a Light will be culled if its influence does not fall within the Camera’s frustum, which makes total sense.

It stands to reason here that Directional Lights will always be ‘visible’ since their influence is global.

At the point where the VisibleLight collection is available, these will be in World Space, and in no particular order. It’s up to you, the implementer, to sort and/or filter these Lights by some heuristic. In my case, I’m sorting the Lights from closest to farthest by its squared distance to the Camera, for the sake of being efficient. Next, I’ll use the rendering Camera’s View Matrix to transform the Light’s position before backing it into a Vector Array for use in my Shaders.

Some things the Shader will need to know about each Light:

Light Type – Directional, Point, Spot

Position/Direction

Attenuation

Colour

From here it was easy to follow the Valve document to implement Radiosity Normal Mapping. The lighting is calculated per-vertex for speed, but also means a more than insignificant amount of tessellation would be required to obscure any artifacts due to interpolation and low precision.

I’ve also taken the initiative to reduce ALU cost of some calculations by using some neat tricks for Normals that can be found here and here, with a few slight modifications for context, of course.

Adding an option to do these calculations per-fragment will be trivial, so I’ll likely revisit this later.

With diffuse and specular terms handled, I moved on to adding in ambient lighting. This surprisingly gave me a little bit of trouble, as I was expecting Unity’s default ambient properties to be available. I guess it makes sense, but it does show that there is some inconsistency as to what Unity is going to continue to handle and what they’ve handed off to us.

Okay, fine. I’ll push a CBuffer for ambient lighting, no big deal, only it looks like the Lighting Editor is completely ignored. RenderSettings doesn’t appear to be updated, either. I double-checked against the Unity examples, which all had the same result (or lack-there-of). So here I decided to just roll a Component to deal with it for now. C’est la vie.

There are a couple reasons why I hate this solution though, the main being that there is a whole Lighting Editor window, that Content Creators are used to, that will effectively be rendered useless. This will likely cause confusion, and I’ll surely receive bug reports for the Unity Lighting Editor ‘not working’. Perhaps this could be circumvented with a little Editor magic, but that is something to tackle later.

To factor in ambient lighting, I opted to do this per-vertex just like the diffuse. In the HL2 document, they outline an ambient cube, but I decided to go a different route with a tri-point gradient since Unity has used that method.

I’m fairly happy with the results.

Fog

Fog is going to be controlled through the same controls as the ambient data for now. I’m only going to bother to support linear fog, but exponential and exp2 will fit into the Vector4 I’m passing into the CBuffer when I get around to it.

Lightmaps

Precomputed lighting will be very important for maintaining a good balance of visuals/performance, so that’s up next. Unity does basically all the work for us here, so the only thing we need to do in the render loop is to enable Lightmaps. You do this through RendererConfiguration . There are a number of flags we can set her, but for now I’m just going to set the RendererConfiguration.PerObjectLightmaps flag to enable per-Object Lightmaps, which will behave just the way Unity has handled them, by enabling the LIGHTMAP_ON precompiler in the Shader.

settings.rendererConfiguration = RendererConfiguration.PerObjectLightmaps;

This solution is the bare minimum to getting the Lightmapper working with this SRP. I could (and eventually will) go much deeper by overriding the ‘Meta’ Pass in my Shader to have more control over what is being written into the Lightmap, but for now I don’t see the need.

Mixed Lighting

Blending Lightmaps and Dynamic lights at this stage is fairly trivial. I’ll simply use the alpha channel of each Light’s color and either pass a 0 or 1, multiplying out any contribution to the diffuse term.

lightColor.a = vl.light.bakingOutput.lightmapBakeType == LightmapBakeType.Mixed ? 0f : 1f; lightColors[lightCount] = lightColor;

float4 lightColor = LIGHT_COLOR(index); #if defined(LIGHTING_MIXED) lightColor.rgb *= lightColor.a; #endif

I’m setting up a separate Pass to handle the mixed lighting to be rendered separately from fully dynamic objects. This way, I’ll be able to process the lighting differently as an optimization.

I’ll use whether or not there is a Lightmap Index for the Renderer that is greater than the default -1, and set a rendererLayerMask to be used with the FilterRenderersSettings.

filterSettings = new FilterRenderersSettings(true) { renderQueueRange = RenderQueueRange.opaque, layerMask = camera.cullingMask, renderingLayerMask = ShaderLib.RenderLayers.BakedLightmaps };

Note that these renderingLayerMask values are actually just an unsigned integer which can be thought of as a 32-bit mask. This is a nice addition because the 24 Layers Unity had prior to this needed to be shared across the Physics Engine and for Cameras. That means we’ll be able to define up-to 32 unique Layers in our pipeline for whatever we want, without worrying about running into conflicts.

Light/Reflection Probes

Adding probes are just as simple as piping some flags together in our RendererConfiguration.

settings.rendererConfiguration = RendererConfiguration.PerObjectLightmaps | RendererConfiguration.PerObjectLightProbe | RendererConfiguration.PerObjectReflectionProbes;

There does exist flags for defining these Probes indices manually, which appears to be in use in the HD Render Pipeline. I’ll look into refining this in the future but for now I will allow for an override that can take a manually defined Cubemap.

Light Probes are only factored in with Dynamically lit objects, so I’m wrapping them into lights per-vertex with the ambient contribution. This means that they will be subject to the Radiosity Normal Mapping as well, but this is just for lack of Interpolators.

Shadows

On to real-time shadow maps. Unity’s ScriptableRenderContext has a DrawShadows API which will work very similar to DrawRenderers . But since Shadow mapping is expensive, especially in the case of Mobile VR, I’m going to forego the built-in method and build something completely custom.

For this case, I’m going to blend the Valve Lab and Daydream techniques, make a shadow atlas, but limit the samples to keep fragment costs low.

In the case of some of the Snapdragon 820-series devices (Pixel, Pixel XL), the Bus is quite slow so texture sampler latency can be a real performance killer. This sort of thing won’t be obvious, either, unless you dig in with a Graphics Debugger. I’m unsure at this time if the 830-series made drastic changes to the hardware architecture to remedy this.

I’m limiting the number of Shadow Casters to 4, which will easily translate to 4 quadrants of a Shadow Atlas Texture and can be packed into 2 Texture Interpolators. Additionally, I will make the concession that there can only be a single shadow-casting Light, which in the render loop will be the first in the sorted list (i.e. closest) that has shadows enabled. Since I’m implementing a custom Shadow filter, it doesn’t matter whether Hard or Soft is selected.

The Daydream Renderer has a clever implementation for their shadows which is very low cost. I’m going to copy what I can to minimize shadow operations but account for my modifications like the atlas and a shadow mask. This isn’t Unity’s Shadow Mask, however, it is a flag which I will use to mask out the shadow from leaving artifacts due to depth imprecision in the 16-bit Shadow Texture. Since there can only be 4 shadows, the only possible values for the mask are 1, 2, 4, and 8. The mask value will be passed into the caster’s Renderer via a MaterialPropertyBlock and evaluated in the Shader; if the shadow that is currently being shaded has the same flag as the fragment, it fails the bitwise AND and is ignored. No more self-shadow artifacts.

Since it doesn’t look like there is a good way to get select Renderers in the Render loop, I won’t be using Rendering.ShadowCastingMode to denote which Renderers write to the Shadow Maps, but instead will rely on a custom ShadowCaster Component. I hope that as Unity iterates on SRP that it becomes easier to leverage or even re -purpose their built-in Components. For now I’ll have to hide some of options with custom Editor work. I’m going to put down those thoughts in another section later on.

To fit all 4 shadows, I need 3 full interpolators. I can pack the x,y coordinates for all projections into the first two mapping to .xy and .zw appropriately, and the normalized z (accounting for perspective) into the third.

There currently exists some issues with the Spot implementation using this method, which I’ll aim to address later. I’m wondering if doing an off-axis projection to take a ‘slice’ of the view cone that intersects with each Renderer’s AABB will work?

Transparency

For now, I’m only supporting a very minimalist transparent queue. I’ll wrap this in a special Pass called ‘Transparent’. The Render Loop will execute these Shader Passes only when the Material has the right Shader Pass and the correct RenderQueue (Transparent/3000+). This makes supporting a Transparent version of my Basic Shader very trivial to implement.

I also quickly made a very simple ‘Transparent’ Shader which basically just an Unlit, Textured Shader with configurable Blend modes and ZTest.

Since both the Basic and Transparent Shaders have a ‘Transparent’ Pass, they’ll both be drawn at the same point in the Render loop.

Note that I don’t have to do a DisableShaderPass here, since it’s all gated by the Transparent RenderQueue , I’ll avoid any unnecessary draw calls. This is distinctly different from the Reflective Passes since they are Passes that could potentially draw multiple times since they’d pass through multiple filters.

Supporting fog for transparent objects was trivial, however one thing that always bugs me about the singular fog colour is that it fails where the BlendMode doesn’t agree with it. To alleviate this, fog colour can be overridden on a per-Material basis by way of a variant.

Things like Blend DstColor Zero (Multiply) will appear to fade correctly over distance and blend in with the fog.

Another thing that I’ve added was the ability to selectively ZPrime any transparent geometry on a per-Material basis.

settings = new DrawRendererSettings(camera, ShaderLib.Passes.ZPrime); settings.SetShaderPassName( ShaderLib.Passes.TRANSPARENT_PASS_INDEX, ShaderLib.Passes.Transparent); settings.sorting.flags = SortFlags.CommonTransparent; filterSettings.renderQueueRange = RenderQueueRange.transparent; context.DrawRenderers(cull.visibleRenderers, ref settings, filterSettings);

Optimizations

Some low-hanging optimizations here are to simply ‘massage’ my Render order a little so the more expensive fragment Shaders execute last. On chips with early Z culling/discard/reject, this can be a huge boost in the opaque queue. I’ll be able to do some of this through Passes, having certain Material options toggle Passes On/Off. I’ve done this already with the Reflective option so that Cubemapped geometry executes later than others.

Other future work on this should include options for no Normal Maps and Diffuse-only Materials.

Downsampling

A quick addition was simply rendering to a lower-resolution target and then doing a final Blit to the main framebuffer. The rationale here is that doing the lighting calculations on the lower resolution buffer will reduce fragment overhead at the expense of quality. Setting this up was as easy as changing the SetTarget in the initial ‘Clear’ CommandBuffer , then adding a final CommandBuffer where we Blit to the main Camera target.

// clear frame buffer cmd = CommandBufferPool.Get(); cmd.name = "Clear Framebuffer"; cmd.GetTemporaryRT( ShaderLib.Variables.Global.id_TempFrameBuffer, framebufferDescriptor); cmd.SetRenderTarget(framebufferID); cmd.ClearRenderTarget(true, false, Color.clear); context.ExecuteCommandBuffer(cmd); CommandBufferPool.Release(cmd);

// Final Blit cmd = CommandBufferPool.Get(); cmd.name = "Blit Framebuffer"; cmd.Blit(framebufferID, BuiltinRenderTextureType.CameraTarget); context.ExecuteCommandBuffer(cmd); CommandBufferPool.Release(cmd);

I’ve used this in the case where a device might have the same GPU across multiple screen resolutions. To compensate, we can reduce the framebuffer size on the larger resolution proportionally to match the smaller one, theoretically equalizing fragment performance.

GCAlloc

There is a fair amount of garbage generated by the pipeline at this point, and seeing that this executes every frame, it’s going to add up to a bad time. After doing a quick profile session of the Lightweight and HD Render Pipeline example scenes, they allocate GC as well, at least at the point that I am writing this, which is the 1.1.2-preview Tag.

So at this point I am going to accept that some amount of garbage is going to get allocated, but I am going to do some work to eliminate any garbage that I’m making by simply being lazy.

The first thing I’ll attack in this matter will be the List<VisibleLight>.Sort which is going to create a new List every frame. There are some 0 allocation sorting algorithms that I can work with.

Extending the Editor

While peeking at the inner-workings of the HD Render Pipeline I’ve seen that there have been some really amazing features added to the Editor side, and some neat tricks that you could probably have always done that I hadn’t really thought of.

Custom Editors in the HD Render Pipeline

First of all, there is a new concept of the CustomEditor but for a given RenderPipeline, which is aptly named CustomEditorForRenderPipeline . This is super powerful because it allows for Components to easily change their user-facing behavior based on what pipeline is active.

Unity isn’t allowing you to write over Light, and it is a sealed class so inheritance is out, but that’s not a problem since Unity’s whole design paradigm is built off of the Composite Design Pattern anyway. They are simply creating a complimentary Component that requires a Light be present and pulling data from that Component during the render loop when needed. It appears that they make use of hideFlags to obfuscate the composite and draw the additional properties in their custom Light Inspector. This makes everything look like it’s all just a feature of the Light, which is nice because it reduces clutter and also prevents a lot of end-user assembly.

Where this goes really, really bad though is the fact that Unity isn’t really playing by its own rules, while still conceding to its arbitrary limitations. We can see that by simply trying to remove the Light Component from its GameObject.

So this obviously needs more work put into it. I hope that this becomes more of a supported feature in the future.

My Take on Custom Editors

So far, as a part of my attempt to make things mostly streamlined, I’ve got to be able to set some custom renderingLayerMask values on Renderers at different points. So far I’m mainly only tracking for Mixed vs. Dynamic lighting. For this I’ve created a class for listening to Lightmapping events to flag any Renderers in the Scene correctly.

uint lightmapFlag = ShaderLib.RenderLayers.BakedLightmaps; Renderer[] allRenderers = Transform.FindObjectsOfType<Renderer>(); foreach(Renderer renderer in allRenderers) { // If Lightmap, make sure it doesn't have that flag. if (renderer.lightmapIndex == -1) RendererUtilities.RemoveFlagsFromMask(renderer, lightmapFlag); else RendererUtilities.AddFlagsToMask(renderer, lightmapFlag); }

For future work, Shadow Casting could potentially be obfuscated to be more transparent to the user between baked and this pipeline’s real-time solution.

I built a custom ShaderGUI to handle some of the toggling of Passes and Keywords more easily. If reflections are enabled, the correct Pass is enabled. This makes things completely transparent to the Artist, which is a big deal for me. There also isn’t a point to showing the Cubemap input if we don’t want the Material to be reflective, so I handle that too.

The ShaderGUI will also provide for controlling whether or not the Material is Transparent or not, which was fairly easy to add.

It seems like a sizable amount of the work that is going to go into creating a functional SRP is going to be Editor work as well, which is something to definitely keep in mind. Content Creators will be interfacing with this, afterall.

Quality-of-Life Improvements

Building a real SRP for use in a real production, as many quality-of-life bits that Unity has had should transfer over. One such nice-to-have feature is establishing a default Shader for the pipeline, much like the way that Unity uses the ‘Standard’ as the default now. This is actually really simple: just override GetDefaultShader in the RenderPipelineAsset .

// New Materials will use this Shader automatically public override Shader GetDefaultShader() { return ShaderLib.Shaders.SafeFind(ShaderLib.Shaders.BASIC); }

There are a number of other things that can be overridden there as well, like the default Material when creating a new ParticleSystem.

Stress Testing

I brought the pipeline into an updated version of the Viking Village demo from the Asset Store and after some amount of work, like removing PostProcess scripts and other bits, as well as porting Materials and recomputing Lightmaps, I had my SRP rendering correctly.

There are many improvements to be made here. Local AO, which could be packed into the alpha of the Normal texture, or perhaps packed into vertex colour, since this pipeline requires fairly high vertex density anyway. But considering the high constraints I arbitrarily placed on myself, I am quite pleased with the result.

Wrapping Up… For Now

I’ve got a good start on this and I think I’ll stop here and share my findings. I’ll be working more and more with SRP with the official release of 2018.1 coming very soon, so I will no doubt be taking more notes.

Some things I’d like to work on next are:

Supporting Emissive and AO in the ‘Basic’ Shader

Optimizations to GCAlloc

Revisiting Spot Shadows

Custom Meta Pass for the Lightmapper

Benchmarks for the ‘Target’ device

As for the GitHub project, I plan to continue iterating on it, but maybe not immediately. I would like to spend some time investigating things like ShaderGraph and perhaps extending one of Unity’s official SRPs.

Feedback is welcome and appreciated on both the SRP project as well this post. Until next time.