As developers, we’re always aware of performance, both in terms of CPU and GPU. Maintaining good performance gets more challenging as scenes get larger and more complex, especially as we add more and more characters. Me and my colleague in Shanghai come across this problem often when helping customers, so we decided to dedicate a few weeks to a project aimed to improve performance when instancing characters. We call the resulting technique Animation Instancing.

We often implement outdoors scenes with GPU Instancing, such as grasses and trees. But for SkinnedMeshRenderer (for example characters), we can’t use instancing, because the skinning is calculated on the CPU, and submitted to the GPU one by one. In general, we can’t draw all characters through one submission. When there are lots of SkinnedMeshRenderers in the scene, this results in lots of draw calls and animation calculations.

We have found a way to reduce CPU cost and supplement GPU Instancing in Unity with Animation Instancing. You can get our code on GitHub. Be aware that this is custom experimental solution, we’ve only shared it with a few of our enterprise support customers until recently. Now we’re ready for more feedback – please let us know what you think directly in the project comments!

Goals

Our initial goals for this experimental project were:

Instancing SkinnedMeshRenderer

Implement as many animation features as possible

LOD

Support mobile platform

Culling

Not all of our goals were reached due to time constraints. Animation features supported are: Root Motion, Attachment, Animation Events (not yet supported features: Transitions, Animation Layer). Also, bear in mind that this only works on mobile platforms using OpenGL ES 3.0 and newer.

However, we felt that the experiment was successful in proving that this approach can have interesting results. Let’s dig into some of the details.

Animation Generation

Before using instancing for characters, we need to generate the animations. We generated the animations of a character into textures. These textures are called Animation Texture. The textures are used in skinning on GPU.

This generator collects animations from the Animator component attached to the GameObject in question. It collects the animation events as well. It’s convenient to transfer from Mecanim system to Animation Instancing. If you want to attach something on a character, you need to specify the bones to which something can attach in the Attachment settings.

When we finish generating the animation texture, the Animation Instancing script will load the animation information at runtime. Note that the animation information are not the animation clip files.

Instancing

It’s simple to apply Animation Instancing. Let’s add the Animation Instancing script to our generated game object. The Bone Per Vertex parameter controls the number of bones which are calculated per vertex. The important thing to be aware of here is that having less bones improves performance, but decreases accuracy.

Next, we need to modify the shader in order to support Instancing. Basically, what you need is to add these lines into your shaders. It doesn’t affect your shading, but adds a vertex shader to skinning.

#include “AnimationInstancingBase.cginc” #pragma vertex vert 1 2 #include “AnimationInstancingBase.cginc” #pragma vertex vert

Performance Analysis

We used a slightly modified version of a demo scene from the Mecanim Example Scenes and tested its performance on an iPhone 6. Let’s take a closer look at the profiler views for both original and instancing example.

CPU

The original projects spawns 300 characters, and our FPS is around 15. To get to at least 30 FPS, we have to limit the number of characters to about 150. In the Animation Instancing version we can spawn 900 characters while maintaining 30 FPS.

As you can see, calculations on the CPU slow the project down.

Using the instancing project, we reduced the animation calculations (skeleton and skinning etc.) a lot on the CPU. That way, we can spawn five or six times as many characters!

In the test scene, drawing the environment requires around 80 draw calls. The character has three materials. So we have three draw calls to render a character.

Without instancing spawning 250 characters requires around 1100 draw calls (3 *250 characters + their shadows).

When using Animation Instancing, after spawning 800 characters, the draw calls only increases to about 50. As you can see, there are 4800 batched draw calls in the instancing column and 48 batches(3 * 8 characters + 3 * 8 shadows). That is because we submit 100 characters per batch.

GPU

This technique increases GPU cost a little, because we put skinning on the GPU. If the characters have shadows, we have to skin the characters again in the shadow pass. However, it improves the overall frame rate because it reduces CPU cost. Usually CPU cost is the biggest issue in crowd simulations in games.

Memory

The additional memory is used to store the Animation Textures. The texture holds the skin matrix. We use RGBAHalf format texture. Let’s assume a character has N bones and four pixels per bone(one matrix); we generate one animation as M key frames. So one animation costs N * 4 * M * 2 = 8NM bytes. If a character has 50 bones and we generate 30 keyframes, one animation has 50 * 4 * 30 = 6000 pixels. So a 1024*1024 texture can store up to 174 animations.

Conclusion

We’ve found Animation Instancing can significantly reduce CPU cost if you have lots of SkinnedMeshRenderers. It’s suitable for crowds of similar enemies such as zombies etc.

We hope this experimental project provides some insight that can shine into your own project’s performance challenges and gives you the ability to build more elaborate scenes. Certainly, there are many avenues for future work, such as support for transitions, animation layers etc.

Please check out the code on Github and post any comments / issues you have directly to the project!