Hi, I’m Yan and for the past two years I’ve been a Toolsmith at Unity. We have grown quite a lot recently and so have our test suites and the number of unstable tests, slow tests and failures which can not be reproduced locally. In this blog post, I’ll talk about what we’re doing about it, but first let me tell you a little about our automation environment – just to give you a better understanding of what challenges we are dealing with.

At Unity, we have many different kinds of test frameworks (Figure 1) and test suites:

Runtime Tests are verifying Unity’s public runtime API on all of Unity’s supported platforms.

Integration Tests allow testing things that are not easily expressed as runtime tests – they can test Unity Editor features as well as integration to components like the Cache Server and Bug Reporter.

Native C++ Tests are focused on testing native code directly without going through the scripting layer.

Graphics Tests are testing rendering features by comparing a resulting image with a reference image, which is considered “correct”.

Many others (Performance Tests, Load Tests, IMGUI Tests, etc.).

On the highest level, all tests are grouped in different subsets based on test framework. However, they are further divided based on platform, run frequency, execution time and some other criterias. Those divisions produce an enormous amount of testing points. We’ll discuss these numbers later on.

Having so many frameworks and runners is not easy, so about a year ago we started working on a Unified Test Runner (UTR): a single entry point for running all tests. It serves as a facade (see Figure 2) for all testing runners/frameworks. This enables anyone to run any of our tests suites from command line.

All the artifacts that are produced by a test run are copied into the same place and are grouped and organized according to the same conventions everywhere. UTR also provides other services:

tests can be filtered the same way everywhere with -testfilter=TestName

execution progress is reported the same way for all the test suites

Initially, UTR was mostly used to run tests locally. Then we switched focus to our Build Farm configurations. We wanted to use the Unified Test Runner there as well. Our goal was to run tests the same way locally and on the build farm. Or in other words: if something failed on the Build Farm – it should be easy to reproduce it locally.

Slowly but surely UTR has become the single entry point which we are using to run tests in Unity. That’s what made it a perfect candidate for another task: collecting test execution data, both from local and Build Farm test runs. Whenever a test run is finished, UTR reports data to the Web service. That is how our test data analytics solution, Hoarder, was born. Hoarder’s responsibility is to collect, store and provide access to test execution data. It can present aggregated statistics with a possibility to drill down to the individual test runs. See Figure 3.

We discovered a lot of interesting things in the data, which led to a few important decisions. I’m going to talk about how we make informed decisions based on this data in the next blog post.