Paul Hammant is an independent consultant helping clients with Continuous Delivery and DevOps. He played a leading role in developing tools and techniques for Trunk-Based Development and Dependency Injection. He's been a senior staff member in several technology organizations including a twelve year stint at ThoughtWorks.

Test Impact Analysis (TIA) is a modern way of speeding up the test automation phase of a build. It works by analyzing the call-graph of the source code to work out which tests should be run after a change to production code. Microsoft has done some extensive work on this approach, but it's also possible for development teams to implement something useful quite cheaply.

One curse of modern software development is having "too many" tests to run all of them prior to check-in. When that becomes true, developers use a costly coping strategy of not running any tests on their local developer workstation. Instead they rely on tests running later on an integration server. And quite often even those fall into disrepair, which is inevitable when shift right becomes normal for a dev team.

Of course, everything that you test pre-integrate should immediately be tested post-integrate in the Continuous Integration (CI) infrastructure. Even the highest functioning development teams might experience breakages born from timing alone for commits landing in real time. Those teams might also harbor someone who sometimes wants to 'economize' on the agreed integration process and NOT run tests. Luckily, CI servers and a decent series of build steps are how you quickly catch those moments.

Various techniques exist to speed up test execution, including running them in parallel over many machines and using test doubles for slow remote services. But this article will focus on reducing the number of tests to run, by identifying those most likely to identify a newly added bug. With a test pyramid structure, we run unit tests more frequently because they usually run faster, are less brittle and give more specific feedback. In particular, we frame a suite of tests that should be run as part of CI: pre-integrate and post-integrate. We then create a deployment pipeline to run slower tests later.

The same problem restated: If tests ran infinitely quickly, we would run all the tests all the time, but they are not, so we need to balance cost vs value when running them.

In this article, I detail an emerging field of testing-related computer science where Microsoft is leading the way, a field that companies with long test automation suites should take note of. You may be able to benefit from Microsoft's advances around "Test Impact Analysis" immediately, if you are in the .NET ecosystem. If you are not doing .NET, you have to be able to engineer something yourself fairly cheaply. A former employer of mine engineered something themselves based on proof of concept work that I share below.

Conventional strategies to shorten test automation To complete the picture, I will recap the traditional "run a subset of the tests" strategies, that remain dominant in the industry. Well, with the newer reality of parallel test execution and service virtualization. Pre-calculated graphs of source vs tests 276 tests with their notional 'size' designations. The ones executed for given commits, with two failures resulting. As it happens, some of those turn out to be small, some medium, and some large. I have only depicted a tree/fractal because it helps explain the concepts (it is not really like that). Google's fabled internal build system Blaze, has been copied into a few open source technologies over the years. Most notable are Buck from Facebook and Bazel from Google. Pants by Twitter, Foursquare and Square. Blaze inside Google navigates a single directed graph across their entire monorepo. Blaze has a mechanism of direct association of test to production code. That mechanism is a fine grained directory tree of production sources and associated test sources. It has explicit dependency declarations via BUILD files that were checked in too. Those BUILD files could be maintained and evolved by the developers, but could also be verified as correct or incorrect by automated tooling. That process repeated over time goes a long way to make the directed graphs correct and efficient. Importantly, that tooling could point out redundant claims about dependencies. Thus for a given directory/package/namespace, the developer could kick off a subset of the tests quite easily - but just the ones that are possible via directed graphs from the BUILD files. The ultimate time saver, both for the developer pre-integrate and the integration infrastructure, was scaled CI infrastructure 'Forge' (later TAP), was the automated subsetting of tests to run per commit based on this baked-in intelligence. There are a bunch of mind-blowing stats in Taming Google-Scale Continuous Testing. In my opinion this stuff has cost Google tens of millions but made them tens of billions over the years. Perhaps far greater than the earnings:wages ratio.

Test Impact Analysis Test Impact Analysis (TIA) is a technique that helps determine which subset of tests for a given set of changes. A similar depiction for tests to run for a hypothetical change. The key to the idea is that not all tests exercise every production source file (or the classes made from that source file). Code coverage or instrumentation, while tests are running is the mechanism by which that intelligence is gleaned (details below). That intelligence ends up as a map of production sources and tests that would exercise them, but begins as a map of which productions sources a test would exercise. One test (from many) exercises a subset of the production sources. One prod source is exercised by a subset of the tests (whether unit, integration or functional) So you will note that the stylized diagram of executed tests, is the same as for the Directed graph build technologies above. It's effectively the same, as the curation of the BUILD files over time leads to more or less the same outcome as TIA. The TIA maps can only really be used for changes versus a reference point. That can be as simple as the work the developer would commit or has committed. It could also be a bunch of commits too. Say everything that was committed today (nightly build), or since the last release. One realization from using a TIA approach is that you have too many tests covering the same sections of prod code. Of those are straight duplicates, then deleting tests after analysis of the test and the paths through prod code that it exercises, is a possibiity. Often they are not though, and working out how to focus testing on what you want to test, and not on at all on transitive dependencies in the code is a different focus area that inevitably rests on the established practice of using test doubles and more recently Service Virtualization (for integration tests). The minimal level of creating a list of what has changed is "Changed production sources", but the ideal would to determine what methods/functions have changed, and further subset to only tests that would exercise those. Right now though, there is one ready to go technology from Microsoft that works at the source-file level, and one reusable technique (by me). Read on. Microsoft's extensive TIA work Microsoft has put in the longest concerted effort to productionize Test Impact Analysis ideas, and gave the concept that name and its acronym. They have a current series of blog entries that span March to August of 2017, so far: Accelerated Continuous Testing with Test Impact Analysis - Part 1, Part 2, Part 3, and Part 4. Their older articles on this go back eight years: Test Impact Analysis in Visual Studio 2010 (2009) Streamline Testing Process with Test Impact Analysis) (2010) Which tests should be run since a previous build? (2010) How to: Collect Data to Check Which Tests Should be Run After Code Changes (2010) Test Impact Analysis (2011) Microsoft's Pratap Lakshman detailed the evolution of their implementation. Concerning the current evolution of their TIA tech, Pratap says: The map of the impacted tests versus production code is recalculated when there is a build that is triggered. The job to do that runs as part of the VSTest task within a VSTS build definition. Our TIA implementation collects dynamic dependencies of each test method as the test method is executing. At a high level here is what we do: As the test is executing it will cover various methods – the source file in which those methods reside are the dynamic dependencies we track. So we end up with a mapping like the following: Testcasemethod1 <--> a.cs, b.cs. d.cs Testcasemethod2 <--> a.cs, k.cs, z.cs and so on … Now when a commit comes in to say a.cs , we run all Testcasemethod(s) that had a.cs as their dynamic dependency. We, of course, take care of newly introduced tests (that might come in as part of the commit) and carry forward previously failing tests as well. Our TIA implementation does not do test prioritization yet (e.g. most often broken first). That is on our radar, and we will consider it if there is sufficient interest from the community. The actual TIA map data is stored in TFS as in an SQLServer database. When a commit comes in, TIA uses TFVC/Git APIs to open up the commit and see the files into the which the changes are going into. Once it knows the files, TIA then consults the mapping to know which tests to run. Of course, usage of this TIA technology is supported in pull request (PR) and regular CI workflows, as well as pre-integrate on the developer's workstation. We want our users to embrace the Shift Left and move as many of their tests to earlier in the pipeline. In the past we have seen customers a little concerned about doing that because it would mean running more tests at every commit, thereby making the CI builds take longer. With TIA we want to have our users Shift Left and let TIA take care of running only the relevant tests – thereby taking care of the performance aspect. Concerning the first few years of TIA inside TFS and Visual Studio, he says: The TIA technology at that time was quite different in many ways: It would only identify impacted tests. it was the left to the user to explicitly run them.

It used block level code coverage as the approach to generate the test <--> mapping. In the subsequent build, it would do an IL-diff with the earlier build to find out blocks that changed, and then use the mapping to identify and list impacted tests. Note that it would not run them for you.

The above approach made it slow (compared to the current implementation), and required way more storage for the mapping information (compared to the current implementation)

The above approach also caused it to be less safe than the current implementation (it would miss identifying impacted tests in some cases).

It did not support the VSTS build workflow (it was only supported in the older XAML build system) VectorCAST/QA - application Vector Software has made a product called VectorCAST/QA that is a one stop shop application that leverages code-coverage in the same way to run fewer impacted tests (and more). Their technology is mostly sold to the automotive (and related) industry that embeds C, C++ and Ada software. VectorCAST working in this mode of operation also predates my kitchen sink experiments. I have to work on my googling skills! NCrunch for VisualStudio NCrunch for .NET teams was launched in 2011 after a couple of years of development. It is a sophisticated plugin for Visual Studio that can optimize the run order of tests based on algorithms that predict which are most likely to break for a change. In 2014 extra features were added to allow it to subset tests to just ones impacted by the change. Also in 2014, NCrunch became compatible with CI usages generally. Specifically it was able to orchestrate executions of MsBuild outside of the VisualStudio UI, with the same elapsed-time savings that you would hope for. The raw impact map data isn't stored in source control as it is binary, but can be shared between developers and CI infrastructure on a network share. NCrunch is commercial, but for a reasonable per-developer (and per test-engineer) license fee. CI nodes are free, and NCrunch's creator, Remco Mulder, agrees that nobody should pay twice for something or be penalized for having scaled their CI infra via Docker and alike in 2017.