Kill the Clones: How Change Coupling helps you identify Design Problems in large-scale Systems

Your Code as a Crime Scene presents around 15 different software analyses. Of all those analyses, Change Coupling is the most exciting one. In this article we'll put the analysis to work on a real-world system under heavy development: Microsoft's ASP.NET MVC framework. You'll see why and how Change Coupling helps us design better software as we detect a number of possible quality issues in ASP.NET MVC. In addition, you get a preview of CodeScene, a new tool to automate software analyses.

What's Change Coupling?

Change Coupling means that two (or more) modules change together over time. In this study I configured CodeScene to consider two modules coupled in time if they are modified in the same commit (there are other more elaborate options too).

The fascinating thing with Change Coupling is that the more experience we get with the analysis, the more use cases there seem to be. For example, you can use Change Coupling results to:

Detect software clones (aka copy-paste code). Evaluate the relevance of your unit tests. Detect architectural decay. Find hidden dependencies in your codebase.

In this article we focus on detecting software clones and uncovering hidden dependencies.

Explore your Physical Couples

Let's fire-up CodeScene and run an analysis of ASP.NET MVC. Since we're interested in exploring Change Coupling, we click on the corresponding button. Here's what the result looks like:

CodeScene presents us with multiple views so that we can investigate different aspects of the results. The default view above is based on a visualization technique called Hierarchical Edge Bundling. You see the names of all involved files as nodes with links between the ones that are coupled. If we hover of a node its change couples will light-up in red.

So, why do two source code files change together over time? Well, the most common reason is that they have a dependency between them; One module is the client of the other.

You see a few examples on physical coupling in the picture above, a unit test tends to change together with the code under test. This is expected. In fact, we'd be surprised if the change coupling was absent - that would be a warning sign since it indicates that your tests aren't being kept up to date or aren't relevant.

A physical dependency like this is something you can detect from the code alone. But remember that Change Coupling isn't measured from code; Change Coupling is measured from the evolution of the code. That means you'll sometimes make unexpected findings.

Look for the Unexpected

There's one main heuristic to keep in mind as you analyze a software system: always look for the unexpected. Look for surprises, since a surprise in a software design is bound to be expensive from a cognitive perspective and therefore expensive to maintain.

As soon as you find a logical dependency that you cannot explain, make sure to investigate it. Let me clarify with an example from ASP.NET MVC:

The visualization above shows change coupling between a LinkTagHelper.cs and a ScriptTagHelper.cs. You also see that their unit tests tend to be changed together.

While those two classes seem to solve related aspects of the same problem, there's no good reason why a change to one of them should imply that the other one has to be changed as well. Let's get more information on this potential problem by switching to the detailed view:

The data above confirms that there's a strong degree of coupling between the LinkTagHelper.cs, the ScriptTagHelper.cs, but also between their unit tests. 9 out of 10 changes we do to one of the classes will result in a predictable change to its coupled peer.

When you find an unexpected change pattern like this you need to dig into the code and understand why. It's particularly interesting in this case since there is not any direct physical dependency between the logically coupled classes! Let's have a look at the LinkTagHelper.cs and the ScriptTagHelper.cs to see if we can spot any patterns:

So you have the ScriptTagHelper.cs to the left and the LinkTagHelper.cs to your right. Do you see any pattern? Indeed, and this is what I tend to find on a regular basis as I inspect logical coupling - a dear old friend - copy-paste.

If you take a detailed look you'll note something rare. The variable names and, even more rare, the comments have been updated so this is more like copy-paste with a gold plating:

Break the Logical Dependencies

When you have an unexpected change dependency, you'll often find that there's some duplication of both code and knowledge. Extracting that common knowledge into a module of its own breaks the change coupling and makes your code a bit easier to maintain. You see, change coupling often suggests refactoring candidates.

At this point it's important to note that duplicated code in itself isn't always a problem; Just because two pieces of code look the same, that doesn't mean they have to be refactored to use a shared abstraction. Instead we want to focus on what the code expresses.

In the case of ASP.NET MVC it's clear that the two classes model the same process. That is, it's indeed a duplication of knowledge and it's likely that the code would benefit from a refactoring. This is even more important since, as the change coupling results indicate, we have the same amount of duplication between their corresponding unit tests. Avoiding expensive change patterns makes software maintenance so much easier. And duplication of knowledge is always an expensive one; It's so easy to forget to update one of the copies once the business rules change or when a bug gets fixed.

Complement your Intuition

If you're an experienced developer that has contributed a lot of code to a particular project then you probably have a good feeling for where the most significant maintenance problems will show-up. You may still get surprised when you run an analysis, but in general several code analysis findings will match your intuitive guess. Change Coupling is different. We developers seem to completely lack all kind of intuitive sense when it comes to Change Coupling.

Conclusion

In this article we explored how Change Coupling detects a DRY violation in ASP.NET MVC. We ran an analysis with CodeScene and looked for unexpected temporal dependencies to identify expensive change patterns in our code. Used that way, Change Coupling suggests both refactoring candidates and the need for new modular boundaries.

The same analysis principle also helps you catch architectural decay. All you have to do is to lookout for change dependencies that span architectural boundaries. Change Coupling is like a bumpy journey - it gets worse with the distance you have to travel - and it makes a big difference if we need to modify two files located in the same package versus modifying files in different parts of the system. So make it a habit to investigate the change coupling dependencies in your repository on a regular basis. Your code will thank you for it.