So, You’ve Inherited a Legacy Codebase

Editorial Note: I originally wrote this post for the SubMain blog. You can check out the original here, at their site. While you’re there, have a look around at some of the other posts and sign up for the feed.

During my younger days, I worked for a company that made a habit of strategic acquisition. They didn’t participate in Time Warner style mergers, but periodically they would purchase a smaller competitor or a related product. And on more than one occasion, I inherited the lead role for the assimilating software from one of these organizations. Lucky me, right?

If I think in terms of how to describe this to someone, a plumbing analogy comes to mind. Over the years, I have learned enough about plumbing to handle most tasks myself. And this has exposed me to the irony of discovering a small leak in a fitting plugged by grit or debris. I find this ironic because two wrongs make a right. A dirty, leaky fitting reaches sub-optimal equilibrium and you spring a leak when you clean it.

Legacy codebases have this issue as well. You inherit some acquired codebase, fix a tiny bug, and suddenly the defect flood gates open. And then you realize the perilousness of your situation.

While you might not have come by it in the same way that I did, I imagine you can relate. At some point or another, just about every developer has been thrust into supporting some creaky codebase. How should you handle this?

Put Your Outrage in Check

First, take some deep breaths. Seriously, I mean it. As software developers, we seem to hate code written by others. In fact, we seem to hate our own code if we wrote it more than a few months ago. So when you see the legacy codebase for the first time, you will feel a natural bias toward disgust.

But don’t indulge it. Don’t sit there cursing the people that wrote the code, and don’t take screenshots to send to the Daily WTF. Not only will it do you no good, but I’d go so far as to say that this is actively counterproductive. Deciding that the code offers nothing worth salvaging makes you less inclined to try to understand it.

The people that wrote this code dealt with older languages, older tooling, older frameworks, and generally less knowledge than we have today. And besides, you don’t know what constraints they faced. Perhaps bosses heaped delivery pressure on them like crazy. Perhaps someone forced them to convert to writing in a new, unfamiliar language. Whatever the case may be, you simply didn’t walk in their shoes. So take a breath, assume they did their best, and try to understand what you have under the hood.

Get a Visualization of the Architecture

Once you’ve settled in mentally for this responsibility, seek understanding quickly. You won’t achieve this by cracking open the code and looking through random source files. But, beyond that, you also won’t achieve it by looking at their architecture documents or folder structures. Reality gets out of sync with intention, and those things start to lie. You need to see the big picture, but in a way that lines up with reality.

Look for tools that map dependencies and can generate a visual of the codebase. Plenty of these tools exist for you and can automate visual depictions. Find one and employ it. This will tell you whether the architecture resembles the neat diagram given to you or not. And, more importantly, it will get you to a broad understanding much more quickly.

Characterize

Once you have the picture you need of the codebase and the right frame of mind, you can start doing things to it. And the first thing you should do is to start writing characterization tests.

If you have not heard of them before, characterization tests have the purpose of, well, characterizing the codebase. You don’t worry about correct or incorrect behaviors. Instead, you accept at face value what the code does, and document those behaviors with tests. You do this because you want to get a safety net in place that tells you when your changes affect inputs and outputs.

As this XKCD cartoon ably demonstrates, someone will come to depend on the application’s production behavior, however problematic. So with legacy code, you cannot simply decide to improve a behavior and assume your users will thank you. You need to exercise caution.

But characterization tests do more than just provide a safety net. As an exercise, they help you develop a deeper understanding of the codebase. If the architectural visualization gives you a skeleton understanding, this starts to put meat on the bones.

Isolate Problems

With a reliable safety net in place, you can begin making strategic changes to the production code beyond simple break/fix. I recommend that you start by finding and isolating problematic chunks of code. In essence, this means identifying sources of technical debt and looking to improve, gradually.

This can mean pockets of global state or extreme complexity that make for risky change. But it might also mean dependencies on outdated libraries, frameworks, or APIs. In order to extricate yourself from such messes, you must start to isolate them from business logic and important plumbing code. Once you have it isolated, fixes will come more easily.

Evolve Toward Modernity

Once you’ve isolated problematic areas and archaic dependencies, it certainly seems logical to subsequently eliminate them. And, I suggest you do just that as a general rule. Of course, sometimes isolating them gives you enough of a win since it helps you mitigate risk. But I would consider this the exception and not the rule. You want to remove problem areas.

I do not say this idly nor do I say it because I have some kind of early adopter drive for the latest and greatest. Rather, being stuck with old tooling and infrastructure prevents you from taking advantage of modern efficiencies and gains. When some old library prevents you from upgrading to a more modern language verison, you wind up writing more, less efficient code. Being stuck in the past will cost you money.

The Fate of the Codebase

As you get comfortable and take ownership of the legacy codebase, never stop contemplating its fate. Clearly, in the beginning, someone decided that the application’s value outweighed its liability factor, but that may not always continue to be true. Keep your finger on the pulse of the codebase, while considering options like migration, retirement, evolution, and major rework.

And, finally, remember that taking over a legacy codebase need not be onerous. As initially shocked as I found myself with the state of some of those acquisitions, some of them turned into rewarding projects for me. You can derive a certain satisfaction from taking over a chaotic situation and gradually steering it toward sanity. So if you find yourself thrown into this situation, smile, roll up your sleeves, own it, and make the best of it.