The Myth of the Software Rewrite

Editorial note: this post was originally written for the NDepend blog, and you can read the original here. If you like the topics of static analysis and code metrics, there’s a lot you’ll love over there.

Editorial update: due to the unanticipated popularity of this post and the fact that I’m buried in work the first half of this week, I’m planning to write a detailed follow-up addressing some of the sentiments in the comments that I’m seeing. That will be on the NDepend blog, where the original appeared. Stay tuned, if you’re interested in follow up, and thanks for reading/commenting!

“We can’t go on like this. We need to rewrite this thing from scratch.”

The Writing is on the Wall

These words infuriate CIOs and terrify managers and directors of software engineering. They’re uttered haltingly, reluctantly, by architects and team leads. The developers working on the projects on a day to day basis, however, often make these statements emphatically and heatedly.

All of these positions are understandable. The CIO views a standing code base as an asset with sunk cost, much the way that you’d view a car that you’ve paid off. It’s not pretty, but it gets the job done. So you don’t want to hear a mechanic telling you that it’s a total and that you need to spend a lot of money on a new one. Managers reporting to these CIOs are afraid of being that mechanic and delivering the bad news.

Those are folks whose lives are meetings, power points, and spreadsheets, though. If you’re a developer, you live the day to day reality of your code base. And, to soldier on with the metaphor a bit, it’s pretty awful if your day to day reality is driving around a clunker that leaves car parts on the road after every pothole. You don’t just start to daydream about how nice it would be to ride around in a reliable, new car. You become justifiably convinced that doing anything less is a hazard to your well being.

And so it comes to pass that hordes of developers storm the castle with torches and pitchforks, demanding a rewrite. What do we want? A rewrite! When do we want it? Now!

At first, management tries to ignore them, but after a while that’s not possible. The next step is usually bribery — bringing in ping pong tables or having a bunch of morale-building company lunches. If the carrot doesn’t work, sometimes the stick is trotted out and developers are ordered to stop complaining about the code. But, sooner or later, as milestones slip further and further and the defect count starts to mount, management gives in. If the problem doesn’t go away on its own, and neither carrots nor sticks seem to work, there’s no choice, right? And, after all, aren’t you just trusting the experts and shouldn’t you, maybe, have been doing that all along?

There’s just one nagging problem. Is there any reason to think the rewrite will turn out better than the current system?

Would the Rewrite Go Well?

Let’s do a dispassionate play by play of the situation. A software group starts writing a piece of software and they’re productive at it. Over the course of time, as they hustle to get features out the door, they make a mess, always vowing to clean it up later, when they have the time. But, they never have the time because with every delivery cycle, they’re able to ship fewer things because of all the problems that have developed in the code. Eventually, features slow to a crawl, developers are increasingly miserable, and the group suffers attrition as people start heading to other groups or companies for greener (field) pastures. Things are in a downward spiral and something must be done. The developers want that something to be a total rewrite. “This time,” they say, “we know so many things we didn’t know when we started the current system, so this time we’ll get it right.”

Insanity: doing the same thing over and over again and expecting different results. — Albert Einstein

Sure, they know things now that they didn’t know when they started on this code 3 years ago. But won’t the same thing be true in 3 years? Won’t the developers then be looking at the code and saying, “this is mess — if only we knew in 2015 what we now know in 2018!” And, beyond that, what makes you think that giving the same group of people the same marching orders won’t result in the same kind of code?

The “big rewrite from scratch because this is a mess” is a losing strategy.

Don’t get me wrong. There are certainly times when old software needs to be phased out in favor of more modern stuff. If you have code specific to hardware that is no longer manufactured, you’re better served building new software than scrounging E-Bay for resellers of the old server model. If you have a line of business application written in a defunct language that no one knows anymore, you’ll probably need to bite the bullet and commission something modern. But you don’t need to rewrite software because developers made a mess of it while hurrying to meet deadlines.

It’s a long road back from a mess, but the road exists. You can use automated tooling to identify and start working to improve the most dangerous parts of the code. Automated tests are your friend — characterize the system’s current behavior with lots of automated tests and then work on refactoring. Bring in coaches or developers that are used to legacy rescues. Shift the team’s priorities and help the business understand that it’s time to pay the piper on the accumulated technical debt. They’re going to have to deal with a slowdown in the short term to go faster, sustainably over the long term. And they’re in no position to complain — that’s exactly what a rewrite would mean too. It’s just that this approach is an actual game changer and not just more of the same.

When everyone on the project is at wits’ end and people are finally past the shock of the notion of a write-off, the rewrite is tempting. It’s like making peace with having a car payment and starting to get excited about the newfangled dashboard computer and leather seats of the luxury thing you’re going to buy next. But software isn’t a car. The software is a mess because the group made it a mess, and it’ll only get and stay clean if the group cleans it.