The Rule of Three is a fundamental technique to discover the right reusable piece of code. It can also be applied on many scales, including Distributed Systems.

A civil engineer, a mathematician, and a programmer are driving down the road when the car breaks down.

Once they get out, the civil engineer says:

Before I can solve the problem, I need time to disassemble the whole car and understand how each part fits.

The mathematician says:

Before I can solve the problem, I need to model all the variables that led to this outcome.

The programmer says:

I have an idea: let's all enter in the car and get out again!

The Rule of Three can assist with a joke, enable the discovery of a mathematical pattern or improve the chances for the success of a project like StackOverflow. In refactoring, it's the number of times you recreate a piece of code to solve the same problem before you're more likely to discover the best pattern. Once you do, you can decide to remove the duplication and build a more efficient reusable abstraction if it makes sense.

The Rule Of Three enables you to create better reusable abstractions, but it's not cost-free. If you need to produce three unique samples of the code before you can discover a pattern, that also means you have to bear the cost of solving the same problem three times.

Copy/paste is not allowed.

It’s not a big deal to build the same thing 3 times in a system where it’s cheap to produce each part. You can accept the cost to recreate code and use that duplication in your favor. However, when the cost to recreate each part of the system is enormous, things get scary. That's when the benefits of the Rule of Three starts to become debatable.

The Rule of Three comes with a cost: you have to solve the same problem three times.

A network of computers represents a conglomerate of machines, called services. They talk to each other to produce an outcome. Let's call it a Distributed System.

A system running in a single service represents a conglomerate of small functions or classes. They also talk to each other to produce an outcome. For this post, let's call it an Isolated System.

The difference between a Distributed System and an Isolated System is the distance between their internal components and the effects that distance can cause. If you assume both of them are decently modularized and extensible, the fundamentals and techniques used to build a Distributed System are the same as the ones used to build an Isolated System.

Here's a thought.

Suppose the construction of a component has a cost c . The Cost Of Recreation with the Rule of Three is 3c . If you add the Cost Of Refactoring r to remove the duplication, the total cost of the component is 3c + r . The Cost Of Refactoring r becomes more prominent as the price to recreate the component increases. However, either if you're working in an Isolated System or a Distributed System, the formula remains the same.

If you recreate a small service 3 times, once there's enough evidence you have found a pattern, you can build a single service that removes the duplication and becomes reusable.

However, just because you can, that doesn't mean you should.

You can apply the Rule of Three in the context of a Distributed System.

43 years ago, in 1975, Fred Brooks introduced the idea of the Second-System Effect:

An architect’s first work is apt to be spare and clean. He knows he

doesn’t know what he’s doing, so he does it carefully and with

great restraint. […] […] The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. The result […] is a “big pile.” — Fred Brooks on "The Mythical Man Month (1975)", page 55.

According to the Rule of Three, you should not remove the duplication after you create the second system. You need to create a third system to triangulate and prove you’re not under the influence of the Second-System Effect.

When you apply the Rule of Three in the context of a Distributed System where each service costs $1 million to recreate, you need to fight against that tendency to over-design the second system. In an Isolated System, that tendency does not exist because the cost to recreate its parts is lower relative to the budget of the project.

Let's say there's a project where the budget is $1 million. In that project, the team has decided it's ok to recreate a service that costs up to $1k, but not ok if it costs $20k. Using the same logic, if the team works in another project where the budget is $1 billion, it's ok to recreate a service that costs up to $1 million, but not ok if it costs $20 million. Still, when significant figures are involved, it's easy to become biased due to the magnitude of the numbers and ignore the relativeness of the issue.

The bigger the cost of the service, the higher the tendency to misuse the Rule of Three, or not use it at all.

You can apply the Rule of Three in the context of a Distributed System, but it depends on the cost of the service and the budget of the project.

However, the problem is not that you can't apply the Rule of Three in a Distributed System because each service costs $1 million. It’s that each service costs $1 million; therefore you can't apply the Rule of Three.

If the Distributed System is not built using principles like SOLID and Bounded Contexts, the Cost Of Recreation is higher. If you reduce the Cost Of Recreation, you fix the problem.

Here's an idea.

In an Isolated System, if a class costs too much and you need some functionality from within it, you recreate only the parts you need. You don't recreate the whole thing if you don't need to. Once 2 different classes are doing the same thing, and it makes sense for them to be a single component, you use the learnings from the recreation of those and build one reusable class containing the concern you need.

A diagram that shows a big class with one part of the functionality extracted. Before you extract that, create 2 other classes with code that solves the same problem. Later, merge all the parts into one reusable class if it makes sense to do so.

I have worked in Isolate Systems where a single class had an enormous Cost Of Recreation relative to the budget of the project. Even with that cost, you could understand and recreate parts of the class using the Rule of Three.

If you have monstrously expensive services, recreating them becomes scary. To deal with that, don't recreate them all one by one. If you need some functionality from within a service, recreate just the parts you need so that you can end up with cheaper services that won't have the same problem again.

A diagram that shows a big service with one part of the functionality extracted. Before you extract that, create 2 other services with code that solves the same problem. Later, merge all the parts into one reusable service if it makes sense to do so.

If it's too expensive to apply the Rule of Three in a Distributed System, that's a smell you should rethink the design.

The Rule of Three, as any other technique, can affect progress if used as a Silver Bullet. If you don’t have the discipline to remove the duplication on the third time, there's a risk the system becomes an unmanageable mess given the amount of duplicated cruft.

Also, if you already have strong domain experience, there's no need to recreate a system three times to find out if a type of architecture is a good fit or not. Some architectural models are self-evident for certain types of domains. One example is Event Sourcing, which is a great architecture to store transactions in the domain of accounting.

The software industry is enormous. Most of the time you're working without enough domain knowledge. If you recreate things at least three times, you Test-Drive your assumptions and get early feedback to prove if your code is a match to the problem you are trying to solve. That allows you to build domain expertise quickly and figure out if there was a need for reusability at all.

The code is not essential, your understanding of the problem is. This rule increases your speed towards a meaningful direction, instead of no direction at all.

The Rule of Three is part of a Test-Driven Development approach that helps to find the best possible solution for a problem.

The Rule of Three is not worth it if you recreate a big class. It's also not worth it if you recreate a big service.

However, the existence of expensive components is a smell that you should redesign the system to make them cheaper to change. Isolate and recreate only the parts you need until that problem is not a problem anymore.

It doesn't matter how fast you code or evolve the component of a system.

If you exercise your ability to solve the same problem three times, you’ll always come up with code that is better and richer than the one you had before.

Then, only then, you'll see if there's a reason to remove the duplication.

Only then, you'll have a challenge that is worth solving.