Addressing concerns about small functions

The author raises several points about small functions, and how she thinks they can be abused. These concerns are, I think, relatively common whenever the size and complexity of functions is debated, so let’s walk through them.

I do want to point out that these concerns are held by real people, who’ve worked in real code bases, and experienced real frustration. I don’t want to minimize that, my hope here is that we can redirect that frustration in a way that results in higher quality code.

“Do one thing” is fuzzy and sometimes taken too far

What, exactly, is the one thing that a given function should do? That’s often not an easy question to answer. How far is too far when trying to reach that one thing? The author gives us an example that she seems to indicate would be best expressed as a single function:

For instance, in a web application, a CRUD operation like “create user” can be “one thing”. Typically, at the very least, creating a user entails creating a record in the database (and handling any concomitant errors). Additionally, creating a user might also require sending them a welcome email. Furthermore, one might also want to trigger a custom event to a message broker like Kafka to feed this event into various other systems.

This is actually a really good example of something that should be broken out into different functions. It’s doing several things, it’s operating at different levels of abstraction, and (perhaps worst of all) it has side-effects the caller is unlikely to anticipate. If you call something like User.create would you expect that it would send an email? That’s a recipe for disaster.

Let’s walk through this example to see why this would be a problematic way to design our code.

The funny thing is this method isn’t actually that long. It’s only six real lines of code. I probably wouldn’t really think of its length as a smell if I found it in a pull request. The smell isn’t just its size, it’s all the various things its doing and how misleading the message name is.

Now, imagine this method were long enough that it was smelly by size alone. Think about how much more it would be doing with ten lines or twenty lines. How about fifty or a hundred? The definition of “big” might vary depending on language, but the truth is nearly unavoidable: a method is highly unlikely to be both big and well-designed.

In only six lines we’re performing at least three seperate functions, we’re operating at different levels of abstraction, and OH the dependencies —three classes and six method signatures! In just six lines we’ve written something that can be broken by a multitude of unrelated code.

This is a problem with big functions: they are inherently fragile. Would you expect that by making a change in the WelcomeEmail class you’d potentially break your User model? I think we’ve all been there, and it’s not a place we typically like to visit.

Finally, this method is very challenging to test. In order to write a simple unit test we either have to mock all of these dependencies, write an integration test instead, or just give up out of frustration. Frustration when writing tests is a huge, glowing red flag: it’s telling you, “Hey! Stop! There’s something wrong with the code you’re trying to test.”

Instead, many developers will just get fed up with testing because it doesn’t accommodate their god methods or because they don’t feel like they have time to refactor code that, in all likelihood, they didn’t write. Then you end up with the worst of all worlds: an app full of huge, untested methods. Things break when you don’t expect them to and you don’t even know it until it’s in production because your test coverage is in the garbage.

You need to make a small change to support some trivial feature, but you open up the code and find it’s buried in a two hundred line behemoth. You now have to read, understand, and load the entire thing into your mental memory before you can make your trivial change. Even then you have no confidence that your change won’t break something unrelated.

DRY and modular? The horror!

DRY-driven functions can lead to the wrong abstraction

This is a fair concern. In the relentless pursuit of not repeating anything we pull code out into the wrong abstraction. That practice is worse than the original duplication.

We don’t really disagree here. You shouldn’t abstract code in the wrong way just to make a method shorter. But a wrong abstraction usually takes the form of a class, not just a method. Classes form the nouns, the ideas or concepts that we’re trying to represent. These concepts are what we’re abstracting. You’re methods should be the verbs, the simple actions that those nouns can perform.

Creating the wrong class can be painful to refactor later on, but there’s very little danger in pulling out some code into a private method in the same class. In fact, refactoring by extracting methods out of large methods can reveal the need for additional refactorings — such as the data clump code smell which may indicate a new object would be useful.

Long names are hard to read and understand

What could’ve been, say, 4–5 lines of code is now stashed away in a function that bears an extremely long name. When I’m reading code, seeing such a verbose word pop up suddenly gives me pause as I try to process all the different syllables in this function’s name, try to fit it into the mental model I’ve been building thus far and then decide whether or not to investigate the function in greater detail by jumping to its definition and reading the implementation.

Naming things is hard. Mostly because coming up with a word or two to represent the concept that you’re looking at requires that we back up from the context we’ve been working in. If you abstract a function and it really has an absurdly long name there’s a good chance some additional refactoring is needed. Perhaps a new object should be abstracted to represent the concept, for instance.

I will say that I’ve heard this argument before, and it’s usually related to the “comments are/aren’t a code smell” debate. Opponents of the comments as code smells view will sometimes point to the edge case where a function name would become very long in absence of a comment. The problem is that it’s just that, an edge case. The main case is that code blocks can be pulled into methods with perfectly reasonable names.

Loss of locality

Is it better to shimmy new functionality into existing methods than write new ones to handle these cases? It’s something of a trick question: I would submit that neither approach is really ideal. You should be able to change the behavior of existing code by extending it (with new objects) rather than modifying it with new functions that change the interface of an existing object (Open/Closed Principle).

The article rightly points out the flaw in adding new methods for every new use case, but (in my view) wrongly concludes that the solution to that problem is to just extend existing functions. The clean solution is likely to involve new classes, with their own interfaces developed in light of new requirements.

On this note, I’ve seen this objection raised before:

Small functions work best when we don’t have to jump across file or package boundaries to find the definition of a function.

All modern editors allow you to quickly jump to the definition of a function with a shortcut. Any reasonably complex code that’s well designed will require you to navigate multiple files. SOLID virtually ensures this. Don’t constrain the design of your system because a better design would mean more files.