Constraining Myself to Neutral and Positive “Code Golf”

One thing I tried doing halfway through is to make sure I don’t engage in any code-golf that makes the codebase worse in the name of test coverage. (Depending on how you define code golf, it might tautologically mean it makes the codebase worse. Here, I really just mean it as any creative rearranging of code for the sake of coverage.)

But what I found was that some code golf actually helped me think more clearly about the code. Even in cases where the codebase itself looked the same, I now had a new vocabulary to talk about error conditions. In some other cases, covering lines of code drove me from making run-time guarantees in the code to making compile-time guarantees in the code (definitely a positive).

Let’s walk through some examples.

Neutral: Clarifying assertions versus errors

I had transformation functions that took RDF triples and converted them to an intermediate representation of classes. These functions had some issues:

Case Study A: Intra-function impossibility

There was a line in my code that my tests never covered. It looked something like this:

As you can see, just reading this function and knowing: (classes.size === 0) will never happen. For one, there's a classes.set(K, V) a few sections above. And we set a few other key-value pairs from this wellKnownTypes array that is hard-coded to always have a set number of elements.

One can try to understand the point of this error: It could be to show that none of the RDF triples we got are turned into classes (in which case we might want to compare classes.size with wellKnownTypes.length + 1 instead). Alternatively, it can be a poorly-placed verification for when we had less confidence that we were building classes properly, and had no clear concept of such well-known types.

In my case, creating a map with just the well-known seemed fine. If the ontology is empty or missing data, we’ll be likely to find it at earlier steps or later ones. The error gives no clear context as to what’s going wrong. So, in my case, the answer is to kill it:

Case Study B: Inter-function assertion

Another error I saw looked like this:

In this case, line 21 never happened (and the (!cls) condition was always false). This should make sense: ForwardDeclareClasses literally checks if a TypedTopic satisfies IsClass() and, if so, unconditionally adds it to the map. BuildClasses assert that a topic matching IsClass exists in the map.

One way to get test coverage for this line is to export BuildClasses and test it. But that seems like it goes against the spirit of making the codebase better. A better approach is to ask what this line is trying to do.

Interlude: Expectations, errors, and assertions

Sometimes, we assert things because they either…

are error conditions that can happen in the wild due to poor data or input,

shouldn’t happen, and if they did it’s a sign of a bug in our code, or

shouldn’t happen, and if they did it’s a sign of cosmic radiation.

I decided to differentiate these. If my test coverage report complains about an uncovered:

error condition — I should test it. If I can’t, I should refactor my code to make it testable;

I should test it. If I can’t, I should refactor my code to make it testable; assertion — that might indicate a bug, some code golf to make these always run might be in order (more on that in a bit);

— that might indicate a bug, some code golf to make these always run might be in order (more on that in a bit); assertion that’s totally impossible — maybe I should delete it.

I refer to #1 as an error condition — test these.

For assertions, I often found that the line between #2 and #3 is often the function boundary (this isn’t always true). Cases of intra-function assertions (like Case Study A above) seem so useless that we’re better off removing them. Cases of inter-function assertions (like this case) seem useful enough to stay.

The fix

I found that this distinction is not just helpful to split hairs — it’s also very helpful for someone reading the code. Is this error something that can happen in normal operation or is it a sign of a bug? I decided to make this clear:

Normal error conditions: if + throw , or similar.

+ , or similar. Bug assertions: assert and assertXyz variants.

With that, I ended up with this change:

Here, thinking about covering a line of code fundamentally helped me communicate what my code does more effectively. A lot of the “moving things around” that I had to do is very much semantic code golf (that happily happens to give a better test coverage score), but I’d like to think it’s a net positive.

Positive: Restructure code to achieve compile-time guarantees rather than run-time guarantees

I already showed that some lines that are never covered by test runs are assertions that should never happen. Sometimes, we can restructure our code to make compile-time claims about our code’s structure, rather than worry about it.

I’ll be more specific: My code has a parseComment function that uses htmlparser2 to turn HTML comments in JSDoc tagged comments. In that code, we define a new htmlparser2.Parser that handles known tags and throws on unknown tags. It looks something like this:

Initially, lines 11 and 18 in the above snippet where uncovered. Line 11, in onopentag , is easy — I added a test for an unknown tag and saw it failed. Great — our test coverage now includes this line.

I couldn’t get line 18 covered, though. If an unknown opening tag exists, it will throw on open. A self-closing tag counts as an open tag. Closing a never-opened tag is bad HTML, but doesn’t actually register as a closed tag. This is valid HTML: An end tag token with no matching start tag is not a valid token.

In other words, line 18 will never trigger.

The fix

Should we remove it? Well, right now, with the code as stated, this line actually has some utility: If the developer adds some handling for a start tag (e.g. <table> or <td> ), we'll notice a run-time error if we omit adding the end tag handler as well.

That’s kind of useful. But the nice thing about using TypeScript is that we can structure our code so that run-time guarantees are turned into compile-time guarantees.

In our case, this change made things much better. Here’s a summarized version:

By unifying tag handlers between open and close tags, a user won’t forget to add a closed tag handler, or vice-versa, as the compiler defines a unified interface for what handling a tag looks line.

In general, assertions and guarantees about parallel code structures (e.g. parallel switch cases, etc.) and parallel data structures (e.g. parallel lists or maps) are tenuous. If you have two structures (code or a data structure) that need to agree about what elements they have, etc. you might often benefit from rearranging them structurally. Google’s Testing Blog covered the data structure version of this in its Code Health: Obsessed With Primitives? episode.