Hyperbolic title? Of course! It came from my ‘Scaling my Server’ blog post. It was a bold statement and it stirred up controversy and I paid for it with the pundits. But I broadly meant it, and it all comes down mental modeling and testing. Specifically, unit testing. Unit testing is overrated.

(To said pundits: you know I’m going to tell it as I see it, and you’ll be able to wedge your favourite niche language to 'disprove’ some small part of my reasoning or other, but unmaintainable is a subjective thing and everyone has different pain thresholds so you’ll just have to agree to differ with me :) )

The fun thing about maintaining dynamic language programs (like the Python server I was commenting on) is that that the language is, duh, dynamic. A piece of code cannot be trivially checked for a whole swathe of typo bugs at compile time. If you misspell a variable, or have your property attached to the wrong object, you discover this at run time. Ouch.

So you combat this with two approaches:

First, you might try and use static code analysis tools.

The problem with this approach is that, because it is a dynamic language duh, the static code analysis tools (we’re not talking smalltalk) are limited. The mainstream ones all balk at a twisted app. They all basically don’t work, and I say this as someone who has spent an inordinate amount of time trying to think about how to fix them so they can follow callbacks through event loops and deal with overloading and so on; I’ve been quite involved in the static analysis of Python discussion recently, for example. Of course “dynamic languages” is a broad brush and in truth there’s a spectrum with niche languages that try and be amenable to refactoring to varying extents, but the mainstream “hash languages” like Python are pretty limited in this regard.

So you try and bang your code about to make it pass one analyser and find it won’t now pass another; you find it not finding bugs, or you drown in false positives, or you end up writing verbose, statically typed programs in a dynamic language. You even contemplate resurrecting Hungarian Notation. Ouch!

Second, and more successfully, you write lots of unit tests.

In a statically typed language you are trying to test that your unit is doing what you, the programmer who wrote it, expect it to do for a variety of valid and perhaps invalid input.

In a dynamically typed language you move down a level of detail and try and devise tests that check for typos, which is an interesting challenge. You end up with more unit tests and more coupled unit tests. Your unit tests end up distilling a lot of your inane knowledge of the the other bits of code that some code interacts with as you try and fill out the contracts of all the mocked objects you’re passing around.

And then you eventually have version 1.0 of your product. Being agile and a cool scrummer you probably call it something else, but lets call it version 1.0. That version that first reaches some kind of usefulness and stability at the same time and people start using… and now you’re in maintenance phase.

Now its time for your first minor rewrite. Just some sub-part of your program needs to be redesigned and reimplemented to do something slightly differently:

How do you refactor dynamic languages? How can you change the name of an object’s method, or the meaning of a particular parameter? Refactoring dynamic languages is not a solved automated problem and it all comes back to the static analysis problem. So you end up doing much of it manually, and having to try and run your program in your head to work out where you should go make related changes at the other end of chained event handlers and callbacks.

And here is that insidious problem; all your unit tests have frozen the interaction of your components. They have copied the API of your components and distributed it all over your test code base. This is all the more true if you get on the mocking vs stubs bandwagon and check units of code by their interaction with mocked components rather than by the outcomes; if you have been unit testing behaviour rather than state.

And those unit tests that you don’t visit will still succeed, even though the code will fail at run time. Ouch!

The way out of this mess is integration testing.

Tests that put data in at the top and check what comes out the bottom of your program is far, far more useful! It also, in my experience, finds by the far the most real bugs and guards best against regressions.

So you want some framework that runs your program (e.g. your whole web-server) for real, and does real inputs, and lets your code talk to all the external data stores it needs (canned, or worse case mocked, so be it, but you really want a little local database for integration tests if you can set it up). Using as many real components as possible is a laudable goal and if you can move your integration testing so high as to have automated system testing, go for it! You won’t find all your unicode and blob handling bugs if you mock out MySQL, I can tell you ;)

What’s so special about all the bugs in your program? They passed all your tests, quipped Rich Hickey. Well, when that happens, do some test-driven debugging. Add a test case to your integration test that recreates the bug, and use that for debugging and to validate the fix.

So as you head forwards in the maintenance phase, integration tests grow in utility whilst unit tests are often inadvertent speed-breaks on productivity. There’s always code that is suited to unit testing, so unit test it. But don’t overuse unit testing, its overrated.

So this blog post started dishearteningly and then integration testing came riding to the rescue. The sad truth is, it didn’t. Integration tests still lets bugs through. The unit testing was too intrusive and made it difficult to make large changes to your program; and integration tests, even as they grow over time, don’t really cover everything. There is no straightforward level ground between unit testing and integration testing and your program still has bugs you’ll find only in production.

So integration tests don’t make dynamic languages maintainable, they just prolong their usefulness until about midway into the maintenance phase. Because by midway into your maintenance phase, mental modeling overload kicks in.

Once your program is big enough, and you have had it on a back burner for long enough, you have basically no chance of remembering the flow of your program through all those callbacks and event loops and you wrote it. Now pity the maintenance programmer.

Being able to right-click on some identifier in an IDE and see its declaration and all references is your saviour. In statically typed languages, of course. In dynamic languages, well… I said that static analysis of dynamic languages is an unsolvable problem at the top of this blog :(

Python is still my go-to hobby and scripting language. I still write buckets of raw Javascript. I still read avidly all I can about Typescript and Dart and various attempts to marry static and dynamic typing. I still think Haxe is underused. But the bigger my dynamically typed project is, the more I find myself trying to be disciplined and write it in a typed style and try and avoid callbacks and indirection. Its like once burnt twice shy. I would strongly recommend a statically typed language (Go?) for your next big, production destined project.

Scrum is overrated too, but that’s another blog post ;)

Notes

"share"