There’s an article in the new issue of the French magazine Programmez which got my attention. It chronicles the efforts of some programmers for a B2C site to investigate converting their old for loops into more readable and maintainable Java-8 streams. The article is kind of a mess, actually, but to a software craftsman or craftswoman who works on Java legacy code, the message is a bit frightening. The programmers did performance tests of the methods they changed, but because some of the more complex methods they modified took twice as long to execute as they did before, their employer was not convinced and does not want them to proceed with a refactor.

If you thought that it was difficult to sell your product owner on the idea of adding days to the schedule to clean up legacy code, imagine having to reveal that the clean-up also involves the risk of performance issues!

Not all of their changes slowed down execution – some even speeded it up a bit. What troubled me about the article, though, is that I could not predict, looking at the examples they gave of their modifications, which ones would run faster and which ones would run slower than before.

I turned to StackOverflow to see if somebody knows how to do that in general. Is there any way, in a code review, for example, to see if a stream will run significantly slower (a constant-order slowdown, but enough to annoy a user or kill a refactoring initiative) based on just looking at the code? Of course, it’s possible to code stupidly in a way that will obviously cause a slowdown. But, apart from some useful guidance about when to use parallel streams and when not to, the message I got from some smart, informed developers is that you can’t, or else it’s not worth the trouble for what amounts to at worst a constant-order slowdown. Of course you should review code changes to make sure they don’t affect the correctness or the complexity of the algorithm, and maybe you can tell whether going parallel would help or hurt performance, but that’s all you can usefully detect.

I think the biggest mistake that the developers in the Programmez article made was to evaluate the performance based on isolated unit tests (microtests). Performance of a method is not interesting to the end-user. A constant-order decrease in performance of a method will frighten a product owner, but maybe the end-user won’t even notice it. For example, it may be that doubling the execution time of a loop results in a 0.001% decrease in performance of a web request because it’s a loop to process results of a heavy database query.

If the code you work on is for a financial trading application where every nanosecond counts, then you’re not going to be using Java streams at all – you likely have customized collections that run super-fast with no syntactic frills. For the rest of us, we need to do automated end-to-end performance tests for the scenarios where performance might be an issue.

If you convert for loops to streams, don’t convert everything on the same day. Space out these kinds of changes enough that end-to-end tests will catch only one performance issue at a time. Then you can reassure the product owner that there will be no “noticeable” decrease in performance from the refactor, but that the code will be more readable and understandable, and thus easier to change and less bug-prone. For certain iterations which can be parallelized efficiently, there may even be noticeable performance gains.