If there is one thing about testing in Perl which bugs me, it's that most testing in Perl is what cgi-lib.pl is to Plack. The following is mostly a rant and I'm also guilty of many of these sins.

Yes, we have a beautiful testing infrastructure around our tests. For example, if I want to know which operating systems and versions of Perl my Code::CutNPaste module fails on, I just go take a look. Or I can quickly see almost 3,000 passing test reports for aliased (woo hoo!). The Perl community is proud of tools like this and we should be. Just reading about Perl's test-infected culture is fantastic and other language which claim to be test-infected often come in a poor second.

Er, sort of.

Where are the tests?

So, open up any random test suite on the CPAN for a suitably large distribution, find some code and then play the "find the tests for this code" game. Go ahead. I'll wait.

In fact, when working for many large companies, one of the biggest obstacles I find to integrating a new developer is teaching them where to find the damned tests! And no, I don't mean "they're in the t/ directory".

When I'm working on some code, I can often reason about where the associated code is. I can't do that with tests. Part of the problem is that we often group tests conceptually and I'm as guilty of this as the next person and I'll cry mea culpa first. For example, I wrote Test::Harness and in that distribution you will find:

$ find lib -type f|wc -l 49

All things considered, that's not really that big of a codebase, but it's gotten to the point where not everyone can keep the entire codebase in their head. I certainly can't and I can't make a commit without some VMS user pointing out that I broke VMS again. So I should have better organization of that test suite, but I don't. Case in point, I have a t/errors.t test file. What the hell is that? Are you honestly going to tell me that out of ~50 packages, I shove every possible error into one test file, regardless of the source of the error?

If, on the other hand, you can make your tests more or less resemble a one-to-one mapping with the classes/packages they're testing, you can reason about where your tests should go. Or you can use a stopgap measure such as Devel::CoverX::Covered to help you learn your test suite, but that assumes you can run code coverage on your test suite in the first place (those of you who have hour-long test suites know what I'm talking about).

Woo hoo! My dead code is 100% covered!

Yes, you generally want strong code coverage because you want to know that you can change your code with impunity, right? More than once I've found that a test suite's code coverage report is covering dead code. Why on earth would you do that?

Because we tend to not make a distinction between unit and integration tests in the Perl community and Devel::Cover doesn't know or care which is which. If you can make a strong distinction between unit and integration tests, you can have your code coverage run over integration tests (or skip unit tests). Not only will your code coverage be faster, it will be a more accurate representation of what your tests actually cover.

In fact, try to target your code coverage at end points in your code: that is to say, code run from cron, or from the command line, or a browser, or a GUI, and so on. If you find that you can't reach some code from any of its consumers, that's a strong candidate for deletion.

Yeah, whatever. I still have 100% code coverage

So what? I've written about this before. It's trivial to get 100% code coverage and still have buggy code. It's a given that this is impossible to avoid, but that doesn't mean we can't improve on the situation. Basis path testing would help with that, but I know precious few Perl programmers who have even heard about that technique, much less use it. That being said, it would probably be more common if Perl were easier to parse, but borrowing an example from Java:

sub return_input { my ( $x, $condition1, $condition2, $condition3 ) = @_; if ( $condition1 ) { $x++; } if ( $condition2 ) { $x--; } if ( $condition3 ) { $x=$x; } return $x; }

For basis path testing, you basically (er, not really) take the number of conditions in the code and add 1 and that tells you how many paths you need to test. Then you pick an arbitrary path and walk through the code, once for each path needed, altering on of the conditions. For the above, you get (T = true, F = false):

TTT

FTT

TFT

TTF

Then you write test cases assuming the above.

While you still won't have 100% path coverage, you'll have covered enough of the possible paths through your code that it makes it much easier to discover little known bugs, but sadly, Perl devs are still not pulling in useful information like that from testers in other language communities.

Duplicated Tests

You know that I've tried to go to war against duplicated code in tests. Test::Most, for example, means that instead of this in all of your test programs:

use strict; use warnings; use Test::Exception; use Test::Differences; use Test::Deep; use Test::Warn; use Test::More tests => 42;

You write this:

use Test::Most tests => 42;

What? You don't like the test modules I chose? Use my Test::Kit module and build your own.

But that's not enough. Learning how to not duplicate fixtures, or code behavior is also important. While there's often a t/lib directory for shared test behavior, a common anti-pattern is cutting and pasting test code. Hey, if we change behavior, at least we'll see tests fail, so it's not so bad, right? Yes, it's bad. Too much cutting and pasting in your test code is a sign that you're not taking care of your tests the way you should. When I see this in test suites, I look for, and discover, all manner of problems. Here's one in my own code I've just discovered.

I have a parser that reads a bunch of CSV data, validates it, and passes it to a data importer. Unfortunately, this CSV data was often generated by Microsoft Excel and when you export from Excel to CSV you can easily get corrupted Unicode, credit card and phone numbers getting converted to scientific notation, problematic date formats or just the hassle of stitching multiple worksheets into a single CSV file.

Thus, I wrote a quick parser subclass to read from Excel instead of CSV and all of the above problems miraculously went away.

And then I committed the cardinal sin of copying my CSV parser tests to the Excel parser tests. After a couple of updates to the parsers requiring cutting-n-drooling more test code between them, I started doing the right thing and switch from *.t tests to Test::Class and take advantage of test inheritance. My tests for the Excel parser look like this:

package Test::Company::Utility::Importer::Parser::Excel; use parent 'Test::Company::Utility::Importer::Parser'; sub _get_file_extension {'xls'} 1;

I've written no tests for the Excel parser because I inherit all of the tests from my regular parser. In my parent code I have something like this:

sub _get_file_extension {'csv'} sub _get_fixture { my ( $self, $fixture ) = @_; my $extension = $self->_get_file_extension; my $filename = "t/fixture/parser/$fixture.$extension"; unless ( -f $filename ) { croak("Uknown file: $filename"); } return $filename; }

And each test gets its fixture with $test->_get_fixture('concatenate_group_columns') and everything magically works.

And then I stumbled on a test which shows that my CSV parser can also read tab-separated data. How does inheritance work in that case? Well, it doesn't. It's clear that I should have a parser base class (or better still, a role) and CSV and Excel should be separate classes inheriting the abstract base class or implementing the parser role. Soon I'll be adding an XML version of this parser and it's now clear that having the CSV parser be the parent is very problematic. De-duplicating my code and "doing the right thing" showed a design flaw that I should have seen earlier and I'm going to have to deal with soon.

Test Metadata

It's long been said that bugs in code tend to recur. While I can't say if this is true, I do find that certain types of bugs tend to return in various code bases I work on. I still remember one terrible tangle of nested evals in one project that usually mysteriously did the right thing, but would sometimes break and we'd cobble together a solution, but we could never quite figure out how to unravel that rat king. We only knew that this was a common failure in that mess of code because of folklore within the team. A strong QA department will often track the types of bugs found and be able to say "well, we tend to get UI failures much more than data failures", but they're a level of abstraction away from the code and won't always know how failures really group together. By tracking this sort of data, it might be nice to say "you know, our CouchDB tests fail three times more often than other tests, so maybe there's something to look at there." But in reality, we don't do this.

I made a stab at solving this once but abandoned it, not convinced that I had taken the right approach. However, I believe the tagging feature in Test::Class::Moose can get us closer to this. Of course, its reporting capabilities already mean we can get useful timing information out of test suites in a way we hadn't previously. Oh, and you could use it to solve the code coverage of unit test problem I mentioned earlier. Oh, and if used correctly, it can not only reduce duplicated code, but also make your tests more discoverable. Hmm ...

Conclusion

Most projects that I work on have Perl tests in a hodge-podge of *.t tests. If I'm lucky enough to work on a Test::Class based test suite, I've found that every single one I've ever worked on is broken in some fundamental way. People are still duplicating code all over the place, we're not paying attention to the difference between different types of tests and we're tossing the tests in a big pile 'o code in a way that starts to become an obstacle after a while. And then we obsess over code coverage in ways that are less than helpful.

For small test suites, big deal. When you're working on mission critical code on a large code base, having a well-designed, easy-to-use test suite can be the difference between delivering next week and delivering next month. I would love to see the Perl community start doing more interesting things with tests, including developing strong tools for capturing metadata about tests that we tend to currently ignore.

Even if other languages are as bad as what I've mentioned above (and many are), that still is no excuse for dropping the ball here. We can do better. We should do better.