This is a write-up of some ideas that Tux and I bounced round following the CPAN River discussions at the QAH. When doing dev releases of dists that are "up the river", look for changes in the CPAN Testers results of downstream dists to see if you've had a knock-on effect. This could be automated in a 'river smoke tester'.

Let's say you're working on a release for dist A, which has a number of downstream dists:

With that complexity of downstream graph, it's a good idea to do a developer release. If you've got the time, you should wait 6 or 7 days to check you have good CPAN Testers coverage.

But that just tells you how well your dist checks out under various operating systems and versions of Perl.

As a next step, you could test whether your immediate downstream dependents are ok, for example using Test::DependentModules. But there's always a chance that you could change the behaviour of your dist in such a way that your dist's tests all pass, and the tests for P, V, and K all pass too, but something further downstream breaks. This could be down to a brittle test in one of the dists. You might change the wording of an error message, which doesn't break P or Q, but R has a test which matches on the wording and breaks when you do your release.

A good example of where you can break dists further downstream is where your dist is a trait / role / mixin. It's used downstream in a class, which is itself used by a further downstream dist. That downstream dist may test for your behaviour against the aggregating class, and fail, where the dist between you didn't.

We could create a smoke tester that tests all downstream dists for every module under test. One way this might work:

Let's say a new version of dist A has been released

The tester would start with a config of every dist that's on CPAN, latest non-developer release, including the previous version of A.

Run tests on all downstream dists, and record the results.

Install A, then run its tests.

Run tests on all downstream dists, and compare the results with the previous run.

Where there are differences, let the relevant people know.

A real-life smoker probably wouldn't be written quite like this, but you get the idea. Instead of checking OS+perl-version+dist-version (or whatever it is that's getting checked!), have a signature that includes the version of all upstream dists:

darwin-14.3.0|perl5.20.2|strict-1.07|JSON-2.90|...

If you haven't tested that signature before, upload the result.

For example, referring to the picture above, let's say that version 1.01 of A is released (i.e. not a developer release), and tests for A, P, and Q pass cleanly, but the new version of A breaks R. This may or may not break everything downstream of R. Depending on the nature of the breakage, some or all of the following may be called for:

If the breakage was due to fragile tests in R, then the author of R would need to fix the tests and release a new version.

The author of S should consider a release, setting a minimum required version of R. That might ripple downstream.

The author of A might have to release 1.02, if the breakage was A's fault rather than R's.

P might then need to specify a minimum version of 1.02 for A, and the same for Q.

A dedicated smoker would run tests on all downstream dists. I think some smokers test the immediate downstream, but I don't know if any do the full downstream graph.

If you look at the report from an individual CPAN Testers test, it contains the prereqs and the local versions:

So we might be able to build something on top of CPAN Testers, especially if we had several river smokers running. Then the reporting could come from CPAN Testers, rather than the smokers themselves. This would essentially be an analysis job running on top of the CPAN Testers data, but I wonder whether there would be enough coverage.

Furthermore the parts of the toolchain I'm familiar with don't upload what they think are duplicate results. So if you run a test where the only difference is the version of an upstream dist, and there's no difference in the end result, then the report isn't uploaded.

So what?

For distributions up at the headwaters of the CPAN River, this kind of testing might help us really stabilise the heavyweight distributions that most of us depend on.

If you could do a dev release, and know you were going to get this sort of coverage testing, I think you'd wait for the results.

Smoke testing on the CPAN River? It could be called the paddle steamer, or maybe the Paddle Smoker? The Tux Express!

Please enable JavaScript to view the comments powered by Disqus.

Disqus