The unstoppable Perl release train?

This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

There are a number of release management styles to be observed in the free software community, but many of them fall within the two broad categories of "time-based" or "feature-based". A time-based project fixes a date for its next release, then adjusts the set of features included in that release to make the intended date at least somewhat plausible. Feature-based projects, instead, pick a set of desired features and hold the actual release "until it's ready." Arguably, the trend over the last decade has been in the direction of time-based releases. Perl 5 is a relatively recent convert to the time-based model; the project is now faced with a decision between making a release with known problems - including possible security issues - or delaying its 5.16 release.

Contemporary Perl's release schedule is approximately one major release per year. The 5.14 release came out on May 14, 2011, so it is not too soon to be thinking about 5.16. That release is indeed stabilizing with a number of new features. A key aspect of this release is a familiar story: as with most other languages, Perl developers are trying to improve their support for Unicode in all situations. Many developers have participated in this work, but Tom Christiansen has arguably been the most visible. His recent work includes the preparation of an extensive Perl Unicode cookbook demonstrating Perl's Unicode-related features which, he has suggested, are now second to none.

As this cookbook was being discussed, Karl Williamson pointed out that some of the examples where Perl is put into full-time UTF8 mode (a very useful mode for contemporary programs) are unsafe because the language does not properly handle malformed strings. At best, such problems could lead to the corrupted strings seen in so many settings where character encodings are not properly handled. But, as Christian Hansen suggested, things can be worse than that:

I would love for this to happen, I have advocated this on #p5p several times, but there is always the battle of "backwards compatibility disease". About 10 months ago I reported a security issue reading the relaxed UTF-8 implementation (still undisclosed and still exploitable) on the perl security mailing list.

What followed was a classic missive from Tom trying to get to the bottom of the problem. He listed nine different ways to tell Perl to operate with UTF8 and asked how many of them were truly vulnerable to undetected encoding problems. If Perl's UTF8 handling is unsafe by default, he said, it needs to be fixed:

If there's something so important that it must be done everytime to ensure correct behavior, then that is too important to be left up to the programmer to forget to do. It needs to be done for him.

And, he said, this situation really needs to be treated as a blocker for the 5.16 release; to do otherwise would be to delay the fix excessively and cause the world to be filled with bad workaround code.

In many projects, the prospect of open security-related problems would at least cause people to think about delaying a release. On the Perl list, though, Tom found little support. Aristotle Pagaltzis responded:

5.16 *must* be released whatever the state of this issue. To not do so is to fall into the thinking and behavioural pattern that stifled the release of 5.10 by several years. Perl 5 switched to a timeboxed release cycle because “this one more thing has to be polished before we can ship it” meant it never shipped at all.

Ricardo Signes added:

It's definitely true, though, that 5.even.0 releases *are no longer milestones.* Or, rather, they are milestones in a much more literal sense than is often meant. 5.16.0 means that we've come about one year since 5.15.0. It does *not* mean that we have met a series of goals named at 5.15.0, for example.

And, with those words, the public email discussion faded away. At this point, there is little clarity on which Unicode features are safe to use, what might be required to fix the rest, how many of those fixes might be ready in time for the 5.16 release, or whether programs using UTF8 in Perl 5.16 (and earlier releases) suffer from known-exploitable security problems. Tom, who probably understands these issues better than just about anybody else, said:

Right now I'm very hazy on the real status of all this stuff, and I am very uncomfortable with the idea of relentlessly charging ahead toward a release like a freight train with no brakes.

The response he got claimed another train would be coming along in a year and the fixes could catch a ride on that one.

Releasing software with known bugs is a common practice; even a project like Debian cannot make the claim that there are no known problems with its releases. To do otherwise would make software releases into rare events indeed. It is also true that time-based releases have value; users know that they will get useful new code in a bounded period of time. Anybody who has watched release dates slip indefinitely knows how frustrating that can be for both users and developers; the 2.4.0 or 2.6.0 kernel releases are a classic example - as is Perl 5.10. So the Perl developers who say that the show must go on are doing so in accordance with many years of free software development experience.

That said, releasing software with known, security-related issues in something as fundamental as UTF8 support risks tarnishing the image of the project for a long time. There is not enough information publicly available to say whether Perl's UTF8 problems are severe enough to incur this risk. But it is probably safe to assume that, as a result of this conversation, crackers are looking at Perl's UTF8 handling rather more closely than they were before. Perl may have had some of these problems for years without massive ill effect, but they may not remain undiscovered and undisclosed for much longer. If the problem is real and exploitable, people are going to figure it out and take advantage of it.

One would assume that, behind the public positioning, the relevant developers understand what's at stake and are taking the time to understand the scope of the problem. They may not want to stop the release train, but seeing it derailed by a known security problem after release would not be much fun either. A release delay now may prove less painful than a security-update fire drill later.

