Perl 5.16 and beyond

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

Perl 5's yearly release rhythm is well established. Because major releases come out every single year, a major release no longer introduces a slew of new features. Instead, it consists of a smaller set of features and bug fixes. Upgrading to a new major version is easier than it's ever been. So why upgrade at all? What's changed in Perl 5 recently, and where is it going?

Syntax extensions outside the core

One of the most important changes has been a push toward making the Perl 5 core more extensible. This work started with the Perl 5.12.0 release, and has continued since then. Many of the internal APIs have been cleaned up and documented. It has also become much easier to write extensions that work like Perl builtins, and even to extend the syntax in entirely new ways. Extensions can modify the "optree" while Perl is compiling your code. The optree is the Perl interpreter's internal representation of a piece of code. It's what the interpreter executes when it runs your program. When extensions are compiled down to ops, they can be as fast as an implementation in the core would be.

The smart match feature that was added in 5.10.0 provides a good example. Smart match includes the " given " and " when " builtins, as well as the " ~~ " operator. In the code, smart match examples can look like:

$value ~~ @array @array ~~ qr/foo/

~~

$value

@array

@array

The behavior ofvaries based on the type of both operands. The first example checks to see ifis a member of. The second example checks to see if any member ofmatches the regular expression. The Perl 5 Porters eventually realized that this feature's behavior is much too complicated. Unfortunately, this realization happened after the feature shipped, so we can't simply remove or change it out from under existing users.

What we need is a way to change the behavior in future releases while still providing access to the old behavior. We could, of course, just do this in the Perl core's C code. If the "enable old behavior" flag is on, we use the old code path, otherwise we use the new path. While this is feasible, it's not desirable. It clutters the core, and the more features that go down this path, the messier the core gets.

Jesse Luehrs's smartmatch module lets us alter the behavior of the smart match feature in a lexical scope by loading a module. The smartmatch::engine::core module implements the original behavior as introduced in 5.10.0. If we include " use smartmatch 'core' " in our own code, the smart match operator and builtins use the old behavior. Meanwhile, another module can include " use smartmatch 'sane' " and get a different behavior.

Because this module takes advantage of hooks in the Perl interpreter for syntax extensions, it can actually insert itself into the compiled Perl optree. This means that the extension can be as fast as the core implementation without being part of the core C code. When the smartmatch behavior in the core is changed, this module can be shipped with that release. We can even arrange for code that includes " use v5.14 " to use the old smart match behavior, while code that includes " use v5.18 " gets the new behavior.

This extensibility also makes it easy to prototype new features as CPAN modules. Florian Ragwitz's List::Gather module on CPAN adds gather and take "builtins" which compile down to operations in the parsed program. This syntax provides a nice way to create an array from an iterating operator:

use List::Gather; my @list = gather { while (<$fh>) { next if /^\s*$/; next if /^\s*#/; last if /^(?:__END__|__DATA__)$/; take $_ if some_predicate($_); } take @defaults unless gathered; };

This syntax could easily be included in the core simply by bundling the List::Gather module in the core distribution. All we need to do is make use v5.18 load List::Gather behind the scenes. There's no need to alter the core at all.

What's new in Perl 5.16.0

Perl 5.16.0 doesn't have any huge features, but it does have a collection of small features and bug fixes that will make Perl 5 better. In 5.16.0, we now have a __SUB__ token. This token returns a reference to the current subroutine, letting us write cleaner recursive closures:

use feature 'current_sub'; my $factorial = sub { my $val = shift; if ( $val > 1 ) { return $val * __SUB__->( $val - 1 ); } else { return $val; } }; print $factorial->(5);

The 5.16.0 release also adds support for the Unicode 6.1 standard, in addition to several other Unicode improvements. We now have much better support for Unicode characters in symbol names (packages, methods, etc.). The new fc operator and \F escape implement the Unicode foldcase operation, which does proper case-folding for all languages.

As part of the previously mentioned push to make the interpreter internals more extensible, 5.16.0 documents a number of functions for manipulating "pads". A pad (or scratchpad) is the data structure that stores lexical variables for each subroutine. This API was already in use by some modules, like List::Gather, but documenting it means that module authors can rely on the API's stability.

There has also been a lot of work on the core documentation. Perl 5.16.0 ships with a new object-oriented programming tutorial, and the object-oriented reference documentation has been rewritten from scratch with expanded coverage.

In addition to these changes, Perl 5.16.0 has many other bug fixes, performance improvements, documentation improvements, and core module updates.

A detour through the Perl 5 ecosystem

Just talking about the core when talking about the state of Perl 5 misses much of what makes Perl Perl. Perl's greatest strength has always been the Comprehensive Perl Archive Network (CPAN) and the larger Perl community. This is a huge topic, so I'll only hit a few highlights.

If you haven't looked at Perl recently, you may not have heard of Moose. Moose borrows from Common Lisp Object System, Perl 6, and many other languages. Moose provides a clean declarative API for declaring classes and roles. This eliminates huge amounts of boilerplate that Perl's native object system requires, allowing developers to focus on what their class does, rather than how it does it. Moose also provides a self-hosted metamodel, which means that it can be extended by writing classes and roles using Moose itself. Moose has been adopted by many CPAN authors for their own modules, and there are dozens of Moose extensions available.

Perl 5 has a number of excellent web frameworks available. Catalyst, Dancer, and Mojolicious are all mature, well-supported, and widely used. All of these frameworks build on top of the PSGI spec and Plack tools. These are inspired by Python's WSGI and Ruby's Rack respectively. Any application or framework that implements the PSGI spec can easily be deployed using FastCGI, standalone servers like Starman, mod_perl, or even plain old CGI. This makes writing and deploying Perl web applications easier than it's ever been.

Perl also has a number of Object-relational mapping modules, but the most popular is DBIx::Class. It has been under development since 2005, and has seen wide adoption throughout the community, attracting a number of core contributors, as well as inspiring dozens of extensions.

The services built around the CPAN archive are also exciting. The MetaCPAN API provides a web API to a database of every distribution ever uploaded to the CPAN archive. This API is open and freely usable, so anyone can build tools on top of it. There is already a new CPAN search site, also called MetaCPAN, that uses this API.

The CPAN Testers service collects test reports on every distribution uploaded to CPAN. Clients test the distributions on many different platforms and Perl versions. The service received its twenty millionth test report on March 7, 2012, and is currently receiving nearly 1 million reports a month.

The community is also busy organizing events. There are YAPCs (Yet Another Perl Conferences) this summer in the US, Germany, Brazil, and Japan, as well as many smaller workshops scheduled or in the works.

Future Plans for Perl 5

What happens after Perl 5.16.0? What will be in 5.18.0 or 5.20.0? Perl 5 development is volunteer driven, and I cannot commit anyone else's time. That said, here are some ideas that have been floated for future releases.

The project I'm most excited about is work on a MOP for the Perl 5 core. A MOP is a Meta Object Protocol. This is what Moose provides, but Moose does this as an extension. The goal is to create an API that can be implemented by the Perl 5 core and extended by modules like Moose. Putting this in the core has the potential to make modules like Moose much faster. However, the MOP is not just for Moose. It will be flexible enough to support multiple object systems, and will be usable as a minimal Object Oriented system all on its own, without extension.

As a bonus, this work will also include a few new bits of syntax. In particular, classes and methods will finally be distinct from packages and subroutines, and there will also be core support for roles, named method parameters, and attribute declaration.

I already mentioned the work on making the core more extensible. That work is an ongoing effort that opens up the possibility of breaking backward compatibility in a saner way, as with the smartmatch module example. This in turn frees up core developers to fix old design mistakes and introduce new features that might otherwise break old code.

Unicode work is also progressing. The 5.18.0 release will (hopefully) include support for set operations on Unicode character classes in regular expressions, as well as Unicode-related performance improvements.

While it's hard to predict the specifics of the future, I'm excited to see the activity and effort going into Perl 5 core development these days. The new release schedule, along with a move to Git, seems to have attracted some new contributors. Looking at the activity in the core and the community, Perl 5 is on a healthy path toward the future.