First: This is neither a complaint nor a criticism. I understand the intent of the CPAN and its goals. I believe it meets those goals effectively.

If you talk to Jarkko about the CPAN, he'll likely tell you that it's primarily a distribution service. It's a series of regularly updated mirrors containing some metadata and an archive of redistributable code. Many proposals for enhancements and replacements and reinventions in other languages have come and gone. Most of them have tried to add complexity to this simple base. That's one reason they haven't succeeded.

This half of the CPAN makes code available to users.

Another half of the CPAN is PAUSE, the service which allows CPAN developers to upload their code to the metadata analysis and code distribution service.

The third half of the CPAN consists of the tools used to find and install CPAN distributions with partial or full automation. It's an optional part of the CPAN experience, but it demonstrates that the CPAN ecosystem also includes tools which rely on the metadata and mirroring services which the CPAN makes available. Without this metadata, the CPAN would be much less useful.

It's also the metadata which allows services such as search.cpan.org (which many people consider the face of the CPAN), RT for CPAN, CPAN Testers, CPANTS, CPAN Ratings, CPAN Forums, and plenty of other services now and in the future.

That's what the CPAN is: a loose federation of sites and services built around a code and metadata mirroring system, with an upload service for registered developers.

Who's It For?

I believe the primary beneficiaries of the CPAN are active CPAN developers.

By uploading your code to the CPAN, you get worldwide mirroring and distribution. You get test results from a wide variety of platforms and versions. You get bug tracking, documentation hosting, reviews, and feedback on the quality and efficacy of your distribution.

You get to push your installation and dependency management to CPAN installers. Because CPAN tools are effective about gathering dependency information and publishing it in a form that other CPAN tools can understand, the easiest way to install distributions from the CPAN is with a CPAN shell such as CPAN.pm or CPANPLUS. Utilities exist for free software distributions such as Debian and Gentoo to wrap CPAN distributions into OS packages where the packaging system can manage them, but they're necessarily specific to individual platforms, where the CPAN shells can run on any operating system where Perl 5 runs.

One strong benefit of the existing CPAN shells is that they run distribution test suites before installation by default, refusing to install when test failures occur. This provides strong pressure to review, report, and fix test failures; the focus is on quality by default.

Active CPAN developers know when and how to report bugs, how to read CPAN Testers reports, and how to force installations. They may know how to use the BackPAN or to use an earlier version of a dependency.

This brings up a subtler feature of the CPAN which optimizes the experience for active CPAN developers: you always get the newest version of a distribution. While a PAUSE/CPAN shell hack allows developers to upload a development version which people cannot install accidentally, there's little ability to specify in dependencies that you want users to install a specific version of a dependency. One accidental upload in any of a dozen distributions could render half of the distributions on the CPAN uninstallable.

In some ways, this feature creates and exacerbates a problem. It can be difficult to bundle a distribution and all of its dependencies as the dependency graph can change during the bundling process.

A CPAN for Normal Users

What would a CPAN look like for normal users? ActiveState's PPM isn't a bad model in some ways, though it hews too closely to the CPAN itself in others. Binary repositories for Linux distributions have other advantages. I can think of several attributes of a CPAN enhancement for non-developers:

Binary distributions, or at least not requiring the presence of a C or C++ compiler and make utilities. This could be optional.

utilities. This could be optional. Run the tests on installation for verification and reporting purposes. This could also be optional, but I like the quality-by-default approach.

Bundling a distribution and all of its dependencies into a single, installable package.

Automatic relocation (perhaps through the use of local::lib or something similar) to allow multiple versions of a single distribution installed and usable.

or something similar) to allow multiple versions of a single distribution installed and usable. Regular, tested updates to bundles and the contained dependency graphs.

Working with upstream.

Integration with OS packages.

The latter two I have no good ideas how to accomplish. Working with upstream can be difficult in the normal case; not everyone looks at CPAN Testers reports or the CPAN's RT or other CPAN extensions. Building OS packages seems like a lot of trouble and a lot of duplicate work.

Even so, the Perl 5 ecosystem already has most of the tools necessary to build such a thing. We can build a dependency graph for most CPAN distributions, and we can identify those without accurate graphs. We can calculate the likelihood of tests passing on various Perl 5 versions and platforms given that graph. It only takes a little bit of code to bundle most graphs into a dependency-first installable bundle, and a small loader module could set @INC paths appropriately.

Given a list of dependencies, it's possible to analyze the potential graphs for solutions and identify potential points of conflict or failure. If solutions exist, the software could create an installable bundle. Source code is the easiest, but a binary is possible.

It's also possible to keep these graphs and bundles up to date, with a lag of a few hours to a couple of days. Though calculating the possible solutions from a graph may be expensive, most of the information is cacheable.

Would people use such a system? I don't know. Should it replace elements of the current CPAN system? Never; it addresses a different purpose. Is it worth building? The idea continues to tickle my mind.