Once in a while, someone asks "How can I compile my Perl program to a binary?" Once in a while, someone answers "Use B::CC, at which point many someones shudder and reply "No, please never suggest such a thing, you horrible person."

Set aside that thought for a second.

You may have heard of Devel::Declare, which allows you to bend, fold, spindle, and mangle Perl syntax in a way that's safer than source filters but which allows nicer code such as signatures to work without making some poor fool like me patch the Perl parser. Unfortunately, D::D works by hijacking parts of the parsing phase to inject bits and pieces of alternate Perl code in place of non-Perl code.

The good news is that it's fairly well encapsulated and respects lexical scope. The bad news is that you're using Perl to generate Perl, which has many of the same drawbacks as when you use eval . (The good news is that you don't have to parse all of Perl. Make that great news.)

What you can't do easily is manipulate code that's already been parsed or compiled. Sure, you can manipulate the symbol table and examine things, if you know the relationships between and representations of Perl's internal data structures, but you're at the mercy of binary representations written in C, which can vary between major releases.

The B:: family of modules are not the answer because they exist at the wrong level of representation. It's not their fault—they do the best they can with what they can access—but they're doomed to hacks and workarounds and incompletenesses because of other incorrect decisions.

I've released Pod::PseudoPod::DOM on Github (it needs documentation and more work on XHTML output before it's ready for the CPAN) as part of my work on two Onyx Neon books, Liftoff and the upcoming second edition of Modern Perl: the book. I've written about the reasons why I revised the internals of the PseudoPod parser so heavily (everything is a compiler).

The same reasoning applies to the Perl parsing and compilation process.

If Perl had an intermediate layer between lexing/parsing and producing the optrees which the runtime uses to execute code, and if that intermediate form were a sufficient representation of a program, and if that intermediate form were accessible from C as well as Perl itself, we could solve a lot of problems.

(I've used B::Generate productively. It's difficult to do so. You get to dodge segfaults. You have to become an expert on the internals of the versions of perl you want to use. Note the plurals. Whee.)

In particular, a good macro system (one which is not "Run these substitutions over that code") would be possible. It might also be possible to translate certain classes of Perl code to other languages with substantially more ease, or to identify error patterns, or to perform better syntax highlighting, or to canonicalize the formatting and idioms of code in one fell swoop.

(You still have to deal with XS modules and the BEGIN problem, but you can embrace some ambiguity in the grammar and the abstract representation and still produce a valid and parsed representation even if you have to coalesce two alternatives into a single representation with out of band knowledge. It's not impossible to get 90% of all programs represented perfectly, and another 5% shouldn't be too much more work.)

Unfortunately, a proof of concept would likely take a good hacker a month of work. A solid demo is likely six months of work. The entire project probably represents two years of work.

It's still a pleasant daydream though.