When working with legacy code, it's useful to have a variety of tools that let you better understand your code base. For example, I recently wrote about finding unused subroutines. That heuristic approach was fine because I was still going to inspect the code manually rather than automatically remove the code.

So now I hacked out a rough "duplicate code finder" for Perl. It focuses on cut-n-drool code and has found more than I would have thought (even in my code!). If a developer changes variable names, it won't find it, but if I hacked around with B::Deparse , I could fix that, too.

Here's the hack:

On a recent run, it took about 15 minutes to find duplicate code in a small project with 111 .pm files, though on DBIx::Class, it took less than five minutes on 375 files.

Patches welcome!

In other news, my "Beginning Perl" book has its first review on Amazon and it's five stars. I feel like I've dodged a bullet :)

Update: I just ran this against Catalyst, Moose and DBIx::Class. The first two ran quickly and were remarkably clean. DBIx::Class is still running and here's some sample output (not all of it):