A Perl 5 Overview and Quick-reference John Gabriele,

Intro

This is a rough Perl 5 quick-reference that contains some brief overview material, recommendations, and reminders regarding how to use Perl 5. It assumes you’ve used Perl 5 before, but maybe have gotten a little rusty. :)

A good place to look for more info on using Perl 5 is the Modern Perl book.

Other good documentation, articles, and books:

Installation

You may wish to install your own Perl 5 (ex. ~/opt/perl/bin/perl ) instead of using the one that comes with your system ( /usr/bin/perl ). The system Perl is often used heavily by various system processes, and it’s probably best not to tinker with it too much.

Tip: Use only your OS package installer (ex. apt-get or yum ) to install Perl modules into the system Perl. Use only cpanm (or similar) for installing modules into your own Perl. Of particular importance: don’t use cpanm with your system Perl. For example, when using Debian, apt knows about what it installs, but wouldn’t know about packages with cpanm .

Installing your very own Perl 5, OTOH, lets you experiment, tweak, and even break things — you can always just wipe it out and start fresh again, if that becomes necessary.

To install your very own Perl 5, grab the Perl 5 source code and manually install into ~/opt like so:

cd ~/opt mkdir perl-5.n.m mkdir perl-5.n.m/bin cd src # Make this dir if you need to. cp path/to/perl-5.n.m.tar.gz . tar xzf perl-5.n.m.tar.gz cd perl-5.n.m # rm -f config.sh Policy.sh ## Not necessary for a newly-unpacked kit. ./Configure -Dprefix=/home/ < you > /opt/perl-5.n.m # Answer a lot of questions, accepting the defaults. # If it complains that it can't find "/usr/bin/less -R", # tell it to use "/usr/bin/less" instead. make make test make install # Final step (to be put on your $PATH). cd ~/opt ln -s perl-5.n.m perl

(where n and m are the version and sub-version numbers, and <you> is your home directory name).

Now add the following to your ~/.profile :

export PATH=$HOME /opt/perl/bin: $PATH

and log out/in again for the change to take effect. Check that you’re getting the perl you think you’re getting:

which perl # should be ~/opt/perl/bin/perl perl -v # should be your newly-installed v5.n.m # Alternatively: perl -e 'print "$]

"' # or perl -e 'print "$^V

"'

As another option, you can instead use perlbrew to automatically install one or more Perls into your home directory.

* * *

Before doing anything else (but after you’ve logged out and back in again so the new perl is first on your PATH), install the cpanminus tool. I suggest using the clever “bootstrap method”, whereby cpanm installs itself:

curl -L http://cpanmin.us | perl - App::cpanminus

You should now have ~/opt/perl/bin/cpanm present.

As a quick test, have cpanminus install Modern::Perl:

cpanm Modern::Perl

This will install Modern::Perl into your own Perl’s site_perl dir.

Perl doesn’t come with a REPL out of the box. Install this one:

cpanm Devel::REPL

After that completes, you may also need to install a handful of supporting modules for the repl to work smoothly, such as File::Next, B::Keywords, and Term::ReadLine::Gnu. After that’s done, try it out:

$ re.pl

Here are some commonly-used modules you also might install right off the bat:

DateTime Moose Config::Tiny Template Try::Tiny IPC::System::Simple Capture::Tiny File::Slurp

and for database use, probably also:

DBI DBD::SQLite

For creating and distributing your own Perl 5 modules:

Module::Starter Dist::Zilla

Finally, you may want to set up a local Perl module directory in your home directory for installing modules which you’d rather not put into your Perl’s site_perl dir (for example, more specialized, experimental, or project-specific modules). For this, use local::lib.

Local Perl docs

You can get at your perl docs using the perldoc command. They are also available online as html at http://perldoc.perl.org/.

To see the master ToC, see perldoc perltoc .

To see all the built-in functions grouped by category, see perldoc perlfunc and page down a couple of times.

You can always jump straight to the docs for a given function using the -f option, for example: perldoc -f sort . You can read the perldocs of a local file like so: perldoc ./myfile.pl (you can use the -F option here to speed things up).

Go to perldoc perlfunc and go down a page or two to find a handy arrangement of Perl functions grouped by category.

See perldoc perl to get the big table of contents of all the available perldoc pages.

At the top of every script/module

Always (unless you have a good reason not to) start your scripts and modules with use strict and use warnings , or else use Modern::Perl (which does those for you, plus a bit more).

Language fundamentals

Expressions are bits of code that perl evaluates to some value. They are made up of terms and operators. Statements tell the interpreter to do something, and are made up of expressions. Declarations are like statements, but only tell the interpreter to learn something. Blocks are one or more statements separated by semicolons and delimited as a whole by braces.

The $ , @ , and % sigils are for scalar, array, and hash expressions, respectively. Variables look like $foo , @bar , and %baz .

Strings can be ‘single-quotish’ (hard strings) or “double-quotish” (soft-strings). There’s also alternate syntax: q{single-quotish} and qq{double-quotish} Double-quotish strings allow escapes and variable interpolation. You can interpolate with curlies in there too if necessary, "like ${this}tastic." . You can interpolate an item of a list ( @ar = qw/this am and or but/ ) into a string "like $ar[0] for ex$ar[1]ple" .

Variables represent the value itself — they are not “references” to the values unless you explicitly make a reference (ex. my $foo = \@a ).

When you do my @b = @a; , you’re making a shallow copy of @a . To make a deep copy, use Clone or Storable (see its dclone function).

Strings can be modified in-place (for example, using s/// , chomp , and substr ). Note, you can’t index into a string as if it were an array — it’s a $calar, not an @rray.

Use length $a_string to get the length of a string. For the length of an array, just evaluate it in scalar context. To get the length of a hash, do scalar keys %h .

$_ is the default arg in a number of places, for example: default item in for and “ while (<>) ” loops default arg to chomp , print , say , and others m// matches against it unless you use the binding operator ( =~ ). likewise with s///

You use a different set of operators for working with numbers than with strings:

numerical op string op = eq != ne < lt <= le > gt >= ge <=> cmp * x + .

Use the dot “.” to do string concatenation. There’s no op for concatenating arrays; you just write them together: (@a1, @a2) .

=> is the fat comma. It’s like a regular comma, except that it autoquotes what’s to the left of it if it’s just a simple identifier.

.. is the range operator. Works for numbers, and characters too (ex. ‘a’ .. ‘z’).

<> is the line input operator, a.k.a the angle operator, a.k.a the readline function.

You can put underscores in number literals, as in 1_000_000 , 0x0000_1111 , 0b11_00_11_00 , etc.

Perl 5 has no boolean literals. If necessary, just use 1 and 0 to mean true and false.

undef , 0 , 0e0 (that is, 0×10⁰) (well, any 0×10ⁿ (0e1, 0e2, …)), q{} (the empty string) and '0' (the string) are all falsey values. They evaluate to false in a boolean context. All other scalars are truthy.

Note that there’s a difference between expressions as they appear in your source code (at compile-time), and the values that the interpreter evaluates them to (at runtime). Context is determined at compile-time.

A list is something that exists at runtime, in the Perl interpreter. In english, when we see a number of things separated by commas, we tend to call that a list. However, when discussing Perl compile-time expressions, it’s more accurate to call a bunch of things separated by commas a “comma expression”.

An array is one of @these in your code. What it gets evaluated to at compile-time depends on the context.

Empty lists ( my @a = (); ) and empty hashes ( my %h = (); ) evaluate to false in a boolean context. If either have anything in them though, they’re true.

print only prints what you tell it to. It won’t put spaces between the args you pass it, and you need to include a "

" if you want one. say tacks on a newline for you.

By default, arrays interpolate into strings with their elements separated by spaces (which makes them easy to print: say "the items are: @ar"; ). Hashes don’t interpolate; if you want a string representation of a hash, maybe use Data::Dumper, Data::Printer, or Dumpvalue.

Parentheses are used for all grouping, lists, and hashes. You also need to put parens around an if , while , for , etc. condition expression.

hashes: creating one: my %h = (foo => 2, bar => 4); setting a key/value pair: $h{baz} = 3; accessing a value: my $bar = $h{foo}; . In list context, a hash unwinds into one long flat list of key/value pairs: my @a = %h; . (So, if you know all your values are unique, you can invert a hash like so: my %inverted_h = reverse %h; .)

For hashes, exists tells whether or not the key is even there. defined tells if the value (for an existing key) is defined or not: if ( exists $h { $key } ) ... # Is $key in this hash? if ( defined $h { $key } ) ... # Is its value defined? if ( $h { $key } ) ... # Is its value true?

splice is for arrays; substr is for strings. Also note: delete is for hashes and hash slices — to remove items from arrays, use shift , pop , and splice . You may delete the current item from a hash while iterating over the hash (see perldoc -f each ), but don’t try that with an array.

Perl 5 doesn’t have a “set” data type. For that, use either Set::Scalar or else use a hash (and ignore its values).

Define a function like so: sub foo { ... } . Some quick notes: Within a sub, you can just refer to (and change) globals normally. Last expression evaluated is what gets returned, however, it’s probably better style to always use an explicit return (and use a bare return to indicate failure (thanks PBP)). Args get passed in by reference via @_ . If you want local copies of them, do: my ($foo, $bar) = @_ . Although @_ is a local, its contents ( $_[0] , $_[1] , etc.) refer to the variables in the caller’s scope — they are aliases to them. Subroutines are package globals.

You can call built-ins as functions or as operators. If you call them as functions (with explicit parentheses), they have very high precedence. If you call them as operators (no parens) they have very low precedence.

Take a reference by adding a backslash in front of the variable: my $foo_ref = \@my_array; .

The special syntax for a literal array ref is [...] , and for a hash ref it’s {...} .

Dereferencing: my $foo_ref = \ @my_array; # Take a ref of @my_array. my @a2 = @ { $foo_ref }; # Dereference $foo_ref. That is, you put a reference inside ${} , @{} , or %{} to dereference it. The braces are sometimes optional, but I like to include them, for clarity.

There are shortcuts for dereferencing. Observe: my %foods = ( ' good ' => [ ' beets ' , ' spinach ' , ' carrots ' ], ' bad ' => [ ' twinkies ' , ' devil dogs ' ], ' ugly ' => [ ' gruel ' , ' slop ' ], ); my $f = $ { $foods {good} }[ 1 ]; # (spinach) Not using any shortcuts. my $g = $foods {bad}->[ 0 ]; # (twinkies) Using the arrow shortcut to dereference. my $h = $foods {ugly}[ 1 ]; # (slop) Perl lets you omit the arrow here. # Also: my @ar = ([ ' a ' , ' b ' , ' c ' ], [ 1 , 2 , 3 ], [ ' foo ' , ' bar ' ]); my $s1 = $ { $ar[0] }[ 1 ]; # Not using any shortcuts. my $s2 = $ar[1] -> [2]; # Using the arrow shortcut. my $s3 = $ar[2][1]; # Perl lets you omit the arrow here.

Use map and grep to easily build lists from other lists (block forms are preferable).

You can stash data at the end of your file after a line that has __END__ on it. Access that data via the DATA filehandle. If it’s binary data, base64 encode it first (see MIME::Base64 ).

Use die to write to stderr and exit. Use warn to write to stderr but not exit. If writing a module, use the Carp equivalents to give more info to the users of your module.

Context

When perl is evaluating an expression (at compile-time), what type of value it expects to find (scalar or list) depends upon context.

Context is determined at compile-time when perl parses your source code.

If perl is expecting a given expression to be a scalar, it tries to evaluate it such that it provides a scalar.

is expecting a given expression to be a scalar, it tries to evaluate it such that it provides a scalar. If perl is expecting a given expression to be a list, it tries to evaluate it such that it provides a list.

If a scalar is put into a list context, it usually produces a one-element list.

If a list-producing expression is put into a scalar context, it will hopefully evaluate to something useful. For example, an array will yield the number of elements it contains.

You can force a scalar context by using the scalar operator.

In the docs, when you see something like “ sort LIST ”, it means that the sort operator provides a list context to its arguments. Furthermore, if the operator provides a list context to an argument, it also provides a list context to the elements of that list argument.

Lexicals, globals, and scoping

Perl provides 2 kinds of namespaces for variables to live in: “package” (i.e. “symbol tables”), and lexical. Package variables are globals (aka “package globals”), are dynamically scoped, and live in named symbol tables. Lexicals are locals and live in unnamed lexical scopes. File scope is the largest possible lexical scope.

When one subroutine calls another, the one being called is in the dynamic scope of the caller. When one block is inside another, the inner block is in the lexical scope of the outer one. Note, however, that when you get to the end of a block where you started, you leave the current lexical and dynamic scopes.

If you like, you can create your very own secluded lexical scope with an empty block:

say " Noisy outer scope! " ; { # Nice quiet local scope in here. my $i = 3 ; say $i; } # say $i; ERROR

Symbol tables are actually global hashes, and contain the names of the variables in them. All the built-in globals (like @ARGV , %ENV , $$ , etc.) are located in the main symbol table. Incidentally, within each namespace there’s a sub-namespace for each sigil (that’s why $foo and @foo are 2 different variables). If you want to refer to every “foo” in a symbol table — regardless of sigil — you use a “typeglob”. Perl uses typeglobs to implement the importing of modules.

A fully-qualified package variable name like $Foods::Veg::Tomato::variety shows you the structure of the nested symbol tables — the most deeply-nested of which contains the global $variety scalar. The fully-qualified name also indicates that the file path leading to Tomato.pm is Foods/Veg/Tomato.pm .

Note: you can’t see local/lexical/my variables in a module from outside that module.

When you call use , it often imports package symbols (or else gives the compiler some hints (as “pragmas”)) for use in the current lexical scope.

A package declaration (usually at the top of a module) is lexically scoped, and declares the name of the current default package until the end of the current lexical scope (usually the file).

Recall that use happens at compile-time. require happens at run-time.

Aside from package and use (and require ), the three operators dealing with lexicals and package globals are my , our , and local :

my declares a lexically-scoped variable. It’s name and value are both stored locally, only.

our declares a lexically scoped name that refers to a package global. Using the above example, the Tomato.pm file would contain our $variety; in it. our does not create values — it just gives you access to the global, though, you usually give access to and create it at the same time, ex.:

our $foo = 7 ;

Note: “ our ” replaces “ use vars ”.

local sets up a temporary value for a package global — but only for the current dynamic scope. That is, if you use local in a sub, and then call another sub, in that 2nd sub you’re still in the same dynamic scope, and so will still see that localized value. Once the 1st sub returns, you’re back to the pre-localized value. Of course, same thing happens if you come out of a lexical scope — you’re back to the value that the package global had before the scope you were just in.

Only use local if you have a good reason to.

More on operators

A little more on operators:

They come in three flavors: unary, binary, and trinary.

The things operators work on are called “terms”.

Autoincrement and autodecrement have a little extra magic when dealing with alphanumeric strings.

The -> is a binary infix dereference operator when used like so: $a_ref -> [0] $h_ref ->{foo} $s_ref -> ( ' bar ' ) Here’s an example of using it with a reference to a function: sub foobar { ... }; my $fn_ref = \&foobar; ... &{ $fn_ref }(); # calling the function $fn_ref -> (); # same Otherwise, the arrow is used for method calls, like: my $f = $Foo -> new(); $f -> bar();

Use ** for raising a number to a power.

=~ is the regex binding operator. =~ is for “match”, and its cousin !~ is for “doesn’t match”. These binding operators have a pretty high precedence.

You can get a list of n things like so: “ my @a = ('whatever') x $n; ”. To get “=====”, do: “ my $s = q{=} x 5 ”.

Among the assignment operators, note the presence of ||= , //= , .= , and x= .

In list context, the comma is just as separator. In scalar context, it’s an operator, but not one you’d normally use.

You can make a reference to a list of words like so: [ qw/foo bar baz/ ]

For more, read perldoc perlop .

Keywords

The details are in perldoc perlsyn . Among others, you’ll find info on:

if , unless , elsif , else

, , , for , while , until

, , next , last , redo

, , continue (see perldoc -f continue )

Note, in Perl you can label loops, if you like.

LINE: for my $line ( @lines) { #... next LINE if $thus_and_so; }

BTW, $line is lexically scoped to that for-loop block. Same thing with while loops like this:

while ( my $r = get_thingy()) { # do something } say " it was $r! " ; # ERROR: $r is out of scope here.

Incidentally, a bare block:

{ print " hi

" ; }

is the same as a loop that only loops once. It’s contents have their own lexical scope, and you can exit it using last (or even start it over using redo ). if and unless blocks are, of course, not loops.

When looping over a list, the variable used for each item ( $_ being the default) actually aliases the item. So, you can change the list items on the fly. However, don’t delete items on the fly — for that, append to a separate array instead.

Built-ins

Perl comes with a good number of built-in functions. They’re also sometimes referred to as operators when you don’t put parentheses around their arguments. When you use a built-in that only takes one arg, and you don’t use parentheses, it’s called a named unary operator.

See a nicely-organized list of them at perldoc perlfunc .

Standard library

Perl comes out-of-the-box with many modules in its standard library. To see the list of them, run perldoc perlmodlib .

Commonly-used variables

Besides $_ , a few of them are:

@ARGV — args passed into this script. (Recall, the program’s name is stored in $0 , not in $ARGV[0] .)

— args passed into this script. (Recall, the program’s name is stored in , not in .) %ENV — holds environment variables.

— holds environment variables. %INC

@INC

%SIG — to set up signal handlers.

— to set up signal handlers. $? — see perldoc perlvar under $CHILD_ERROR

— see under $!

See chap. 28 of the Camel for more, or read perldoc perlvar .

Quoting

qq{} ( "" ), q{} ( '' )

( ), ( ) qx{} ( `` )

( ) qw{}

m// , s/// , tr///

, , qr{}

See the perldoc perlop section “Quote and Quote-like Operators” for more info.

If you need a multi-line string, use the heredoc syntax:

my $long_string = <<"END_OF_STRING"; # Note explicit quoting. la dee da va va va $voom ok, done END_OF_STRING process_long_string( $long_string); process_long_string( <<'EOS' ); line one line two line three EOS

For dealing with regexes

m// , s///

, Captured groups go in $1 , $2 , …

, , … There’s also $` , $& , and $' for pre-match, match, and post-match.

Remember that, inside the current regex (i.e. during the match), you use \1 . Outside the match, you use $1 . For s/foo/bar/ , you’d use $1 (not \1 ) where “bar” is.

The /g regex modifier is for globally finding matches. What the pattern match yields is described in the following sections.

It’s probably best practice to always use “/xms” at the end of your regexen.

Mostly, what you’ll be doing with regexes is:

checking if a string matches a regex: if ( $s =~ m/.../xms ) {...}

doing a search/replace: $s =~ s/foo/bar/xms; (add “ g ” to the xms to replace all)

iterate over a bunch of strings you find in some long string: my $long_str = <<'EOS'; line one foo11bar baz line two foo283bar moo line three foo321bar yo EOS while ( $long_str =~ m/ foo ( \d +) bar /gxms ) { say $1; }

Resulting value after an attempted match (m//)

In scalar context

Without /g : If a match, returns true (1). If no match, returns the empty string (false).

With /g (“progressive match”): If a match or no match, same as without /g . However, each subsequent request for a match moves the position pointer to just after the previous match.



In list context

Without /g If a match, returns the list of matches captured by the grouping parentheses (if there’s no grouping parens, then returns (1) ). If no match, returns the null list.

With /g : If a match, and no grouping parentheses, returns a list of all matches found. If there’s parens, returns the strings captured. If no match, same as without /g .



Regarding s///

You can use /g with s/// as well, and it does what you’d expect. Regardless, s/// returns a number telling how many times it succeeded in doing the replacement. But note, s///g in scalar context is not progressive like m//g is — you need to manually loop for something like that.

There’s a lot more to regexes, of course. For details, see chapter 5 of the Camel, and/or perldoc perlre .

Object Oriented Programming

Use Moose. Or, for something much smaller, faster to start up, and less featureful, see Moo).

Basic Moose usage: To create a WoodStove class, put the following into WoodStove.pm:

package WoodStove; use OtherModules; use YouMightNeed; use Moose; use namespace::autoclean ; # extends, roles, attributes, etc. # methods no Moose; __PACKAGE__ ->meta->make_immutable; 1 ;

Files

Some functions: open , close , chdir , glob , unlink , rename , mkdir , rmdir , …

Regarding open :

use autodie qw/ :all / ; open ( my $in_file, ' < ' , ' input.txt ' ); open ( my $out_file, ' > ' , ' output.txt ' ); my $one_line = < $in_file>; my @all_lines = < $in_file>; print { $outfile } @all_lines; # Note extra braces and no comma. close $outfile; # Easiest way to read in all lines of a file: use File::Slurp qw/ slurp / ; my $one_big_string = slurp $in_file; my @all_lines = slurp $in_file;

Note: you don’t need to check for success when opening files (those first 2 lines) if you’re using autodie . And you should indeed be using autodie .

You can do tests on files using their filename and the various “ -x ” tests such as -r (is readable), -w (is writable), -e (exists), and so on. For the full list of tests, see perldoc perlfunc — they’re listed near the top.

Note that glob has some special magic: if it’s in the condition of a while , for , or until loop, each time through it’ll give you the next filename.

while ( my $fn = glob ' *.txt ' ) { say " Found $fn " ; }

This is analagous to the magic of the line input operator.

Processes

Aka, “shelling out”.

Use system when you don’t need to capture the output of the program you’re running, and when may need to interact with it (via stdin/stdout). Return value is the exit status of the program you shelled-out to.

Use backticks for running a shell program and capturing its output (ex. my $foo = `date`; ). What’s between the backticks is double-quotish.

See also the docs for IPC::System::Simple and Capture::Tiny.

Exception handling

See perldoc -f eval and perldoc -f die .

POD

Perl 5 POD is pretty no-frills. Blank lines separate paragraphs. You indent what you want rendered verbatim. You can get I<italic> , B<bold> , and C<monospace> . Headings are made with =headn where n is 1, 2, 3, or 4. Lists are made with =over , =item * (or =item foo ), and then with a =back at the end of the list. End POD with =cut . It works ok for manpage-style docs.

Perl 6 Pod (see Synopsis 26) is the newest standard for Pod.

That said, this quick reference is written in Pandoc-Markdown.

Packages, Modules, and Distributions

A distribution is the tar.gz file you can download from the CPAN (which cpanm downloads for you). These are typically named “Like-This-1.02.tar.gz”.

A module is a .pm file that you can use in your code. It may contain zero or more packages (described below).

A package is a namespace. At the top of your module you can specify the current package name which sets the name of the default package for whatever follows. Packages usually have a version number as well.

package MyPackage; our $VERSION = 0.01 ;

(Think of version number i.NNNN like version i.j.k where j and k can be up to 2 digits and are padded with a zero if < 10.)

You generally use CamelCase for package and module names. You can specify more than one package per file, but it’s simpler to have just one per file. That is, one .pm file == one module == one package. Keep it simple.

You name your module the same as the tail end of the package name, but with “.pm” at the end. Further, if the package name contains :: , you place the file in the corresponding nested directory. So, for package Foo::Bar::Baz; , you’d have ~/perl5lib/Foo/Bar/Baz.pm . For your own simple modules, there’ll usually be no colons in the package name, and you’ll just drop the files directly into your ~/perl5lib (no subdirs needed or required).

You’ll need to have use lib '/home/you/perl5lib' in your source file for it to find your own modules.

Checking if you have a module installed

# If you can `use` it, it's installed. perl -M Some::Module -e 1 # If you can read its docs, it's installed. perldoc Some::Module

List all core modules installed

To see the full list of core modules for a given version of Perl:

corelist -v 5.n.m

(See corelist -v for the full list of versions that corelist knows about.)

List all extra modules you’ve installed

perldoc perllocal

Checking if a given module comes standard with Perl

Use the corelist script that comes with Perl, ex.:

corelist List::Util

Find out where a given module is installed

perldoc -l Some::Module

See pmtools.

Uninstalling modules

Tricky.

You might opt to always install modules into your own ~/perl5lib (say, via local::lib). This way, you keep your Perl’s site_perl dir relatively clean, and you can tinker around with your ~/perl5lib dir as you wish. Then, if you really botch up your ~/perl5lib, you can always wipe it and start over without harming your Perl installation.

Using modules

To use modules, whether they come with the standard library or from elsewhere, you just use them near the top of your file like so:

use Foo::Bar ; # Import whatever Foo::Bar exports. use Foo::Baz qw/ func1 func2 / ; # Only import func1 and func2. use Foo::Moo (); # Do not import anything.

You’ll sometimes see something like:

use Foo::Bar -moo;

The two things going on there are:

Putting a dash in front of a bareword does a little magic and makes it into a string which starts with a dash. Putting a lone string where a list is expected evaluates to a one-item list.

Where does Perl search for modules?

Paths are stored in @INC . See them yourself: perl -MModern::Perl -e 'for (@INC) { say; }'

Installing modules in a local directory

You can use cpanm to install modules into your own local ~/perl5lib directory (just like how it normally installs into the site_perl dir) by first installing and setting up local::lib. Follow the “bootstrap method” to install local::lib into your own ~/perl5lib dir.

TODO: more instructions here.

Writing your own procedural modules

For simple modules, just put this into your /home/<me>/perl5lib/Foo.pm file:

package Foo; our $VERSION = 1.01 ; my $whatever; # Only visible inside this module. our $bar; # Can be accessed outside via `$Foo::bar`. # Access outside via `Foo::baz()`. sub baz { #... } 1 ;

No need to use Exporter if it’s just a simple module that you don’t wish to export anything from.

From other scripts, to access Foo.pm’s globals and functions:

use lib ' /home/me/perl5lib ' ; use Foo; say $Foo :: bar; Foo::baz ();

For anything more complex, or for modules you wish to distribute, see Module::Starter and Dist::Zilla.

Some Perl idioms

Easily generate an arrayref from a list of words: [qw/foo bar baz/]

Assigning to multiple variables at once: my ( $foo, $bar, $baz, $moo) = ( ' XX ' , 38 , [ qw/ a b c / ], ' yo ' );

When you need to do a number of search-replace operations on a given string: for ( $st ) { s/ foo / bar / ; s/ baz / moo / ; s/ qux / quux / ; }

Randomly picking one item from a list: my @words = qw/ foo bar baz / ; my $word = $words[ rand ( @words) ];

Looping over a hash: while ( my ( $k, $v) = each %h ) { # ... }

Some common tasks

Files, line-by-line

To do something with a file, line-by-line:

use autodie qw/ :all / ; open ( my $fh, ' < ' , ' foo.txt ' ); while ( my $line = < $fh>) { chomp $line; # ... } close $fh;

or

use File::Slurp qw/ slurp / ; my $one_big_line = slurp ' foo.txt ' ; my @lines = slurp ' bar.txt ' ; chomp @lines;

Shell output line-by-line

for my $line ( ` ls -l *.txt ` ) { chomp $line; say " >>> $line<<< " ; }

my $sse = time ; # Seconds since epoch. my ( $hours, $minutes, $seconds) = ( localtime )[ 2 , 1 , 0 ]; my ( $year, $month, $day) = ( localtime )[ 5 , 4 , 3 ]; $month ++ ; # Necessary. $year += 1900 ; # Necessary. printf " %u-%02u-%02u

" , $year, $month, $day; # See `perldoc -f sprintf` for details on that format string. my $nice_string = localtime ; # You can also pass `localtime()` an sse value. my $earlier = localtime ( $sse - 120 ); # 2 minutes ago.

For anything more complicated than that, use DateTime.

Random

rand () # 0.0 to < 1.0 rand ( 10 ) # 0.0 to < 10.0 int ( rand ( 10 )) # 0 to 9 (`int` truncates)

Database access

SQLite

Make sure you have SQLite installed:

sudo apt-get install sqlite3 libsqlite3-dev

Then:

cpanm DBI cpanm DBD::SQLite

XML

Use XML::Twig. See also XML::LibXML.

Other tips and best practices

One good place to look for a list of Perl best practices is the book "Perl Best Practices, by Damian Conway. If you haven’t already read it, you probably want to read it. Here’s some tips (some of which are from PBP):

Always use strict and use warnings .

Use my for variables you want to be local, our for ones you want to be global.

Always use parentheses when calling non-built-in functions.

Only use the unless statement modifier when the part that comes first is the usual case, and the modifier is for an “of course, in the unlikely event” situation. For example: go_to_store() unless $hurricane_outside; You can also use modifiers with the various loop operators: while (<>) { next if m/ $bad_coffee_cake / ; last if m/ $tomato_too_soft / ; #... }

Regex tips: Possibly except for very simple matches, always use /xms . Then \A and \z are beginning and end of string. Prefer m//xms to //xms . Use non-capturing parentheses “ (?:...) ” when you want grouping but no capturing. Consider using canned regexen via Regexp::Common sometimes.

Don’t use constant; . use Readonly; instead. You’ll need to grab it from CPAN.

Always quote your heredoc marker after the << .

Use the fat comma for pairing.

If you need to change the value of a punctuation variable, always localize it first.

use English for the less-familiar punctuation variables.

When you really do need to know indexes of values in a list: my @ar = qw/ foo bar baz moo poo qux / ; while ( my ( $idx, $item) = each @ar) { say " $idx: $item " ; } or, the more old-fashioned way: my @ar = qw/ foo bar baz moo poo qux / ; for my $idx ( 0 .. $#ar ) { say " $idx: $ar[$idx] " }

A named lexical iterator variable in a while loop looks like: while ( my $line = <> ) { next if $line =~ m/ ^ # / ; }

Label your loops if you’re using next , last , or redo in them.

Always use a block with map and grep .

Call your own functions with parentheses and without & . In fact, you probably shouldn’t ever call a function using the & except for when doing it by reference (as it &{$foo}($arg1, $arg2) ), and in that case, consider using the arrow instead.

When writing a function that takes more than 3 args, use a hashref to pass them in with names. As in: sub foo { my ( $arg_ref) = @_; # $arg_ref->{bar} # $arg_ref->{baz} # $arg_ref->{moo} # $arg_ref->{qux} #... } foo( {bar => ' the bar ' , baz => ' shirley temple ' , moo => ' love boat ' , qux => ' Isaac ' , });

Check for hash key presence like so: my $ans = exists $q_for {ans} ? $q_for {ans} : 42 ;

To read in the contents of a whole file as one long string, use File::Slurp qw/slurp/ . The old-fashioned way was to do my $file_contents = do { local $/; <$infile }; .

Modules to make use of

Aside from many useful built-in modules, including:

Cwd

autodie (see note below re. IPC::System::Simple)

List::Util

Data::Dumper

File::Basename

File::Copy

File::Path

File::Temp

File::Find

File::stat

Getopt::Long

Test::More

here’s a few (in no particular order):

those listed in Task::Kensho

Scalar::Util, List::Util, and List::MoreUtils (but not Hash::Util) (PBP chp. 8, p. 170)

Moose for OO development. Also consider MooseX::Declare.

You can IO::Interactive’s is_interactive() function, or just have IO::Prompt take care of everything for you.

function, or just have IO::Prompt take care of everything for you. Carp and Carp::Always (when writing your own modules)

Getopt::Long for command-line option processing

Module::Starter, though check out Module::Starter::PBP. Wait: see also Dist::Zilla

Module::Build (used in your Build.PL file).

Config::Std or Config::Tiny.

Test::Simple or Test::More (or Test::Most).

cpanm for installing modules (possibly use with local::lib)

for installing modules (possibly use with local::lib) Modern::Perl

DBI, DBD::SQLite, DBD::mysql, DBIx::Class (aka “DBIC”)

DateTime

File::Slurp

IPC::System::Simple (for use with autodie)

Capture::Tiny

Try::Tiny (or maybe TryCatch)

Path::Class (instead of File::Spec)

Config::Tiny

Term::ANSIColor — because who doesn’t like colors in their terminal? :)

Regexp::Common

Perl::Tidy ( perltidy )

) Perl::Critic ( perlcritic )

) Devel::NYTProf

Devel::Cover

Template

Text::Autoformat

Email::Sender::Simple

GD, GD::Graph

Image::Magick

Gtk2

Plack, PSGI, Starman

Mojolicious, Dancer, Catalyst

Net::OpenSSH

WWW::Mechanize

App::Ack

For more ideas, check out