



















Impatient Perl





version: 19 January 2010





Copyright 2004-2010 Greg London













Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".









Cover Art (Front and Back) on the paperback version of Impatient Perl is excluded from

this license. Cover Art is Copyright Greg London 2004, All Rights Reserved.





For latest version of this work go to:

http://www.greglondon.com













Table of Contents 1 The Impatient Introduction to Perl 7 1.1 The history of perl in 100 words or less 7 1.2 Basic Formatting for this Document 7 1.3 Do You Have Perl Installed 8 1.4 Your First Perl Script, EVER 9 1.5 Default Script Header 9 1.6 Free Reference Material 10 1.7 Cheap Reference Material 10 1.8 Acronyms and Terms 10 2 Storage 11 2.1 Scalars 11 2.1.1 Scalar Strings 12 2.1.1.1 String Literals 12 2.1.1.2 Single quotes versus Double quotes 13 2.1.1.3 chomp 13 2.1.1.4 concatenation 13 2.1.1.5 repetition 13 2.1.1.6 length 13 2.1.1.7 substr 13 2.1.1.8 split 14 2.1.1.9 join 14 2.1.1.10 qw 15 2.1.1.11 Multi-Line Strings, HERE Documents 15 2.1.2 Scalar Numbers 16 2.1.2.1 Numeric Literals 16 2.1.2.2 Numeric Functions 16 2.1.2.3 abs 16 2.1.2.4 int 16 2.1.2.5 trigonometry (sin,cos,tan) 17 2.1.2.6 exponentiation 17 2.1.2.7 sqrt 17 2.1.2.8 natural logarithms(exp,log) 18 2.1.2.9 random numbers (rand, srand) 18 2.1.3 Converting Between Strings and Numbers 19 2.1.3.1 Stringify 19 2.1.3.1.1 sprintf 19 2.1.3.2 Numify 20 2.1.3.2.1 oct 21 2.1.3.2.2 hex 21 2.1.3.2.3 Base Conversion Overview 21 2.1.4 Undefined and Uninitialized Scalars 22 2.1.5 Booleans 23 2.1.5.1 FALSE 24 2.1.5.2 TRUE 24 2.1.5.3 Comparators 25 2.1.5.4 Logical Operators 26 2.1.5.4.1 Default Values 27 2.1.5.4.2 Flow Control 27 2.1.5.4.3 Precedence 27 2.1.5.4.4 Assignment Precedence 27 2.1.5.4.5 Flow Control Precedence 28 2.1.5.4.6 Conditional Operator 28 2.1.6 References 29 2.1.7 Filehandles 31 2.1.8 Scalar Review 32 2.2 Arrays 32 2.2.1 scalar (@array) 33 2.2.2 push(@array, LIST) 34 2.2.3 pop(@array) 34 2.2.4 shift(@array) 35 2.2.5 unshift( @array, LIST) 35 2.2.6 foreach (@array) 36 2.2.7 sort(@array) 37 2.2.8 reverse(@array) 38 2.2.9 splice(@array) 39 2.2.10 Undefined and Uninitialized Arrays 39 2.3 Hashes 39 2.3.1 exists ( $hash{$key} ) 41 2.3.2 delete ( $hash{key} ) 42 2.3.3 keys( %hash ) 42 2.3.4 values( %hash ) 43 2.3.5 each( %hash ) 43 2.4 List Context 47 2.5 References 49 2.5.1 Named Referents 50 2.5.2 References to Named Referents 50 2.5.3 Dereferencing 50 2.5.4 Anonymous Referents 51 2.5.5 Complex Data Structures 53 2.5.5.1 Autovivification 54 2.5.5.2 Multidimensional Arrays 55 2.5.5.3 Deep Cloning, Deep Copy 56 2.5.5.4 Data Persistence 56 2.5.6 Stringification of References 56 2.5.7 The ref() function 57 3 Control Flow 58 3.1 Labels 60 3.2 last LABEL; 60 3.3 next LABEL; 60 3.4 redo LABEL; 60 4 Packages and Namespaces and Lexical Scoping 61 4.1 Package Declaration 61 4.2 Declaring Package Variables With our 62 4.3 Package Variables inside a Lexical Scope 62 4.4 Lexical Scope 63 4.5 Lexical Variables 63 4.6 Garbage Collection 65 4.6.1 Reference Count Garbage Collection 65 4.6.2 Garbage Collection and Subroutines 66 4.7 Package Variables Revisited 67 4.8 Calling local() on Package Variables 68 5 Subroutines 69 5.1 Subroutine Sigil 69 5.2 Named Subroutines 69 5.3 Anonymous Subroutines 70 5.4 Data::Dumper and subroutines 70 5.5 Passing Arguments to/from a Subroutine 70 5.6 Accessing Arguments inside Subroutines via @_ 71 5.7 Dereferencing Code References 71 5.8 Implied Arguments 72 5.9 Subroutine Return Value 73 5.10 Returning False 73 5.11 Using the caller() Function in Subroutines 74 5.12 The caller() function and $wantarray 75 5.13 Context Sensitive Subroutines with wantarray() 75 6 Compiling and Interpreting 76 7 Code Reuse, Perl Modules 78 8 The use Statement 78 9 The use Statement, Formally 79 9.1 The @INC Array 79 9.2 The use lib Statement 80 9.3 The PERL5LIB and PERLLIB Environment Variables 80 9.4 The require Statement 80 9.5 MODULENAME -> import (LISTOFARGS) 81 9.6 The use Execution Timeline 82 10 bless() 83 11 Method Calls 85 11.1 Inheritance 86 11.2 use base 87 11.3 INVOCANT->isa(BASEPACKAGE) 88 11.4 INVOCANT->can(METHODNAME) 88 11.5 Interesting Invocants 88 12 Procedural Perl 90 13 Object Oriented Perl 90 13.1 Class 92 13.2 Polymorphism 92 13.3 SUPER 93 13.4 Object Destruction 95 14 Object Oriented Review 96 14.1 Modules 96 14.2 use Module 96 14.3 bless / constructors 96 14.4 Methods 97 14.5 Inheritance 97 14.6 Overriding Methods and SUPER 97 15 CPAN 98 15.1 CPAN, The Web Site 98 15.2 CPAN, The Perl Module 98 15.3 Plain Old Documentation (POD) and perldoc 101 15.4 Creating Modules for CPAN with h2xs 101 16 The Next Level 102 17 Command Line Arguments 102 17.1 @ARGV 103 17.2 Getopt::Declare 105 17.2.1 Getopt::Declare Sophisticated Example 106 18 File Input and Output 108 18.1 open 108 18.2 close 108 18.3 read 108 18.4 write 109 18.5 File Tests 110 18.6 File Globbing 110 18.7 File Tree Searching 111 19 Operating System Commands 112 19.1 The system() function 112 19.2 The Backtick Operator 112 19.3 Operating System Commands in a GUI 112 20 Regular Expressions 113 20.1 Variable Interpolation 114 20.2 Wildcard Example 115 20.3 Defining a Pattern 115 20.4 Metacharacters 116 20.5 Capturing and Clustering Parenthesis 118 20.5.1 $1, $2, $3, etc Capturing parentheses 118 20.5.2 Capturing parentheses not capturing 119 20.6 Character Classes 119 20.6.1 Metacharacters Within Character Classes 120 20.7 Shortcut Character Classes 120 20.8 Greedy (Maximal) Quantifiers 121 20.9 Thrifty (Minimal) Quantifiers 121 20.10 Position Assertions / Position Anchors 122 20.10.1 The \b Anchor 122 20.10.2 The \G Anchor 123 20.11 Modifiers 125 20.11.1 Global Modifiers 125 20.11.2 The m And s Modifiers 125 20.11.3 The x Modifier 127 20.12 Modifiers For m{} Operator 128 20.13 Modifiers for s{}{} Operator 128 20.14 Modifiers for tr{}{} Operator 128 20.15 The qr{} function 128 20.16 Common Patterns 128 20.17 Regexp::Common 129 21 Parsing with Parse::RecDescent 130 22 Perl, GUI, and Tk 133 23 GNU Free Documentation License 134





A quick explanation of the revision history.





I lost the original Open Office files for â€œImpatient Perlâ€. To recover, I had to take the PDF, copy and paste the text, and then manually reformat the document to resemble its original layout. It was painfully tedious, but the revision on 17 February 2009 is that reconstructed version.





One of the problems with that version is that the text had hard-returns coded into it from the cut/paste. It might look like there's a four-line paragraph on some page, but it's really four lines with hard-coded â€œ

â€ at the end of each line.





The January 2010 version attempts to remove all the hard-coded returns in the paragraph text and allow the word processor to determine where to wrap the text.





While I was working on that rev, I fixed a couple of typos and tweaked a sentence or two.





Enjoy! Greg London

1 The Impatient Introduction to Perl

This document is for people who either want to learn perl or are already programming in perl and just do not have the patience to scrounge for information to learn and use perl. This document should also find use as a handy desk reference for some of the more

common perl related questions.

1.1 The history of perl in 100 words or less

In the mid 1980s, Larry Wall was working as a sys-admin and found that he needed to do a number of common, yet oddball functions over and over again. And he did not like any of the scripting languages that were around at the time, so he invented Perl. Version 1 was released circa 1987. A few changes have occurred between then and now. The current version of Perl has exceeded 5.8.3 and is a highly recommended upgrade.





Perl 6 is on the drawing board as a fundamental rewrite of the language. It is not available yet, and probably will not be available for some time.

1.2 Basic Formatting for this Document

This document is formatted into text sections, code sections, and shell sections. This sentence is part of a text section. Text sections will extend to the far left margin and will use a non-monospaced font. Text sections contain descriptive text.





Code sections are indented.

They also use a monospaced font.

This is a code section, which represents

code to type into a script.

You will need to use a TEXT EDITOR,

not a WORD PROCESSOR to create these files.

Generally, the code is contained in one file,

and is executed via a shell command.





If the code section covers multiple files,

each file will be labeled.





###filename:MyFile.pm

This code will be placed in a

file called MyFile.pm





#!/usr/local/env perl

###filename:myscript.pl

This code will be placed in a file

called myscript.pl

The first line of myscript.pl will be the

line with #!/usr/local/env perl









> shell sections are indented like code sections

> shell sections also use monospaced fonts.

> shell sections differ from code sections in

> that shell sections start with a '>' character

> which represents a shell prompt.

> shell sections show commands to type on

> the command line.

> shell sections also show the output of a script,

> if any exists.

> In simple examples, the code is shown in a

> code section, immediately followed by the output

> from running the script. The command to run

> the script is dropped to save space.





As an example, the code for a simple "Hello World" script is shown here. It can be typed into a file of any name. The name of the file is not important. The command to execute the script is not important either. In this example, the code is important, and the output is important, so they are they only things shown.





print "Hello World

";

> Hello World





THIS DOCUMENT REFERS TO (LI/U)NIX PERL ONLY. Much of this will translate to Mac Perl and Windows Perl, but the exact translation will be left as an exercise to the reader.

1.3 Do You Have Perl Installed

To find out if you have perl installed and its version:





> perl -v





You should have at least version 5.8.3. If you have an older version or if you have no perl installed at all, you can download it for free from





http://www.cpan.org





CPAN is an acronym for Comprehensive Perl Archive Network. The CPAN site contains the latest perl for download and installation, as well as a TON of perl modules for your use.





If you are a beginner, get your sys-admin to install perl for you. Even if you are not a beginner, get your sys-admin to install perl for you.









1.4 Your First Perl Script, EVER

Create a file called hello.pl using your favorite text editor. Type in the following:





#!/usr/bin/env perl

use warnings;

use strict; # comment

print "Hello World

";





(The #! on the first line is sometimes pronounced "shebang")

(The .pl extension is simply a standard accepted extension for perl scripts.)

Run the script:





> perl hello.pl

Hello World





This calls perl and passes it the name of the script to execute. You can save yourself a little typing if you make the file executable:





> chmod +x hello.pl





And then run the script directly.





> hello.pl

Hello World





If "." is not in your PATH variable, you will have to run the script by typing:





> ./hello.pl





HOORAY! Now go update your resume.





Anything from a # character to the end of the line is a comment.

1.5 Default Script Header

All the code examples in this document are assumed to have the following script header, unless otherwise stated. It uses your PATH environment variable to determine which perl executable to run. If you need to have different versions of perl installed on your system, you can control which version of perl they will run by changing your PATH variable without having to change your script.





#!/usr/bin/env perl

use warnings;

use strict;

use Data::Dumper;





Note that Data::Dumpertakes some time to load and you wouldn't want to use Data::Dumperon some timing-critical project. But for learning perl with simple scripts, the execution speed isn't that high of a priority. If you're writing a â€œrealâ€ script (i.e. one where time-to-run and memory-usage are issues to be considered), then don't use Data::Dumperby default, only use it if you really need it.

1.6 Free Reference Material

You can get quick help from the standard perl installation.





> perl -h

> perldoc

> perldoc -h

> perldoc perldoc





FAQs on CPAN: http://www.cpan.org/cpan-faq.html

Mailing Lists on CPAN: http://list.cpan.org

More free documentation on the web: http://www.perldoc.com

Still more free documentation on the web: http://learn.perl.org

1.7 Cheap Reference Material

"Programming Perl" by Larry Wall, Tom Christiansen, and Jon Orwant. Highly recommended book to have handy at all times. It is sometimes referred to as the "Camel Book" by way of the camel drawing on its cover. The publisher, O'Reilly, has printed enough computer books to choke a, well, camel, and each one has a different animal on its cover. Therefore if you hear reference to some animal book, it is probably an O'Reilly book. Well, unless its the "Dragon Book", because that refers to a book called "Compilers" by Aho, Sethi, and Ullman.

1.8 Acronyms and Terms

Perl: Originally, "Pearl" shortened to "Perl" to gain status as a 4-letter word. Now considered an acronym for Practical Extraction and Report Language, as well as Petty Eclectic Rubbish Lister. The name was invented first. The acronyms followed. Note that this is "Perl" with a capital "P". The "perl" with a lower case "p" refers to the executable found somewhere near /usr/local/bin/perl





CPAN: Comprehensive Perl Archive Network. See http://www.cpan.org for more.





DWIM: Do What I Mean. Once upon a time, the standard mantra for computer inflexibility was this: "I really hate this darn machine, I wish that they would sell it. It never does what I want, but only what I tell it." DWIM-iness is an attempt to embed perl with telepathic powers such that it can understand what you wanted to write in your code even though you forgot to actually type it. Well, alright, DWIM is just a way of saying the language was designed by some really lazy programmers so that you could be even lazier than they were. (They had to write perl in C, so they could not be TOO lazy.)





AUTOVIVIFY: "auto" meaning "self". "vivify" meaning "alive". To bring oneself to life. Generally applies to perl variables that can grant themselves into being without an explicit declaration from the programmer. Part of perl's DWIM-ness. "Autovivify" is a verb. The noun form is "autovivification". Sometimes, autovivification is not what you meant your code to do, and for some reason, when "do what I mean" meets autovivification in perl, autovivification wins.





And now, a Haiku:

Do What I Mean and

Autovivification

sometimes unwanted





TMTOWTDI: There is More Than One Way To Do It. An acknowledgment that any programming problem has more than one solution. Rather than have perl decide which solution is best, it gives you all the tools and lets you choose. This allows a programmer to select the tool that will let him get his job done. Sometimes, it gives a perl newbie just enough rope to hang himself.





Foo Fighters: A phrase used around the time of WWII by radar operators to describe a signal that could not be explained. Later became known as a UFO. This has nothing to do with perl, except that "foo" is a common variable name used in perl.





Fubar: Another WWII phrase used to indicate that a mission had gone seriously awry or that a piece of equipment was inoperative. An acronym for Fouled Up Beyond All Recognition and similar interpretations. This has nothing to do with perl either, except that fubar somehow got mangled into foobar, and perl is often awash in variables named "foo" and "bar", especially if the programmer wishes to hide the fact that he did not understand his code well enough to come up with better names. If you use a $foo variable in your code, you deserve to maintain it.

2 Storage

Perl has three basic storage types: Scalars, Arrays, and Hashes.

The most basic storage type is a Scalar.

Arrays and Hashes use Scalars to build more complex data types.

2.1 Scalars

Scalars are preceded with a dollar sign sigil. A "$" is a stylized "S".





sigil : A symbol. In Perl a sigil refers to the symbol in front of a variable.





Scalars can store Strings, Numbers (integers and floats), References, and Filehandles. Perl is smart enough to know which type you are putting into a scalar and handle it.





my $diameter = 42; # The â€œmyâ€ keyword declares a lexical

my $pi = 3.1415; # variable. If you don't know what

my $initial = 'g'; # that means, don't worry about it,

my $name = 'John Doe'; # it will be explained later.

my $ref_to_name = \$name # Specifically, in section 4









Without "use strict;" and without declaring a variable with a "my", using a variable causes perl to create one and initialize it to undef. This undef value will stringify to "" or numify to 0, depending how the undefined variable is used. This is called autovivication. (Stringification and Numification are covered later.)





Autovivify: to bring oneself to life.









In some situations, autovivication is handy. However, in certain situations, autovivification can be an unholy monster.





my $circumference = $pie * $diameter;

# oops, $pie doesn't exist. Autovivified to undef,

# numified to 0, therefore $circumference is zero.





Without use warnings; use strict; perl will autovivify a new variable called "pie", initialize it to zero, and assume that is what you meant to do. There is no reason that warnings and strictness should not be turned on in your scripts.

2.1.1 Scalar Strings

Scalars can store strings. You do not have to declare the length of the string, perl just handles it for you automatically.

2.1.1.1 String Literals

String literals must be in single or double quotes or you will get an error.





print hello;

Error: Unquoted string "hello" may clash with

reserved word





You can use single quotes or double quotes to set off a string literal:





my $name = 'mud';

my $greeting = "hello, $name

";

print $greeting;

> hello, mud





You can also create a list of string literals using the qw() function.





my ($first,$last)=qw( John Doe );

print "first is '$first'

";

print "last is '$last'

";

> first is 'John'

> last is 'Doe'

2.1.1.2 Single quotes versus Double quotes

Single quoted strings are a "what you see is what you get" kind of thing.





my $name = 'mud';

print 'hello $name';

> hello $name





Double quotes means that you get SOME variable interpolation during string evaluation. Complex variables, such as a hash lookup, will not be interpolated properly in double quotes.





my $name = 'mud';

print "hello $name

";

> hello mud





Note: a double-quoted "

" is a new-line character.

2.1.1.3 chomp

You may get rid of a newline character at the end of a string by chomp-ing the string. The chomp function removes one new line from the end of the string even if there are multiple newlines at the end. If there are no newlines, chomp leaves the string alone. The return value of chomp is what was chomped (seldom used).





My $string = "hello world

";

chomp($string);

warn "string is '$string'

"

> string is 'hello world' ...

2.1.1.4 concatenation

String concatenation uses the period character "."





my $fullname = 'mud' . "bath";

2.1.1.5 repetition

Repeat a string with the "x" operator.





my $line = '-' x 80; # $line is eighty hypens

2.1.1.6 length

Find out how many characters are in a string with length().





my $len = length($line); # $len is 80

2.1.1.7 substr

substr ( STRING_EXPRESSION, OFFSET, LENGTH);





Spin, fold, and mutilate strings using substr(). The substr function gives you fast access to get and modify chunks of a string. You can quickly get a chunk of LENGTH characters starting at OFFSET from the beginning or end of the string (negative offsets go from the end). The substr function then returns the chunk.





my $chunk = substr('the rain in spain', 9, 2);

warn "chunk is '$chunk'";

> chunk is 'in' ...





The substr function can also be assigned to, replacing the chunk as well. You need a string contained in a variable that can be modified, rather than using a constant literal in the example above.





my $string = 'the rain in spain';

substr($string, 9, 2) = 'beyond';

warn "string is '$string'";

> string is 'the rain beyond spain' ...

2.1.1.8 split

split(/PATTERN/, STRING_EXPRESSION,LIMIT);





Use the split function to break a string expression into components when the components are separated by a common substring pattern. For example, tab separated data in a single string can be split into separate strings.





my $tab_sep_data = "John\tDoe\tmale\t42";

my ($first,$last,$gender,$age)

= split(/\t/, $tab_sep_data);





You can break a string into individual characters by calling split with an empty string pattern "". The /PATTERN/ in split() is a Regular Expression, which is complicated enough to get its own chapter. However, some common regular expression PATTERNS for split are:





\t tab-separated data

\s+ whitespace-separated data

\s*,\s* comma-separated data

2.1.1.9 join

join('SEPARATOR STRING', STRING1, STRING2, ...);





Use join to stitch a list of strings into a single string.





my $string = join(" and ",

'apples', 'bananas', 'peaches');

warn "string is '$string'";

> string is 'apples and bananas and peaches'...

2.1.1.10 qw

The qw() function takes a list of barewords and quotes them for you.





my $string =

join(" and ", qw(apples bananas peaches));

warn "string is '$string'";

> string is 'apples and bananas and peaches'...





2.1.1.11 Multi-Line Strings, HERE Documents

Perl allows you to place a multi-line string in your code by using what it calls a "here documentâ€.





My $string = <<â€ENDOFDOCUMENTâ€;

Do What I Mean and

Autovivification

sometimes unwanted

ENDOFDOCUMENT





warn "string is '$string'â€;

> string is 'Do What I Mean and

> Autovivification

> sometimes unwanted' at ...





The '<<' indicates a HERE document, followed by the name of the label indicating the end of the here document. Enclosing the label in double quotes means that perl variables in the document will get interpolated as strings. Enclosing the label in single quotes means that no string interpolation occurs.

Perl then reads the lines after the '<<' as string literal content until it sees the end of string label positioned at the beginning of a line.

2.1.2 Scalar Numbers

Perl generally uses floats internally to store numbers. If you specify something that is obviously an integer, it will use an integer. Either way, you simply use it as a scalar.





my $days_in_week = 7; # scalar => integer

my $temperature = 98.6; # scalar => float

2.1.2.1 Numeric Literals

Perl allows several different formats for numeric literals, including integer, floating point, and scientific notation, as well as decimal, octal, and hexadecimal.

Binary numbers begin with "0b"

hexadecimal numbers begin with "0x"

Octal number begin with a "0"

All other numeric literals are assumed to be decimal.





my $solar_temp_c = 1.5e7; # centigrade

my $solar_temp_f = 27_000_000.0; # Fahrenheit

my $base_address = 01234567; # octal

my $high_address = 0xfa94; # hexadecimal

my $low_address = 0b100101; # binary





2.1.2.2 Numeric Functions

2.1.2.3 abs

Use abs to get the absolute value of a number.





my $var1 = abs(-3.4); # var1 is 3.4

my $var2 = abs(5.9); # var2 is 5.9

2.1.2.4 int

Use "int" to convert a floating point number to an integer. Note that this truncates everything after the decimal point, which means you do NOT get rounding. Truncating means that positive numbers always get smaller and negative numbers always get bigger.





my $price = 9.95;

my $dollars = int ($price);

# dollars is 9, not 10! false advertising!

my $y_pos = -5.9;

my $y_int = int($y_pos);

# y_int is -5 (-5 is "bigger" than -5.9)









If you want to round a float to the nearest integer, you will need to write a bit of code. One way to accomplish it is to use sprintf:





my $price = 9.95;

my $dollars = sprintf("%.0f", $price);

# dollars is 10

2.1.2.5 trigonometry (sin,cos,tan)

The sin, cos, and tan functions return the sine, cosine, and tangent of a value given in RADIANS. If you have a value in DEGREES, multiply it by (pi/180) first.





my $angle = 45; # 45 deg

my $radians = $angle * ( 3.14 / 180 ); # .785 rad

my $sine_deg = sin($angle); # 0.707

my $sine_rad = sin($radians); # 0.707





If you need inverse sine, cosine, or tangent, then use the Math::Trig module on CPAN.





2.1.2.6 exponentiation

Use the "**" operator to raise a number to some power.





my $seven_squared = 7 ** 2; # 49

my $five_cubed = 5 ** 3; #125

my $three_to_the_fourth = 3 ** 4; # 81

Use fractional powers to take a root of a number:

my $square_root_of_49 = 49 ** (1/2); # 7

my $cube_root_of_125 = 125 ** (1/3); # 5

my $fourth_root_of_81 = 81 ** (1/4); # 3





Standard perl cannot handle imaginary numbers. Use the Math::Complex module on CPAN.

2.1.2.7 sqrt

Use sqrt to take the square root of a positive number.





my $square_root_of_123 = sqrt(123); # 11.0905

2.1.2.8 natural logarithms(exp,log)

The exp function returns eto the power of the value given. To get e, call exp(1);





my $value_of_e = exp(1); # 2.7183

my $big_num= exp(42); # 2.7183 ** 42 = 1.7e18





The log function returns the inverse exp() function, which is to say, log returns the number to which you would have to raise e to get the value passed in.





my $inv_exp = log($big_num); # inv_exp = 42





If you want another base, then use this subroutine:





sub log_x_base_b {return log($_[0])/log($_[1]);}

# want the log base 10 of 12345

# i.e. to what power do we need to raise the

# number 10 to get the value of 12345?

my $answer = log_x_base_b(12345,10); # answer = 4.1





Note that inverse natural logs can be done with exponentiation, you just need to know the value of the magic number e (~ 2.718281828). The exp function is straightforward exponentiation:





# big_num = 2.7183 ** 42 = 1.7e18

my $big_num = $value_of_e ** 42;





Natural logarithms simply use the inverse of the value (i.e. 1/value) with exponentiation.





# inv_exp = 2.7183 ** (1/1.7e18) = 42

my $inv_exp = $value_of_e ** (1/$big_num);

2.1.2.9 random numbers (rand, srand)

The rand function is a pseudorandom number generator (PRNG).





If a value is passed in, rand returns a number that satisfies ( 0 <= return <= input )





If no value is passed in, rand returns a number in the range ( 0 <= return < 1 )





The srand function will seed the PRNG with the value passed in. If no value is passed in, srand will seed the PRNG with something from the system that will give it decent randomness. You can pass in a fixed value to guarantee the values returned by rand will always follow the same sequence (and therefore are predictable). You should only need to seed the PRNG once. If you have a version of perl greater than or equal to 5.004, you should not need to call it at all, because perl will call srand at startup.

2.1.3 Converting Between Strings and Numbers

Many languages require the programmer to explicitly convert numbers to strings before printing them out and to convert strings to numbers before performing arithmetic on them. Perl is not one of these languages.





Perl will attempt to apply Do What I Mean to your code and just Do The Right Thing. There are two basic conversions that can occur: stringification and numification.

2.1.3.1 Stringify

Stringify: Converting something other than a string to a string form.

Perl will automatically convert a number (integer or floating point) to a string format before printing it out.





my $mass = 7.3;

my $volume = 4;

warn "mass is '$mass'

";

warn "volume is '$volume'

";

> mass is '7.3' ...

> volume is '4' ...





Even though $mass is stored internally as a floating point number and $volume is stored internally as an integer, the code did not have to explicitly convert these numbers to string format before printing them out. Perl will attempt to convert the numbers into the appropriate string representation. If you do not want the default format, use sprintf. If you want to force stringification, simply concatenate a null string onto the end of the value.





my $mass = 7.3; # 7.3

my $string_mass = $mass .= ''; # '7.3'





2.1.3.1.1 sprintf

Use sprintf to control exactly how perl will convert a number into string format.





sprintf ( FORMAT_STRING, LIST_OF_VALUES );





For example:





my $pi = 3.1415;

my $str = sprintf("%06.2f",$pi);

warn "str is '$str'";

> str is '003.14' ...









Decoding the above format string:





% => format

0 => fill leading spaces with zero

6 => total length, including decimal point

.2 => put two places after the decimal point

f => floating point notation





To convert a number to a hexadecimal, octal, binary, or decimal formated string, use the following FORMAT_STRINGS:





hexadecimal "%lx" The letter 'l' (L)

octal "%lo" indicates the input is

binary "%lb" an integer, possibly

decimal integer "%ld" a Long integer.

decimal float "%f"

scientific "%e"

2.1.3.2 Numify

Numify: Converting something other than a number to a numeric form.

Sometimes you have string information that actually represents a number. For example, a user might enter the string "19.95" which must be converted to a float before perl can perform any arithmetic on it.





You can force numification of a value by adding integer zero to it.





my $user_input = '19.95'; # '19.95'

my $price = $user_input+0; # 19.95





If the string is NOT in base ten format, then use oct() or hex()





2.1.3.2.1 oct

The octfunction can take a string that fits the octal, hexadecimal, or binary format and

convert it to an integer.





binary formatted strings must start with "0b"





hexadecimal formatted strings must start with "0x"





All other numbers are assumed to be octal strings.





Note: even though the string might not start with a zero (as required by octal literals), oct will assume the string is octal. This means calling oct() on a decimal number could be a bad thing.





To handle a string that could contain octal, hexadecimal, binary, OR decimal strings, you could assume that octal strings must start with "0". Then, if the string starts with zero, call oct on it, else assume it's decimal. This example uses regular expressions and the conditional operator.





my $num = ($str=~m{^0}) ? oct($str) : $str + 0;

2.1.3.2.2 hex

The hex() function takes a string in hex format and converts it to integer. The hex() function is like oct() except that hex() only handles hex base strings, and it does not require a "0x" prefix.

2.1.3.2.3 Base Conversion Overview

Given a decimal number:





my $decimal=12;





Convert from decimal to another base using sprintf:





my $hex = sprintf("%lx", $decimal);

my $oct = sprintf("%lo", $decimal);

my $bin = sprintf("%lb", $decimal);





If you want to pad the most significant bits with zeroes and you know the width, use this:





# 08 assumes width is 8 characters

my $p_hex = sprintf("%08lx", $decimal);

my $p_oct = sprintf("%08lo", $decimal);

my $p_bin = sprintf("%08lb", $decimal);









If you have a string and you want to convert it to decimal, use the conditional operator and oct().





sub convert_to_decimal {

($_[0]=~m{^0}) ? Oct($_[0]) : $_[0] + 0;

}





warn convert_to_decimal('42'); # dec

warn convert_to_decimal('032'); # oct

warn convert_to_decimal('0xff'); # hex

warn convert_to_decimal('0b1001011'); # bin





If you want to know how many bits it would take to store a number, convert it to binary using sprintf (don't pad with zeros) and then call length() on it.





warn length(sprintf("%lb", 255)); # 8

2.1.4 Undefined and Uninitialized Scalars

All the examples above initialized the scalars to some known value before using them. You can declare a variable but not initialize it, in which case, the variable is undefined. If you use a scalar that is undefined, perl will stringify or numify it based on how you are using the variable.





An undefined scalar stringifies to an empty string: ""





An undefined scalar numifies to zero: 0





Without warnings or strict turned on, this conversion is silent. With warnings/strict on, the conversion still takes place, but a warning is emitted.





Since perl automatically performs this conversion no matter what, there is no string or numeric operation that will tell you if the scalar is undefined or not.





Use the defined() function to test whether a scalar is defined or not.





If the scalar is defined, the function returns a boolean "true" (1)





If the scalar is NOT defined, the function returns a boolean "false" ("").









If you have a scalar with a defined value in it, and you want to return it to its uninitialized state, assign undef to it. This will be exactly as if you declared the variable with no initial value.





my $var; # undef

print "test 1 :";

if(defined($var)) {print "defined

";}

else {print "undefined

";}

$var = 42; # defined

print "test 2 :";

if(defined($var)) {print "defined

";}

else {print "undefined

";}

$var = undef; # undef as if never initialized

print "test 3 :";

if(defined($var)) {print "defined

";}

else {print "undefined

";}

> test 1 :undefined

> test 2 :defined

> test 3 :undefined

2.1.5 Booleans

Perl does not have a boolean "type" per se. Instead, perl interprets scalar strings and numbers as "true" or "false" based on some rules:





1)Strings "" and "0" are FALSE,

any other string or stringification is TRUE

2) Number 0 is FALSE, any other number is TRUE

3) all references are TRUE

4) undef is FALSE





Note that these are SCALARS. Any variable that is not a SCALAR is first evaluated in scalar context, and then treated as a string or number by the above rules. The scalar context of an ARRAY is its size. An array with one undef value has a scalar() value of 1 and is therefore evaluated as TRUE.





A subroutine returns a scalar or a list depending on the context in which it is called. To explicitly return FALSE in a subroutine, use this:





return wantarray() ? () : 0; # FALSE





This is sufficiently troublesome to type for such a common thing that an empty return statement within a subroutine will do the same thing:





return; #FALSE













2.1.5.1 FALSE

The following scalars are interpreted as FALSE:





integer 0 # false

float 0.0 # false

string '0' # false

string '' # false

undef # false

2.1.5.2 TRUE

ALL other values are interpreted as TRUE, which means the following scalars are considered TRUE, even though you might not have expected them to be false.





string '0.0' # true

string '00' # true

string 'false' # true

float 3.1415 # true

integer 11 # true

string 'yowser' # true





If you are doing a lot of work with numbers on a variable, you may wish to force numification on that variable ($var+0) before it gets boolean tested, just in case you end up with a string "0.0" instead of a float 0.0 and get some seriously hard to find bugs.





Note that the string '0.0' is TRUE, but ('0.0'+0) will get numified to 0, which is FALSE. If you are processing a number as a string and want to evaluate it as a BOOLEAN, make sure you explicitly NUMIFY it before testing its BOOLEANNESS.





Built in Perl functions that return a boolean will return an integer one (1) for TRUE and an empty string ("") for FALSE.









2.1.5.3 Comparators

Comparison operators return booleans, specifically an integer 1 for true and a null string "" for false. The "Comparison" operator ("<=>" and "cmp") return a -1, 0, or +1, indicating the compared values are less than, equal to, or greater than. Distinct comparison operators exist for comparing strings and for comparing numbers.

Function String Numeric

First column: function; second column - the string-related operator; third column - the numeric oeprator Function String Numeric equal to eq == not equal to ne != less than lt < greater than gt > less than or equal to le <= greater than or equal to ge >= comparison (lt=-1, eq=0, gt=+1) cmp <=>

equal to

Note that if you use a string operator to compare two numbers, you will get their alphabetical string comparison. Perl will stringify the numbers and then perform the compare. This will occur silently; perl will emit no warning. And if you wanted the numbers compared numerically but used string comparison, then you will get the wrong result when you compare the strings ("9" lt "100").





String "9" is greater than (gt) string "100".





Number 9 is less than (<=) number 100.





If you use a numeric operator to compare two strings, perl will attempt to numify the strings and then compare them numerically. Comparing "John" <= "Jacob" will cause perl to convert "John" into a number and fail miserably. However, if warnings/strict is not on, it will fail miserably and SILENTLY, assigning the numification of "John" to integer zero.





The numeric comparison operator '<=>' is sometimes called the "spaceship operator".









2.1.5.4 Logical Operators

Perl has two sets of operators to perform logical AND, OR, NOT functions. The difference between the two is that one set has a higher precedence than the other set.





The higher precedence logical operators are the '&&', '||', and '!' operators.





function, operator, usage, and return value of the operators function operator usage return value AND && $one && $two if ($one is false) $one else $two OR || $one || $two if ($one is true) $one else $two NOT ! ! $one if ($one is false) true else false









The lower precedence logical operators are the 'and', 'or', 'not', and 'xor' operators.





function, operator, usage, and return value of the operators function operator usage return value AND and $one and $two if ($one is false) $one else $two OR or $one or $two if ($one is true) $one else $two NOT not not $one if ($one is false) true else false XOR xor $one xor $two if ( ($one true and $two false) or ($one false and $two true) ) then return true else false







Both sets of operators are very common in perl code, so it is useful to learn how precedence affects their behavior. But first, some examples of how to use them.

2.1.5.4.1 Default Values

This subroutine has two input parameters ($left and $right) with default values (1.0 and 2.0). If the user calls the subroutine with missing arguments, the undefined parameters will instead receive their default values.





sub mysub {

my( $left, $right )=@_;

$left ||= 1.0;

$right ||= 2.0;

# deal with $left and $right here.

}





The '||=' operator is a fancy shorthand. This:

$left ||= 1.0;





is exactly the same as this:

$left = $left || 1.0;





2.1.5.4.2 Flow Control

The open() function here will attempt to open $filename for reading and attach $filehandle to it. If open() fails in any way, it returns FALSE, and FALSE OR'ed with die () means that perl will evaluate the die() function to finish the logical evaluation. It won't complete because execution will die, but the end result is code that is actually quite readable.





open (my $filehandle, $filename)

or die "cant open";

2.1.5.4.3 Precedence

The reason we used '||' in the first example and 'or' in the second example is because the operators have different precedence, and we used the one with the precedence we needed.

2.1.5.4.4 Assignment Precedence

When working with an assignment, use '||' and '&&', because they have a higher precedence than (and are evaluated before) the assignment '='. The 'or' and 'and' operators have a precedence that is LOWER than an assignment, meaning the assignment would occur first, followed by any remaining 'and' and 'or' operators.





Right:





my $default = 0 || 1;

# default is 1









Wrong:





my $default = 0 or 1;

# default is 0





The second (wrong) example is equivalent to this:





(my $default = 0) or 1;





which will ALWAYS assign $default to the first value and discard the second value.





2.1.5.4.5 Flow Control Precedence

When using logical operators to perform flow control, use 'or' and 'and' operators, because they have lower precedence than functions and other statements that form the boolean inputs to the 'or' or 'and' operator. The '||' and '&&' have higher precedence than functions and may execute before the first function call.





Right:





close $fh or die "Error:could not close";





Wrong:

close $fh || die "Error: could not close";





The second (wrong) example is equivalent to this:





close ($fh || die "Error");





which will ALWAYS evaluate $fh as true, NEVER die, and close $fh. If close() fails, the return value is discarded, and the program continues on its merry way.





It is always possible to override precedence with parentheses, but it is probably better to get in the habit of using the right operator for the right job.

2.1.5.4.6 Conditional Operator

The conditional operator mimics the conditional testing of an if-else block. The conditional operator uses three operands, and is also called a trinary operator.





As it happens, the conditional operator is perl's ONLY trinary operator, so people sometimes call it the trinary or ternary operator when they mean conditional operator. As long as perl doesn't add another trinary operator, its not a problem. It is even more rarely called the ?: operator.





The conditional operator has this form:





my $RESULT = $BOOLEAN1 ? $VALUE1 : $VALUE2;





This can be rewritten as an if-else block like this:





my $RESULT;

if($BOOLEAN1) {

$RESULT = $VALUE1

} else {

$RESULT = $VALUE2

}





The conditional operator allows you to declare the variable and perform the assignment all in one short line of code.





Note that $BOOLEAN1, $VALUE1 and $VALUE2 can be replaced by any normal perl expression, rather than being limited to a simple scalar value. One interesting expression that you could replace $VALUE2 with is another conditional operator, effectively allowing you to create a chain of if-elsif-elsif-else statements. For example:





my $RESULT =

$BOOLEAN1 ? $VALUE1

: $BOOLEAN2 ? $VALUE2

: $BOOLEAN3 ? $VALUE3

: $BOOLEAN4 ? $VALUE4

: $VALUE5;





The above example is equivalent to this mouthful:





my $RESULT;

if($BOOLEAN1) { $RESULT = $VALUE1 }

elsif($BOOLEAN2) { $RESULT = $VALUE2 }

elsif($BOOLEAN3) { $RESULT = $VALUE3 }

elsif($BOOLEAN4) { $RESULT = $VALUE4 }

eles { $RESULT = $VALUE5 }





2.1.6 References

A reference points to the variable to which it refers. It is kind of like a pointer in C, which says "the data I want is at this address". Unlike C, you cannot manually alter the address of a perl reference. You can only create a reference to a variable that is visible from your current scope.





Create a reference by placing a "\" in front of the variable:





my $name = 'John';

my $age = 42;

my $name_ref = \$name;

my $age_ref = \$age;





Perl will stringify a reference so that you can print it and see what it is.





warn "age_ref is '$age_ref'";

> age_ref is 'SCALAR(0x812e6ec)' ...





This tells you that $age_ref is a reference to a SCALAR (which we know is called $age). It also tells you the address of the variable to which we are referring is 0x812e6ec.





You cannot referencify a string. I.E. you cannot give perl a string, such as "SCALAR (0x83938949)" and have perl give you a reference to whatever is at that address. Perl is pretty loosy goosey about what it will let you do, but not even perl is so crazy as to give people complete access to the system memory.









You can dereference a reference by putting an extra sigil (of the appropriate type) in front of the reference variable.





my $name = 'John';

my $ref_to_name = \$name;

my $deref_name = $$ref_to_name;

warn $deref_name;

> John ...





References are interesting enough that they get their own section. But I introduce them here so that I can introduce a really cool module that uses references: Data::Dumper. Data::Dumper will take a reference to ANYTHING and print out the thing to which it refers in a human readable form.





This does not seem very impressive with a reference to a scalar:





my $name = 'John';

my $ref_to_name = \$name;

warn Dumper \$ref_to_name;

> $VAR1 = \'John';





But this will be absolutely essential when working with Arrays and Hashes.

2.1.7 Filehandles

Scalars can store a filehandle. File IO gets its own section, but I introduce it here to give a complete picture of what scalars can hold.





Given a scalar that is undefined (uninitialized), calling open() on that scalar and a string filename will tell perl to open the file specified by the string, and store the handle to that file in the scalar.





open(my $fh, '>out.txt');

print $fh "hello world

";

print $fh "this is simple file writing

";

close($fh);





The scalar $fh in the example above holds the filehandle to "out.txt". Printing to the filehandle actually outputs the string to the file.





There is some magic going on there that I have not explained, but that is a quick intro to scalar filehandles.









2.1.8 Scalar Review

Scalars can store STRINGS, NUMBERS (floats and ints), REFERENCES, and FILEHANDLES.





Stringify: to convert something to a string format





Numify: to convert something to a numeric format





The following scalars are interpreted as boolean FALSE:





integer 0, float 0.0, string "0", string "", undef





All other scalar values are interpreted as boolean TRUE.

2.2 Arrays

Arrays are preceded with an "at" sigil. The "@" is a stylized "a".





An array stores a bunch of scalars that are accessed via an integer index.





Perl arrays are ONE-DIMENSIONAL ONLY. (Do Not Panic.)





The first element of an array always starts at ZERO (0).





When you refer to an entire array, use the "@" sigil.





my @numbers = qw ( zero one two three );





When you index into the array, the "@" character changes to a "$" and the numeric index is placed in square brackets.





my @numbers = qw ( zero one two three );

my $string = $numbers[2];

warn $string;

> two ...













The length of an array is not pre-declared. Perl autovivifies whatever space it needs.





my @months;

$months[1]='January';

$months[5]='May';

# $months[0] and $months[2..4] are autovivified

# and initialized to undef

print Dumper \@months;

> $VAR1 = [

> undef, # index 0 is undefined

> 'January', # $months[1]

> ${\$VAR1->[0]}, # this is same as undef

> ${\$VAR1->[0]}, # undef

> ${\$VAR1->[0]}, # undef

> 'May' # $months[5]

> ];





If you want to see if you can blow your memory, try running this piece of code:





my @mem_hog;

$mem_hog[10000000000000000000000]=1;





# the array is filled with undefs

# except the last entry, which is initialized to 1





Arrays can store ANYTHING that can be stored in a scalar





my @junk_drawer = ( 'pliers', 1,1,1, '*', '//',

3.14, 9*11, 'yaba', 'daba' );





Negative indexes start from the end of the array and work backwards.





my @colors = qw ( red green blue );

my $last=$colors[-1];

warn "last is '$last'";





> last is 'blue' ...

2.2.1 scalar (@array)

To get how many elements are in the array, use "scalar"





my @phonetic = qw ( alpha bravo charlie delta );

my $quantity = scalar(@phonetic);

warn $quantity;





> 4 ...





When you assign an entire array into a scalar variable, you will get the same thing, but calling scalar() is much more clear.





my @phonetic = qw ( alpha bravo charlie );

my $quant = @phonetic;

warn $quant;





> 3 ...





This is explained later in the "list context" section.

2.2.2 push(@array, LIST)

Use push() to add elements onto the end of the array (the highest index). This will increase the length of the array by the number of items added.





my @groceries = qw ( milk bread );

push(@groceries, qw ( eggs bacon cheese ));

print Dumper \@groceries;





> $VAR1 = [

> 'milk',

> 'bread',

> 'eggs',

> 'bacon',

> 'cheese'

> ];

2.2.3 pop(@array)

Use pop() to get the last element off of the end of the array (the highest index). This will shorten the array by one. The return value of pop() is the value popped off of the array.





my @names = qw ( alice bob charlie );

my $last_name = pop(@names);

warn "popped = $last_name";

print Dumper \@names;





> popped = charlie ...

> $VAR1 = [

> 'alice',

> 'bob'

> ];









2.2.4 shift(@array)

Use shift() to remove one element from the beginning/bottom of an array (i.e. at index zero). All elements will be shifted DOWN one index. The array will be shorted by one.





The return value is the value removed from the array.





my @curses = qw ( fee fie foe fum );

my $start = shift(@curses);

warn $start;

warn Dumper \@curses;





> fee

> $VAR1 = [

> 'fie',

> 'foe',

> 'fum'

> ];

2.2.5 unshift( @array, LIST)

use unshift() to add elements to the BEGINNING/BOTTOM of an array (i.e. at index ZERO). All the other elements in the array will be shifted up to make room. This will length the array by the number of elements in LIST.





my @trees = qw ( pine maple oak );

unshift(@trees, 'birch');

warn Dumper \@trees;





> $VAR1 = [

> 'birch', # index 0

> 'pine', # old index 0, now 1

> 'maple', # 2

> 'oak' # 3

> ];









2.2.6 foreach (@array)

Use foreach to iterate through all the elements of a list. Its formal definition is:





LABEL foreach VAR (LIST) BLOCK





This is a control flow structure that is covered in more detail in the "control flow" section. The foreach structure supports last, next, and redo statements.





Use a simple foreach loop to do something to each element in an array:





my @fruits = qw ( apples oranges lemons pears );

foreach my $fruit (@fruits) {

print "fruit is '$fruit'

";

}





> fruit is 'apples'

> fruit is 'oranges'

> fruit is 'lemons'

> fruit is 'pears'





DO NOT ADD OR DELETE ELEMENTS TO AN ARRAY BEING PROCESSED IN A

FOREACH LOOP.





my @numbers = qw (zero one two three);

foreach my $num (@numbers) {

shift(@numbers) if($num eq 'one');

print "num is '$num'

";

}





> num is 'zero'

> num is 'one'

> num is 'three'

# note: I deleted 'zero', but I failed to

# print out 'two', which is still part of array.

# BAD!!









VAR acts as an alias to the element of the array itself. Changes to VAR propagate to changing the array.





my @integers = ( 23, 142, 9384, 83948 );

foreach my $num (@integers) {

$num+=100;

}

print Dumper \@integers;





> $VAR1 = [

> 123,

> 242,

> 9484,

> 84048

> ];

2.2.7 sort(@array)

Use sort() to sort an array alphabetically. The return value is the sorted version of the array. The array passed in is left untouched.





my @fruit = qw ( pears apples bananas oranges );

my @sorted_array = sort(@fruit);

print Dumper \@sorted_array ;





>$VAR1 = [

> 'apples',

> 'bananas',

> 'oranges',

> 'pears'

> ];





Sorting a list of numbers will sort them alphabetically as well, which probably is not what you want.





my @scores = ( 1000, 13, 27, 200, 76, 150 );

my @sorted_array = sort(@scores);

print Dumper \@sorted_array ;





> $VAR1 = [

> 1000, # 1's

> 13, # 1's

> 150, # 1's

> 200,

> 27,

> 76

> ];









The sort() function can also take a code block ( any piece of code between curly braces ) which defines how to perform the sort if given any two elements from the array. The code block uses two global variables, $a and $b, and defines how to compare the two entries.





This is how you would sort an array numerically.





my @scores = ( 1000, 13, 27, 200, 76, 150 );

my @sorted_array = sort {$a<=>$b} (@scores);

print Dumper \@sorted_array ;





> $VAR1 = [

> 13,

> 27,

> 76,

> 150,

> 200,

> 1000

> ];

2.2.8 reverse(@array)

The reverse() function takes a list and returns an array in reverse order. The last element becomes the first element. The first element becomes the last element.





my @numbers = reverse (1000,13,27,200,76,150);

print Dumper \@numbers ;

> $VAR1 = [

> 150,

> 76,

> 200,

> 27,

> 13,

> 1000

> ];





2.2.9 splice(@array)

Use splice() to add or remove elements into or out of any index range of an array.





splice ( ARRAY , OFFSET , LENGTH , LIST );





The elements in ARRAY starting at OFFSET and going for LENGTH indexes will be removed from ARRAY. Any elements from LIST will be inserted at OFFSET into ARRAY.





my @words = qw ( hello there );

splice(@words, 1, 0, 'out');

warn join(" ", @words);





> hello out there ...

2.2.10 Undefined and Uninitialized Arrays

An array is initialized as having no entries. Therefore you can test to see if an array is initialized by calling scalar() on it. This is equivalent to calling defined() on a scalar variable. If scalar() returns false (i.e. integer 0), then the array is uninitialized.





If you want to uninitialize an array that contains data, then you do NOT want to assign it undef like you would a scalar. This would fill the array with one element at index zero with a value of undefined.





my @array = undef; # WRONG





To clear an array to its original uninitialized state, assign an empty list to it. This will clear out any entries, and leave you with a completely empty array.





my @array = (); # RIGHT

2.3 Hashes

Hashes are preceded with a percent sign sigil.





The "%" is a stylized "key/value" pair.





A hash stores a bunch of scalars that are accessed via a string index called a "key"





Perl hashes are ONE-DIMENSIONAL ONLY. (Do Not Panic.)





There is no order to the elements in a hash. (Well, there is, but you should not use a hash with an assumption about what order the data will come out.)





You can assign any even number of scalars to a hash. Perl will extract them in pairs. The first item will be treated as the key, and the second item will be treated as the value.





When you refer to an entire hash, use the "%" sigil.





my %info = qw ( name John age 42 );





When you look up a key in the hash, the "%" character changes to a "$" and the key is placed in curly braces.





my %info = qw ( name John age 42 );

my $data = $info{name};

warn $data;





> John ...





The keys of a hash are not pre-declared. If the key does not exist during an ASSIGNMENT, the key is created and given the assigned value.





my %inventory;

$inventory{apples}=42;

$inventory{pears}=17;

$inventory{bananas}=5;

print Dumper \%inventory;





>$VAR1 = {

> 'bananas' => 5,

> 'apples' => 42,

> 'pears' => 17

> };





If the key does not exist during a FETCH, the key is NOT created, and undef is returned.





my %inventory;

$inventory{apples}=42;

my $peaches = $inventory{peaches};

warn "peaches is '$peaches'";

print Dumper \%inventory;





> Use of uninitialized value in concatenation

> peaches is '' at ./test.pl line 13.

> $VAR1 = {

> 'apples' => 42

> };





2.3.1 exists ( $hash{$key} )

Use exists() to see if a key exists in a hash. You cannot simply test the value of a key, since a key might exist but store a value of FALSE





my %pets = ( cats=>2, dogs=>1 );

unless(exists($pets{fish})) {

print "No fish here

";

}





Warning: during multi-key lookup, all the lower level keys are autovivified, and only the last key has exists() tested on it. This only happens if you have a hash of hash references. References are covered later, but this is a "feature" specific to exists() that can lead to very subtle bugs. Note in the following example, we explicitly create the key "Florida", but we only test for the existence of {Maine}->{StateBird}, which has the side effect of creating the key {Maine} in the hash.





my %stateinfo;

$stateinfo{Florida}->{Abbreviation}='FL';

if (exists($stateinfo{Maine}->{StateBird})) {

warn "it exists";

}

print Dumper \%stateinfo;





> $VAR1 = {

> 'Florida' => {

> 'Abbreviation' => 'FL'

> },

> 'Maine' => {}

> };





You must test each level of key individually, and build your way up to the final key lookup if you do not want to autovivify the lower level keys.





my %stateinfo;

$stateinfo{Florida}->{Abbreviation}='FL';

if (exists($stateinfo{Maine})) {

if (exists($stateinfo{Maine}->{StateBird}))

{ warn "it exists"; }

}

print Dumper \%stateinfo;





> $VAR1 = {

> 'Florida' => {

> 'Abbreviation' => 'FL'

> }

> };





2.3.2 delete ( $hash{key} )

Use delete to delete a key/value pair from a hash. Once a key is created in a hash, assigning undef to it will keep the key in the hash and will only assign the value to undef.





The only way to remove a key/value pair from a hash is with delete().





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

$pets{cats}=undef;

delete($pets{fish});

print Dumper \%pets;





> $VAR1 = {

> 'cats' => undef,

> 'dogs' => 1

> };

2.3.3 keys( %hash )

Use keys() to return a list of all the keys in a hash. The order of the keys will be based on the internal hashing algorithm used, and should not be something your program depends upon. Note in the example below that the order of assignment is different from the order printed out.





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

foreach my $pet (keys(%pets)) {

print "pet is '$pet'

";

}





> pet is 'cats'

> pet is 'dogs'

> pet is 'fish'





If the hash is very large, then you may wish to use the each() function described below.









2.3.4 values( %hash )

Use values() to return a list of all the values in a hash. The order of the values will match the order of the keys return in keys().





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

my @pet_keys = keys(%pets);

my @pet_vals = values(%pets);

print Dumper \@pet_keys;

print Dumper \@pet_vals;





> $VAR1 = [

> 'cats',

> 'dogs',

> 'fish'

> ];

> $VAR1 = [

> 2,

> 1,

> 3

> ];





If the hash is very large, then you may wish to use the each() function described below.

2.3.5 each( %hash )

Use each() to iterate through each key/value pair in a hash, one at a time.





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

while(my($pet,$qty)=each(%pets)) {

print "pet='$pet', qty='$qty'

";

}





> pet='cats', qty='2'

> pet='dogs', qty='1'

> pet='fish', qty='3'





Every call to each() returns the next key/value pair in the hash. After the last key/value pair is returned, the next call to each() will return an empty list, which is boolean false. This is how the while loop is able to loop through each key/value and then exit when done.





Every hash has one "each iterator" attached to it. This iterator is used by perl to remember where it is in the hash for the next call to each().





Calling keys() on the hash will reset the iterator. The list returned by keys() can be discarded.





keys(%hash);





Do not add keys while iterating a hash with each().





You can delete keys while iterating a hash with each().





The each() function does not have to be used inside a while loop. This example uses a subroutine to call each() once and print out the result. The subroutine is called multiple times without using a while() loop.





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);





sub one_time {

my($pet,$qty)=each(%pets);

# if key is not defined,

# then each() must have hit end of hash

if(defined($pet)) {

print "pet='$pet', qty='$qty'

";

} else {

print "end of hash

";

}

}





one_time; # cats

one_time; # dogs

keys(%pets); # reset the hash iterator

one_time; # cats

one_time; # dogs

one_time; # fish

one_time; # end of hash

one_time; # cats

one_time; # dogs









> pet='cats', qty='2'

> pet='dogs', qty='1'

> pet='cats', qty='2'

> pet='dogs', qty='1'

> pet='fish', qty='3'

> end of hash

> pet='cats', qty='2'

> pet='dogs', qty='1'





There is only one iterator variable connected with each hash, which means calling each() on a hash in a loop that then calls each() on the same hash another loop will cause problems. The example below goes through the %pets hash and attempts to compare the quantity of different pets and print out their comparison.





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

while(my($orig_pet,$orig_qty)=each(%pets)) {

while(my($cmp_pet,$cmp_qty)=each(%pets)) {

if($orig_qty>$cmp_qty) {

print "there are more $orig_pet "

."than $cmp_pet

";

} else {

print "there are less $orig_pet "

."than $cmp_pet

";

}

}

}





> there are more cats than dogs

> there are less cats than fish

> there are more cats than dogs

> there are less cats than fish

> there are more cats than dogs

> there are less cats than fish

> there are more cats than dogs

> there are less cats than fish

> ...





The outside loop calls each() and gets "cats". The inside loop calls each() and gets "dogs". The inside loop continues, calls each() again, and gets "fish". The inside loop calls each() one more time and gets an empty list. The inside loop exits. The outside loop calls each() which continues where the inside loop left off, namely at the end of the list, and returns "cats". The code then enters the inside loop, and the process repeats itself indefinitely.





One solution for this each() limitation is shown below. The inner loop continues to call each() until it gets the key that matches the outer loop key. The inner loop must skip the end of the hash (an undefined key) and continue the inner loop. This also fixes a problem in the above example in that we probably do not want to compare a key to itself.





my %pets = (

fish=>3,

cats=>2,

dogs=>1,

);

while(my($orig_pet,$orig_qty)=each(%pets)) {

while(1) {

my($cmp_pet,$cmp_qty)=each(%pets);

next unless(defined($cmp_pet));

last if($cmp_pet eq $orig_pet);

if($orig_qty>$cmp_qty) {

print "there are more $orig_pet "

."than $cmp_pet

";

} else {

print "there are less $orig_pet "

."than $cmp_pet

";

}

}

}





> there are more cats than dogs

> there are less cats than fish

> there are less dogs than fish

> there are less dogs than cats

> there are more fish than cats

> there are more fish than dogs





If you do not know the outer loop key, either because its in someone else's code and they do not pass it to you, or some similar problem, then the only other solution is to call keys on the hash for all inner loops, store the keys in an array, and loop through the array of keys using foreach. The inner loop will then not rely on the internal hash iterator value.





2.4 List Context

List context is a concept built into the grammar of perl. You cannot declare a "list context" in perl the way you might declare an @array or %hash. List context affects how perl executes your source code. Here is an example.





my @cart1=qw( milk bread butter);

my @cart2=qw( eggs bacon juice );

my @checkout_counter = ( @cart1, @cart2 );

print Dumper \@checkout_counter;





> $VAR1 = [

> 'milk',

> 'bread',

> 'butter',

> 'eggs',

> 'bacon',

> 'juice'

> ];





Basically, two people with grocery carts, @cart1 and @cart2, pulled up to the @checkout_counter and unloaded their carts without putting one of those separator bars in between them. The person behind the @checkout_counter has no idea whose groceries are whose.





Everything in list context gets reduced to an ordered series of scalars. The original container that held the scalars is forgotten.





In the above example the order of scalars is retained: milk, bread, butter is the order of scalars in @cart1 and the order of the scalars at the beginning of @checkout_counter. However, looking at just @checkout_counter, there is no way to know where the contents of @cart1 end and the contents of @cart2 begin. In fact, @cart1 might have been empty, and all the contents of @checkout_counter could belong to @cart2, but there is no way to know.





Sometimes, list context can be extremely handy. We have used list context repeatedly to initialize arrays and hashes and it worked as we would intuitively expect:





my %pets = ( fish=>3, cats=>2, dogs=>1 );

my @cart1 = qw( milk bread eggs);





The initial values for the hash get converted into an ordered list of scalars





( 'fish', 3, 'cats', 2, 'dogs', 1 )





These scalars are then used in list context to initialize the hash, using the first scalar as a key and the following scalar as its value, and so on throughout the list.





List context applies anytime data is passed around in perl. Scalars, arrays, and hashes are all affected by list context. In the example below, @house is intended to contain a list of all the items in the house. However, because the %pets hash was reduced to scalars in list context, the values 3,2,1 are disassociated from their keys. The @house variable is not very useful.





my %pets = ( fish=>3, cats=>2, dogs=>1 );

my @refrigerator=qw(milk bread eggs);

my @house=('couch',%pets,@refrigerator,'chair');

print Dumper \@house;

>$VAR1 = [

> 'couch',

> 'cats',

> 2,

> 'dogs',

> 1,

> 'fish',

> 3,

> 'milk',

> 'bread',

> 'eggs',

> 'chair'

> ];





There are times when list context on a hash does make sense.





my %encrypt=(tank=>'turtle',bomber=>'eagle');

my %decrypt=reverse(%encrypt) ;

print Dumper \%decrypt;

> $VAR1 = {

> 'eagle' => 'bomber',

> 'turtle' => 'tank'

> };





The %encrypt hash contains a hash look up to encrypt plaintext into cyphertext. Anytime you want to use the word "bomber", you actually send the word "eagle". The decryption is the opposite. Anytime you receive the word "eagle" you need to translate that to the word "bomber".





Using the %encrypt hash to perform decryption would require a loop that called each() on the %encrypt hash, looping until it found the value that matched the word received over the radio. This could take too long.





Instead, because there is no overlap between keys and values, (two different words don't encrypt to the same word), we can simply treat the %encrypt hash as a list, call the array reverse() function on it, which flips the list around from end to end, and then store that reversed list into a %decrypt hash.





2.5 References

References are a thing that refer (point) to something else.





The "something else" is called the "referent", the thing being pointed to.





Taking a reference and using it to access the referent is called "dereferencing".





A good real-world example is a driver's license. Your license "points" to where you live because it lists your home address. Your license is a "reference". The "referent" is your home. And if you have forgotten where you live, you can take your license and "dereferencing" it to get yourself home.





It is possible that you have roommates, which would mean multiple references exist to point to the same home. But there can only be one home per address.





In perl, references are stored in scalars. You can create a reference by creating some data (scalar, array, hash) and putting a "\" in front of it.





my %home= (

fish=>3,cats=>2,dogs=>1,

milk=>1,bread=>2,eggs=>12,

);

my $license_for_alice = \%home;

my $license_for_bob = \%home;





Alice and Bob are roommates and their licenses are references to the same %home. This means that Alice could bring in a bunch of new pets and Bob could eat the bread out of the refrigerator even though Alice might have been the one to put it there. To do this, Alice and Bob need to dereference their licenses and get into the original %home hash.





$ {$license_for_alice} {dogs} += 5;

delete($ {$license_for_bob} {milk});

print Dumper \%home;

> $VAR1 = {

> 'eggs' => 12,

> 'cats' => 2,

> 'bread' => 2,

> 'dogs' => 6,

> 'fish' => 3

> };





2.5.1 Named Referents

A referent is any original data structure: a scalar, array, or hash. Below, we declare some named referents: age, colors, and pets.





my $age = 42;

my @colors = qw( red green blue );

my %pets=(fish=>3,cats=>2,dogs=>1);

2.5.2 References to Named Referents

A reference points to the referent. To take a reference to a named referent, put a "\" in front of the named referent.





my $ref_to_age = \$age;

my $r_2_colors = \@colors;

my $r_pets = \%pets;

2.5.3 Dereferencing

To dereference, place the reference in curly braces and prefix it with the sigil of the appropriate type. This will give access to the entire original referent.





${$ref_to_age}++; # happy birthday

pop(@{$r_2_colors});

my %copy_of_pets = %{$r_pets};

print "age is '$age'

";

> age is '43'





If there is no ambiguity in dereferencing, the curly braces are not needed.





$$ref_to_age ++; # another birthday

print "age is '$age'

";

> age is '44'









It is also possible to dereference into an array or hash with a specific index or key.





my @colors = qw( red green blue );

my %pets=(fish=>3,cats=>2,dogs=>1);

my $r_colors = \@colors; my $r_pets = \%pets;

${$r_pets} {dogs} += 5;

${$r_colors}[1] = 'yellow';

print Dumper \@colors; print Dumper \%pets;

> $VAR1 = [

'red',

'yellow', # green turned to yellow

'blue'

];

$VAR1 = {

'cats' => 2,

'dogs' => 6, # 5 new dogs

'fish' => 3

};





Because array and hash referents are so common, perl has a shorthand notation for indexing into an array or looking up a key in a hash using a reference. Take the reference, follow it by "->", and then follow that by either "[index]" or "{key}". This:





${$r_pets} {dogs} += 5;

${$r_colors}[1] = 'yellow';





is exactly the same as this:





$r_pets->{dogs} += 5;

$r_colors->[1] = 'yellow';

2.5.4 Anonymous Referents

Here are some referents named age, colors, and pets. Each named referent has a reference to it as well.





my $age = 42;

my @colors = qw( red green blue );

my %pets=(fish=>3,cats=>2,dogs=>1);

my $r_age = \$age;

my $r_colors = \@colors;

my $r_pets = \%pets;





It is also possible in perl to create an ANONYMOUS REFERENT. An anonymous referent has no name for the underlying data structure and can only be accessed through the reference.





To create an anonymous array referent, put the contents of the array in square brackets.









The square brackets will create the underlying array with no name, and return a reference to that unnamed array.





my $colors_ref = [ 'red', 'green', 'blue' ];

print Dumper $colors_ref;

> $VAR1 = [

> 'red',

> 'green',

> 'blue'

> ];





To create an anonymous hash referent, put the contents of the hash in curly braces. The curly braces will create the underlying hash with no name, and return a reference to that unnamed hash.





my $pets_ref = { fish=>3,cats=>2,dogs=>1 };

print Dumper $pets_ref;

> $VAR1 = {

> 'cats' => 2,

> 'dogs' => 1,

> 'fish' => 3

> };





Note that $colors_ref is a reference to an array, but that array has no name to directly access its data. You must use $colors_ref to access the data in the array. Likewise, $pets_ref is a reference to a hash, but that hash has no name to directly access its data. You must use $pets_ref to access the data in the hash.

2.5.5 Complex Data Structures

Arrays and hashes can only store scalar values. But because scalars can hold references, complex data structures are now possible. Using references is one way to avoid the problems associated with list context. Here is another look at the house example, but now using references.





my %pets = ( fish=>3, cats=>2, dogs=>1 );

my @refrigerator=qw(milk bread eggs);

my $house={

pets=>\%pets,

refrigerator=>\@refrigerator

};

print Dumper $house;

> $VAR1 = {

> 'pets' => {

> 'cats' => 2,

> 'dogs' => 1,

> 'fish' => 3

> },

> 'refrigerator' => [

> 'milk',

> 'bread',

> 'eggs'

> ]

> };

The $house variable is a reference to an anonymous hash, which contains two keys, "pets" and "refrigerator". These keys are associated with values that are references as well, one a hash reference and the other an array reference.





Dereferencing a complex data structure can be done with the arrow notation or by enclosing the reference in curly braces and prefixing it with the appropriate sigil.





# Alice added more canines

$house->{pets}->{dogs}+=5;

# Bob drank all the milk

shift(@{$house->{refrigerator}});





2.5.5.1 Autovivification

Perl autovivifies any structure needed when assigning or fetching from a reference. The autovivified referents are anonymous. Perl will assume you know what you are doing with your structures. In the example below, we start out with an undefined scalar called $scal. We then fetch from this undefined scalar, as if it were a reference to an array of a hash of an array of a hash of an array. Perl autovivifies everything under the assumption that that is what you wanted to do.





my $scal;

my $val =

$scal->[2]->{somekey}->[1]->{otherkey}->[1];

print Dumper $scal;

> $VAR1 = [

> undef,

> ${\$VAR1->[0]},

> {

> 'somekey' => [

> ${\$VAR1->[0]},

> {

> 'otherkey' => []

> }

> ]

> }

> ];





If this is NOT what you want to do, check for the existence of each hash key and check that the array contains at least enough array entries to handle the given index.

2.5.5.2 Multidimensional Arrays

Perl implements multidimensional arrays using one-dimensional arrays and references.





my $mda;

for(my $i=0;$i<2;$i++){

for(my $j=0;$j<2;$j++) {

for(my $k=0;$k<2;$k++){

$mda->[$i]->[$j]->[$k] =

"row=$i, col=$j, depth=$k";

}

}

}

print Dumper $mda;

> $VAR1 = [

> [

> [

> 'row=0, col=0, depth=0',

> 'row=0, col=0, depth=1'

> ],

> [

> 'row=0, col=1, depth=0',

> 'row=0, col=1, depth=1'

> ]

> ],

> [

> [

> 'row=1, col=0, depth=0',

> 'row=1, col=0, depth=1'

> ],

> [

> 'row=1, col=1, depth=0',

> 'row=1, col=1, depth=1'

> ]

> ]

> ];





2.5.5.3 Deep Cloning, Deep Copy

If you need to create an entirely separate but identical clone of a complex data structure, use the Storable.pm perl module. Storable comes standard with perl 5.8. If you don't have 5.8 installed, consider an upgrade. Otherwise, read the section about CPAN later in this document, download Storable from CPAN, and install.





Then use Storable in your perl code, indicating you want to import the 'nstore', 'dclone', and 'retrieve' subroutines. The 'use' statement is explained later in this document as well, for now, it isn't that important.





The 'dclone' subroutine takes a reference to any kind of data structure and returns a reference to a deep cloned version of that data structure.





use Storable qw(nstore dclone retrieve);

my $scal;

$scal->[2]->{somekey}->[1]->{otherkey}->[1];

# $twin is an identical clone of $scal

my $twin = dclone $scal;

2.5.5.4 Data Persistence

The Storable.pm module also contains two subroutines for storing the contents of any perl data structure to a file and retrieving it later.





use Storable qw(nstore dclone retrieve);

my $scal;

$scal->[2]->{somekey}->[1]->{otherkey}->[1];

nstore ($scal, 'filename');

# exit, reboot computer, and restart script

my $revived = retrieve('filename');

2.5.6 Stringification of References

Perl will stringify a reference if you try to do anything string-like with it, such as print it.





my $referent = 42;

my $reference = \$referent;

warn "reference is '$reference'";

> reference is 'SCALAR(0x812e6ec)' ...





But perl will not allow you to create a string and attempt to turn it into a reference.





my $reference = 'SCALAR(0x812e6ec)';

my $value = $$reference;

> Can't use string ("SCALAR(0x812e6ec)") as

> a SCALAR ref while "strict refs" in use





Turning strict off only gives you undef.





no strict;

my $reference = 'SCALAR(0x812e6ec)';

my $value = $$reference;

warn "value not defined" unless(defined($value));

warn "value is '$value'

";

> value not defined

> Use of uninitialized value in concatenation





Because a reference is always a string that looks something like "SCALAR(0x812e6ec)", it will evaluate true when treated as a boolean, even if the value to which it points is false.

2.5.7 The ref() function

The ref() function takes a scalar and returns a string indicating what kind of referent the scalar is referencing. If the scalar is not a reference, ref() returns false (an empty string).





my $temp = \42;

my $string = ref($temp);

warn "string is '$string'";

> string is 'SCALAR'





Here we call ref() on several types of variable:





sub what_is_it {

my ($scalar)=@_;

my $string = ref($scalar);

print "string is '$string'

";

}

what_is_it( \'hello' );

what_is_it( [1,2,3] );

what_is_it( {cats=>2} );

what_is_it( 42 );

> string is 'SCALAR'

> string is 'ARRAY'

> string is 'HASH'

> string is ''





Note that this is like stringification of a reference except without the address being part of the string. Instead of SCALAR(0x812e6ec), its just SCALAR. Also note that if you stringify a non-reference, you get the scalar value. But if you call ref() on a nonreference, you get an empty string, which is always false.

3 Control Flow

Standard statements get executed in sequential order in perl.





my $name = 'John Smith';

my $greeting = "Hello, $name

";

print $greeting;





Control flow statements allow you to alter the order of execution while the program is running.





if( $price == 0 ) {

print "Free Beer!

";

}









Perl supports the following control flow structures:





##

LABEL is an optional name that identifies the

# control flow structure.

# It is a bareword identifier followed by a colon.

# example==> MY_NAME:

##

SINGLE_STATEMENT ==> a single perl statement

# NOT including the semicolon.

# print "hello

"

##

BLOCK ==> zero or more statements contained

# in curly braces { print "hi"; }

LABEL BLOCK

LABEL BLOCK continue BLOCK

# BOOL ==> boolean (see boolean section above)

SINGLE_STATEMENT if (BOOL);

if (BOOL) BLOCK

if (BOOL) BLOCK else BLOCK

if (BOOL) BLOCK elsif (BOOL) BLOCK elsif ()...

if (BOOL) BLOCK elsif (BOOL) BLOCK ... else BLOCK

unless (BOOL) BLOCK

unless (BOOL) BLOCK else BLOCK

unless (BOOL) BLOCK elsif (BOOL) BLOCK elsif ()...

unless (BOOL) BLOCK elsif (BOOL) BLOCK ... else

BLOCK

LABEL while (BOOL) BLOCK

LABEL while (BOOL) BLOCK continue BLOCK

LABEL until (BOOL) BLOCK

LABEL until (BOOL) BLOCK continue BLOCK

# INIT, TEST, CONT are all expressions

# INIT is an initialization expression

# INIT is evaluated once prior to loop entry

# TEST is BOOLEAN expression that controls loop exit

# TEST is evaluated each time after

# BLOCK is executed

# CONT is a continuation expression

# CONT is evaluated each time TEST is evaluated TRUE

LABEL for ( INIT; TEST; CONT ) BLOCK

# LIST is a list of scalars, see arrays and

# list context sections later in text

LABEL foreach (LIST) BLOCK

LABEL foreach VAR (LIST) BLOCK

LABEL foreach VAR (LIST) BLOCK continue BLOCK





3.1 Labels

Labels are always optional. A label is an identifier followed by a colon.





A label is used to give its associated control flow structure a name.





Inside a BLOCK of a control flow structure, you can call





next;

last;

redo;





If the structure has a LABEL, you can call





next LABEL;

last LABEL;

redo LABEL;





If no label is given to next, last, or redo, then the command will operate on the inner-most control structure. If a label is given, then the command will operate on the control structure given.

3.2 last LABEL;

The last command goes to the end of the entire control structure. It does not execute any continue block if one exists.

3.3 next LABEL;

The next command skips the remaining BLOCK. if there is a continue block, execution resumes there. After the continue block finishes, or if no continue block exists, execution starts the next iteration of the control construct if it is a loop construct.

3.4 redo LABEL;

The redo command skips the remaining BLOCK. It does not execute any continue block (even if it exists). Execution then resumes at the start of the control structure without evaluating the conditional again.

4 Packages and Namespaces and Lexical Scoping

4.1 Package Declaration

Perl has a package declaration statement that looks like this:





package NAMESPACE;





This package declaration indicates that the rest of the enclosing block, subroutine, eval, or file belongs to the namespace given by NAMESPACE.





The standard warnings, strictness, and Data::Dumper are attached to the namespace in which they were turned on with "use warnings;" etc. Anytime you declare a new package namespace, you will want to "use" these again.





package SomeOtherPackage;

use warnings; use strict; use Data::Dumper;





All perl scripts start with an implied declaration of:





package main;





You can access package variables with the appropriate sigil, followed by the package name, followed by a double colon, followed by the variable name. This is called a package QUALIFIED variable meaning the package name is explicitly stated.





$package_this::age;

@other_package::refrigerator;

%package_that::pets;





If you use an UNQUALIFIED variable in your code, perl assumes it is in the the most recently declared package namespace that was declared.





When you have strict-ness turned on, there are two ways to create and use package variables:





1) Use the fully package qualified name everywhere in your code:





# can use variable without declaring it with 'my'

$some_package::answer=42;

warn "The value is '$some_package::answer'

";

4.2 Declaring Package Variables With our

2) Use "our" to declare the variable.





package this_package;

our $name='John';

warn "name is '$name'";





Using "our" is the preferred method. You must have perl 5.6.0 or later for "our" declarations.





The difference between the two methods is that always using package qualified variable names means you do NOT have to declare the package you are in. You can create variables in ANY namespace you want, without ever having to declare the namespace explicitly. You can even declare variables in someone else's package namespace. There is no restrictions in perl that prevent you from doing this.





To encourage programmers to play nice with each other's namespaces, the "our" function was created. Declaring a variable with "our" will create the variable in the current namespace. If the namespace is other than "main", then you will need to declare the package namespace explicitly. However, once a package variable is declared with "our", the fully package qualified name is NOT required, and you can refer to the variable just on its variable name, as example (2) above refers to the $name package variable.





We do not HAVE to use the "our" shortcut even if we used it to declare it. The "our" declaration is a shorthand for declaring a package variable. Once the package variable exists, we can access it any way we wish.





package Hogs;

our $speak = 'oink';

warn "Hogs::speak is '$Hogs::speak'";

> Hogs::speak is 'oink' ...

4.3 Package Variables inside a Lexical Scope

When you declare a package inside a code block, that package namespace declaration remains in effect until the end of the block, at which time, the package namespace reverts to the previous namespace.





package Hogs;

our $speak = 'oink';

{ # START OF CODE BLOCK

package Heifers;

our $speak = 'moo';

} # END OF CODE BLOCK

warn "speak is '$speak'";

> speak is 'oink' ...













The Heifers namespace still exists, as does all the variables that were declared in that namespace. Its just that outside the code block, the "our Heifers;" declaration has worn off, and we now have to use a fully package qualified name to get to the variables in Heifers package. This "wearing off" is a function of the code block being a "lexical scope" and a package declaration only lasts to the end of the current lexical scope. The package variables declared inside the code block "survive" after the code block ends.





{

package Heifers;

our $speak = 'moo';

}

print "Heifers::speak is '$Heifers::speak'

";

> Heifers::speak is 'moo'

4.4 Lexical Scope

Lexical refers to words or text. A lexical scope exists while execution takes place inside of a particular chunk of source code. In the above examples, the "package Heifers;" only exists inside the curly braces of the source code. Outside those curly braces, the package declaration has gone out of scope, which is a technical way of saying its "worn off". Scope refers to vision, as in telescope. Within a lexical scope, things that have lexical limitations (such as a package declaration) are only "visible" inside that lexical space.





So "lexical scope" refers to anything that is visible or has an effect only withing a certain boundary of the source text or source code. The easiest way to demonstrate lexical scoping is lexical variables, and to show how lexical variables differ from "our" variables.

4.5 Lexical Variables

Lexical variables are declared with the â€œmyâ€ keyword. Lexical variables declared inside a lexical scope do not survive outside the lexical scope.





no warnings;

no strict;

{

my $speak = 'moo';

}

warn "speak is '$speak'

";

> speak is ''





The lexical variable "$speak" goes out of scope at the end of the code block (at the "}" character), so it does not exist when we try to print it out after the block. We had to turn warnings and strict off just to get it to compile because with warnings and strict on, perl will know $speak does not exist when you attempt to print it, so it will throw an exception and quit.









Lexically scoped variables have three main features:





1) Lexical variables do not belong to any package namespace, so you cannot prefix them with a package name. The example below shows that â€œmy $cntâ€ is not the same as the â€œmain::cntâ€:





no warnings;

package main;

my $cnt='I am just a lexical';

warn "main::cnt is '$main::cnt'";

> main::cnt is ''





2) Lexical variables are only directly accessible from the point where they are declared to the end of the nearest enclosing block, subroutine, eval, or file.





no strict;

{

my $some_lex = 'I am lex';

}

warn "some_lex is '$some_lex'";

> some_lex is ''





3) Lexical variables are subject to "garbage collection" at the end of scope. If nothing is using a lexical variable at the end of scope, perl will remove it from its memory. Every time a variable is declared with "my", it is created dynamically, during execution. The location of the variable will change each time. Note in the example below, we create a new $lex_var each time through the loop, and $lex_var is at a different address each time.





my @cupboard;

for (1 .. 5) {

my $lex_var ='canned goods';

my $lex_ref = \$lex_var;

push(@cupboard, $lex_ref);

print "$lex_ref

";

}

> SCALAR(0x812e770)

> SCALAR(0x812e6c8)

> SCALAR(0x812e6e0)

> SCALAR(0x81624c8)

> SCALAR(0x814cf64)





Lexical variables are just plain good. They generally keep you from stepping on someone else's toes. They also keep your data more private than a package variable. Package variables are permanent, never go out of scope, never get garbage collected, and are accessible from anyone's script.

4.6 Garbage Collection

When a lexical variable goes out of scope, perl will check to see if anyone is using that variable, and if no one is using it, perl will delete that variable and free up memory.





The freed up memory is not returned to the system, rather the freed up memory is used for possible declarations of new lexically scoped variables that could be declared later in the program.





This means that your program will never get smaller because of lexical variables going of of scope. Once the memory is allocated for perl, it remains under perl's jurisdiction. But perl can use garbage collected space for other lexical variables.





If a lexical variable is a referent of another variable, then the lexical will not be garbage collected when it goes out of scope.





no strict;

my $referring_var;

{

my $some_lex = 'I am lex';

$referring_var=\$some_lex;

}

warn "some_lex is '$some_lex'";

warn "referring var refers to '$$referring_var'";

> some_lex is ''

> referring var refers to 'I am lex'





When the lexical $some_lex went out of scope, we could no longer access it directly. But since $referring_var is a reference to $some_lex, then $some_lex was never garbage collected, and it retained its value of "I am lex". The data in $some_lex was still accessible through referring_var.





Note that the named variable $some_lex went out of scope at the end of the code block and could not be accessed by name.

4.6.1 Reference Count Garbage Collection

Perl uses reference count based garbage collection. It is rudimentary reference counting, so circular references will not get collected even if nothing points to the circle. The example below shows two variables that refer to each other but nothing refers to the two variables. Perl will not garbage collect these variables even though they are completely inaccessible by the end of the code block.





{

my ($first,$last);

($first,$last)=(\$last,\$first);

}

4.6.2 Garbage Collection and Subroutines

Garbage collection does not rely strictly on references to a variable to determine if it should be garbage collected. If a subroutine uses a lexical variable, then that variable will not be garbage collected as long as the subroutine exists.





Subroutines that use a lexical variable declared outside of the subroutine declaration are called "CLOSURES".





In the example below, the lexical variable, $cnt, is declared inside a code block and would normally get garbage collected at the end of the block. However, two subroutines are declared in that same code block that use $cnt, so $cnt is not garbage collected. Since $cnt goes out of scope, the only things that can access it after the code block are the subroutines. Note that a reference to $cnt is never taken, however perl knows that $cnt is needed by the subroutines and therefore keeps it around. The inc and dec subroutines are subroutine closures.





{

my $cnt=0;

sub inc{$cnt++; print "cnt is '$cnt'

";}

sub dec{$cnt--; print "cnt is '$cnt'

";}

}

inc;

inc;

inc;

dec;

dec;

inc;

> cnt is '1'

> cnt is '2'

> cnt is '3'

> cnt is '2'

> cnt is '1'

> cnt is '2'





Subroutine names are like names of package variables. The subroutine gets placed in the current declared package namespace. Therefore, named subroutines are like package variables in that, once declared, they never go out of scope or get garbage collected.

4.7 Package Variables Revisited

Package variables are not evil, they are just global variables, and they inherit all the possible problems associated with using global variables in your code. In the event you DO end up using a package variable in your code, they do have some advantages. They are global, which means they can be a convenient way for several different blocks of perl code to talk amongst themselves using an agreed upon global variable as their channel.





Imagine several subroutines across several files that all want to check a global variable: $Development::Verbose. If this variable is true, these subroutines print detailed information. If it is false, these subroutines print little or no information.





package Development;

our $Verbose=1;

sub Compile {

if ($Development::Verbose) {

print "compiling

"; }

}

sub Link {

if ($Development::Verbose){

print "linking

";

}

}

sub Run {

if ($Development::Verbose){

print "running

";

}

}

Compile;

Link;

Run;

> compiling

> linking

> running





The three subroutines could be in different files, in different package namespaces, and they could all access the $Development::Verbose variable and act accordingly.

4.8 Calling local() on Package Variables

When working with global variables, there are times when you want to save the current value of the global variable, set it to a new and temporary value, execute some foreign code that will access this global, and then set the global back to what it was.





Continuing the previous example, say we wish to create a RunSilent subroutine that stores $Development::Verbose in a temp variable, calls the original Run routine, and then sets $Development::Verbose back to its original value.





package Development;

our $Verbose=1;

sub Compile {

if ($Development::Verbose) {

print "compiling

";

}

}

sub Link {

if ($Development::Verbose){

print "linking

";

}

}

sub Run {

if ($Development::Verbose){

print "running

";

}

}

sub RunSilent {

my $temp = $Development::Verbose;

$Development::Verbose=0;

Run;

$Development::Verbose=$temp;

}

Compile;

Link;

RunSilent;

> compiling

> linking





This can also be accomplished with the "local()" function. The local function takes a package variable, saves off the original value, allows you to assign a temp value to it. That new value is seen by anyone accessing the variable. And at the end of the lexical scope in which local() was called, the original value for the variable is returned. The RunSilent subroutine could be written like this:





sub RunSilent {

local($Development::Verbose)=0;

Run;

}

Perl originally started with nothing but package variables. The "my" lexical variables were not introduced until perl version 4. So to deal with all the package variables, perl was given the local() function. Local is also a good way to create a temporary variable and make sure you don't step on someone else's variable of the same name.

5 Subroutines

Perl allows you to declare named subroutines and anonymous subroutines, similar to the way you can declare named variables and anonymous variables.

5.1 Subroutine Sigil

Subroutines use the ampersand ( & ) as their sigil. But while the sigils for scalars, arrays, and hashes are mandatory, the sigil for subroutines is optional.

5.2 Named Subroutines

Below is the named subroutine declaration syntax:





sub NAME BLOCK





NAME can be any valid perl identifier.





BLOCK is a code block enclosed in parenthesis.





The NAME of the subroutine is placed in the current package namespace, in the same way "our" variables go into the current package namespace. So once a named subroutine is declared, you may access it with just NAME if you are in the correct package, or with a fully package qualified name if you are outside the package. And you can use the optional ampersand sigil in either case.





package MyArea;

sub Ping {print "ping

";}

Ping;

&Ping;

MyArea::Ping;

&MyArea::Ping;

> ping

> ping

> ping

> ping









Once the current package declaration changes, you MUST use a fully package qualified subroutine name to call the subroutine.





package MyArea;

sub Ping {print "ping

";}

package YourArea;

MyArea::Ping;

&MyArea::Ping;

&Ping; # error, looking in current package YourArea

> ping

> ping

> Undefined subroutine &YourArea::Ping

5.3 Anonymous Subroutines

Below is the anonymous subroutine declaration syntax:





sub BLOCK





This will return a code reference, similar to how [] returns an array reference, and similar to how {} returns a hash reference.





sub what_is_it {

my ($scalar)=@_;

my $string = ref($scalar);

print "ref returned '$string'

";

}

my $temp = sub {print "Hello

";};

what_is_it($temp);

> ref returned 'CODE'

5.4 Data::Dumper and subroutines

The contents of the code block are invisible to anything outside the code block. For this reason, things like Data::Dumper cannot look inside the code block and show you the actual code. Instead Data::Dumper does not even try and just gives you a place holder that returns a dummy string.





my $temp = sub {print "Hello

";};

print Dumper $temp;

> $VAR1 = sub { "DUMMY" };

5.5 Passing Arguments to/from a Subroutine

Any values you want to pass to a subroutine get put in the parenthesis at the subroutine call. For normal subroutines, all arguments go through the list context crushing machine and get reduced to a list of scalars. The original containers are not known inside the subroutine. The subroutine will not know if the list of scalars it receives came from scalars, arrays, or hashes. To avoid some of the list context crushing, a subroutine can be declared with a prototype, which are discussed later.

5.6 Accessing Arguments inside Subroutines via @_

Inside the subroutine, the arguments are accessed via a special array called @_, since all the arguments passed in were reduced to list context, these arguments fit nicely into an array. The @_ array can be processed just like any other regular array. If the arguments are fixed and known, the preferred way to extract them is to assign @_ to a list of scalars with meaningful names.





sub compare {

my ($left,$right)=@_;

return $left<=>$right;

}





The @_ array is "magical" in that it is really a list of aliases for the original arguments passed in. Therefore, assigning a value to an element in @_ will change the value in the original variable that was passed into the subroutine call. Subroutine parameters are effectively IN/OUT.





sub swap { (@_) = reverse(@_); }

my $one = "I am one";

my $two = "I am two";

swap($one,$two);

warn "one is '$one'";

warn "two is '$two'";

> one is 'I am two'

> two is 'I am one'





Assigning to the entire @_ array does not work, you have to assign to the individual elements. If swap were defined like this, the variables $one and $two would remain unchanged.





sub swap {

my ($left,$right)=@_;

@_ = ($right,$left); #WRONG

}

5.7 Dereferencing Code References

Dereferencing a code reference causes the subroutine to be called. A code reference can be dereferenced by preceding it with an ampersand sigil or by using the arrow operator and parenthesis "->()". The preferred way is to use the arrow operator with parens.





my $temp = sub {print "Hello

";};

&{$temp};

&$temp;

$temp->(); # preferred

> Hello

> Hello

> Hello

5.8 Implied Arguments

When calling a subroutine with the "&" sigil prefix and no parenthesis, the current @_ array gets implicitly passed to the subroutine being called. This can cause subtly odd behavior if you are not expecting it.





sub second_level {

print Dumper \@_;

}





sub first_level {

# using '&' sigil and no parens.

# doesn't look like I'm passing any params

# but perl will pass @_ implicitly.

&second_level;

}





first_level(1,2,3);

> $VAR1 = [

> 1,

> 2,

> 3

> ];

This generally is not a problem with named subroutines because you probably will not use the "&" sigil. However, when using code references, dereferencing using the "&" may cause implied arguments to be passed to the new subroutine. For this reason, the arrow operator is the preferred way to dereference a code reference.





$code_ref->(); # pass nothing, no implicit @_

$code_ref->(@_); # explicitly pass @_

$code_ref->( 'one', 'two' ); # pass new parameters





5.9 Subroutine Return Value

Subroutines can return a single value or a list of values. The return value can be explicit, or it can be implied to be the last statement of the subroutine. An explicit return statement is the preferred approach if any return value is desired.





# return a single scalar

sub ret_scal {

return "boo";

}

my $scal_var = ret_scal;

print Dumper \$scal_var;

# return a list of values

sub ret_arr {

return (1,2,3);

}

my @arr_var =ret_arr;

print Dumper \@arr_var;

> $VAR1 = \'boo';

> $VAR1 = [

> 1,

> 2,

> 3

> ];

5.10 Returning False

The return value of a subroutine is often used within a boolean test. The problem is that the subroutine needs to know if it is called in scalar context 