#!/usr/bin/perl –w
use seminar qw(
regexes
subroutines
);
Eric Wastl
Regular Expressions
Regular Expressions
$var =~ /regex/;
• returns true/false
• captures in $1, $2, $3, ...
$var =~ s/regex/replacement/;
• modifies $var
@array = split(/regex/, $var);
• returns new array
Regexes: matching
• Basic syntax: m/regex/mods
• Modifiers are:
–i (case-insensitive matching)
–m (^ and $ match the start/end of any line)
–s (. also matches newlines)
–x (ignore whitespace and comments)
–g (“global” match)
• „m‟ prefix optional if // delimiters are used
Regexes: bind operator
$var =~ m/regex/mods;
my $var = “BaNaNaS”;
if ($var =~ /anana/i) {...}
if ($var =~ m#anana#i) {...}
if ($var =~ m{anana}i) {...}
if ($var !~ /apple/i) {...}
unless ($var =~ /apple/i) {...}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while (my $line = ) {
if ($line =~ /^\s*$/) {
last;
}
print $line;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while ($_ = ) {
if ($_ =~ /^\s*$/) {
last;
}
print $_;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
if ($_ =~ /^\s*$/) {
last;
}
print $_;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
if (/^\s*$/) {
last;
}
print $_;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
if (/^\s*$/) {
last;
}
print;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
if (/^\s*$/) {
last;
}
print;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
last if /^\s*$/;
print;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
last if /^\s*$/;
print;
}
Regexes: usage
$var =~ m/regex/mods;
• Without a bind operator, $_ is used.
while () {
last if /^\s*$/;
print;
}
Regexes: usage
$var =~ m/regex/mods;
• In a loop, m/.../g keeps track of state
my $file = get_xml_doc();
while ($file =~ /(\)/g) {
print “tag found: $1\n”;
}
Regexes: substituion
• Basic syntax:
s/regex/replacement/mods
• Modifiers are the same:
–i (case-insensitive matching)
–m (^ and $ match the start/end of any line)
–s (. also matches newlines)
–x (ignore whitespace and comments)
–g (“global” match)
• „s‟ prefix is not optional
Regexes: bind operator
$var =~ s/regex/repl/mods;
• Modifies $var in-place
my $var = “Banana smoothie”;
$var =~ s/banana/orange/i;
$var =~ s/\bs.*?e\b/planet/;
$var =~ s#orange#green#i;
$var =~ s{e[et]}{oo}i;
# $var is “groon planoo”
Regexes: usage
$var =~ s/regex/repl/mods;
• Without a bind operator, $_ is used.
while () {
s/[aeiou]//g;
print;
}
Regexes
perldoc perlre
Subroutines
Subroutines
• Basic syntax: sub name { ... }
• Calling: name($x, 2, “z”);
• Arguments appear in @_
• Returns last expression or explicitly from
return statement
• Has own scope like control structures
• perldoc perlsub
Subroutines
Perl PHP
sub add { function add($x,$y) {
my ($x, $y) = @_; return $x + $y;
return $x + $y; }
}
sub add { Usage example:
return $_[0] + $_[1]; my $result = add(1, 2);
print $result;
} # prints 3
sub add {$_[0] + $_[1]}
Variadic functions: Summation
print sum(1, 1) . “\n”;
my $sum = sum(1, 3, 6, 5);
Summation: PHP
function sum() {
$t = 0;
foreach (func_get_args() as $arg) {
$t += $arg;
}
return $t;
}
Summation: Perl
sub sum {
my $t = 0;
foreach my $arg (@_) {
$t += $arg;
}
return $t;
}
Summation: Perl
sub sum {
my $t = 0;
$t += $_ foreach @_;
return $t;
}
Recursion
sub sum {
if (!@_) {
return 0;
}
my $v = shift(@_);
return $v + sum(@_);
}
Recursion
sub sum {
if (!@_) {
return 0;
}
my $v = shift(@_);
return $v + sum(@_);
}
Recursion
sub sum {
return 0 if !@_;
my $v = shift(@_);
return $v + sum(@_);
}
Recursion
sub sum {
return 0 unless @_;
my $v = shift(@_);
return $v + sum(@_);
}
Recursion
sub sum {
return 0 unless @_;
my $v = shift;
return $v + sum(@_);
}
Recursion
sub sum {
return 0 unless @_;
return shift() + sum(@_);
}
Recursion
sub sum {
return @_ ? shift() + sum(@_) : 0;
}
Recursion
sub sum {
@_ ? shift() + sum(@_) : 0;
}
Recursion
sub sum {@_ ? shift() + sum(@_) : 0}
(TMTOWTDI!)
Anonymous subroutines
• Basic syntax: sub { ... }
• Returns the function as a value
• Also exists in JavaScript
• Very expensive in PHP 5.3
Anonymous subroutines
my $add = sub {
my ($x, $y) = @_;
return $x + $y;
};
my $two = $add->(1, 1);
my $six = &$add(2, 4);
Why are anonymous
functions useful?
Anonymous subroutines:
“Strategy” pattern
my $logger = sub { print @_ };
sub mult {
my ($logger, $x, $y) = @_;
$logger->(“mult: $x * $y\n”);
return $x * $y;
}
print mult($logger, 6, 9), “\n”;
Anon. subroutines: Closures
sub makeCounter {
my $value = $_[0] || 0;
return sub { $value++ };
}
my $counter = makeCounter(7);
print $counter->(), “\n”;
print $counter->(), “\n”;
print $counter->(), “\n”;
In-class activity: Factorial
Create a program which calculates the
factorial of a number using recursion.
The factorial is the number multiplied by all
integers between itself and 1:
factorial(5) = 5 * 4 * 3 * 2 = 120
factorial(1) = 1
factorial(0) = 1
In-class activity: Advanced calc
Create a program which prompts the user
for an operation from a list and any
number of arguments, and then returns
the result of the operation. Do this by
creating a hash of subs:
my %ops = (
add => sub { ... },
subtract => sub { ... },
);
In-class activity: Advanced calc
# code example here
(come back later
for more activities)
In-class activity: Reduce
Create a function which “reduces” an array.
The function should take an operator (a
two-arg sub), a starting value and an
array. It should return the result of calling
the operator on the previous result of the
operation (starting with the starting value)
and each item in the list. For example:
4 = reduce( $add, 0, [1, 2, 1]);
“Hello” = reduce($concat, “”, [“He”, “ll”, “o”]);
7 = reduce( $max, 0, [6, 3, 7, 4, 0, 2]);
In-class activity: Reduce
sub reduce {
my ($op, $default, $array) = @_;
return $default unless @$array;
my $next = shift @$array;
return reduce(
$op,
$op->($default, $next),
$array
);
}
In-class activity: Currying
Create a function “curry()” which takes
another function “$fn” and a list of
arguments. Return a new function which,
when called, calls $fn, passing it both the
original list of arguments and the
arguments to the returned function:
my $add_six = curry($add, [2, 4]);
16 = $add_six->(3, 7);
In-class activity: Currying
sub curry {
my ($fn, $args) = @_;
return sub {
return $fn->(@$args, @_);
};
}
Homework!
Create a function which serializes a data
structure to the format of your choice (that
is, turns a data structure into a string
format, like JSON or XML). Create
another function which unserializes that
format. You may invent your own format.
You may assume all leaves are integers.
For example:
“{„a‟:1, „b‟:[2, 3]}” =
serialize({a=>1, b=>[2, 3]});