) { chomp; $seen{$_} = 1; } close MEM; # the value of one is arbitrary for keys of %seen
A Sample CGI Script
$address = $FORM{email}; if ($seen{$address}) { print "Content-type: text/html\n\n"; print "You're already a member!"; } else { print "Content-type: text/html\n\n"; foreach $key (sort keys %FORM) { print "$key is $FORM{$key}
"; } open(MEM,”>>memberemails.txt”); print MEM “$address”; close MEM; }
Regular Expressions
Regular expressions are patterns to be matched against a string Perl regular expressions are a superset of those used by the UNIX utilities grep, sed, vi and awk
We‟ve already seen: print if (/$pattern/); Which is shorthand for: print $var if ($var=~m/$pattern/);
Pattern Matching Operators/Functions
$var=~m/$pattern/; # the match operator $var=~s/$pattern/$replacementpattern/g; # the substitution operator # “g” modifier means all occurences on each line @list = split /$pattern/, $var; # splits $var into list with $pattern as delimiter $var = join /$pattern/, @list; # joins list into a single variable /$pattern/i ; # “i” means ignore case
Regular Expressions
Metacharacters: \|()[{^$*+?. Backslash means “escape” or literal interpretation of metacharacters: $var =~ s/\|\$/pipe-dollar/; #means replace „|$‟ with „pipe-dollar Escaping normal alphanumeric characters turns them (some of them) into metacharacters: \s means “white space (tab or space) \n means line return
Regular Expressions
“|” means “or”; Parentheses allow grouping: print if (/Dept of (Psychology|Biology)/); # prints lines containing # “Dept of Psychology” or “Dept of Biology” “.” Means “any character” “*” means any number of the previous character: /Psych.*/ # matches Psychology or Psychiatry “+” means “one or more of the previous character” $line=~s/\s+/\t/g; # replace one-or-more spaces with a tab
Regular Expressions
“^” means beginning of the line “$” means end of the line s/^\s+//; # gets rid of spaces at beginning of line “[ ]” identify a “character class” s/[A-Ex2]/R/g # replaces A, B, C, D, E, 2, or x with R. “[^… ]” identifies a negative character class \w # any word character [a-zA-Z0-9_] while(<>) { /\@/ && print “$_\n” foreach(split /[^\w\@\.\-]/ ); } # extracts email addresses from an html file
Command Line Options
perl -w filename.pl
Debug mode, provides extra detail about potential flaws in code
Test if file compiles successfully without actually running
perl -c filename.pl
perl -e „command1; command2; …‟
Command line switch; runs perl code typed directly on the command line. perl -e ‟sleep(120); while (1) { print "\a" }‟ # a cheap alarm clock
Subroutines
Defining a subroutine
sub name { …. }
& name;
Invoking a subroutine
print “What‟s your name?”; chomp ($name = ); & hello; sub hello { print “Hello, $name!\n”; }
System Calls
Backticks execute an expression “from the command line” and return the standard output: $files = `ls`; @files = split /\n/,$files;
system( … ) just executes the expression and returns 1 if successful, 0 if not system (“mailx -s \”test mailing\” smiile@utk.edu < file”)
Additional Resources
CGI Course, March 28 and April 6. See
http://web.utk.edu/~training
http://www.netcat.co.uk/rob/perl/win32perltut.html http://www.astentech.com/tutorials/Perl.html
Another PERL tutorial:
A Directory of PERL tutorials:
Schwartz, R., Christiansen, T., & Wall, L. (1997). Learning Perl. Sebastopol, CA: O‟Reilly & Associates.
Additional Resources
The PERL Bookshelf (CD-ROM with 6 books). O‟Reilly & Associates. Includes Learning Perl. Christiansen, T., & Torkington, N. (1998). Perl Cookbook. Sebastopol, CA: O‟Reilly & Associates. UNIX for Windows http://www.research.att.com/~dgk/uwin/