Docstoc

PERL_Session

Document Sample
PERL_Session Powered By Docstoc
					PERL

TOPICS COVERED
What is PERL History of PERL Why PERL Features of PEARL Operators in PERL Data Types Control Structures Functions Basic I/O Regular Expression File Handle and Tests PERL Modules Application of PERL Case Study

What is PERL


Practical Extraction and Reporting Language (or Pathologically Eclectic Rubbish Lister) Intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).
Interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. Good language for many system management tasks. Expression syntax corresponds quite closely to C expression syntax.





 



PERL is a C-like scripting language with powerful string processing features. PERL was derived from C, sed, bash and awk.
PERL is commonly used with web development. Perl is one of the most commonly used languages for writing Common Gateway Interface (CGI) scripts PERL, created, written, developed, and maintained by Larry Wall, is a language for processing text.





History of PERL
1987 - Perl 1.0 is released 1988 - Perl 2.0 is released 1989 - Perl 3.0 is released under the GNU Protection License 1991 - Programming perl, by Larry Wall and Randal L. Schwartz, is published by O'Reilly & Assocaites. 1992 - The first version of Perl for the Macintosh, MacPerl 4.0.2, is released. 1993 - The final Perl4 version, Perl 4.036, is released. 1995 - Perl 5.001 is released. Andreas König starts the Perl module repository which will later turn into the Comprehensive PERL Archive Network (CPAN). Tim Bunce posts the first module list that lists CPAN as the central module repository. Jarkko Hietaniemi introduces CPAN to the Perl community. 1996 - Larry Wall joins the staff of O'Reilly & Associates as a Senior Software Developer. 1997 - The Perl Insitute announces a web-accessible mirror of CPAN. 1998 - Larry Wall is awarded the first Free Software Foundation Award for the Advancement of Free Software. 2000 - Perl 5.6 is released. The perl version converts to the standard <major version>.<minor version>.<path version> format for release numbers. PERL now has support for Unicode.

Why PERL?
   



Allows rapid prototyping of concepts and algorithms. Allows rapid development of utilities and tools. Platform Independent (Win32, Mac, Linux, BSD, ...) Vendor neutral, standardized and open source. Large number of modules (extensions) available for just about any conceivable application available from the Comprehensive PERL Archive Network (CPAN).

Features of PERL




 

 

   

Modularity and Reusability using innumerable modules Subroutines can be Overridden, Auto loaded, and Prototyped Regular Expression Enhancements Enhanced debugger and interactive Perl environment, with integrated editor support Many usability enhancements Simplified grammar Arbitrarily nested data structures Modularity and Reusability Embeddable and Extensible Multiple simultaneous DBM implementations

Operators in PERL
Fundamental Operators
Increment & Decrement Additive Multiplicative
Assignment Bitwise Relational

++, -+, -, .

**, *, /, %, x
=,**=,+=,*=,&=,<<=,&&=,-=,/=, |=, >>=, ||=, .=, %=, ^=, x= &,|,^,~,<<,>> <, >, <=, >=, lt, gt, le, ge,

Conditional Equality
Logical

?: ==, !=, <=>, eq, ne, cmp
&&, ||, !, and, or, not, xor

Operators in PERL
Ranging Binding Comma .. =~, !~ , =>

File -r readable by effective uid/gid. -R readable by real uid/gid. [Handle] -w writable by effective uid/gid. -W writable by real uid/gid. Test -x executable by effective uid/gid. -X executable by real uid/gid. -o owned by effective uid. -O owned by real uid. -e File exists. -z has zero size. -f plain file. -s has non-zero size. -d is a directory. -p named pipe (FIFO). -l symbolic link. -b block special file. -S a socket. -c character special file. -u File has setuid bit set. -g File has setgid bit set. -k File has sticky bit set. -t Filehandle is opened to a tty. -T File is a text file. -B File is a binary file. -M Age of file in days when script started. -A Same for access time. -C Same for inode change time.

Data Types




  



No type declarations  Types are distinguished lexically (by first character) Perl does not have integer, float, boolean, etc. types like other languages  In Perl, these are values of type Scalar PERL supports the following data types: Scalars - a number or a string of characters Arrays - a numerically indexed (starting at zero) list of scalars, other arrays or hashes Hashes - a hash or associative array is an array that is indexed based on a key which may be a number (any number) or a string

Basic PERL Data Types Scalars
Scalars include strings, doubles, integers and references to the other data types. my $mystring = "Perl is cool."; my $myreal = 40.96; my $myint = 42; print 57+$myint, " bottles of beer on the wall ... \n"; print $mystring;

Basic PERL Data Types Arrays
Arrays are lists of scalars. They are grown dynamically. Multidimensional arrays are constructed by having arrays of array references. @myarray = ('An', 'array', 'of', 'strings'); @filelist = glob('*.*'); $myarray[0] = 3.1415926535897932384626; foreach $val (@myarray) { print "item=$val\n"; } for($i=0; $i <= $#myarray; $i++) { print "item $i = $myarray[$i]\n"; }
# array ref $arrayref = \@myarray; $arrayref->[2] = 'circle'; #will change “of” to “circle”

Basic PERL Data Types Hashes
Hashes are collections of (key, value) pairs of scalars. They are grown dynamically and are one of Perl's most powerful data structures. Most object oriented Perl is done by blessing a hash ref.
%myhash = ( 'color' => 'red', 'texture' => 'rough', 'miles' => 5000 ); print "color = ", $myhash{'color'}, "\n"; for $mykey (keys %myhash) { print "key $mykey has value = $myhash{$mykey}\n"; } # hash ref $hashref = \%myhash; $hashref->{'color'} = 'green'; $hashref2 = { 'food' => 'inedible', 'location' => 'vending machine' };

Scalars
 



Number or string of characters Numbers  Internally, all numbers are double-precision floating point Single-quoted strings  Any character, with 2 exceptions (single quote and back slash), is legal between quotes  \' prints as '  \\ prints as \

Scalars


Double-quoted strings  Backslash specifies control characters, or, through octal and hex notation, any character  \n – newline \r - carriage return  \t – tab \b - backspace  \a – bell \e - escape  \007 - octal ASCII value \x87 - hex ASCII value  \\ - backslash \" - double quote

  




\l - lowercase next letter \L - lowercase all following letters until \E \u - uppercase next letter \U - Uppercase all following letters until \E \E - terminates \L or \U \cC - control character (here, control C)

Scalars


Numeric operators  Addition (+)  Subtraction (-)  Multiplication (*)  Division (/)  Exponentiation (**)  Modulus (%)  Comparison operators  <, <=, ==, >=, >, !=

Scalars


String operators  Concatenation (.)  String repetition (x)  "bill"x4 is "billbillbillbill"  (4+2) x 3 is 6 x 3 or "6" x 3 or "666"  "a"x3.7 is "aaa"  "a"x0 is "“  Comparison operators  lt, le, eq, ge, gt, ne  7 < 30 is true  7 lt 30 is false

Scalars
Conversion between strings and numbers  String used as operand for numeric operator  String converted to its equivalent numeric value  " 45.67bill" converts to 45.67  "bill" converts to 0  Numeric value used as operand for string operator  Value calculated and converted into string  "z" . (3 + 5) is "z8"



Scalars
Scalar variables  Begin with $, followed by letter, possibly followed by more letters, digits, or underscores  Case sensitive Variable assignment  Value of a scalar assignment is the value of the right-hand side  $z = 3 * ($y = 2);  $a = $b = 7;





Scalars




Autoincrement and autodecrement  $z = 8; $y = ++$z; #$z and $y are both 9  $z = 8; $y = $z++; #$y is 8, but $z is 9  $z = 8.5; $y = ++$z; #$z and $y are both 9.5 Binary assignment operators  $a = $a + 8; $a += 8;  $string = $string . "cat"; $string .= "cat";

Scalars


chop( )  Returns last character of string argument  Has side effect of removing this last character Interpolation  $a=7; print "Value is $a";  Prints Value is 7  $b='Bill'; print "Name is $b";  Prints Name is Bill  $b='Bill'; print "Name is \$b";  Prints Name is $b



Scalars






<STDIN> as a scalar value  $a = <STDIN>; chop($a);  chop($a=<STDIN>); Output  print("Hi there\n");  print "Hi there\n"; undef  Value of undefined variable  Looks like zero when used as a number  Looks like the empty string when used as a string  <STDIN> returns undef when input is control-D

Arrays






Literals  (4,5,6) ("Bill",7,8,9)  ($a,8) ($z+$h,7,8)  (1..5,4,7) # (1,2,3,4,5,4,7)  (1.3..5.3) # (1.3,2.3,3.3,4.3,5.3)  (2.3..7.1) # (2.3,3.3,4.3,5.3,6.3)  () # the empty array  print (5, 6, "cat", $a) Array variables  Begins with @ Array assignment  @a=(1,2,3);  @b=@a;  @z=5; # @z is (5)  @z=("b",6); @w=(4,5,@z,9); # @w is (4,5,"b",6,9)

Arrays
Array assignment
($a,$b,$c)=(4,5,6) # $a=4, $b=5, $c=6 ($a,$b)=($b,$a) # switches values of $a and $b ($z,@w)=($a,$b,$c) # $z is $a, @w is ($b,$c) $a=(4,5,6); # $a is 6, from the comma operator ($a)=(4,5,6); # $a is 4 @x=(4,5,6); $a=@x; # $a is 3, the length of @x ($a)=@x; # $a is 4

Arrays
Element access
1.

2.
3. 4. 5. 6.

7. 8.
9. 10. 11.

@x=(4,5,6); $a=$x[0]; # $a is 4 $x[0]=9; # @x is (9,5,6) $x[2]++; # @x is (9,5,7) $x[1] += 5; # @x is (9,10,7) ($x[0],$x[1])=($x[1],$x[0]); # @x is (10,9,7) @x[0,1]=@x[1,0]; # @x is (9,10,7) @x[0,1,2]=@x[1,1,1]; # @x is (10,10,10) $b=$x[8]; # $b is undef $x[4] = "b"; # @x is (10,10,10,undef, "b") $#x=3; # $#x is index value of last element of @x, so @x is (10,10,10,undef)

Arrays


push( ) and pop( ) operators
push(@x,$a); # @x = (@x,$a) push(@x,6,7,8); # @x = (@x,6,7,8) $last=pop(@x); # removes last element of @x



shift( ) and unshift( ) operators
unshift(@x,6,7,8); # @x = (6,7,8,@x) $first=shift(@x); # ($first,@x) = @x

Arrays


reverse( ) operator



sort( ) operator

@x = (6,7,8); @y = reverse(@x); # @y = (8,7,6)



chop( ) operator

@x = (1,2,4,8,16,32,64); @x = sort(@x); #@x = (1,16,2,32,4,64,8) @x = ("bill\n", "frank\n"); chop(@x); # @x = ("bill", "frank")

Arrays
Scalar and array contexts






Operator expecting an operand to be scalar is being evaluated in scalar context Operator expecting an operand to be an array is being evaluated in array context

Arrays


Scalar and array contexts


Concatenating a null string to an expression forces it to be evaluated in scalar context
@x = ("x", "y", "z"); print ("There are ",@x, " elements\n"); # prints "xyz" for @x print ("There are ", "".@x, " elements\n"); # prints 3 for @x

Control Structures
Control structures are classified into following two categories i) Decision Making Statements

if, unless

II) Looping Statements

for, foreach, while, & until

Statement block:

A list of statements surrounded by opening and closing phrases. Ex: { first_statement; === last_statement; }

Control Structures
Decision Making Statements : Syntax unless (expression) if (expression_1)

Statement Blocks;
if (expression)

Statement Blocks;
elseif (expression_2)

Statement Blocks;
if (expression) else

Statement Blocks;
elseif (expression_3)

Statement Blocks; Statement Blocks;

Statement Blocks;
else

Statement Blocks;

Control Structures
Decision Making Statements : Examples
Ex: my $total=10 unless ($total < 1) { print “Total is =10 !\n"; } if ($total >= 10) { print “Total is =10 !\n"; } if ($total >= 10) { print “This is true part.\n"; } else { print “This is false part.\n"; }
if ($total < 1) { print "Invalid\n"; } elsif ($total < 34) { print "Too small\n"; } elsif ($total < 66) { print "Just right\n"; } elsif ($total < 101) { print "Too large\n"; } else { print "Invalid\n"; }

Control Structures
Looping Statements:
until (Expression) Statement Blocks
[Looping if Expression is False]

while (Expression) Statement Blocks
[Looping if Expression is True]

# Prg to find the sum of digit until ($n<0) { $t=$n % 10; $sum+=$t; $n = int($n/10); }

# Prg to find the sum of digit while ($n>0) { $t=$n % 10; $sum+=$t; $n = int($n/10); }

Control Structures
Looping Statements:
do Statement Blocks until (Expression) [Looping if Expression is False] # Prg to find the sum of digit do { $t=$n % 10; $sum+=$t; $n = int($n/10); } until ($n<0) do
while (Expression)

Statement Blocks

[Looping if Expression is True]

# Prg to find the sum of digit do { $t=$n % 10; $sum+=$t; $n = int($n/10); } while ($n>0)

Control Structures
Looping Statements:
1) for ( initializer ; expression ; increment ) Statement Blocks; 2) foreach $var (@an_array) Statement Blocks;
# Prg to find the sum of N Natural Numbers for($i=1;$i<=$n;$i++) { $sum+=$i; }

# Prg to find the print the first N Natural Numbers foreach $n (1 .. $n) { print $i; } foreach (1 .. $n) { print $_; } [Note: Both are same.]

Miscellaneous Control Structures
False Expressions :

“”, ( ), "0“, undef, 0
Bye Passing the Loop: Last Exits a loop Next Begins another loop iteration Redo Jumps to the top of a loop without beginning another iteration

Miscellaneous Control Structures
Next & Last @movies=("Star Wars", "Porky's", "Rocky5", "Terminator“} @ratings = ("PG", "R", "PG-13", "R", "G"); for ($j=0, $j<=3, $j++) { next if $ratings[$j] eq 'R'; print "Would you like to see $movie[$j]?"; chop($answer = <>); last if $answer eq 'yes'; }
if ($answer eq 'yes') {print "That will be \$4.25\n"} else {print "Sorry you don't see anything you like.\n"}

Miscellaneous Control Structures
REDO
print "Type in 4 digits (0 through 9)\n\n" for ($j=1; $j<=4; $j++) { print "Choice $j:";

chop($digit[$j] = <>); redo if $digit[$j] > 9; redo if $digit[$j] < 0;
} print "Your choices: @digit \n";

Miscellaneous Control Structures
Labelled Blocks : OUTER: for ($i=1; $i<=10; $i++) { INNER: for ($j=1; $j<=10; $j++) { if ($i*$j == 25) { print "$i times $j is 25!\n"; last OUTER; } if ($j > $i) { next OUTER; } } }

Miscellaneous Control Structures




Expression Modifiers  exp2 if exp1; # if (exp1) {exp2;}  exp2 unless exp1; # unless (exp1) {exp2;}  exp2 while exp1; # while (exp1) {exp2;}  exp2 until exp1; # until (exp1) {exp2;} &&, ||, and ?: as control structures  exp1 && exp2  if (exp1) {exp2;}  exp1 || exp2  unless (exp1) {exp2;}  exp1 ? exp2 : exp3;  if (exp1) {exp2;} else {exp3;}

Functions










Defining a function sub repeat { my $arg1=@_; print "Hello, $word $arg1\n" } Subroutine definitions can be anywhere in program text Subroutine definitions are global  For two subroutines with same name, the latter one overrides the former one By default, any variable reference inside a subroutine is global Invoking a function &repeat(“First”); do repeat(“First”); $x = 5+&repeat(“First”);

Functions
Return Values
Return value of a subroutine is the value of the logically last expression evaluated Ex: 1) sub sum_of_a_and_b {
$a+$b; # return $a+$b;

} $a=7; $b=8; $x=3*&sum_of_a_and_b;

2) sub list_of_a_and_b {
($a,$b);
} $a=7; $b=8; @x=&list_of_a_and_b;

Functions
Arguments
1. Invoking a subroutine with arguments in parentheses puts these arguments into a list denoted by @_ for the duration of the subroutine. 2. Each arguments can be accessible thru $_ array with index. 3. Arguments can be of type either by value or by reference. 4. Value of the Reference can be accessed by $$Variable – Scalar $Variable->[index] or $$Variable[index]-Array $Variable->{”Key”} or $$Variable{”Key”}-Hash 5. There is no static No of Arguments

Functions
a) Sub say { print "$_[0], $_[1]!\n"; } &say("goodbye", "cruel world"); &say("hello", "world");

b) @_ is local to the subroutine sub add { $sum = 0; foreach $_ (@_) { $sum += $_; } $sum; } $a = &add(4,5,6); # adds 4, 5, 6 and assigns 15 to $a print &add(6,7,8,9); print &add(6..9);

Functions
sub bigger_than { local($n,@values); ($n,@values) = @_; # or local($n,@values) =@_; local(@result); foreach $_ (@values) { if ($_ > $n) { push(@result,$_); } } @result; }

Local Variables

@new=&bigger_than(100,@list); #@new gets elements of @list > 100 @x = &bigger_than(5,1,5,15,67); #@x= (15, 67)

System Interactions
Running OS commands system(“ren $oName $nName”); # rename Unix file system(“ls –lrt”); #wont do much Backticks $retVal = system(“pwd”); # this does not give you

the dir name

$retVal = `pwd` ; print “current working directory is $retVal\n”;

Basic I/O
For receiving the input from the standard input device (Keyboard) the built in File Handle <STDIN> is used

Input from <STDIN> 1) while ($_ = <STDIN>) { chop $_; … } while (<STDIN>) { chop; … }
3) My $name; print “Enter your Name”; $name=<STDIN>;

Basic I/O
Input from the diamond operator <>




Gets data from files specified on the command line that invoked given Perl program  If no files on command line, the diamond operator reads from standard input while (<>) { print $_; } Actually, diamond operator gets its input by examining @ARGV, the array which is a list of command line arguments  Can set this array in program @ARGV = ("aa", "bb", "cc"); while (<>) { # processes files aa, bb, and cc print "This line of the file is: $_"; }

Basic I/O


Output to STDOUT  print operator  Argument is a list of strings  Return value is true or false, depending on success of the output Ex: $x = print("Hello", " world", "\n"); Print STDOUT "Hello", " world", "\n";


Formatted output printf "%9s %4d %12.5f\n", $s, $n, $r;

Regular Expression
A Regular Expression is simply a string that describes a pattern. Has a symbol notations to represent the particular part Used to search strings, extract desired parts of strings, and to replace operations. Regular expressions have the undeserved reputation of being abstract and difficult to understand. Regular expressions are constructed using simple concepts like conditionals and loops.

Regular Expression
Symbols Used to represent the String Parts:
\s \S \d \D \w \W \b \B \num $` $& $‟ A whitespace character (space, tab, newline) Non-whitespace character A digit (0-9) A non-digit A word character (a-z, A-Z, 0-9, _) A non-word character Word Boundary the String Non Word Boundary the String Back References [\1, \2, \3…\9] Starting to Matched part Matched part Matched part to Ending

Regular Expression
Metacharacters :
. * + ? {n} {m,n} {n,} () (?:re) [] [^ ] | ^ $ \ A single character Zero or more of the previous thing One or more of the previous thing Zero or one of the previous thing Matches exactly n of the previous thing Matches between m and n of the previous thing Matches n or more of the previous thing Capturing the part into $ variables Capturing the part into $ variables Matches a single character in the given set Matches a single character outside the given set Matches any of the alternatives specified. [This are That] Start of string End of string Escape the special meanings

Regular Expression
Precedence  Parentheses


() + * ? {n,m} abc ^ $ \b \B |



Multipliers




Sequence and anchoring




Alternation


Regular Expression
Examples: Simple Word Matching
1.

2.
3. 4.

5.
6. 7.

8.
9. 10. 11. 12. 13. 14. 15. 16.

"Hello World" =~ /World/; # matches if (/foo/) { ... } # true if $_ contains "foo" if ($a =~ /foo/) { ... } # true if $a contains "foo“ "2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter "2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary + "The interval is [0,1)." =~ /[0,1)./ # is a syntax error! "The interval is [0,1)." =~ /\[0,1\)\./ # matches "/usr/bin/perl" =~ /\/usr\/local\/bin\/perl/; # matches 'C:\WIN32' =~ /C:\\WIN/; # matches "1000\t2000" =~ m(0\t2) # matches "1000\n2000" =~ /0\n20/ # matches "1000\t2000" =~ /\000\t2/ # doesn't match, "0" ne "\000" "cat" =~ /\143\x61\x74/ # matches, but a weird way to spell cat /[bcr]at/; # matches 'bat, 'cat', or 'rat„ /item[0123456789]/; # matches 'item0' or ... or 'item9„ "abc" =~ /[cab]/; # matches 'a'

Regular Expression
Looking Ahead and Looking Behind
^ looks behind. $ looks ahead. \b looks both ahead and behind  Lookahead and lookbehind are zero-width assertions  Look Ahead assertion is denoted by (?=regexp)  Look Behind assertion is denoted by (?<=fixed-regexp).


Examples:
$x = "I catch the housecat 'Tom-cat' with catnip"; $x =~ /cat(?=\s+)/; # matches 'cat' in 'housecat' @cwd=($x =~ /(?<=\s)cat\w+/g); #matches, $cwd[0]='catch' $cwd[1]='catnip' $x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat' $x =~ /(?<=\s)cat(?=\s)/; #doesn't match; no isolated 'cat' in #middle of $x

Regular Expression


All matching is greedy  Longest expression matches first  In "a yyy c yyyyyyyy c yyy d", the pattern "a.*c.*d" induces a match of the first ".*" on " yyy c yyyyyyyy "  Consider the pattern "a.*cx.*d" matching the string "a yyy cx yyyyyyyy cy yyy d"  The first ".*" first matches the substring " yyy cx yyyyyyyy ", and then backtracking occurs

Regular Expression
 









Normally, beginning of pattern is shifted through the string from left to right Anchoring ensures that parts of the pattern match with particular parts of the string \b requires a word boundary at the given spot in order for the pattern to match  Word boundary is place between \w and \W or between \w and the start or end of the string  /car\b/ matches car but not cars  /\bbarb/ matches "barb" and "barbara", but not "ebarb"  /\bport\b/ matches "port", but neither "sport" nor "ports“ \B requires that there not be a word boundary  /\bFred\B/ matches "Frederick" but not "Fred Flintstone" The caret (^) matches the beginning of the string if it makes sense  /^a/ matches "a" if it is the first character of the string  /a^/ matches "a^" anywhere in the string The dollar sign ($) matches the end of the string if it makes sense

Anchoring Patterns

Regular Expression
Selecting a different target



Want to match a variable other than $_ The operator =~ matches the left-hand argument

Ex: $a = "hello world"; $a =~ /^he/; # true
if (<STDIN> =~ /^[yY]/) { … }

if (<STDIN> =~ /^y/i) { # i ignores case … }

Regular Expression
Using a different delimiter
If regular expression contains "/", precede it with a "\" $path = <STDIN>; if ($path =~ /^\/usr\/etc/) { … }

By Different Delimiter if ($path =~ m#^/usr/etc#) { … }

Regular Expression
Special read-only variables





$& is the part of the string that matches the regular expression $` is the part of the string before the part that matched the regular expression $' is the part of the string after the part that matched the regular expression
$_ = "this is a sample string"; /sa.*le/; # matches "sample" # $` is "this is a " # $& is "sample" # $' is " string"

Ex:

Regular Expression
Using variable interpolation
Interpolation is a pattern can be given thru a variable.
Ex: $what = "is"; $sentence = "This is good."; if ($sentence =~ /\b$what\b/) { print "This sentence contains the word $what.\n" }

Regular Expression
Special read-only variables
After successful match, $1, $2,… are set to same values as \1, \2, … Ex: 1) $_ = "This is good"; /(\w+)\W+(\w+)/; # matches first two words # $1 is "This" and $2 is "is“ 2) $_ = "This is good"; ($first,$second) = /(\w+)\W+(\w+)/; # $first is "This" and $second is "is"

Regular Expression
Substitutions


s/regular_expression/replacement_string/flag
Replaces first occurrence $_ = "bill yyyyyyy fred"; s/y*/tom/; # $_ is now "bill tom fred" Flag "g" substitutes all occurrences $_ = "foot fool buffoon"; s/foo/bar/g; # $_ is now "bart barl bufbarn“ Flag “I” is used for case insensitive Flag “m” to treat the string as multi line Flag “s” to treat the string as single line





  

Regular Expression
Substitutions
1.

2. 3. 4.
5. 6. 7.

8.

$_ = "hello world"; $new = "goodbye"; s/hello/$new/; # $_ is now "goodbye world“ s/Hello/$new/i; # $_ is now "goodbye world“ $_ = "this is a test"; s/(\w+)/<$1>/g; # $_ is now "<this> <is> <a> <test>" $x[$j] =~ s/here/there/; $d{"abc"} =~ s/there/here/;

Regular Expression
Takes as arguments a regular expression and a string, looks for all occurrences of the regular expression in the string, and the parts of the string that don‟t match the regular expression are returned in sequence as a list of values Ex: $line="67;bill;;/usr/bin;joe"; @fields=split(/;/,$line); # @fields is ("67","bill","","/usr/bin","joe")


split( ) Operator

 

@fields=split(/;+/,$line); # @fields is ("67", "bill", "/usr/bin", "joe") split(/regex/) is the same as split(/regex/,$_) split is the same as split(/\s+/,$_)

Regular Expression
join( ) Operator



Inverse of split( ) Takes a list of values and puts them together with a separator element between each pair of items
@fields is ("67","bill","","/usr/bin","joe") $line = join(";",@fields); $line="67;bill;;/usr/bin;joe";

Ex:

File Handles & Tests



Built In Filehandles 1) STDIN [Input] 2) STDOUT [Output] 3) STDERR [Error] Opening and closing a filehandle 1) open(FILEHANDLE, "file_name") #for reading 2) open(FILEHANDLE, ">file_name") # for writing 3) open(FILEHANDLE, ">>file_name") #for appending 4) close(FILEHANDLE) # Closes file 5) These operations return true or false, depending on whether or not the operation was successful 6) A filehandle that hasn‟t been successfully opened can be read (you get the end-of-file at the start) or written to (data disappears) 7) The die( ) operator prints out its arguments on STDERR and then ends the process a) unless (open(FILE, ">/x/y/data")) { die "File couldn't be opened\n"; } b) open(FILE, ">/x/data") || die “File couldn‟t be opened\n”;

File Handles & Tests
Using file handles Program to copy from one file to another
@ARGV=(“Src.txt”,”Dest.txt”); open(IN,"$ARGV[0]") || die "Cannot open $ARGV[0] for reading"; open(OUT, ">$ARGV[1]") || "Cannot create $ARGV[1]"; My $i=1; while (<IN>) { print STDOUT “Copying Line $++:”, $_; print OUT $_; } close(IN); close(OUT);

File Handles & Tests
print "What file? "; $filename=<STDIN>; chop($filename); if (-r $filename && -w $filename) { # file exists and I can read and write it } Options -x File is executable -e File or directory exists -o File or directory is owned by user -z File exists and has zero size -s File or directory exists and has nonzero size Return value is the size in bytes -f Entry is a file -d Entry is a directory stat(FH) Returns 13-element array, where stat(FH)[7] is the size of the file in bytes. FH- FILE HANDLE

Built in Variables

  

PERL is rich in built in variables. Each having special meaning. Can be used either in terms of symbols or words

in words. Default is Symbol.
 

Use English; strict the programmer to use built in variables
Some of the variables are read only

Some are used as a file handles (STDIN, STDOUT, STDERR, ARGV]


Variables are of type either Scalar, Array or Hash

What is PERL Module







 

A Perl Module is a self-contained piece of [Perl] code that can be used by a Perl program (or by other Perl modules) Perl Module file will have extension .pm Each Perl Module has an unique name. Perl provides a hierarchal name space for modules, similar to the name space for Java classes. Components of a module name are separated by double colons “::”. i.e. IEPM::PingER Compiler will search the module in @INC/%INC. It is conceptually similar to:
 

a C link library a C++/Java class

Using PERL Module
do, use, require are used to include a perl module.  A module can be included directly or thru lib path use [lib] <path/filename>


Ex: Module is located at: /username/modules/Foo/Module.pm use lib ‘/username/modules'; use Foo::Module; OR use username::modules::Foo::Module;

Difference Between Use & Require
Use
Operates at compile time Can be used only with modules

Require
Operates at runtime. Can be used with perl files

Internally, use calls both require and a function called import()
use Cwd; # import names from Cwd::

No such call

$here = getcwd();

require Cwd; # make Cwd:: accessible $here = Cwd::getcwd();

Applications of PERL
As hence name [Practical Extraction of Reporting Language is mainly used for text manipulation.  Used in Telecom Billing for processing CDR, IPDR files. Ex: Kenan-ARBOR, ADC-Singl.eView  Used to develop the Web Pages along with CGI.  Easy I/O Operation, Extended Regular Expression, IPC, Networking, Representing of Complex data structures, OOPS, Data Base Programming etc are the key area to use PERL.  Used to convert the flat file into Data base tables. Ex: Collection, Rating and Payment System of Singl.eView  Used in most of the Migration project to convert the data from one Application/Version to other by flat file or XML parsing.


Case Study
Project Highlights: Data coming from the Various NMS and EMS to be reconciled into Inventory Data Base. Products Involved: NMS: Alcatel 1354 SH, Marconi 38 and Tellabs 8100 EMS: Alcatel 1353 RM, Marconi 36 and Tellabs 6300 Inventory Product: NetCracker Initial Approach: Using Java parsing is planned. Issues: 1) Alcatel NMS, EMS and Marconi EMS is providing data in ASCII in complex structure. Others providing data in XML. 2) Parsing these complex data in Java will be difficult. Proposed Approach: Using PERL parsing is done by implementing the concrete rules Results: 1) Analysis , Design and Coding effort can be reduced 2) Accurate and Effective parsing can be done

Thank You


				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:67
posted:8/26/2008
language:English
pages:75