Docstoc

CS 360

Document Sample
CS 360 Powered By Docstoc
					CS 360

Perl Part 2
                                    Remember
 Assignments are Due Today
    Make a web page for all of your assignments
    You get to design it so it works well for you
    Email the TA with the URL
 Lab is due Monday
    Clean out your directory
    We are asking for more space
                                  Example
 dna1.txt has the following in it.
   hopper acgtacacactgca
   flick acgcacacattgca
   spot acgcacaccttgca
 How would you write a perl script to read
  this file in and print it (assume that the
  command line has the file name in it)
             Whoa!! What about the
                     command line
 How about @ARGV or @_
     #/usr/bin/perl
     print "ARGV=",@ARGV, "\n";
     print "ARGV[0]=",$ARGV[0], "\n";
     print "ARGV[1]=",$ARGV[1], "\n";
     open(DNAFILE, $ARGV[0]) or die "cant open $ARGV[0]: $!";
     while(<DNAFILE>) {
           print "line $_\n";
     }
          Now search for the taxa
                        “hopper”
Whoa!! How do I search!!
       print "line $_\n";
      @words = split /\s+/, $_;
      print "$words[0] \n";
      print "$words[1] \n";
      if($words[0] eq "hopper") {
            print "found hopper: dna $words[1]\n";
      } else {
            print "not hopper";
      }
                 search for the string
                            “hopper”
Whoa!! How do I search!!
       print "line $_\n";
      @words = split /\s+/, $_;
      print "$words[0] \n";
      print "$words[1] \n";
      if($words[0] eq "hopper") {
            print "found hopper: dna $words[1]\n";
      } else {
            print "not hopper";
      }
                Perl Regular Expresions
 Extremely Powerful Text Processing.
 One of Perls most useful yet most misunderstood features
 ‘=~’ indicates a regexp match
    if ($var =~ /BLAH/) – Match string
    if ($var =~ /^BLAH/) – Start of String
    if ($var =~ /BLAH$/) – End of String
    if ($var =~ /\w+/) – Any letters
          \w - Letters
          \d - Numbers
          \s - Whitespace
          . - Anything
reg.pl
             Now search for the taxa
                           “hopper”
Whoa!! How do I search!!
        print "line $_\n";
      @words = split /\s+/, $_;
      print "$words[0] \n";
      print "$words[1] \n";
      if($words=~/^hop|^fli/) {
            print "found hopper or flick: dna $words[1]\n";
      } else {
            print "not either";
      }
                Perl – Everything else
 Supplied Perl Docs
    man perlfunc
    man perlfaq
    man perlsyn
    man perlre (fun)


 The Perl Bible
    “Perl In a Nutshell” – O’Reilly
          Now for some Magic
   $dna = $words[1];
   print "dna $dna \n";
   $reversed = scalar reverse $dna;
   print "reversed dna $reversed \n";
   $rna = $reversed;
   $rna =~ tr/acgt/UGCA/;
                                            Hash Tables
 Index using a variable
# Hashes
my %organisms = (
  grasshoppers => "gh",
  fleas => "fs",
  lobster => "lb",
);

my @names = keys %organisms;
my @abbrev = values %organisms;

print "names ", @names[1], ", abbrev ", @abbrev[1], "\n";
print "fleas = ", $organisms{"fleas"}, "\n";
print %organisms, "\n";
                  Nucleotide Translation
   $codonMap{"gct"}="A"; #Alanine
   $codonMap{"tgt"}="C"; #Cysteine
   $codonMap{"gcc"}="A"; #Alanine
   $codonMap{"tgc"}="C"; #Cysteine

 $mrna="gcttgtgcctgc";

 my $pro="";
 while ($mrna=~s/(...)//) { # Dots match any character
   print "codon $1\n";
   $pro=$pro.$codonMap{$1};
   print "pro $pro\n";
 }
            Now for the fun stuff !!
 cd ~/public_html
 vi hits.cgi
 :r ~clement/public_html/hits.cgi
 :wq
 chmod a+x hits.cgi
 Open in a web browser
    http://students.cs.byu.edu/~you/hits.cgi
                         Why use subroutines?
 They will make your program
    Shorter, since you're reusing the code.
    Easier to test, since you can test the subroutine separately.
    Easier to understand, since it reduces clutter and better organizes
      programs.
     More reliable, since you have less code when you reuse
      subroutines, so there are fewer opportunities for something to go
      wrong.
     Faster to write, since you may, for example, have already written
      some subroutines that handle basic statistics and can just call the
      one that calculates the mean without having to write it again. Or
      better yet, you found a good statistics library someone else
      wrote, and you never had to write it at all.
                                                   Description
 Like many languages, Perl provides for user-defined subroutines.
 The Perl model for subroutine call and return values is simple: all
   subroutines are passed as parameters one single flat list of scalars,
   and all subroutines likewise return to their caller one single flat list of
   scalars.
 Any arguments passed to the subroutine come in as the array @_.
 The return value of the subroutine is the value of the last expression
   evaluated. Alternatively, a return statement may be used to exit the
   subroutine
            To declare subroutines
sub NAME;
# A "forward" declaration.

  sub NAME(PROTO);
# ditto, but with prototypes

  sub NAME BLOCK
# A declaration and a definition.

  sub NAME(PROTO) BLOCK
# ditto, but with prototypes
                             To call subroutines
NAME(LIST);            OR            &NAME(LIST);
# & is optional with parentheses.

  NAME LIST;
# Parentheses optional if predeclared/imported.

  &NAME;
# Makes current @_ visible to called subroutine.
                        Example (similarity.pl)
sub percent_identity {
     $seq1 = $_[0];
     $len1 = length $seq1;
     $seq2 = $_[1];
     $len2 = length $seq2;
     $num_mismatches = 0;
     for $i (0..$len1-1) {
           if (substr($seq1, $i,1) ne substr($seq2, $i, 1)) {
                 $num_mismatches++;
           }
     }
     if($len2 > $len1) {
           $num_mismatches += ($len2-$len1);
     }
     return (($len1-$num_mismatches)*100/$len1);
}

$seq1="acctgaatg";
$seq2="atcgtgagtg";
print "percent identity = ". percent_identity($seq1, $seq2) . "\n";
                       Scoping and Arguments
 The variables declared with a my belong only to the block in which
   they are declared. In out example, $DNA has effect only in the
   subroutine.
     You don’t have to worry about name conflicts outside the subroutine
     You don’t have to worry about accidentally change the values of some
       variables
 All the arguments passed to the subroutine are stored in the array
   @_. You can access parameters with
     my $DNA = $_[0];
                       Another simple example
#!/usr/bin/perl -w
# Counting the number of G's in some DNA
my($DNA) = "ACGAGCTGCGAGGCGACTAGCGAGCTAGCGATCAGCTA";
# Call the routine that does the real work and collect the result.
my($number_of_Gs) = countG($DNA);
print "\nThe DNA sequence $DNA has $number_of_Gs G\'s in it.\n\n";
exit;
#########################################################
# Subroutines
#########################################################
sub countG {
   my($DNA) = @_;
   my($count) = 0;
   $count = ($DNA =~ tr/Gg//);
   return $count;
}
                                           Pass by value
 The values of these arguments are copied and passed to the
  subroutines.
 whatever happens to those values in the subroutine doesn't affect the
  values of the arguments in the main program
  Pass-by-reference (reference.pl)
#!/usr/bin/perl
# Example of pass-by-reference (a.k.a. call-by-reference)
use strict;
use warnings;
my @i = ('1', '2', '3');
my @j = ('a', 'b', 'c');
reference_sub(\@i, \@j);
print "In main program after calling subroutine: i = " . "@i\n";
print "In main program after calling subroutine: j = " . "@j\n";
exit;
############################################################
# Subroutine
############################################################
sub reference_sub {
   my ($i, $j) = @_;
   print "In subroutine : i = " . "@$i\n";
   print "In subroutine : j = " . "@$j\n";
   # push and shift are built-in functions on arrays
   push(@$i, '4');
   shift(@$j);
}

In main program before calling subroutine: i = 1 2 3, j = a b c
In subroutine : i = 1 2 3, j = a b c
In main program after calling subroutine: i = 1 2 3 4
In main program after calling subroutine: j = b c
                                   Pass-by-reference
 To pass a parameter by reference, you have to preface the name of
  the parameter with a backslash.
 \@i is a reference to array @i.
 In the subroutine, $i gets the value of \@i. So it is also a reference to
  array@i.
 When argument variables are passed in this fashion, anything you do
  to the values of the argument variables in the subroutine also affects
  the values of the arguments in the main program.
                                      Arrays
  Two Dimensional Arrays
#!/usr/bin/perl
$gap = -2;
$st1="acgtactacg";
$st2="acctaccacgt";
$n1=length($st1);
$n2=length($st2);

# Allocate the array
for(my $i = $n1-1; $i >= 0; $i--) {
  $M[$i][$n2-1] = 0;
}
               Accessing the Matrix

sub printmatrix {
print "n1 $n1, n2 $n2\n";
  for(my $i = 0; $i < $n1; $i++) {
    for(my $j = 0; $j < $n2; $j++) {
      print "M[$i][$j]= $M[$i][$j],";
    }
    print "\n";
  }
}
                       Perl Modules
 Allow you to make your code modular
 Create an Object Oriented interface
 Separate your code into separate files
 so changes wont be made to working
 code
                                                Modules
 Similar idea to libraries in C.
    use CGI;


 Useful Modules
    CGI – CGI routines.
    DBI – Database Connectivity.
    strict – Makes you code all proper like.
    Data::Dumper – Debugging large objects.
    XML::Simple – Simple XML Parsing.


 Always ‘use strict’!
                 What is a Module?
 A module is a .pm file that defines a
  library of related functions
 Modules are conceptually similar to
  old-fashioned Perl libraries (.pl files),
  but have a cleaner implementation
  selective namespace cluttering
  simpler function invocation
                   Example (pasture1.pl)
sub Cow::speak {
  print "a Cow goes moooo!\n";
}
sub Horse::speak {
  print "a Horse goes neigh!\n";
}
sub Sheep::speak {
  print "a Sheep goes baaaah!\n"
}

@pasture = qw(Cow Cow Horse Sheep Sheep);
  foreach $animal (@pasture) {
   $animal->speak;
}
                                   Arguments

Class->method(@args)

attempts to invoke subroutine "Class::method" as:

Class::method("Class", @args);
                                   Simplifying
sub Sheep::speak {
       my $class = shift;
       print "a $class goes baaaah!\n";
     }
                            A second method
                                (pasture2.pl)
  { package Cow;
     sub sound { "moooo" }
     sub speak {
       my $class = shift;
       print "a $class goes ", $class->sound, "!\n"
     }
  }

@pasture = qw(Cow Cow Horse Sheep Sheep);
  foreach $animal (@pasture) {
   $animal->speak;
}
              Inheritance (pasture3.pl)
  { package Animal;
         sub speak {
           my $class = shift;
           print "a $class goes ", $class->sound, "!\n"
         }
   }
   { package Cow;
           @ISA = qw(Animal);
           sub sound { "moooo" }
   }
 On $animal->speak, Perl looks for "Cow::speak". But that’s not
there, so Perl checks for the inheritance array @Cow::ISA. It’s
there, and contains the single name "Animal".
              Overriding (pasture4.pl)
{ package Mouse;
   @ISA = qw(Animal);
   sub sound { "squeak" }
   sub speak {
     my $class = shift;
     print "a $class goes ", $class->sound, "!\n";
     print "[but you can barely hear it!]\n";
   }
}
             How to use a Module
 test.pl
use Foo;
Foo:bar();
 Foo.pm
package Foo;
@EXPORT = qw (bar);

sub bar {
print “hello\n”;
}
                Package Names and
                         Filenames
 Package name is declared on line 1
 This should be the same as the
  filename, without the .pm extension
 If it is different, your functions will not
  be exported correctly
 Should begin with a capital letter to
  avoid possible conflict with pragmas
                             Summary
 Modules are libraries of functions
 A simple module just exports a set of
  functions
 Perl modules can be expanded in many
  directions for arbitrarily sophisticated
  libraries

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:8/31/2012
language:Unknown
pages:37