Programming and Perl for Bioinformatics Part V by bzs12927

VIEWS: 0 PAGES: 24

									Programming and Perl
         for
   Bioinformatics

       Part V
References and Objects
                         What Are References?

           A reference is a (starting) address of a memory
           block that stores some data; also called a pointer




                                            ...
                               010010        G
                010010
                               010011        A
  $str_ref = \$string_1        010100        T
  $list_ref = \@list_1         010101        C
  $hash_ref = \%hash_1                      ...
2/1/2010                                                        3
                  What Good Are References?
           An array of arrays (can do the job of a 2-
           dimensional matrix):
           Spot_num Ch1-BKGD              CH1  Ch2-BKGD    Ch2
            000      0.124               43.2   0.102     80.4
            001      0.113               60.7   0.091     22.6
            002      0.084               112.2  0.144     35.3

              Code:
           my @spotarray = ( [0.124, 43.2, 0.102, 80.4],
                             [0.113, 60.7, 0.091, 22.6],
                             [0.084, 112.2, 0.144, 35.3]);


2/1/2010                    Perl in a Day - Subroutines          4
                   What Good Are References?
          A hash of arrays:
   Accession         Ch1-BKGD      CH1                 Ch2-BKGD   Ch2
    AW10021           0.124        43.2                 0.102     80.4
    BE52002           0.113        60.7                 0.091     22.6
    W20209            0.0841       12.2                 0.144     35.3

          Code:
   my %spothash = ('AW10021' => [0.124, 43.2, 0.102, 80.4],
                   'BE52002' => [0.113, 60.7, 0.091, 22.6],
                   'W20209' => [0.0841, 12.2, 0.144, 35.3]
                  );

     Hashes of hashes, and other more complex data
     structures
2/1/2010                        Perl in a Day - Subroutines              5
                     What Is A Reference?
            @y = ( 1, „a‟, 2.3 );
              $ref_to_y = \@y;

                                                1         ‘a’   2.3
                                                         @y

              print @y yields “1a2.3”
               print $ref_to_y yields
                       “ARRAY(0x80cd6ac)”

2/1/2010                   Perl in a Day - Subroutines                6
           Getting At The Value: “de-referencing”
              Using a block:
                  @{array_reference}
                  %{hash_reference}
                  ${scalar_reference}

           print @{$ref_to_y} yields 1a23.
              Or without it:
                  @x = @$ref_to_y;
                  $foo = “two humps”;
                  $scalar_ref = \$foo;
                  $camel_model = $$scalar_ref; # is now “two humps”
                  push (@$array_ref, $filename);
                  $$hash_ref{“KEY”} = “VALUVE”;
2/1/2010                        Perl in a Day - Subroutines           7
           Getting At The Value: “de-referencing”

              Using the Arrow Operator:
                  $ $array_ref [0] = 1;
                  ${array_ref} [0] = 1;
                   $array_ref->[0] = 1;
           Note that $array[3] and $array->[3] are
           NOT the same.
           my%hash_copy = %{$hash_ref};
           my $hash_value = ${$hash_ref}{'some_key'};
            my $hash_value = $hash_ref->{'some_key'};


2/1/2010                      Perl in a Day - Subroutines   8
           Getting At The Value: “de-referencing”
      Reference to   subroutines:
               $my_cool_sub = \&subroutine;

          Dereference:
       my $result =
           &{$my_cool_sub}($arg1, $arg2);
                            #invoke &my_cool_sub with two arguments
                            #using “block” operator
       my $result =
           &$my_cool_sub($arg1, $arg2);
                            #or without it
       my $result =
           $my_cool_sub->($arg1,$arg2);
                            #or using “arrow”
2/1/2010                      Perl in a Day - Subroutines             9
           Getting At The Value: “de-referencing”
            @y = ( 1, „a‟, 2.3 );
              $ref_to_y = \@y;

                                  1      ‘a’   2.3
                                      @y

              print @y yields “1a2.3”
               print $ref_to_y yields
                       “ARRAY(0x80cd6ac)”

2/1/2010                                             10
           Getting At The Value: “de-referencing”

      $y[3]     = 'z';
       print @{$ref_to_y} # yield “1a2.3z”
      @y = (5, 6, 7);

       print @{$ref_to_y} # yield “567”

      Why?

            Regular variables: static scoping
            Reference variables: dynamic scoping




2/1/2010                       Perl in a Day - Subroutines   11
              Making References To Arbitrary Values
                         From Scratch
      Anonymous       Hashes or Arrays
           $y_gene_families =
                      ['DAZ', 'TSPY', 'RBMY', 'CDY1', 'CDY2' ];
                            #instead of “(” and “)”

           $y_gene_family_counts = { 'DAZ' => 4,
                                      'TSPY' => 20,
                                      'RBMY' => 10,
                                      'CDY2' => 2
                                   };
                            #instead of “(” and “)”

      $y_gene_families     gets a reference to an array, and
           $y_gene_family_counts gets a reference to a hash.
2/1/2010                        Perl in a Day - Subroutines       12
             Making References To Arbitrary Values
                        From Scratch
      for (keys %{$y_gene_family_counts})
           { print "$_\n" }
      my @a = @{$y_gene_families};

      ${$y_gene_families}[0];

      ${$y_gene_family_counts}{'DAZ'}


      Arrow  shorthand:
           $y_gene_families->[0]; # yields 'DAZ'
           $y_gene_family_counts->{'DAZ'} # yields '4'




2/1/2010                   Perl in a Day - Subroutines   13
                        New Function: ref

      ref   - What kind of value does this reference point to?
      print     ref($y_gene_families), "\n";
       ARRAY
      print     ref($y_gene_family_counts), "\n";
       HASH
      $x = 1; print ref($x), "\n";
       (empty string) #return null string                    if not a reference.

      Return   values: SCALAR, ARRAY, HASH, CODE


2/1/2010                       Perl in a Day - Subroutines                         14
      Two-Dimensional Arrays: Matrices
   @probes = (       [1,   3,   2,   9],
                      [2,   0,   8,   1],
                      [5,   4,   6,   7],
                      [1,   9,   2,   8] );
   print "The probe at row 1, column 2 has value ", $probes[1][2],"\n";
         # It prints: The probe at row 1, column 2 has value 8

   $probes_ref = [         [1,   3,   2,   9],
                            [2,   0,   8,   1],
                            [5,   4,   6,   7],
                            [1,   9,   2,   8] ];
   print "The probe at row 1, column 2 has value ",
                                                      $probes_ref->[1][2],   "\n";
         # It prints: The probe at row 1, column 2 has value 8


$probes_ref->[1][2] is a shorthand for $probes_ref->[1]->[2];
it can also be written as $$probes_ref[1][2]
                     Complex Data Structure
   $gene = [
         # hash of basic information about the gene name, discoverer,
         # discovery date and laboratory.
     {
              name => 'antiaging',
              reference => [ 'G. Mendel', '1865'],
              laboratory => [ 'Dept. of Genetics', 'Cornell University',
                                   'USA']
         },
         # scalar giving priority
         'high',
         # array of local work history
         ['Jim', 'Rose', 'Eamon', 'Joe']
    ];

print "Name is ", ${$gene->[0]}{'name'}, "\n";
print "Research center is ", ${${$gene->[0]}{'laboratory'}}[1],
       "\n";
    Passing References to Subroutines
   Perl collapses all arguments to a subroutine as a list of scalars. This
    makes it impossible to distinguish between two arrays you might
    try to pass to a subroutine, as the following example illustrates:
   @aminoacids1 = ('E', 'V', 'L');
    @aminoacids2 = ('D', 'T', 'Y');
    printacids(@aminoacids1, @aminoacids2);
    sub printacids {
        my(@aa1, @aa2) = @_;
        print "Amino acids 1\n";
        print "@aa1\n";
        print "Amino acids 2\n";
        print "@aa2\n"; }
   This prints out:
          Amino acids 1
          EVLDTY
          Amino acids 2
    Passing References to Subroutines
   Here is how to fix the previous example:
   @aminoacids1 = ('E', 'V', 'L');
    @aminoacids2 = ('D', 'T', 'Y');
    printacids(\@aminoacids1, \@aminoacids2);

    sub printacids {
        my($aa1, $aa2) = @_;
        print "Amino acids 1\n";
        print "@$aa1\n";
        print "Amino acids 2\n";
        print "@$aa2\n"; }

   This prints out:
          Amino acids 1
          EVL
          Amino acids 2
          DTY
                Perl Object Syntax
   Perl objects are special references that come bundled
    with a set of functions that know how to act on the
    contents of the reference.
   For example, in BioPerl, there is a class of objects called
    Sequence. Internally, the object is a hash reference that
    has keys that point to the DNA string, the name and
    source of the sequence, and other attributes. The object is
    bundled with functions that know how to manipulate the
    sequence, such as revcom( ), translate( ), subseq( ), etc.
   When talking about objects, the bundled functions are
    known as methods.
                       Perl Objects
   For example, if we have a Sequence object stored in the
    scalar variable $sequence1, we can call its methods like this:
   $reverse_complement = $sequence1->revcom();
    $first_10_bases = $sequence1->subseq(1,10);
    $protein = $sequence1->translate();

   You will learn later from the BioPerl lecture that revcom(),
    subseq() and translate() are all returning new Sequence
    objects that themselves know how to revcom(), translate()
    and so forth. So if you wanted to get the protein translation
    from the reverse complement, you could do this:
   $reverse_complement = $sequence->revcom();
    $protein = $reverse_complement->translate();
                   Creating Objects
   Before you can start using objects, you must load their
    definitions from the appropriate module(s). For example, if
    we want to load the BioPerl Sequence definitions, we load
    the appropriate module, which in this case is called
    Bio::PrimarySeq (you learn this from reading the BioPerl
    documentation):
   #!/usr/bin/perl -w
    use strict;
    use Bio::PrimarySeq;
   Now you'll probably want to create a new object. There are
    a variety of ways to do this, and details vary from module
    to module, but most modules, including Bio::PrimarySeq,
    do it using the new() method:
   my $sequence1 = new
      Bio::PrimarySeq('gattcgattccaaggttccaaa');
                   Creating Objects
   The syntax here is
        ModuleName->new(@args)
    where ModuleName is the name of the module that
    contains the object definitions.
   The new( ) method will return an object that belongs to
    the ModuleName class.
   In the example above, we get a Bio::PrimarySeq
    object, which is the simplest of BioPerl's various Sequence
    object types.
                  Creating Objects
   When you call object methods, you can pass a list of
    parameters, just as you would to a regular function.
   As methods get more complex, parameter lists can get quite
    long and have possibly dozens of optional parameters. To
    make this manageable, many object-oriented modules use a
    named parameter style of parameter passing, that looks like
    this:
    my $result = $object->
          method( -arg1=>$value1, -arg2=>$value2,
                  -arg3=>$value3, ... )
   In this case "-arg1", "-arg2", and so on are the names of
    parameters, and $value1, $value2 are the values of those
    named parameters. The name/value pairs can occur in any
    order.
                  Creating Objects
   Rather than create a humungous argument list which forces
    you to remember the correct position of each argument,
    Bio::PrimarySeq lets you create a new Sequence this
    way:
   #!/usr/bin/perl -w
    use strict;
    use Bio::PrimarySeq;
    my $sequence1 = Bio::PrimarySeq->new(
       -seq      => 'gattcgattccaaggttccaaa',
       -id       => 'oligo23',
       -alphabet => 'dna',
       -is_circular => 0,
       -accession_number => 'X123'
       );

								
To top