Perl Lab 13
Document Sample


Perl Lab 13
On the Menu Today:
Let’s review!
Links:
Sorting: http://www.perlfect.com/articles/sorting.shtml
More Sorting: http://perlhowto.com/sort_ordering_by_multiple_columns
Printing a tab‐delimited array to a CSV file: http://www.perlmonks.org/?node_id=46529
Perl special variables ‐ as used for printing delimited fields:
http://www.kichwa.com/quik_ref/spec_variables.html
Assignments:
1. Read a blast results file in tabular format into a 2 dimensional array and sort it by two
columns, one containing the hit length, the other the e-value:
a. Use split to split the columns in a row into an array @rowArray and push the array
into a 2 dimensional array @twoDimArray
push @twoDimArray,[@rowArray];
b. To sort by columns 0 and 1 use the following code:
my @orderedlist = sort { $b->[0] <=> $a->[0] || $a->[1] <=> $b->[1] }
@twoDimArray;
1. Given a list of genotype markers containing the amino acid position in a gene of interest
and the letter indicating the amino acid that is associated with a particular disease, parse a
list of sequences and for each sequence report if the sequence has the marker or not.
a. Pick one file from the patients DNA sequences from lab 7. Write a script to open
that file and convert the DNA sequences to protein sequence
b. As before but now read a file that contains the following list of markers into an
array:
3L,27R,29N,32R,62I,65N\S,70I,146V\I,177Y\D\F,181E\Q
c. As before, but now loop though all the sequences, and search for only the 3L
markers in each protein sequence. If present write the name of the sequence and
number 1 to a file.
1
d. As before, but now look for 27R in each protein sequence (in addition to looking
for 3L). If present write the name of the sequence and number 2 to a file. If a
sequence has both marker the result for that sequence should be:
037.Rc49,1 2
Where 037.Rc49 is the sequence id and 1 refers to amino acid 3L, and 2 to amino
acid 27R.
e. As before, but now write a subroutine hasMarker which accepts as input:
i. A protein sequence
ii. A marker (for example 177Y\D\F)
The output of the program should be:
iii. True or False depending if the marker is present or not
f. As before, but now loop though all the sequences, and search for the markers in the
protein sequence by looping through the 10 markers and calling the subroutine
hasMarker.
2
Get documents about "