Predicting the Function of Single
Advisor: Eleazar Eskin
A change to a DNA sequence.
Source of genetic variation.
Single Nucleotide Polymorphism (SNP)
Mutation of a single nucleotide.
Single Nucleotide Polymorphisms
We can figure out the locations of
SNPs by looking at genotypes.
How do we figure out the function of each
Many SNPs are do not affect phenotypes at
A way to determine which SNPs are
significant and which are not could be useful.
Only work on SNPs in coding regions.
Translate the protein using the
sequence with and without the SNP.
Use BLAST to find similar proteins.
Basic Local Alignment and Search Tool
Searches a database of proteins for various
Proteins in humans are related to those that
other mammals produce.
Finding better results for one SNP would
May indicate that it is linked to a disease.
BLAST bit score
Score S is based on how well the sequences
K – Scale factor for search space size
λ – Scale factor for scoring system
Bit score is the normalized score which
indicates how well the protein aligns.
Normalization is needed to compare scores
for multiple alignments.
Based off of SNPs in exon regions of human
chromosome 1 (from HapMap rel23a)
0 100 200 300 400 500 600 700 800 900 1000
The program provided a number of
SNPs with a zero difference in byte
score, in some cases SNPs known to
cause phenotypic changes.
SNPs that cause changes to protein
Consider SNPs not in coding regions
SNPs in introns could be ranked similarly
based on binding data.