Docstoc

200901_GaryBader

Document Sample
200901_GaryBader Powered By Docstoc
					Predicting PDZ domain protein-
 protein interactions from the
            genome
                        Gary Bader
  Donnelly Centre for Cellular and Biomolecular Research
                   University of Toronto
            VanBUG, Vancouver, Jan.8.2009




             http://baderlab.org
    Computational
      Cell Map
Map the cell
• Predict map from genome
• Multiple perturbation
mapping
• Active cell map
• Map visualization and
analysis software


Read map to understand
• Cell processes
• Gene function
• Disease effects
• Map evolution


Cary MP et al. Pathway information… FEBS Lett. 2005
Bader GD et al. Functional genomics and proteomicsTrends Cell Biol. 2003
         How are biological networks in the cell
              encoded in the genome?
Can we accurately predict biologically relevant interactions from
                           a genome?
 How do genome sequence changes underlying disease affect
             the molecular network in the cell?
 Can we predict how well model pathways or phenotypes will
                     translate to human?
            Can we design new networks de novo?
     Predicting Protein Interaction
     Networks From the Genome
• Ideally:


                   Accurately
                    Predict
• Reality:
  – Not currently possible
  – Signaling pathways too divergent to accurately
    map by orthology
  – Protein interaction prediction likely as hard as
    protein folding, in general e.g. induced fit
           Predicting Networks
• Map via orthology relationships
  – Metabolic pathways
    • E.g. KEGG, BioCyc, metaSHARK
  – Protein-protein interactions
    • E.g. OPHID, HomoMINT
  – Signaling pathways             Pinney et al.

    • E.g. Reactome                NAR 2005




• Infer using functional associations
  – Phylogenetic profile, Rosetta Stone
• Infer from molecular profiles                    Bader & Enright


  – Gene expression  gene regulatory network
  – E.g. ARACNE, MEDUSA, MatrixREDUCE
     Peptide Recognition Domains
•   Simple binding sites
•   Well studied
•   Numerous
•   Biologically important
    – Eukaryotic signaling
      systems often involve
      modular protein-
      protein interaction
      domains

                                   http://pawsonlab.mshri.on.ca/
                              http://nashlab.uchicago.edu/domains/
      Protein Domain Interaction
          Network Prediction
Genome

Gene and protein prediction

Domain prediction

Specificity prediction


Protein-protein interaction prediction
      Protein Domain Interaction
          Network Prediction
Genome

Gene and protein prediction

Domain prediction

Specificity prediction


Protein-protein interaction prediction
   PDZ Domains
 • 80-90 aa’s, 5-6 beta
   strands, 2 alpha helices
 • Recognize                          C
   hydrophobic C-termini
 • Membrane localization of
   signaling components
 • Neuronal development,
   cell polarity, ion channel
   regulation

Dev Sidhu                       Par-6 PDZ Domain
Tonikian et al. PLoS Biology    VKESLV-COOH
Sep.2008                        (1RZX, Fly)
~250 Human
PDZ Domains
Multiple sequence
    alignment
~250 Human
PDZ Domains
Multiple sequence
    alignment
PDZ
Binding
Motifs

Class 1: X[T/S]X


                    C-Terminus




Class 2: XX



 polar
 basic
 acidic
 hydrophobic
                     Sequence Logo
                                       Position
SWWPDSWV                                -3     -2     -1     0
                                   A     0      0      0     0
NAFEETWV                           C     0      0      0     0
                                   D   0.7    0.1      0     0
NPFWDVWV                           E   0.3   0.05      0     0
NPFWDVWV                           F     0      0   0.05     0

SVDVDTWV              Amino Acid   G
                                   H
                                         0
                                         0
                                                0
                                                0
                                                       0
                                                       0
                                                             0
                                                             0
                                   I     0      0      0     0
-AYFDTWV                           K     0      0      0     0
STFLETWV   Profile                 L     0      0      0   0.1
                                                                 Logo
                                   M     0      0      0     0
KGVFESWV                           N     0      0      0     0
                                   P     0      0      0     0
ESWHDSWV                           Q     0      0      0     0
-GDQDTWV                           R     0      0      0     0
                                   S     0   0.15      0     0
GRWMDTWV                           T     0    0.7      0     0
                                   V     0      0      0   0.9
KFWRDTWL                           W     0      0   0.95     0            polar=green, basic=blue,
…                                  Y     0      0      0     0
                                                                          acidic=red,
                                                                          hydrophobic=black

 Schneider TD, Stephens RM. 1990.
 Nucleic Acids Res. 18:6097-6100                                 http://weblogo.berkeley.edu/
82 worm and human PDZ specificities mapped by phage display




                                            ~3100 peptides
PDZ Specificity   Class 2: XX
     Map




                  Class 1: X[T/S]X
PDZ Specificity   Class 2: XX
     Map
                  Class 3: X[D/X]X




                  Class 4: XGX




                  Class 1: X[T/S]X
PDZ Specificity   Class 2: XX
     Map
                  Class 3: X[D/X]X
 16 Classes




                  Class 4: XGX




                  Class 1: X[T/S]X
Specificity at Most Positions
Position   Versatile




                  Many Distinct Specificities
Versatile and Robust
       91 Erbin mutants phaged, 3400 peptides
       Mutations cause specificity switch, not function loss
   Conserved Specificity, Expanded Use
PDZ domains are versatile, but only ~16 classes used from worm to human
One billion years of evolution
Model: specificities arose early, domains expanded under evolutionary constraints




                                                                     Raffi Tonikian
      Protein Domain Interaction
          Network Prediction
Genome

Gene and protein prediction

Domain prediction

Specificity prediction


Protein-protein interaction prediction
           Predicting PDZ Specificity
  >ERBB2IP-1
  RVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPE
  GPASKLLQPGDKIIQANGYSFINIEHGQAVSLLKTFQNTVELII




Tonikian et al. PDZ specificity map
Sequence Predicts Specificity
50 mapped PDZ domains
>70% similar to 69
unmapped PDZ

Double coverage to
45% of worm/human

33 more PDZ groups
110 singletons




   Mapped     Human

   Unmapped   Worm
             Are Residues Correlated?


                                 ~80




                                 ~3000
Boris Reva, Chris Sander
                    Top 10 1-1 Rules
   Domain           Peptide       Joint    Domain     Peptide       Mutual
    Position         Position       Freq     Freq        Freq    Information
(H@105)         (T@7)                886       1367        913       0.166384111

(P@53)          (T@7)                373        411        913       0.130328629

(Q@67)          (W@8)                366        377       1037       0.117349366

(V@109)         (T@7)                836       1430        913       0.115598151

(S@64)          (E@6)                218        386        414       0.109298916

(V@9)           (W@4)                150        202        340       0.109096478

(A@102)         (E@6)                228        429        414       0.107661006

(L@30)          (S@6)                207        383        384       0.106889284

(P@53)          (E@6)                219        411        414       0.103683514

(L@26)          (E@6)                391       1138        414        0.10274842

                                                     886
                  p joint                   886
   p joint ln                                   ln 2083  0.17
              pdomain p peptide             2083 1367 913
                                                  2083 2083
Correlation Validation
         Prediction Can Be Accurate

Experiment




Prediction
             Challenge: But Not Always

Experiment




Prediction




Shirley Hui
     Predicting PDZ Specificity
                          Consider sequence and physicochemical properties
                          high accuracy at matching known domains to peptides
  Test Examples
(PDZ-Peptide Pairs)                             Predictions
                                                    YES
                 ?                                  NO
                                                      …
                 ?
       …                      Machine Learning



                         Training Examples
            (Binding and Non binding PDZ-Peptide Pairs)


     Positive:              YES        Negative:                 NO
                            YES                                  NO
                            YES                                  NO
                            YES                                  NO
                      …                                     …
                                                   Shirley Hui, Xiaojian Shao
      Protein Domain Interaction
          Network Prediction
Genome

Gene and protein prediction

Domain prediction

Specificity prediction


Protein-protein interaction prediction
                 Genome Search
Phage Results
  SWWPDSWV
  NAFEETWV
  NPFWDVWV
  NPFWDVWV
  SVDVDTWV
  -AYFDTWV
  STFLETWV
             Profile
  KGVFESWV
  ESWHDSWV
  -GDQDTWV
  GRWMDTWV
  KFWRDTWL
  …

  PDZ ERBIN
                       polar=green, basic=blue, acidic=red, hydrophobic=black
                        Genome Search
                                     PDZ ERBIN

    >Q86W91_HUMAN Plakophilin 4, isoform b
    ...LKSTTNYVDFYSTKRPSYRAEQYPGSPDSWV
                     C-Terminal
                       Match                                Score
 QYPGSPDSWV                                                                5.5

                                                                            w    
Assumes: Position independence,           DSWV                     log 10   pi 
                                                                                 
uniform input, good sampling                                                i 1 
Physiological binder is similar to   Predicted C-Terminal Motif
phage sequence
Prediction
Can be
Accurate
ERBIN PDZ
Interaction
Prediction
                                       ERBB2IP-1
   10E-5 (High)

   Probability
   of PDZ binding


   10E-7 (Low)


       Known Interactor

       High Score         …but requires further experimental support
...
Network of prioritized human PDZ interactions
Matches known biology, significantly enriched in known interactors




  8% overlap, p=8.6x10-18                                            p-value
 336 interactions between 54 PDZ domains, 247 proteins
Future: In vivo Protein Interaction Prediction
   In vitro                                         Biologically Relevant
           Peptides    Genome                             (In vivo)
 PDZ                                 Evolutionary
                                       Context
                         Protein
                        Expression
Phage
Display

    In silico          Protein
                      Function    Protein
  Predictions
                                 Structure

                      Network
                      Context      Protein
                                  Location

                                                     DLGs        NMDAR
     PDZ Human-Virus Interactions




89 viral proteins
matched better
than any human
protein
(vs. 30 domains)


Affinities (ELISA)
Yingnan Zhang
Crtam Ig transmembrane
protein important in
late phase T cell
activation
                          Non SCRIB binding   SCRIB Binding



Crtam peptide inhibitor
blocks SCRIB-3 binding
and polarization
                                                      T cell

Synthetic viral peptide
promotes T cell
proliferation




                                              Non SCRIB binding


                                                SCRIB Binding



Jung-Hua Yeh and
Andrew Chan
                   Conclusions
• PDZ domains are highly specific, versatile and
  robust to mutation
• Many specificities possible, but only a few are
  used
• Specificity can be predicted from domain
  sequence
• Prioritize predictions for experimental follow up
• Use by pathogens
• PDZ specificity map useful for:
   – Novel protein interaction discovery
   – Peptidomimetic therapeutic design
   – PDZ design (synthetic biology)
Cell map exploration and analysis
 Can we accurately predict
 protein interactions?                    Databases




                                          Literature




                                       Expert knowledge
                           Pathway
           Pathway
                         Information
           Analysis
         (Cytoscape)
                                       Experimental Data
http://pathguide.org   ~280 Pathway
                        Databases!




Vuk Pavlovic
Pathway Commons: A Public Library
http:pathwaycommons.org




                                                                        Sander Lab
                                                                        (MSKCC)
                                                                        Bader Lab

•Books: Pathways                          •Open access, free software
•Lingua Franca: BioPAX OWL
                                          •No competition: Author attribution
•Index: cPath pathway database software
                                          •Aggregate ~ 20 databases in BioPAX format
•Translators: translators to BioPAX
http://cytoscape.org


Network
visualization
and analysis
Pathway comparison
Literature mining
Gene Ontology analysis
Active modules
Complex detection
Network motif search



UCSD, ISB, Agilent,
MSKCC, Pasteur, UCSF,
Unilever, U of Toronto, U
of Michigan
Gene Function
 Prediction
•Guilt-by-association
principle

•Biological networks
are combined
intelligently to optimize
prediction accuracy

•Algorithm is more fast
and accurate than its
peers



Quaid Morris (CCBR)
Rashad Badrawi, Ovi Comes,
Sylva Donaldson, Christian Lopes,
Jason Montojo, Khalid Zuberi



http://www.genemania.org
Canadian Bioinformatics Workshops 2009
Interpreting Gene Lists from -omics Studies           Clinical Genomics and Biomarker
                                                      Discovery
Date: July 9-10, 2009, Toronto
                                                      Date: July 16-17, 2009, Toronto
Faculty: Gary Bader, Quaid Morris & Wyeth Wasserman
                                                      Faculty: Sohrab Shah

Informatics on High-Throughput                        Exploratory Data Analysis and
Sequencing Data                                       Essential Statistics using R
Date: July 23-24, 2009, Toronto                       October 2-3, 2009, Toronto
Faculty: Michael Brudno, Asim Siddiqui &              Faculty: Raphael Gottardo and Boris Steipe
Francis Ouellette


  Applications now being accepted at
  www.bioinformatics.ca
  Limited registration
  Registration Fee: $500
                  Acknowledgements
                      Bader Lab
PDZ Work              G2N                       Cytoscape
                                                Trey Ideker (UCSD)
Genentech             Chris Tan                 Kei Ono, Mike Smoot, Peng Liang Wang (Ryan
    Dev Sidhu         David Gfeller             Kelley, Nerius Landys, Chris Workman, Mark
                      Shirley Hui               Anderson, Nada Amin, Owen Ozier, Jonathan
    Yingnan Zhang     Xioajian Shao             Wang)
    Heike Held        Shobhit Jain
    Stephen Sazinsky MP                         Lee Hood (ISB)
                                                Sarah Killcoyne, John Boyle, Ilya Shmulevich
    Yan Wu            Anastasija Baryshnikova
                                                (Iliana Avila-Campillo, Rowan Christmas,
                      Iain Wallace              Andrew Markiel, Larissa Kamenkovich, Paul
University of Toronto Laetitia Morrison
                                                Shannon)
    Charlie Boone     Ron Ammar
    Raffi Tonikian,   ACM                       Benno Schwikowski (Pasteur)
       Xiaofeng Xin   Daniele Merico            Mathieu Michaud (Melissa Cline, Tero
                      Ruth Isserlin             Aittokallio)
MSKCC                 Vuk Pavlovic
    Chris Sander                                Chris Sander (MSKCC)
                      Oliver Stueker            Ethan Cerami, Ben Gross (Robert Sheridan)
    Boris Reva       Pathway Commons
                     Chris Sander               Annette Adler (Agilent)
                     Ethan Cerami               Allan Kuchinsky, Mike Creech (Aditya Vailaya)
 Funding
                     Ben Gross
 CIHR, NSERC, NIH                               Bruce Conklin (UCSF)
                     Emek Demir
 Genome Canada                                  Alex Pico, Kristina Hanspers
                     Robert Hoffmann
 Canada Foundation
                     Igor Rodchenkov
 for Innovation/ORF
                     Rashad Badrawi              http://baderlab.org

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:3/30/2013
language:Unknown
pages:48