Analysis of protein-protein interaction networks - CSE Labs User

Document Sample
Analysis of protein-protein interaction networks - CSE Labs User Powered By Docstoc
					CSCI5461: Functional Genomics, Systems Biology and Bioinformatics




                Analysis of protein-protein
                     interaction networks


            Department of Computer Science and Engineering
                        University of Minnesota
Announcements
n    Project proposals due TODAY

n HW #2 due 4/6 (this Friday, before
  midnight)
n Paper discussions next time (Thurs):
    M. Middendorf, E. Ziv, C. H. Wiggins. Inferring network mechanisms: the
    Drosophila melanogaster protein interaction network. Proc Natl Acad Sci U S
    A., 102(9):3192–7.

    Kelley, B. P., Sharan, R., Karp, R., Sittler, E. T., Root, D. E., Stockwell, B.
    R., and Ideker, T. Conserved pathways within bacteria and yeast as
    revealed by global protein network alignment. Proc Natl Acad Sci U S A
    100, 11394-9 (2003).
Outline for today
n   Core principles of network analysis
n   Case studies in PPI networks:
    ¨   Krogan et al. (AP/MS), Gavin et al. (AP/MS), Yu et al. (Y2H)


n   Date vs. party hubs in PPI networks
    ¨   Han et al. Evidence for dynamically organized modularity in the yeast
        protein–protein interaction network. Nature 430, 88-93 (1 July 2004).
    ¨   Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein
        Interaction Network. PLoS Biol. 2006 October; 4(10): e317.
Characterizing graph topology




     Xiaowei Zhu et al. Genes Dev. 2007; 21: 1010-1024
Characterizing graph topology




   Xiaowei Zhu et al. Genes Dev. 2007; 21: 1010-1024
Assume we have two proteins, A and B:
Clustering coefficient of A = 0.9
Clustering coefficient of B = 0.2

Which is part of a signaling pathway, which is part of a complex?
 Let’s focus on node degree




What can we learn from the # of interactions (degree) for a protein?
  Degree distributions
                                   fk = fraction of nodes with degree k
frequency                             = probability of a randomly
                                         selected node to have degree k


       fk

                    k           degree

 n   Why measure the degree distribution?
     The degree distribution is a “fingerprint” of the network– it allows us to
     generally characterize its structure
            Degree distribution from a random
                         network
                       What if we constructed a network by adding edges
                                 between proteins at random?


                                                             Log-log
Frequency




                                                             plot:




             Node degree


                              Properties:
                                  ¨ highly concentrated around the mean
                                  ¨ the probability of very high degree
                                     nodes is exponentially small
              Barabasi, Oltvai. Network Biology: Understanding the cell’s functional organization. Nature
                                        Reviews Genetics 5, 101-113 (2004).
        What about the degree distribution of
                  real networks?




                                                                                                     Random
                                                                                                     network:
Yeast 2-hybrid interaction network
Hawoong Jeong et al. Oltvai Centrality and lethality of protein networks. Nature 411, 41-42 (2001)
  What about other types of real
          networks?
Random




         Conclusion: many real networks have the same fingerprint!

                           [Newman, 2003]
“Scale-free” networks
n   Many real networks have a power-law
    distributed degree distribution
               log p(k) = -α logk + logC




frequency              log frequency        α




              degree                       log degree

n   α : power-law exponent (typically 2 ≤ α ≤ 3)
Summary: Random vs. scale-free network




  Albert L. Barabasi, 2002, LINKED: The New Science of Networks
Why do many real networks
have similar structure?
n   Guesses
    ¨    Maybe this structure has some special
        properties (i.e. properties that could selected
        for)
         n Efficient adaptation/communication?
         n Robustness?

    ¨   Maybe this relates to how they grow
“Small-world” property
n   Most nodes are not neighbors of one another, but most
    nodes can be reached from every other by a small
    number of hops

n   Random networks:
    ¨   mean path length ~ log N (N- number of nodes)
n   Scale-free networks:
    ¨   “ultra-small-world”
    ¨   Mean path length ~ log (log N)
           Six Degrees of Kevin Bacon
Superman
            Mark McClure
                           Frost
                           Nixon


                                           Cat in the
                                           Hat, The
  A Few                                    (2003)
  Good Men
                                                 Clint Howard


                                                Frost
                                                Nixon




  http://oracleofbacon.org/cgi-bin/movielinks
                     Why Kevin Bacon?

                     No. of movies : 46    No. of actors : 1811
    Kevin Bacon      Average separation: 2.79




Is Kevin Bacon the
most connected
actor?




     NO!

                       876         Kevin Bacon          2.786981 46   1811
                                                    Slide from Lacasa and Nuno
#1 Rod Steiger

                               #876
                               Kevin Bacon




 Donald Pleasence
#2




#3 Martin Sheen
                    Slide from Lacasa and Nuno
“Small-world” property is practically
useful in man-made networks




    What about protein-protein interaction networks?   (US airline network)
Scale-free networks are surprisingly robust

Robustness – The ability of complex systems to maintain their function even
         when the structure of the system changes significantly




        Tolerant to random removal of nodes (mutations)




                                          Haiyuan Yu et. Al. Trends in Genetics (2004)
Random networks are homogeneous so there is no
     difference between failure and attack
Diameter of the network




                          Fraction nodes removed from network
                                            Modified from Albert et al. Science (2000) 406 378-382
                          Scale-free networks are robust to failure but
                                      susceptible to attack
Diameter of the network




                               Fraction nodes removed from network
                                                  Modified from Albert et al. Science (2000) 406 378-382
     Evidence for robustness in the yeast protein-protein interaction
                                 network




                                                                                                Only ~20% of 6000 yeast
                                                                                                genes are essential–
                                                                                                these are the hubs
Hawoong Jeong et al. Oltvai Centrality and lethality of protein networks. Nature 411, 41-42 (2001)

                                               NOTE: this result is still controversial
Why do many real networks
have similar structure?
n   Guesses
    ¨    Maybe this structure has some special
        properties (i.e. properties that could selected
        for)
         n Efficient adaptation/communication?
         n Robustness?

    ¨   Maybe this relates to how they grow
    Is this universal network structure a
    consequence of how networks grow?
Let’s assume an evolution-motivated model of network growth
• Evolution : networks expand continuously by the addition of
    new vertices, and

•    Preferential-attachment (rich get richer) : new vertices
     attach preferentially to sites that are already well connected.




                                           Barabasi & Bonabeau Sci. Am. May 2003 60-69
                                          Barabasi and Albert. Science (1999) 286 509-512
Preferential attachment
model results in scale-free
network!




                              Barabasi and Albert. Science (1999) 286 509-512
A more biologically realistic network growth
                   model
  Genes (network nodes) are often duplicated during cell division
  • sometimes death
  • sometimes neutral (duplication may persist)
  • sometimes advantageous
                                                                Middendorf et al.
              “Duplication and divergence model”                paper: can we
                                                                use PPI data to
                                                                distinguish
                                                                different growth
                                                                models?




                                       Proc Natl Acad Sci U S A. 2005 March 1; 102(9): 3173–3174
    Summary of topology analysis
n   There seems to be a universal structure to many protein-
    protein interaction networks that is common to other types of
    networks
n   Network structure possibly relates to:
     ¨ Adaptation/robustness
     ¨ How the network evolved
     ¨ Small-world property
n   Utility for answering biological questions with “network
    science” is still under debate
PPI network case studies
n   Krogan et al. Global landscape of protein complexes in
    the yeast Saccharomyces cerevisiae. Nature 440, 637-
    643 (30 March 2006).
n   (read on your own) Gavin et al. Proteome survey reveals
    modularity of the yeast cell machinery. Nature 440, 631-
    636 (30 March 2006).
n   Yu et al. High-Quality Binary Protein Interaction Map of
    the Yeast Interactome Network. Science 3 October
    2008: Vol. 322 no. 5898 pp. 104-110.
    Krogan et al. TAP/MS study
n 4562 of ~6000 yeast proteins tagged
n 4087 total pulled down proteins by one of
  two mass spec. methods: MALDI-TOF,
  LC-MS/MS
n Machine learning approach was used to
  form a high-confidence set of interactions
n Markov clustering on confidence-weighted
  interactions to identify groups (i.e.
  complexes)
                          Krogan et al. TAP/MS study




Krogan et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 440, 637-643(30 March 2006)
                          Krogan et al. TAP/MS study




Krogan et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 440, 637-643(30 March 2006)
Reminder about PPI network evaluation
Yu et al. High-Quality Binary Protein Interaction
Map of the Yeast Interactome Network
n   Motivation: experimental benchmarking of quality of
    interactions derived from TAP/MS, Y2H approaches
n   Basic setup:
    ¨   Positive reference set (~100 well-documented physical protein-
        protein interaction pairs)
    ¨   Random reference set (~100 randomly selected protein pairs)
    ¨   Small-scale experimental validation of binary interactions with
        independent Y2H and PCA (protein complementation assay)
H. Yu et al., Science 322, 104 -110 (2008)
Interaction quality– binary vs. pull-down interactions




               H. Yu et al., Science 322, 104 -110 (2008)
H. Yu et al., Science 322, 104 -110 (2008)
Y2H interaction hubs are associated with
multiple phenotypes




          H. Yu et al., Science 322, 104 -110 (2008)
Summary of global PPI studies

n   Both TAP/MS and Y2H methods have been
    applied on a global scale (starting in yeast, now
    extended to many other species)
n   Limitations to both technologies, but they are
    complementary in the types of interactions they
    capture
n   Still some debate about the link between
    topology and functional properties (essentiality,
    etc.)
“Date/Party” hub controversy
n Han et al. Evidence for dynamically
  organized modularity in the yeast protein–
  protein interaction network. Nature 2004
  430, 88-93.
n Batada et al. Stratus Not Altocumulus: A
  New View of the Yeast Protein Interaction
  Network. PLoS Biol. 2006 October; 4(10):
  e317.
      Han et al. Evidence for dynamically organized
      modularity in the yeast protein–protein
      interaction network. Nature 430, 88-93 (1
      July 2004).


n   Basic idea: overlay dynamic information
    on static protein-protein interaction
    network
    ¨ dynamic  information: gene expression data
    ¨ claim: reveals two distinct classes of hubs–
      “date” hubs vs. “party” hubs
Hub expression coherence
metric
            • For each protein, compute the average
            Pearson correlation coefficient of the
            corresponding gene across all of its
            interaction partners




             Pearson correlation
                         Evidence for “Date” vs. “Party” hubs


                                                                                Black: randomized network

                                                                                Cyan: non-hubs

                                                                                Red: hubs




Han et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 2004 430, 88-93.
                         Evidence for “Date” vs. “Party” hubs


                                                                                Black: randomized network

                                                                                Cyan: non-hubs

                                                                                Red: hubs




Han et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 2004 430, 88-93.
                “Date” vs. “Party” hubs node removal effect

                                                                                                     Blue: party hubs

                                                                                                     Red: date hubs

                                                                                                     Brown: all hubs

                                                                                                     Green: random
                                                                                                     proteins

                                              (controlling for
                                              degree, clust
                                              coeff.)




Han et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 2004 430, 88-93.
                           “Date” hub-induced subnetworks are
                                   functionally coherent




Han et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 2004 430, 88-93.
                         Hypothesis about modular organization


             • interaction network is organized into modules, in
             which party hubs are essential for holding them
             together

             • date hubs participate in a wide range of modules to
             “bridge” these functional distinct modules together




Han et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 2004 430, 88-93.
    Batada et al. Stratus Not Altocumulus: A New View of the
    Yeast Protein Interaction Network. PLoS Biol. 2006
    October; 4(10): e317.


• present an argument/evidence against the date/party hub
view of organized modularity

“Unfortunately, it is currently unclear whether these
distinctions are real and helpful”


Context:
• Han et al. study was based on combined/filtered Y2H
studies
• More data became available: TAP/MS studies + a large
compilation of literature curated data (curated/published by
the lab behind this paper)
              FYI network                                                HCfyi network
            (Han et al. paper)
Striking differences:
• HC network is nearly a single connected component
• FYI showed underenrichment of hub-hub interactions, not present in HC network
     Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network. PLoS Biol.
                                          2006 October; 4(10): e317.
                  Fraction of Hub–Hub Interactions Reduce
                          with Scale of Experiments




Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network. PLoS Biol.
                                     2006 October; 4(10): e317.
                      Evidence against the date/party hub
                                 distinction




Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network. PLoS Biol.
                                     2006 October; 4(10): e317.
             Other marks against the
             date/party hub distinction


• No evidence for bimodality in expression correlation with partners
(Hartigan’s DIP test of unimodality could not reject a unimodal
distribution)

• Repeated localization entropy analysis
    • earlier conclusion appears to depend on removing two
    cellular compartments and the fact that date hubs had higher
    degree
    • updated analysis suggested party hubs had higher entropy

• Increased genetic interaction frequency does not hold up with
larger datasets available after 2004

• Date/party hubs do not appear to have differences in
evolutionary rates
                         Stratus not altocumulus?




• argue for a more complex model of modular
organization in which modules are more
difficult to cleanly separate, hubs are well-
connected to each other




     Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network. PLoS Biol.
                                          2006 October; 4(10): e317.
          Summary of date/party hub
               controversy

n   Many different (conflicting) views about global
    organization of protein-protein interaction
    networks and the biological significance of these
    properties
n   Topology studies are highly dependent on the
    source of the protein-protein interaction data –
    literature-curated vs. global unbiased, ...
n   Details of how analysis is done can often affect
    significance of results
Next time: paper discussions
Evaluating network growth models for protein-protein interaction networks

   M. Middendorf, E. Ziv, C. H. Wiggins. Inferring network mechanisms: the
   Drosophila melanogaster protein interaction network. Proc Natl Acad Sci U S
   A., 102(9):3192–7.


 Cross-species comparison of protein-protein interaction networks
    Kelley, B. P., Sharan, R., Karp, R., Sittler, E. T., Root, D. E., Stockwell, B.
    R., and Ideker, T. Conserved pathways within bacteria and yeast as
    revealed by global protein network alignment. Proc Natl Acad Sci U S A
    100, 11394-9 (2003).

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:9/13/2013
language:
pages:56