Untangling the protein web

Document Sample
Untangling the protein web Powered By Docstoc
					NATURE|Vol 460|16 July 2009                                                             TECHNOLOGY FEATURE SYSTEMS BIOLOGY

Untangling the protein web
Researchers have identified thousands of macromolecular interactions within cells. But, as Nathan Blow
finds out, joining them up in networks and figuring out how they work still poses a big challenge.

In the spring of 2006, Andrew Emili and Jack                                                              can be used for affinity purification. Although

                                                                                                                                                                M. SNYDER
Greenblatt from the University of Toronto                                                                 some scientists suspect weaker-interacting
in Canada and their colleagues published                                                                  protein pairs or transient interactions could be
a survey1 of the global landscape of protein                                                              lost during purification, Greenblatt — whose
complexes within the yeast Saccharomyces                                                                  lab relies on tandem affinity purification tags
cerevisiae in Nature. In the same issue, another                                                          in their purifications — says this is where the
group of researchers from the drug research                                                               use of mass spectrometry helps out. “Mass
company Cellzome in Heidelberg, Germany,                                                                  spectrometry is very sensitive, so even if you
also reported2 on Saccharomyces protein com-                                                              lose 90% of the interactor during the affinity
plexes. “Those two data sets overlapped nicely,                                                           purification you can still detect the 10% that
but by no means perfectly,” says Mike Tyers,                                                              is left,” he says.
a systems biologist at the University of Edin-                                                               As with the technologies behind protein–
burgh, UK. “And yet it was essentially the same                                                           protein analysis, researchers are finding that no
method and same organism.”                                                                                single labelling tag may be enough to isolate all
   Greenblatt thinks that the two studies high-                                                           proteins. Tyers’s group recently reinterrogated
light something important that is emerging                                                                a section of the yeast proteome using three
from the current crop of large-scale protein–                                                             different tags, each with different properties.
protein interaction studies. “If you combine                                                              “For a number of baits we queried, it made a
data sets you have more information than from                                                             difference what tag was on it,” he says. “Tags
any one study alone,” he says. This is not to say                                                         can certainly affect the recovery of interactions,
that one such study is right and the other is                                                             consistent with the well-known genetic effects
wrong: scientists suspect it is more likely that                                                          often caused by different tags.”
one study often compensates for another’s false                                                              Once a specific protein or protein complex is
negatives, revealing true protein interactions       Mike Snyder has used protein arrays to explore       purified, it is analysed with mass spectrometry.
that can be missed during a single screen.           the yeast interactome.                               Electro-spray ionization or matrix-assisted
   “I think the interaction space is very large.                                                          laser desorption ionization (MALDI) volatil-
Part of the issue is that there is a large range     their interactions, researchers are learning         izes and ionizes peptides, which are analysed
of interaction affinities, and as you start to get   to embrace experimental diversity. “Every            on orthogonal or quadrupole time-of-flight
down into the weaker interactions those are          approach will usually give an overlapping but        (Q-TOF) instruments to identify ions with high
tougher to detect,” says Tyers. He adds that         distinct set of information,” says Snyder. “They     mass-to-charge ratio values. Here research-
identifying such “moving targets” is not like        all have their strengths and weaknesses.”            ers have benefited greatly from advances by
sequencing DNA, which can be argued to be a             Tyers and Greenblatt are in a growing group       instrument developers. During the American
more stable target for researchers to aim at.        of investigators who are advancing the use of        Society for Mass Spectrometry annual confer-
   But the jigsaw pieces are starting to pile up     affinity-purification chromatography followed        ence in Philadelphia, Pennsylvania, in June,
as researchers generate more and more genetic,       by mass spectrometry to uncover protein inter-       Bruker Daltonics of Billerica, Massachusetts,
metabolic and protein-interaction data sets          actions in different cell types. In this approach,   announced its new ultrafleXtreme MALDI
using a diverse array of technologies. This          a protein of interest is tagged with a label that    TOF/TOF system, and Thermo Fisher Scien-
work has been aided in recent                                                                                             tific of Waltham, Massachusetts,

years by a number of improved                                                                                             introduced the LTQ Velos and
methods and techniques. Add                                                                                               LTQ Orbitrap Velos devices.
to this recent refinements in                                                                                             Alongside other hardware, such
computational tools for mod-                                                                                              as the Xevo Q-TOF from Waters
elling signalling pathways and                                                                                            in Milford, Massachusetts, and
it’s clear that scientists might                                                                                          the 6500 series of Q-TOF instru-
be on the cusp of changing the                                                                                            ments from Agilent Technologies
way they look at signalling and                                                                                           in Santa Clara, California, these
information flow in cells.                                                                                                machines have improved both the
                                                                                                                          dynamic range and sensitivity of
Embracing diversity                                                                                                       mass analysis; in many cases they
“I think genetic information lays                                                                                         also feature integrated upstream
out the blueprints, whereas pro-                                                                                          separation technologies and
teomics is much closer to what is                                                                                         improved databases, all of which
going on in the cell, a molecular                                                                                         is making it easier to define a
manifestation of a phenotype,”                                                                                            sample’s protein composition.
says Mike Snyder, a biologist at                                                                                          For additional detail in the analy-
Yale University. When it comes                                                                                            sis, protein complexes can also be
to cataloguing proteins and           Pathway maps illustrate the complexity of cellular interactions.                    analysed with tandem mass spec-
 TECHNOLOGY FEATURE SYSTEMS BIOLOGY                                                                                  NATURE|Vol 460|16 July 2009

trometry, in which selected precursor ions can                                                     proteins of interest are attached to complemen-

                                                                                                                                                       M. VIDAL
be smashed into one another to produce still                                                       tary fragments of a reporter protein. If the pro-
smaller fragments for analysis.                                                                      teins interact with one another the reporter
   “The big advantage of using mass                                                                       is regenerated providing a direct read-out
spectrometry is that it can be per-                                                                          that is not dependent on transcription
formed in a physiological con-                                                                                 of another gene as in the yeast-two-
text,” says Tyers. Unlike other                                                                                 hybrid assay. Steven Michnick and
methods for surveying protein–                                                                                       his colleagues used the PCA
protein interactions, mass spectro-                                                                                  approach last year3 to explore
metry can be done on cell lines or                                                                                  the yeast-protein interactome,
even tissue samples, so indirect                                                                                  identifying nearly 2,800 interac-
interactions that depend on more                                                                                  tions among 1,124 proteins, many
than two proteins or on post-                                                                                    of which had not previously been
translational protein modifica-                                                                             identified by other approaches.
tions can be uncovered. Still, some                                                                            Additional work and tools could be
researchers suggest that although affinity                                                                needed to define a complete interaction
purification followed by mass spectrom-                                                                   map for even the most well-characterized
etry gives important information on how                                                                   organisms. Snyder suspects that in yeast
proteins interact in complexes, the approach                                                              each protein ‘sees’ about five other pro-
does not reveal everything about the nature                                                               teins on average. But at the moment all of
and mechanics of those interactions.                                                               the interactions identified for yeast, which has
                                                                                                   around 6,000 proteins, add up to far fewer than
Yeast shows the way                              Caenorhabditis elegans interactome map, showing   the potentially 30,000 predicted. “So, there is
Binary approaches, such as the yeast two- 5,500 protein interactions among 3,000 proteins.         still a way to go,” he says.
hybrid assay, can provide different protein
interaction information, according to Marc California, RayBiotech in Norcross, Georgia, Clear pathways
Vidal, a geneticist at the Dana–Farber Cancer and R&D Systems in Minneapolis, Minne- Finding which macromolecules interact is
Institute in Boston, Massachusetts. Vidal uses sota, have not been used as often for large- only the first step to figuring out signalling
the analogy of two football teams facing each scale protein-interaction studies as either pathways. Researchers also need methods to
other with referees in the middle of the field to mass spectrometry or the binary-interaction assemble those interactions into cellular net-
explain the differences between the techniques. approaches. “They have had impact in certain works, which is where bioinformatics enters
“The pull-down mass-spectrometry approach areas. Part of the problem is that they have the picture. “It is like building a bicycle — you
will show you the players, referees and field, been somewhat expensive, which might be the have the wheels, a seat and handlebars, but we
but not who is passing to whom and in what reason that they have not caught on as much provide the steps to put the parts together,” says
direction the ball is travelling,” he says. “This for large-scale studies,” says Snyder. Given the Julie Bryant, vice-president of business devel-
is where a binary approach comes in.”             potential of protein microarrays to identify opment at GeneGo in St Joseph, Michigan,
   The yeast two-hybrid assay is arguably the unique interactions, he hopes that costs will a company specializing in the development
best-known binary approach. It relies on a fall, which could increase their use in large- of software for cell-signalling and metabolic
split transcription factor in which one portion scale interaction studies.                         analysis. GeneGo is not alone here: a growing
is placed on each of the two proteins being         An orthogonal approach to the yeast two- number of developers are creating tools for the
tested for interaction. If the proteins interact, hybrid assay for detecting protein–protein analysis of signalling networks — from those
the transcription factor will be regenerated and interactions is the protein-fragment com- that build model networks based on existing
a reporter gene transcribed, providing a read- plementation assay (PCA), in which two data to systems that use data sets and models
out. The assay allows for more the testing of                                                      to make predictions about the activity of dif-
dynamics of protein–protein interactions, such                                                     ferent signalling networks.
as dissociation rates. “Physical interactions                                                         “We can take in any kind of experimental
are not everything: you need both edges and                                                        data — genomic, proteomic, metabolomic —
arrows to know the dissociation rates as well as                                                   and overlay them on cell-signalling pathways,”
other logical aspects of the relationships. Pull-                                                                                                      THERMO FISHER SCIENTIFIC

down mass spectrometry is a little short when
it comes to those interactions,” says Vidal.
   The other advantage of the yeast two-
hybrid approach is that it presents a more
high-throughput solution to studying protein
interactions. “The two-hybrid approach is rea-
sonably high-throughput,” says Snyder, noting
that with robotics a large number of proteins
can be tested for potential interactions in a
two-by-two format.
   Other approaches have also been rising to
the surface. “From the probing that we have
done, we have picked up interactions that you
definitely do not see with other methods,” says
Snyder of his experience using protein micro-
arrays to explore protein–protein interactions.
Protein arrays, which are sold by a number of
companies including Invitrogen in Carlsbad, Advances in mass spectrometry technology are benefiting protein–protein interaction studies.
           NATURE|Vol 460|16 July 2009                                                             TECHNOLOGY FEATURE SYSTEMS BIOLOGY

                                                                                                                    “sociological” biases in terms of which proteins

                                                                                                                    and interactions they will work on and report.
                                                                                                                    “We have learned a lot about the rules of how
                                                                                                                    macromolecules interact, but when you ask
                                                                                                                    how much of the network we have, or what the
                                                                                                                    size of the interactome of a particular species is,
                                                                                                                    if you only used the literature it would be tough
                                                                                                                    to answer those questions,” he says.
                                                                                                                       Tyers is involved with the publicly funded
                                                                                                                    BioGRID (Biological General Reposi-
                                                                                                                    tory for Interaction Datasets) initiative, an
                                                                                                                    internationally curated database of molecular
           Different approaches for identifying protein–protein interactions often reveal unique information.       interactions. Three years ago, there was an effort
                                                                                                                    to back-curate all the yeast literature for protein
           says Bryant, describing GeneGo’s MetaCore            pathways, each averaging 11 steps, with infor-      and genetic interactions, but now the database
           software. Being able to overlay a variety of dif-    mation on direction, mechanism and feedback         contains protein-interaction data from yeast,
           ferent experimental data from different sources      along the pathways, along with direct links to      worms, flies, plants and even humans along
           requires careful database curation, she says. At     literature evidence.                                with some genetic-interaction data as well. For
           the moment, GeneGo employs 50 scientists to             Literature mining is important for building      Tyers, the goal is to accurately mirror the pri-
           manually mine and curate published literature        larger interaction databases, but Bryant says it    mary literature and distil it into a format that
           for studies on protein interaction, gene expres-     can be especially difficult if the experimental     can be used in network biology. “We make no
           sion, metabolism and drugs to expand and             descriptions underlying the results have not        judgement calls on the method or even, within
           update its internal database, which now con-         been published. Another problem, according          reason, the quality of the data themselves,” he
           tains more than 120,000 multi-step interaction       to Vidal, is that researchers sometimes have        says, giving researchers the opportunity to

           When researchers at Plectix             interactions and associated

                                                                                                                                                                          M. TYERS
           BioSystems in Somerville,               probabilities,” says Jack
           Massachusetts, began to use             Greenblatt from the University of
           their new Cellucidate software to       Toronto in Canada. To generate
           model the epidermal growth factor       such probabilities for his mass-
           receptor pathway, they calculated       spectrometry studies, Greenblatt
           that there were 1033 potential          applied a ‘gold standard’ for
           states — including all protein          protein interactions — a set of
           complexes and phosphorylation           protein complexes or interactions
           states — for the system. “This          in which there is a strong amount
           is the kind of complexity that          of confidence according to the
           scientists have to grapple with         literature — as well as a set of
           when it comes to cell-signalling        proteins not known to interact         Graphical representation of the current budding-yeast interaction network.
           networks,” says Gordon Webster,         with one another as a negative
           vice-president of biology at Plectix.   standard. He then tackled the          binary screens. “Let’s roll up our       Whether or not these efforts and
             Although not all these potential      question of whether or not             sleeves and decide on a positive       standards will lead researchers
           states necessarily occur in             data sets generated by mass            and negative gold standard,” he        to rely more on large-scale data
           that pathway, when it comes             spectrometry stacked up against        says. “But let’s also use orthogonal   sets and mine them more deeply
           to creating more manageable             protein-interaction reports seen in    assays to give confidence scores to    will only be known in time. For
           models for understanding cell           peer-reviewed literature.              the interactions.”                     some, even with confidence
           signalling researchers face a              “What we did in the end was           In January, Vidal and his            measures, large-scale data sets
           difficult question: what interaction    to use the same gold standard          colleagues published a series of       lack information often found
           data do they use in their models?       to look at the molecular-biology       papers6–9 suggesting the use of        in smaller studies. “This is one
           Although many commercial and            literature,” says Greenblatt. After    new binary interaction assays to       of the paradoxes that we find
           public databases still rely heavily     adjusting the cut-off point so that    build confidence in basic networks     when people talk about systems
           on the small-scale protein–protein-     the average confidence score           produced using yeast two-hybrid        biology. With technology it is very
           interaction studies that appear         from a high-throughput study           data sets. “You say ‘OK, this is       easy to generate spreadsheets
           in peer-reviewed literature, the        matched the confidence score           basic network’ and then push           of interaction data, but that
           emergence of high-throughput            of interactions reported in the        that into a framework where            alone does not represent any
           experimental approaches that            literature, he says the interaction    all interactions are going to be       knowledge,” says Webster.
           generate very large interaction         data from such studies are no          tested by two or three orthogonal        But for Greenblatt and others,
           data sets is creating the need for a    better or worse than what is in        assays. And not only that, but do      large-scale data sets represent
           new set of rules.                       the literature.                        that under conditions where you        a starting point for further
             “In practice, what comes out of          Marc Vidal, a geneticist at the     have a positive and negative gold      research efforts. “To me, high-
           these high-throughput studies           Dana–Farber Cancer Institute in        standard,” says Vidal, adding          throughput studies are just like the
           is not a yes/no thing — ‘these          Boston, Massachusetts, wants           that the high-scoring interactions     conventional literature,” he says,
           interact, and these don’t’ — but        to see a similar approach taken        can then serve as hypotheses for       “providing a gold mine for people
           in fact they generate a list of         with yeast two-hybrid and other        researchers to test.                   to dig into.”                    N.B.

 TECHNOLOGY FEATURE SYSTEMS BIOLOGY                                                                                                 NATURE|Vol 460|16 July 2009

                                                                                                                                                                             PLECTIX BIOSYSTEMS
Cell-signalling software packages allow researchers to model and test cellular interaction networks.

extract the maximum amount of information.            models is the combinatorial expression of all       dynamics of the information flow in cells,
   A different angle in modelling signalling          these automata doing their own little thing —       researchers not only need more knowledge of
networks was recently described by Walter             just the way it is in the cell,” says Gordon Web-   protein–protein interaction networks, but they
Fontana from Harvard University and his col-          ster, vice-president of biology at Plectix.         also need to understand protein–DNA interac-
leagues4. It uses sets of rules to define relation-                                                       tions, the effects of microRNAs and epigenetic
ships between cellular components instead of          Complexity from simplicity                          changes on gene expression, and how other
the more conventional method of defining              According to Edwards, the advantage of the          macromolecules such as metabolites affect the
specific interactions and species using differen-     Cellucidate approach is that a simple set of        output of signalling networks. “It is the whole
tial equations. Fontana co-founded a company          rules for each agent can result in complex bio-     system together that determines the final out-
called Plectix BioSystems in Somerville, Mas-         logical behaviour when agents interact during       put and activity,” says Snyder.
sachusetts, which has employed this approach          the course of a simulation, unlike modelling in        Vidal thinks that technological improve-
in a web-based system called Cellucidate.             other formats, where the complexity has to be       ments — especially in nanotechnology, to
   “The system is represented at a very granular      defined before a simulation can be executed.        generate more data, and microscopy, to explore
level where the participants are allowed to do        “The level of granularity also means that rules     interaction inside cells, along with increased
in silico what they would do in real life,” says      and agents can be easily recycled from one          computer power — are required to push sys-
Paul Edwards, chief executive at Plectix. Imag-       model to another,” he notes. Like the GeneGo        tems biology forward. “Combine all this and
ine the city-building computer game SimCity           platform and the BioGRID initiative, Plectix        you can start to think that maybe some of the
reworked for complex cellular networks, but           relies on literature mining from various sets of    information flow can be captured,” he says.
here the agents of the cell — proteins and other      experimental data to create the rules for a model      But when it comes to figuring out the best
molecules — are the automata instead of col-          system (see ‘Playing by the rules’, page 417).      way to explore information flow in cells, Tyers
ourful animated people. “In that way the model           “Mapping all interactions is important, but      jokes that it is like comparing different degrees
mirrors the behaviour of the living system it         so is understanding the dynamics behind those       of infinity. “The interesting point coming out
represents: the biology that emerges from our         interactions,” says Snyder. To understand the       of all these studies is how complex these sys-
                                                                                                          tems are — the different feedback loops and

                                                                                                                                                                             M. TYERS
                                                                                                          how they cross-regulate each other and adapt
                                                                                                          to perturbations are only just becoming appar-
                                                                                                          ent,” he says. “The simple pathway models
                                                                                                          are a gross oversimplification of what is
                                                                                                          actually happening.”
                                                                                                             Paul Nurse of Rockefeller University in New
                                                                                                          York wrote about understanding the cell’s
                                                                                                          information flow last year5. He noted that “our
                                                                                                          past successes have led us to underestimate the
                                                                                                          complexity of living organisms”, an oversight
                                                                                                          that is rapidly disappearing within the world
                                                                                                          of systems biology and will probably never
                                                                                                          happen again.
                                                                                                          Nathan Blow is technology editor for Nature
                                                                                                          and Nature Methods.

                                                                                                          1.   Krogan, N. J. et al. Nature 440, 637–643 (2006).
                                                                                                          2.   Gavin, A.-C. et al. Nature 440, 631–636 (2006).
                                                                                                          3.   Tarassov. K. et al. Science 320, 1465–1470 (2008).
                                                                                                          4.   Feret, J., Danos, V., Krivine, J., Harmer, R. & Fontana, W.
                                                                                                               Proc. Natl Acad. Sci. USA 106, 6453–6458 (2009).
                                                                                                          5.   Nurse, P. Nature 454, 424–426 (2008).
                                                                                                          6.   Vidal, M. et al. Nature Methods 6, 39–46 (2009).
                                                                                                          7.   Vidal, M. et al. Nature Methods 6, 47–54 (2009).
                                                                                                          8.   Vidal, M. et al. Nature Methods 6, 83–90 (2009).
Mike Tyers uses mass spectrometry to identify protein–protein interactions.                               9.   Vidal, M. et al. Nature Methods 6, 91–97 (2009).


Shared By: