Docstoc

DOCK THIS

Document Sample
DOCK THIS Powered By Docstoc
					              PLUS:
              DOCK THIS:
              IN SILICO DRUG
              DESIGN FEEDS
              DRUG DEVELOPMENT
Summer 2007
contents
   ContentsSummer 2007                                                                                Summer 2007
                                                                                                      Volume 3, Issue 3
                                                                                                         ISSN 1557-3192

                                                                                                      Executive Editor
     FEATURES                                                                                          David Paik, PhD
                                                                                                     Managing Editor

 8          Imaging Collections: How They’re Stacking Up
            BY MEREDITH ALEXANDER KUNZ
                                                                                                      Katharine Miller
                                                                                                    Science Writers
                                                                                                     Katharine Miller
                                                                                                      Louisa Dalton
                                                                                                  Matthew Busse, PhD
            Dock This: In Silico Drug Design Feeds Drug Development
20          BY KRISTIN COBB, PhD
                                                                                                 Meredith Alexander Kunz
                                                                                                    Kristin Cobb, PhD
                                                                                                Community Contributors
                                                                                                   David Paik, PhD
     DEPARTMENTS                                                                                   Mia Markey, PhD
      1   FROM THE EDITOR: THE ACTIVE TRANSPORT OF IDEAS                                            Layout and Design
            BY DAVID PAIK, PhD                                                                        Affiliated Design
                                                                                                          Printing
      2   NEWSBYTES                                                                                   Advanced Printing
            BY KATHARINE MILLER, LOUISA DALTON, AND MATTHEW BUSSE, PhD                          Editorial Advisory Board
          • Aquaporin Simulations De-Bunk Gas Exchange Assumptions                                Russ Altman, MD, PhD
          • Parkinson’s Culprit Modeled                                                              Brian Athey, PhD
          • Clustering Without Limits                                                              Andrea Califano, PhD
                                                                                                   Valerie Daggett, PhD
          • Computer Vision That Mimics Human Vision                                                  Scott Delp, PhD
          • Nature vs. Nurture In Silico                                                            Eric Jakobsson, PhD
            • Simulating Populations With Complex Diseases                                            Ron Kikinis, MD
                                                                                                 Isaac Kohane, MD, PhD
                                                                                                     Paul Mitiguy, PhD
                                                                                                  Mark Musen, MD, PhD
                                31     SIMBIOS NEWS:                                                Tamar Schlick, PhD
                                        IN THE (PROTEIN) LOOP                                     Jeanette Schmidt, PhD
                                       BY KATHARINE MILLER                                           Michael Sherman
                                                                                                     Arthur Toga, PhD
                                                                                                  Shoshana Wodak, PhD
                                      32   UNDER THE HOOD: MUTUAL INFORMATION                      John C. Wooley, PhD
                                           BY CHIH-WEN KAN AND MIA K. MARKEY, PhD               For general inquiries,
                                                                                         subscriptions, or letters to the editor,
                                                                                                  visit our website at
                                                                                         www.biomedicalcomputationreview.org
                                                                                                       Office
                                                                                           Biomedical Computation Review
                                                                                                  Stanford University
                                                                                                   318 Campus Drive
                                                                                               Clark Center Room S231
                                                                                               Stanford, CA 94305-5444
                                                 33     PUTTING HEADS TOGETHER:          Biomedical Computation Review is pub-
                                                                                         lished quarterly by Simbios National Center for
                                                        CONFERENCES/SYMPOSIA             Biomedical Computing and supported by the
                                                                                         National Institutes of Health through the NIH
                                                   34     SEEING SCIENCE: REMODELING     Roadmap for Medical Research Grant U54
                                                                                         GM072970. Information on the National Centers
                                                          WITH CURVATURE                 for Biomedical Computing can be obtained from
                                                                                         http://nihroadmap.nih.gov/bioinformatics. The
                                                                                         NIH program and science officers for Simbios are:
                                                                                            Peter Lyster, PhD (NIGMS)
                                                 COVER ART BY                               Jennie Larkin, PhD (NHLBI)
                                                 SARA L. MALLOURE OF AFFILIATED DESIGN      Jennifer Couch, PhD (NCI)
                                                                                            Semahat Demir, PhD (NSF)
                                                                                            Peter Highnam, PhD (NCRR)
                                                                                            Jerry Li, MD, PhD (NIGMS)
                                                                                            Richard Morris, PhD (NIAID)
                                                                                            Grace Peng, PhD (NIBIB)
                                                                                            David Thomassen, PhD (DOE)
                                                                                            Ronald J. White, PhD (NASA/USRA)
                                                                                            Jane Ye, PhD (NLM)
                                                                                            Yuan Liu, PhD (NINDS)




BIOMEDICAL COMPUTATION REVIEW      Summer 2007                                               www.biomedicalcomputationreview.org
from the editor
   From theEditor
  BY DAVID PAIK, PhD




            The Active Transport of Ideas

                ow ideas spread gets at the very
        H       fabric of scholarly
                research and has
        been studied from many
        different angles.                                                                                 tion of the innovation and
            Many studies examine                                                                          confirmation of the value
        person-to-person connec-                                                                          of       the       innovation.
        tivity in social networks.                                                                      Although broadly meant to
        Within a social network,                                                                  describe the cultural spread of
        the average path length                                         ideas and technology, it applies well in the narrower context
        between any two people is a key                                 of academic research. While the last four stages are well cov-
        concept. By asking participants in Omaha or                     ered by traditional research activities, it is the initial stage of
        Wichita to mail chain letters that would get closer to          becoming aware of new ideas from far afield that is often the
        selected recipients in Boston, Milgram’s classic 1967 small     rate limiting factor and the least formalized in research.
        world experiment demonstrated the six degrees of separa-            As a great
        tion concept. Movie buffs have created a board game using       believer in the
        this concept called the Six Degrees of Kevin Bacon and
        those interested in mathematical genealogy have adopted
                                                                        power of cross fer-        A foray into
                                                                        tilization, I think
        Erdös Numbers linking researchers by co-authorship to
        the prolific mathematician Paul Erdös.
                                                                        that diffusion is
                                                                        too passive a
                                                                                                 some areas that
            However, a small world is not necessarily a robust
        world. In addition to path lengths, the connectedness
                                                                        metaphor; I prefer
                                                                        instead to think in
                                                                                                  may seem off
        between different parts of the social network is an impor-      terms of the active
        tant measure. A recent Journal of the American Medical          transport of ideas      topic can provide
        Informatics Association paper by Bradley Malin, PhD, and        and places where I
        Kathleen Carley, PhD, examines the connection                   can search out            a little dose of
        between editorial boards of medical informatics and bioin-      sources that facili-
        formatics journals to describe the fragility of links between
        these two sister fields.
                                                                        tate long range           hybrid vigor to
                                                                        transport.
            There are also many ways to examine the spread of ideas
        more broadly. The Rogers theory of diffusion of innovation
                                                                            I’ve recently          one’s work.
                                                                        found inspiration
        states that depending on when they adopt new ideas, people      for orthogonal
        form a bell curve as either innovators, early adopters, early   thinking from sev-
        majority, late majority or laggards and that the innovation     eral unconventional sources. The TED (Technology,
        penetration forms an S curve over time. The five stages are     Entertainment, Design) Conference features a diverse set
        awareness of the innovation, persuasion of the value of the     of inspiring speakers and is podcasted on the web. Edge
        innovation, decision to adopt the innovation, implementa-       Foundation is a web-based publication that includes the
                                                                        World Question Center annually featuring a grand yet sim-
                                                                        ple question asked of numerous notable scientists. On the
                             DETAILS                                    more focused topic of biomedical computation, the NIH
                                                                        Biomedical Computing Interest Group hosts webcast semi-
  Technology, Entertainment, Design (TED) Conferences:                  nars, book clubs, tutorials and brainstorming events.
  http://www.ted.com                                                        Although things are changing, academia is still ham-
                                                                        pered by the inertia of traditional boundaries between dis-
  Edge Foundation: http://www.edge.org                                  ciplines that form unintentional energy barriers against
                                                                        the diffusion of ideas. Just as a retreat or a sabbatical can
  NIH Biomedical Computing Interest Group:                              provide a refreshing perspective, a foray into some areas
  http://www.nih-bcig.org                                               that may seem off topic can also provide a little dose of
                                                                        hybrid vigor to one’s work. ■



www.biomedicalcomputationreview.org                                                        Summer 2007       BIOMEDICAL COMPUTATION REVIEW 1
NewsBytes
         Aquaporin Simulations                       exchange experimentally for about ten
                                                     years. To him, aquaporins are a likely
         De-Bunk Gas Exchange                        suspect for gas conduction because they
             Assumptions                             exist in places where oxygen must go in
           Biologists have long taken gas            and carbon dioxide must come out. For
       exchange for granted, assuming that           example they are plentiful in cells that
       gases simply seep through the cell’s lipid    line the lung, in red blood cells, and in
       membrane. Since 1998, however, evi-           astrocytes—cells at the blood-brain barri-
       dence has been building that gases            er. But it’s very hard to measure small
       might also be exchanged through pores         changes in oxygen concentration at the
       created by specialized proteins.              surface of a membrane experimentally.
           Now molecular dynamics simulations            So Tajkhorshid’s team pitched in
       of aquaporins have weighed in on the          with molecular dynamics simulations.
       question. The result: “It’s now well          Aquaporins occur in groups of four
       established that these proteins can con-      (tetramers), with four pores that con-
       duct gas molecules,” says Emad                duct water (one through each aquapor-
       Tajkhorshid, PhD, co-author of the            in molecule) and one central pore
       work and assistant professor of bio-          where the molecules meet. The latter,
       chemistry, pharmacology and biophysics        until now, had no known function.
       at the University of Illinois at Urbana-      When simulated using two comple-
       Champaign. But, he says, some uncer-          mentary methods—explicit sampling
       tainty remains: “Whether or not it’s          with full gas permeation and implicit        Simulations of the aquaporin tetramer
       important in the human body, that’s the       ligand sampling—the team found both          found that carbon dioxide and oxygen are
       controversial part.” The work was pub-        oxygen and carbon dioxide were               exchanged through the central pore—a site
       lished in the March 2007 issue of the         exchanged through that central pore.         of previously unknown function. Image
       Journal of Structural Biology.                Carbon dioxide was also transmitted          courtesy of Emad Tajkhorshid, a faculty
           Fifteen to twenty years ago, scientists   through the four water pores, while oxy-     associate of the NIH Resource for
       believed that water permeation through        gen passed through those pores only          Macromolecular        Modeling         and
       lipid bilayers was enough for water trans-    rarely. The research also found, howev-      Bioinformatics, and his UIUC colleagues
       port into and out of cells. Gradually,        er, that a plain lipid bilayer conducts      Klaus Schulten, Yi Wang, and Jordi Cohen.



         “It’s now well established that [aquaporins] can conduct gas
           molecules,” says Emad Tajkhorshid. “Whether or not it’s
         important in the human body, that’s the controversial part.”
       though, researchers realized that some        two and a half times as much gas as one      properties of the central pore.
       cells need to control water permeability,     embedded with aquaporin tetramers.           Meanwhile, Boron’s group is looking for
       and other cells have lipid bilayers that      “The question is whether this pathway        a system in which gas conduction
       aren’t very permeable to water.               is significant and makes any difference      through aquaporins is a major pathway.
       Aquaporins, it turned out, carry water in     in terms of total permeability of the        Says Tajkhorshid: “Even if it’s 30 per-
       and out in a controllable fashion. “I         membrane,” says Tajkhorshid.                 cent of total gas permeability, it becomes
       think the same might be true for gas per-         The researchers hypothesize that, as     physiologically relevant because then
       meability,” says Tajkhorshid. “Gas perme-     with water permeability, aquaporins may      you can control it.”
       ability of a lipid bilayer is like an open    be physiologically relevant to gas               According to Nazih Nakhoul, PhD,
       free highway where everything can go          exchange when cells have dense, rigid        research associate professor in biochem-
       through. With a protein, you can have a       lipid bilayers or when aquaporins occu-      istry at Tulane University, “This idea of
       gating mechanism and some regulation.”        py a major fraction of the membrane.         gas transport through membrane proteins
           One of Tajkhorshid’s collaborators,           Tajkhorshid plans to introduce point     is really gaining support. It’s interesting to
       Walter Boron, MD, PhD, professor of           mutations inside the central pore and        see molecular dynamics simulations con-
       cellular and molecular physiology at Yale     manipulate the behavior of a gating loop     firm some of the earliest findings.”
       University, has been working on gas           to see how that changes the conducting       —By Katharine Miller

2 BIOMEDICAL COMPUTATION REVIEW      Summer 2007                                                       www.biomedicalcomputationreview.org
  NewsBytes
     Parkinson’s Culprit                        its hexamer interacting with the cell          mediate and each may last only as long
                                                membrane required juggling around a            as half of a nanosecond. Nevertheless,
          Modeled                               million atoms, Tsigelny says.                  Tsigelny says, even such fleeting inter-
    Under a microscope, the curious pro-
                                                    Yet more than the size of alpha-synu-      mediates may aggregate. The pore-
tein clumps that dot the brains of
                                                clein, what made it difficult to model         like aggregates, they found, are far
Parkinson’s patients stick out like the
                                                was its lack of structure. Alpha-synucle-      more stable than single molecules of
culprits they are. But no one has yet
                                                in is an intrinsically unstructured pro-       alpha-synuclein.
caught the protein—alpha-synuclein—in
                                                tein—one without a distinct three-                Having this model “is one step for-
the act of causing disease. Now, investi-
                                                dimensional shape. Most proteins con-          ward,” says Hilal Lashuel, PhD, profes-
gators report in an April 2007 issue of
                                                sistently fold into a favored shape to do      sor at the Swiss Federal Institute of
FEBS Journal that they’re getting closer:
                                                their jobs, a form that can be crystal-        Technology in Lausanne, Switzerland.
they’ve modeled alpha-synuclein’s early
                                                lized, imaged, and pored over. But             The UCSD model provides a structural
aggregation and offered a detailed mech-
                                                unstructured proteins flop this way and        basis for testing the hypothesis that
anism for its participation in neuron
                                                that, even while performing their spe-         alpha-synuclein forms toxic pores, he
death.
                                                cific tasks, making them very difficult        adds. But Lashuel also cautions that
    “This is not just the first computa-
                                                to pin down and study.                         only biochemical and in vivo studies can
tional model of alpha-synuclein,” says
                                                    “We were not scared by an unstable         prove whether alpha-synuclein pokes
Igor Tsigelny, PhD, an author of the
                                                protein,” Tsigelny states. And he and          holes in neurons. “Isolating the toxic
paper and a computational biologist at
                                                his coworkers developed an unusual             species is really the most difficult ques-
the San Diego Supercomputer Center.
                                                “all-dynamic” approach to modeling             tion we are dealing with. You have to
“Up to now, there was no molecular
                                                the protein. None of the conformations         catch it in the act.”
concept of the aggregation going on.”
                                                are final—they are all considered inter-       —By Louisa Dalton
    In the brain cells of Parkinson’s
patients, alpha-synuclein first starts to
cluster as a proto-fibril. It then forms fib-
ril chains, and finally ends up in the
dense clumps of fibrils called Lewy bod-
ies. Some researchers have suggested in
the past few years that alpha-synuclein
knocks off neurons right at the begin-
ning of aggregation, long before it can be
detected as a Lewy body. Biochemical
and structural evidence hints that when
a few alpha-synuclein molecules first self-
assemble into proto-fibrils, they can
form pore-like ring structures. These
may interact with the cell membrane
and allow ions to enter the cell. The
entrance of ions such as Ca2+ could
lead to neuron death.
    The computer model created by
Tsigelny and his colleagues at the
University of California, San Diego, sup-
ports this theory, providing detailed
dynamics of alpha-synuclein hexamers
and pentamers and their interaction
with the cell membrane. What’s more,
the model shows that another synuclein
in the cell—beta-synuclein—blocks alpha-
synuclein’s ring-making, suggesting at
least one avenue for future inhibitory
drug development.
    Modeling such a complex aggregation
wasn’t simple. Alpha-synuclein is a large       Alpha-synuclein poses as a pentamer, pore-like, on the surface of a cell membrane. Courtesy
protein (140 amino acids), and to model         of Igor Tsigelny



www.biomedicalcomputationreview.org                                                          Summer 2007       BIOMEDICAL COMPUTATION REVIEW 3
  NewsBytes
       Clustering Without Limits                              “Part of the
           Starting in preschool we all learn how
       to get organized. Typically, we start with
       pre-determined categories (dolls, trains,           attraction of the
       blocks); pre-set ideas about what belongs
       in each category (Barbie: doll; Thomas           [affinity propagation]
       the Tank Engine: train) and a fixed num-
       ber of bins to put things in.                       algorithm is that,
           But what if you started with none of
       those initial limitations? Could you still           although it was
       group the toys? It turns out that, in a
       computer, such sorting is not only possi-
       ble, but extremely efficient. Using a
                                                            complicated to
                                                                                                         Frey and Dueck use affinity propagation
       novel algorithm called affinity propaga-
       tion, researchers at the University of
                                                           derive, it’s quite                            to cluster data around “exemplars”—
                                                                                                         data points that best represent their
       Toronto found that they can not only
       cluster lots of different kinds of data          simple to implement                              compatriots. In this graphic, after start-
                                                                                                         ing with an equal chance of serving as an
       appropriately, but do it better and faster
       than other methods. The work was                      and to get an                               exemplar, candidates for that job have
                                                                                                         already emerged (red dots). Each data
       published in the February 16 issue of
       Science.                                          intuitive feel for it,”                         point sends messages to each candidate
                                                                                                         exemplar conveying how well it repre-
           “Almost all existing techniques work
       on a hypothesis refinement basis: they            says Brendan Frey.                              sents the blue point compared to other
                                                                                                         candidate exemplars. And candidate
       start off with a set of assumed groups
                                                                                                         exemplars send messages conveying
       and iteratively refine them,” says
                                                                                                         their availability to serve as an exemplar
       Brendan Frey, PhD, associate professor
                                                                                                         for particular data points.
       of electrical and computer engineering                 The task sounds mind-boggling: There
       at the University of Toronto, co-author            are a huge number of possible groupings.
       of the paper. “To our knowledge, ours is           But affinity propagation handles that         says Dueck. Indeed the algorithm is so
       the first algorithm to consider all possi-         problem by sending messages between           generic that Frey and Dueck used it to
       ble groupings at once.”                            data points—pair-wise—so as to maximize       analyze gene expression data, facial
                                                                           the net similarity in        images, and airline routes, while other
                                                                           each group. “Each mes-       researchers have found applications in
                                                                           sage encapsulates or         basketball statistics, the stock market and
                                                                           summarizes a whole dis-      computer vision. And many tasks in com-
                                                                           tribution of possible        putational biology require a computer to
                                                                           groupings for one of the     organize the data before using it to make
                                                                           data points,” says           predictions.
                                                                           Delbert Dueck, a PhD             “Part of the attraction of the algo-
                                                                           candidate in Frey’s lab.     rithm is that, although it was complicat-
                                                                           “No one has done that        ed to derive, it’s quite simple to imple-
                                                                           before.”                     ment and to get an intuitive feel for it,”
                                                                               Affinity propagation     says Frey. There are basically only two
                                                                           is based on an algo-         equations to it. “Sometimes we’ll give a
                                                                           rithm called belief prop-    talk and get emails from people who’ve
                                                                           agation, which has been      implemented it the day after,” he says.
                                                                           around in various incar-         When the researchers looked at how
                                                                           nations for many years.      well the algorithm performed compared
                                                                           But, say the authors, it’s   to other clustering methods they found
                                                                           an approach that has         it remarkably efficient. “A problem our
                                                                           never been applied to        algorithm could solve in about five min-
        If asked to cluster facial images, a standard clustering method
                                                                           clustering. “Certainly       utes on one computer would take other
        (k-means clustering) would take up to a million years on a sin-
                                                                           not to generic clustering    methods up to one million years to solve
        gle computer to achieve the accuracy achieved by affinity prop-
                                                                           of any type of data,”        on that same computer,” says Frey.
        agation after five minutes.


4 BIOMEDICAL COMPUTATION REVIEW        Summer 2007                                                           www.biomedicalcomputationreview.org
  NewsBytes
    Tim Hughes, PhD, of the Center for         lished out of the lab run by Tomaso           was able to classify pictures of a busy
Cellular and Biomolecular Research at          Poggio, PhD, at MIT’s McGovern                street scene as well as other leading
the University of Toronto, is considering      Institute for Brain Research.                 mathematics-based computer vision sys-
using affinity propagation in his                  For decades, scientists have struggled    tems, as described in the March 2007
research. “It seems like it would do best      to create computer programs that can rec-     issue of IEEE Transactions on Pattern
when things really do form independent         ognize visual objects as well as humans       Analysis and Machine Intelligence.
groups, and when the data are                  can. Some computer systems excel at rec-          Serre’s team then built a more com-
fairly sparse, so most of the correlation      ognizing one particular object, but none      plex system, consisting of many S and C
matrix can be dropped in early                 are anywhere close to recognizing the wide    layers designed to closely match the flow
cycles,” he says. “I think it will work well   range of objects observed by the human        of information in a human brain during
with      exon-profiling        data     or    brain.      Visual                                                   the first 100-200
genome-tiling data, where there is also a      recognition       is                                                 milliseconds      of
constraint        that      the      groups    complicated by      “We’ve built a model                             perception. This
have to correspond to regions near each        two conflicting                                                      enhanced system
other on the chromosome.”                      goals: a program      to be as close as                              performed as well
—By Katharine Miller                           must be specific                                                     as humans on a
                                               enough to discrim-   possible to what is                             rapid object recog-
                                               inate     between                                                    nition task: distin-
  Computer Vision that                         different objects,
                                               such as a person
                                                                     known about the                                guishing animals
                                                                                                                    from non-animals
  Mimics Human Vision
    Our brains can recognize most of the
                                               or a car, yet flexi-
                                               ble enough to rec-
                                                                       human visual                                 when images were
                                                                                                                    flashed in front of
things we pass on an evening stroll:           ognize the same                                                      humans and com-
Cars, buildings, trees, and people all reg-    type of object in      system,” says                                 puters. The work
ister even at a great distance or from an      different     sizes,                                                 appeared in the
odd angle. Now, a new computer vision          poses, and light-      Thomas Serre.                                 April 2007 issue of
program can do the same thing. It suc-         ing.                                                                 the Proceedings of
cessfully rivals the human ability to rap-         To achieve these goals, Serre and col-    the National Academy of Sciences. The
idly recognize objects in a complex pic-       leagues used data recorded from real          computer system even made errors simi-
ture because it mimics how information         neurons in the visual system to program       lar to the errors made by humans, sug-
flows during the initial stages of visual      two fundamentally different kinds of vir-     gesting that the model recapitulates the
perception.                                    tual neurons called S (simple) and C          early processes of the human visual sys-
    “We’ve built a model to be as close as     (complex) units. S units recognize specif-    tem.
possible to what is known about the            ic features of an image; C units monitor          The model will be used as a tool by
human visual system,” explains Thomas          a range of S units in one area and allow      neuroscientists to better understand the
Serre, PhD, a postdoctoral associate in        for variation in position and size.           human visual system, and also has prac-
the Center for Biological and                      The researchers were surprised to         tical applications for surveillance, driv-
Computational learning at MIT and              find that a simple system, consisting of      ing assistance, and autonomous robot-
lead author of two papers recently pub-        four alternating layers of S and C units,     ics. According to Poggio, the team’s next

When presented with a real-world
street scene (left), Serre’s computer
vision system successfully recog-
nized pedestrians, cars, buildings,
trees, sky, and the street (right).
Although not pictured, the model
also successfully identified bicycles.
Note the error in this example: the
model mistakenly classified a street
sign as a pedestrian. Graphic cour-
tesy of Stanley Bileschi, PhD,
McGovern Institute for Brain
Research at MIT.




www.biomedicalcomputationreview.org                                                         Summer 2007     BIOMEDICAL COMPUTATION REVIEW 5
  NewsBytes
       goal is to extend the model to include
       the “back projections” from other parts
       of the brain that allow feedback process-
       ing of visual information after 200 mil-
       liseconds.                                                                                     Agent-based computer models predict the
           “This is the first demonstration that                                                      pattern (left) produced when genetically
       a purely bottom up approach to visual                                                          identical cells have an inherent probability
       object recognition, inspired by record-                                                        of changing (from green to red and vice
       ings from the neurons in the brain, is                                                         versa), and the pattern (right) produced
       effective as a practical computer vision                                                       when cells are triggered to change by an
       system,” says Terry Sejnowski, PhD,                                                            extrinsic factor, such as cell density. Top
       head of the Computational Neuro-                                                               images represent exponential growth;
       biology Lab at the Salk Institute. “There                                                      bottom are at equilibrium. Courtesy of
       is much more work to do, both to                                                               Andras Paldi.
       improve its performance, and also to use
       it to better understand how our own
       visual system works.”                           agent based models of a tissue culture          can affect the differentiation process.
       —By Matthew Busse, PhD                          plate. In each model, all cells act inde-       “The stem cell nature is not an intrinsic
                                                       pendently and can switch between two            property of the cell,” he says. “It is a prop-
                                                       cell types: A or B. In the “extrinsic”          erty of the whole cell population.” Paldi
                                                       model, A cells turn into B cells when it        further believes the work supports the
               Nature Versus                           gets crowded, and back to A cells when          effort to find a way of converting adult,
              Nurture In Silico                        they have more space. In the “intrinsic”        differentiated cells into stem cells (and
           Every generation, a few noncon-             model, each cell has fixed probabilities of     avoid the need for harvesting embryonic
       formists crop up in tissue cultures of          switching from A to B and back again.           stem cells)—a possibility that has not just
       genetically identical cells. The question is:       When the                                                                    scientific,
       are the wayward simply born that way, or        scientists ran the  Why, in the same warm                                       but social
       did something in the environment affect         models,       they                                                              and political
       them? “You have these two possibilities—
       intrinsic or extrinsic, nature or nurture,”
                                                       found each pro-
                                                       duces a stable,
                                                                            spot, getting the same                                     implications
                                                                                                                                       as well.
       says Andras Paldi, PhD, a biologist at          heterogeneous
       Genethon in France.                             population, yet
                                                                          rich media, do some cells                                        Christa
                                                                                                                                       Muller-
           Now, Paldi and his colleagues have          they differ in the                                                              Sieburg,
       modeled such cultured cells to deter-           cell     patterns.   differentiate and others                                   PhD, how-
       mine whether extrinsic or intrinsic             The intrinsic                                                                   ever,     dis-
       influences play a key role in the sponta-       model predicts           stay stem cells?                                       putes that
       neous emergence of phenotypic varia-            lone A cells dis-                                                               scientific
       tion. It turns out that for spatial patterns    tributed evenly throughout a largely B          conclusion. “The idea that mature cells
       beyond randomness to arise, there has           population. Extrinsic predicts that the A       can turn into stem cells is very attractive
       to be some effect of sensing neighboring        cells will cluster. The result held even        to many modelers but has little support
       cells—i.e., extrinsic factors must play a       though the cells were allowed to migrate.       through experimental data,” says the
       role. And the extrinsic model resembles             This pattern difference allowed the         professor at the Sidney Kimmel Cancer
       results seen in real cells. The work            researchers to compare their computa-           Center.
       appears in April in PLoS One.                   tional simulation with real cells. Using a          Sui Huang, MD, PhD, at
           Paldi’s work was motivated in part by       muscle cell line that can switch between        Children’s Hospital Boston, would
       the open question among stem cell biol-         two distinct phenotypes, a stem-cell like       have liked to see Paldi’s group perturb
       ogists of what triggers a stem cell to dif-     progenitor state and a differentiated state,    the cell line or the culture to confirm
       ferentiate. Why, in the same warm spot,         they found that the cell pattern mostly         their model. But both he and Muller-
       getting the same rich media, do some            resembles that of the extrinsic model.          Sieburg believe the study addressed an
       cells differentiate and others stay stem        Many of the rare, stem-cell like cells clus-    important question, that of heterogene-
       cells? It is commonly assumed that this is      ter; a few are solitary.                        ity of a genetically identical population
       because the decision to differentiate is            What’s important here, Paldi says, is       of cells. And, says Huang, it certainly
       intrinsic—that is, purely random.               that they find environment playing a            “contributes to the discussion in the
           To test that assumption, Paldi’s group      role—a significant one. In the case of stem     community.”
       started by designing two simple, multi-         (progenitor) cells, it means neighbor cells     —By Louisa Dalton

6 BIOMEDICAL COMPUTATION REVIEW        Summer 2007                                                          www.biomedicalcomputationreview.org
 Simulating Populations                           But that technique is not without its   based on Python. The software is freely
                                              problems. When a population evolves for-    available at http://simupop.sourceforge.net,
 with Complex Diseases                        ward in time, there are simply too many     under a GPL license.
    Diabetes, breast cancer, multiple                                                        When Peng and his colleagues used
                                              possible outcomes. Most notably, when
sclerosis, Alzheimer’s disease. All are                                                   their method to compare several gene map-
                                              you introduce a disease allele, it can rapid-
associated with several genes’ alleles                                                    ping techniques they found that certain
                                              ly be eliminated and replaced with new
interacting in complex ways with one                                                      methods worked better for loci that were
                                              alleles. So Peng came up with a trick: He
another and the environment. Now,                                                         located distantly from one another; and
                                              pre-sets desired disease allele frequencies in
using a computationally intensive                                                                              other methods were
method known as forward-time simula-                                                                           more effective when
tion of human populations, researchers                                                                         loci were close together.
are hoping to gain a better understand-                                                                        Overall, though, says
ing of how such complex diseases                                                                               Kimmel, “We’re mildly
become established.                                                                                            pessimistic” about cur-
    “In a real population you just see peo-                                                                    rent gene mapping
ple with the disease,” says Marek                                                                              approaches. “When
Kimmel, PhD, professor of statistics at                                                                        the number of loci
Rice University and co-author of the                  CANCER
                                                                                                               involved in complex
work. “You don’t see who in the popula-                                                                        disease is greater than
tion has the disease genes because peo-                                                                        two, the methods rap-
ple carrying these genes do not necessar-                                   MULTIPLE
                                                                           SCLEROSIS                           idly lose their power.”
ily become diseased.” But in the model                                                                         Until recently, gene
population, he says, “you see both.” And                                                                       mapping for complex
the researchers’ approach allows them to                                                                       diseases has been disap-
simulate a very complicated scenario—                                                                          pointing, he says. Loci
including changes in types of selection                                                                        identified in such
pressure.                                                                                                      efforts have later
    “This lets us evaluate how well statis-                        DIABETES
                                                                                                               turned out to be statis-
tical genetics tests determine what genes                                                                      tical artifacts. “Our
are responsible for the symptoms of a                                                                          modeling could figure
disease and how frequently those genes                                                                         out if this is inevitable,”
appear in the population.” That’s a                                                                            he says—and help guide
non-trivial exercise, he says, because it                                                                      people toward more
has been impossible, until now, to                                                                             effective approaches.
compare the many existing gene-map-                                                                                David Balding,
ping methods head-to-head. The work                                                                            PhD, a professor of
was published in PLoS Genetics in               “In a real population, you just see                            statistical genetics at
March 2007.                                                                                                    Imperial College in
    Before now, the most commonly                 people with the disease,” says
                                                                                                               London, does similar
used approach to simulating diseases in
human populations—called the “coales-
                                                  Marek Kimmel. “You don't see                                 work using forward-
                                                                                                               time simulations of
cent” method—worked by coalescing                  who in the population has the                               large            genomic
backward in time to a most-recent com-                                                                         regions. He has
mon ancestor. But it’s extremely diffi-                        disease genes...”                               become pessimistic
cult to take selection into account using
                                              the current generation, extrapolates them about the method’s usefulness for
the coalescent method, says co-author
                                              backward, and starts the simulation from understanding complex diseases because
Bo Peng, PhD, a postdoctoral fellow at
                                              there. As Kimmel puts it, “We are restrict- no one really knows what kind of selec-
the University of Texas MD Anderson
                                              ing potential variability in one aspect of tion is going on. Nevertheless, he says,
Cancer Center. Moreover, that
                                              the present in order to produce a simula- this work can be useful for studying
approach gets too complicated if more
                                              tion that resembles something close to the selection itself. “People tend to look at
than one disease gene is involved. So                                                     selection one allele at a time,” he says,
                                              actual variability that exists now.”
Peng and his colleagues turned to for-
                                                 The simulation uses a scripting lan- “But forward-time simulation lets us do
ward-time simulation, an approach
                                              guage called simuPOP, a general-purpose it with complex interactions.”
that’s been around for about one hun-
                                              forward-time simulation environment —By Katharine Miller ■
dred years.


www.biomedicalcomputationreview.org                                                            Summer 2007    BIOMEDICAL COMPUTATION REVIEW 7
                                                    How They’re Stacking Up




                                           BY MEREDITH ALEXANDER KUNZ




8 BIOMEDICAL COMPUTATION REVIEW   Summer2007                             www.biomedicalcomputationreview.org
                                         n the beginning there was the

                                      I  Visible Human. It broke new
                                         ground by gathering some 2,000
                                      serial images from a death row
                                      inmate’s cadaver, and was the first
                                      time researchers had sectioned a single
                                      human being and gotten it right.
                                        But the project broke new ground in
                                      another way as well. As the first large,
                                      publicly-available image collection, it
                                      proved that “If you build it, they will
                                      come,” according to project director
                                      Michael Ackerman, PhD, of the
                                      National Library of Medicine (NLM).
                                        The Visible Human was initially
                                      envisioned as a tool for teaching anato-
                                      my. But soon after the database
                                      launched in 1994, use agreements
                                      started pouring in from scientists who
                                      wanted to create 3-D images to test for
                                      radiation absorption or design artifi-
                                      cial hips and knees, not to mention
                                      from artists illustrating anatomical
                                      injuries in court cases, to name just a
                                      few of the dozens of projects based on
                                      the Visible Human data.
                                        Despite the suggestion that such
                                      large image collections could inspire
                                      new types of research, the Visible
                                      Human Project remained the only
                                      public imaging database available for
                                      many years. During that time, large
                                      public databases in other fields—most
                                      notably genomics and proteomics—cre-
                                      ated whole new realms of research.
                                        Today, unlike genetic sequence data,
                                      which are centralized in GenBank,
                                      and protein structures, which reside in
                                      the Protein Data Bank (PDB), imaging

www.biomedicalcomputationreview.org              Summer 2007   BIOMEDICAL COMPUTATION REVIEW 9
                                                                How They’re Stacking Up
                          IMAGE COLLECTIONS:




                         This section through the Visible Human Male’s thorax shows his heart (with muscular left ventricle), lungs, spinal
                         column, major vessels, and musculature. Image courtesy Michael Ackerman, Visible Human Project, National Library
                         of Medicine.

                                                                                       data still lacks a central repository. But an increas-
                                                                                       ing number of people are hoping to create image
                                                                                       collections from thousands of people, and not just
                                                                                       one prisoner in Texas.
                                                                                          The question is whether the shift from examining
                              Specialists carrying out                                 images one at a time to looking at them in large
                                                                                       groups will not only lead to better research of the
                               imaging projects feel                                   type already done today, but will create something
                                                                                       fundamentally different. Just as the field of genetics
                             they should be the first to                               transformed into genomics when biologists moved
                                                                                       from looking at individual genes and diseases to
                              reap the benefits of the                                 examining the whole genome, so too imaging could
                                                                                       see a shift. A field that has traditionally studied nar-
                              information the images                                   rowly defined problems using small collections
                                                                                       gleaned from physician-collaborators could find itself
                                                                                       faced with huge collections and the potential to
                                contain, rather than                                   reveal new correlations between diseases, genes, and
                                                                                       anatomy. As in genomics, it will be possible to look
                             having to share the data.                                 at variation both within and between diseases like
                                                                                       never before.
                                                                                          Before this transformation can happen, though, a
                                                                                       leap of faith is required: Researchers must share their
                                                                                       images now in hopes of greater rewards later. That’s
                                                                                       one of the current challenges researchers are tackling.
                                                                                       There are others as well: Researchers must find ways
                                                                                       to increase computer storage capacity; create a com-


10 BIOMEDICAL COMPUTATION REVIEW    Summer 2007                                                         www.biomedicalcomputationreview.org
                                                                                               Indeed, in 2000, a spat erupted in
      “Neuroscientists who do complicated                                                  the brain imaging world when Michael
                                                                                           Gazzaniga, PhD, director of the
    imaging studies are not that happy about                                               National fMRI Data Center, wrote to
                                                                                           fMRI specialists who had contributed
        having data out there before they                                                  to the Journal of Cognitive Neuro-
                                                                                           science, telling them they would be
       can mine it,” says Maryann Martone.                                                 required to share their experimental
                                                                                           data with the center if they wished to
                                                                                           publish in journals including Science
                                                                                           and the Journal of Neuroscience.
                                                                                           Researchers immediately raised objec-
mon language for describing images;          cheap—researchers might pay around            tions, sending a letter to the center’s
develop standards for “metadata” that        $3,500 for a terabyte of storage—and the      financial backers and 14 journals.
will explain where an image comes from       capacity of computer networks to trans-       Releasing their images, they argued,
and what it shows; find ways to map          mit large images is ever improving. Fred      “impinges on the rights authors should
images from different individuals onto       Prior, PhD, of Washington University          have on the publication of findings
an agreed upon “model;” and improve          School of Medicine in St. Louis, recent-      stemming from their own work.” The
existing ways to analyze and interpret       ly purchased space to store new research      center decided to establish a “data
images consistently. They also must          images he expects will be generated dur-      hold” for a period of time, to allow
make images available remotely, so that      ing the next three years at the               authors to profit from their images first.
physicians in rural areas will have access   Electronic Radiology Laboratory which             Maryann Martone, PhD, has run
to large comparative collections.            he directs. His team’s new Network            up against some of the same issues. As
    As these barriers fall and imaging       Attached Storage system from BlueArc          co-director of the National Center for
collections become more readily avail-       can hold 102 terabytes, with an option        Microscopy and Imaging Research
able, suddenly, imaging researchers will     to expand to 500 terabytes or, with an
be able to do what genomics research-        upgrade, to 4,000 terabytes (4
ers do all the time: look at human           petabytes)—a number once unthink-
systems in their entirety rather than in     able. And that does not even include
pieces.                                      clinical imaging, another huge figure.
    But before we get ahead of ourselves,        Even with such imaging, storage,
let’s review the challenges.                 and computing power in hand, a ques-
                                             tion remains: how to motivate other
                                             researchers to share their images?
   BUILDING AND SHARING                      Scientists feel a sense of proprietary
      THE COLLECTION                         ownership over the images they have
   Creating image data is easier than        collected. While patients can perhaps
ever. Imaging capacity has increased by      stake the greatest claim to the images,
leaps and bounds. X-ray technology,          most images are technically “owned” by
developed in the 1890s, was followed by      the institution where they were made,
incrementally stronger imaging meth-         and specialists carrying out imaging
ods, from ultrasound (widely available in    projects feel they should be the first to
1970s), to positron emission tomography      reap the benefits of the information the
or PET (1970s), to computerized axial        images contain, rather than having to
tomography or CT scans (1970s), to mag-      share the data.
netic resonance imaging or MRI (early            “Science is highly competitive.            Researchers have shared abundant
1980s) and functional MRI (early 1990s).     Scientists want to get the first publica-      images in the Cell Centered Database.
New techniques are still appearing.          tion, to gain funding, and get academic        Here, a screenshot shows the types of
   And with major improvements in            promotions,” says Arthur Toga, PhD,            images and movies available. Image
data storage and networking, scientists      head of the Laboratory of Neuro                courtesy Skip Cynar, National Center for
do not worry as much about amassing          Imaging (LONI), at the University of           Microscopy and Imaging Research,
bigger data sets. Big disks are relatively   California, Los Angeles.                       University of California, San Diego.


www.biomedicalcomputationreview.org                                                      Summer 2007    BIOMEDICAL COMPUTATION REVIEW 11
                                                             How They’re Stacking Up
                          IMAGE COLLECTIONS:




                               One of the most important parts of collecting large
                             amounts of imaging data is also to capture each image’s
                              back story—the context in which it was made and the
                                       condition of the patient at the time.


                            (NCMIR) at the University of California, San           trials. It aims to take an “open source” approach—
                            Diego, she has led the creation of the Cell            creating an environment of sharing information in
                            Centered Database (CCDB), one of the first             the work it funds. According to some, this is the
                            Internet databases for cell-level structural data.     wave of the future.
                            She also coordinates a project supported by the            “Increasingly, the NIH is requiring that peo-
                            Biomedical Informatics Research Network                ple share data,” says Daniel Rubin, MD, MS, a
                            (BIRN) that investigates mouse models of human         clinical assistant professor and research scientist
                            neurological disease.                                  at Stanford University Medical Center. Clinical
                               “These resources were created with the idea         trial information, for instance, is becoming more
                            that people were going to populate them from the       readily available, Rubin says. He points to the
                            community, but neuroscientists who do compli-          American College of Radiology Imaging Network
                            cated imaging studies are not that happy about         (ACRIN) as an example of this trend. This NCI-
                            having data out there before they can mine it,”        funded group hosts an imaging database that
                            she says. Because NCMIR is a “technology devel-        houses a large archive of clinical trial imaging
                            opment center” funded by the NIH, she says, it         data in cancer fields.
                            has a mission “to serve a large collaborative com-         Toga thinks that it is ultimately in a scientist’s
                            munity.” So she decided to begin with her own          self-interest to share. Lots of data is needed if sci-
                            center’s data and hope that others would follow:       entists want to identify subtle differences between
                            “We do imaging that is unique. I figured, if we        images, he says. “You can’t possibly collect it on
                            just took all the data around here and made it         your own.” What helps, he says, is when a couple
                            available, that would be helpful.” It was: the proj-   of folks get together and say, “I’ll share mine if you
                            ect was one of the first web databases devoted to      share yours,” which is becoming more common.
                            electron tomography when it launched in 2002.
                            Since then, it has continued to give access to com-
                            plex cellular and subcellular data from light and               METADATA: CAPTURING
                            electron microscopy. Meanwhile, Martone and                        THE CONTEXT
                            colleagues are still thinking about the best ways to      One cooperative project in which Toga has
                            encourage other research groups to share their         been involved is the NIH-sponsored Alzheimer’s
                            data with the site.                                    Disease Neuro-imaging Initiative (ADNI), which
                               As so often happens in the world of science, it     encompasses 60 different sites that are sharing
                            is funders—in particular, big government-spon-         image data on the disease. But if a researcher looks
                            sored efforts—who are beginning to change the          at an ADNI image without knowing whether the
                            rules of the game. One project aiming to put its       patient has a disease or not, or without access to
                            arms around as many images as possible is caBIG™.      the person’s age or gender, or the drugs he or she
                            Launched in 2004 by the National Cancer                has been taking, it becomes much less useful.
                            Institute (NCI), it embraces 50 cancer centers and        One of the most important parts of collecting
                            30 other organizations. caBIG™ is an attempt to        large amounts of imaging data is also to capture
                            bring together the huge amounts of data gathered       each image’s back story—the context in which it
                            and tools created in NCI-funded cancer clinical        was made and the condition of the patient at the


12 BIOMEDICAL COMPUTATION REVIEW   Summer 2007                                                    www.biomedicalcomputationreview.org
Brain imaging studies are expanding into ever-larger populations. This
enables digital atlases to be developed that synthesize brain data across
vast numbers of subjects. Mathematical algorithms can exploit the data in
these population-based atlases to detect pathology in an individual or
patient group, to detect group features of anatomy not apparent in an indi-
vidual, and to uncover powerful linkages between structure and demo-
graphic or genetic parameters. In this image, researchers from UCLA’s
Laboratory of Neuro Imaging (LONI) have used composite tensor mapping
to show how Alzheimer’s patients’ brains exhibit loss of gray matter.
Courtesy of Dr. Arthur W. Toga, Laboratory of Neuro Imaging, UCLA.


time. For images, efforts to create a          laries and common data elements,” an          tions that range from slight to
framework for recording such informa-          effort to standardize terminology in          immense. On top of that, describing
tion—known as metadata—currently lag           cancer analysis. Rubin, one of the            shape is notoriously difficult. Though
behind efforts in other realms (e.g., the      group’s co-leads, reports that they are       shape has been explored by the scientif-
“MIAME” standards for microarray               trying to structure radiology imaging         ic community since the time of the
data). But work is now underway to             findings, to establish controlled termi-      Greeks, we still have no quantitative
improve the situation.                         nologies for radiology, and to associate      parameters for defining the shapes of
    Some metadata—such as a patient’s          specific metadata about patients with         “normal” human organs, let alone
name, home address, and identifying            each image gathered.                          those suffering from disease. In addi-
features—must be removed before                    Indeed, such efforts do not end with      tion, images are affected by the exact
images enter a large database. The             cancer research, but could sweep across       place and time they are taken, and the
process of “de-identification of protect-      all aspects of radiology. Rubin is also       precise method used to take them. All
ed health information” follows federal         involved with a project called RadLex,        this serves to undermine any straight-
privacy regulations.                           which is being created to offer a uni-        forward database of imaging data.
    But other useful information needs         form lexicon for radiologists. RadLex         “Image data is a snapshot of one
to be incorporated into image collec-          plans to unify radiology term standards       instance of a thing at one time under
tions. Before image metadata can make          and to make the new terminology freely        certain conditions. It’s not a ground
sense, though, more standardization            available on the Internet. Rubin sees         truth like a gene sequence,” says
needs to be introduced into the field,         these attempts to create a common             Martone.
many say. Radiologists have a long tra-        vocabulary as the first steps in making           If all images can be standardized in
dition of looking at images with expert        metadata meaningful and useful for            the way they are conducted—that is, the
eyes and dictating a free-flowing analy-       researchers and clinicians alike.             types of equipment used, and the
sis, which becomes a text report that                                                        kinds of patients included, and the dis-
often uses terms in unique ways. That                                                        ease(s) being examined—comparison
makes it difficult for other scientists or          COMPARING IMAGES:                        becomes easier. That is part of the suc-
doctors to understand the image’s con-             SNAPSHOTS AND SCALES                      cess of ADNI, according to Toga: its
text and content in a uniform way.                The race to create useful imaging col-     research sites are required to follow
    Attempts to collect and codify meta-       lections faces another hurdle: how can        strict protocols for their equipment
data are already well underway. One of         multiple images be compared in a way          and image acquisition.
caBIG™’s initiatives in its In-vivo            that makes sense? Each human’s body               Imaging specialists have also come
Imaging Workspace is called “vocabu-           parts are shaped differently, with varia-     to rely on the best available scientific



www.biomedicalcomputationreview.org                                                        Summer 2007   BIOMEDICAL COMPUTATION REVIEW 13
                                                            How They’re Stacking Up
                          IMAGE COLLECTIONS:




                            means of shape comparison, and they try to           tell viewers that there is an 80 percent likeli-
                            incorporate this material into their collections.    hood that the basal ganglia is in a particular loca-
                            One example is in neuroimaging, where pictures       tion that has been set out by coordinates.
                            of the brain are often linked to coordinate              Another means of handling variation is evi-
                            systems. Like a road map, these identify what        dent in the Allen Brain Atlas, an extensive map-
                            parts are found where with reference to a            ping of the mouse brain’s gene expression creat-
                            grid or common starting point. For example,          ed by the Allen Institute for Brain Science in
                            Talairach coordinates measure distances              Seattle. The team behind this atlas created its
                            from a specific spot in the brain, the anterior      own coordinate system to ensure extra accuracy.
                            commissure.                                          The ABA is a union of neuroscience, genetics,
                               However, researchers find fault with existing     and informatics. To map gene expression onto
                            coordinate systems because they fail to accommo-     the 3-D mouse brain model, a team of neu-
                            date variation in large populations. While they      roanatomists drew all the regions of the brain,
                            may serve well for a single human or animal, they    and then “we lofted those regions onto a 3-D
                            are not as helpful when scientists aim to “warp”     model of the brain using informatics algo-
                            many individuals onto a common model to illus-       rithms,” says Michael Hawrylycz, PhD, director
                            trate the workings of a disease, for example. As a   of informatics at the Allen Institute for Brain
                            result, some recent brain atlases have developed     Science. Using high-level computations, an
                            their own, mathematically-complex methods for        image of gene expression was then mapped onto
                            mapping variability in big groups onto a single      the reference atlas’s coordinates, creating pic-
                            framework.                                           tures that form the database. ABA scientists
                               In human brain mapping, researchers have          chose one mouse to be the reference model, and
                            found novel ways of dealing with natural varia-      the rest of the mouse data was warped to fit into
                            tion between human brains. Toga reports that         the spatial framework of that single animal’s
                            the 15-year-old International Consortium for         brain. “We wanted a mouse that was held under
                            Brain Mapping (ICBM) describes the brain in a        exactly the same conditions that we were going to
                            probabilistic sense. For example, the atlas might    run the genes under,” Hawrylycz says.




                                                                                         The Allen Brain Atlas produced this 3-D
                                                                                         reconstruction showing normal expression
                                                                                         of manosidase 1a in the adult mouse brain
                                                                                         viewed from the front left. The translucent
                                                                                         forms represent the left half of the brain
                                                                                         and reflect the underlying standard
                                                                                         anatomical reference framework to which
                                                                                         the gene expression data was registered.
                                                                                         Each colored sphere reflects expression of
                                                                                         the Man1a gene in a 100 μm3 area. The
                                                                                         size of each sphere corresponds to expres-
                                                                                         sion density, and the color reflects expres-
                                                                                         sion level. The large red arc indicates that
                                                                                         this gene is turned on strongly in the hip-
                                                                                         pocampus, a part of the brain known to be
                                                                                         involved in learning and memory. The
                                                                                         image was generated from the Allen Brain
                                                                                         Atlas (www.brain-map.org) using the 3D
                                                                                         visualization tool, Brain Explorer. Courtesy
                                                                                         of the Allen Institute for Brain Science.


14 BIOMEDICAL COMPUTATION REVIEW   Summer 2007                                                  www.biomedicalcomputationreview.org
    Another vexing challenge for image
comparison is the issue of scale.
                                                Although Slicer was conceived as
                                             an interactive tool for processing sin-
                                                                                             “Image data is a
Martone points to the problems con-          gle images, it is also useful for
fronted by brain researchers when they       researchers working with large sets of
                                                                                             snapshot of one
try to see the workings of a disease on      images, Kikinis says. “Now people are
multiple scales in a large set of images     beginning to build informatics frame-        instance of a thing at
taken using different technologies.          works to hold and manage images;
“We go from MRIs, to optical                 and soon people will shift focus to         one time under certain
microscopy, to electron microscopy,          how to process those images,” Kikinis
then to X-ray crystallography,” she says.    explains. “With all the progress in           conditions. It’s not a
“Every time you traverse scales, there       image acquisition, you still need to
are gaps. Every time you switch tech-        turn data into medically-relevant              ground truth like a
niques, you lose continuity.” Even the       information, and that requires image
contrast mechanisms are different, so
one scale may contain fluorescents
                                             analysis,” he says. The current version
                                             of Slicer is interoperable with BIRN’s
                                                                                             gene sequence,”
while another is gray scale, disorient-      informatics frameworks and is also
ing researchers. It’s like being con-        linked directly to the National Cancer
                                                                                              says Martone.
fronted with a GPS tracking image of a       Imaging Archive (NCIA)—a large
moving vehicle one minute, and a             repository of cancer trial images—as a
Polaroid photo of the vehicle’s front        recommended viewer for its images.
wheel the next.                              Slicer can be used to review image sets     tions that are optimized for clinical
    To combat confusion, Martone’s           for prototyping and results for quality     reading, lots of research packages like
team is trying to create new coordinate      assurance. For example, before pro-         Slicer, and great toolkits like ITK that
and reference systems that ease the tran-    cessing hundreds of images, it’s wise       give you functionality, but what’s miss-
sition among scales when studying neu-       to test your algorithms and procedures      ing is a way to build custom applica-
rons in the brain. She cites a new soft-     with a handful first. That’s where          tions for these tools,” says Prior.
ware project that attempts to correlate      Slicer’s interoperability with large        XIP will give users a “rapid develop-
microscopy with “feature-based match-        databases can be used as a tool that        ment environment,” he says, enabling
ing systems” that describe the attributes    offers essential functionality.             researchers to do image processing
of such cells in a uniform way.                 Another fundamental tool avail-          more easily. XIP’s initial targets are
                                             able to image users is the Insight Tool     cancer researchers already working
                                             Kit (ITK), which Ackerman of NLM            in the grid, but its potential is
    ANALYZING IMAGES IN                      says took some three years to develop.      much greater.
     THREE DIMENSIONS                        Based on GE’s Visualization Tool Kit            “We’re hoping we’ll see a cottage
    Those who set out to compare             (VTK), ITK’s algorithm allows a user        industry building new applications in
images are also getting help from            to identify a body part—for instance,       this XIP framework to do things
advances in image analysis software, a       the heart—and then ask the tool to          like virtual colonoscopy and radiation
field that has advanced rapidly in the       draw a line around everything that          therapy analysis,” Prior says. The
past few years. Ron Kikinis, MD, pro-        looks like heart tissue. “Up until now,     “slick part” in Prior’s words is
fessor of radiology at Harvard Medical       you’d have to do that by hand,” says        that such applications could be run
School, has helped lead the way. He          Ackerman. The tool saves users’ time        through the grid and offered to other
and colleagues developed the “3D             and is constantly being updated, mak-       researchers remotely through the
Slicer” image analysis software, initially   ing it ever more efficient.                 platform—creating a whole new level
a joint, open-source effort between the         Other complementary efforts are          of sharing.
Surgical Planning Lab at Brigham and         working to ensure that researchers in           In quite a different application of
Women’s Hospital, where Kikinis is           distant labs can create their own           image analysis, some researchers are
founding director, and the Artificial        image analysis applications on a lab        honing in on new ways to help scien-
Intelligence Lab at MIT. Created to          workstation. Fred Prior has worked          tists and doctors find the images they
help visualize medical image data in         with other researchers to oversee cre-      need using tools that analyze its image
3-D, it has been used with success in        ation of the Extensible Imaging             content rather than its metadata.
fields as far flung as astronomy             Platform, or XIP. “The idea is that         Known as content-based image
and geology.                                 there are lots of commercial worksta-       retrieval, these programs also strive to


www.biomedicalcomputationreview.org                                                    Summer 2007    BIOMEDICAL COMPUTATION REVIEW 15
                                                                 How They’re Stacking Up
                          IMAGE COLLECTIONS:




                         The Cell-Centered Database, a project of the National Center for Microscopy and Imaging Research, brings together
                         data from different experiments so that multi-scaled views can be created, helping scientists to study how higher
                         order structures, such as cellular networks, are assembled out of finer building blocks, such as dendritic architectures.
                         This montage shows seven orders of magnitude of scale from centimeters to nanometers. A slice through a centime-
                         ter-sized mouse brain was obtained by making a mosaic from thousands of multiphoton microscopic images. Then flu-
                         orescence microscopy was used to isolate a spiny neuron (first sub-panel). Correlating cell structures identified under
                         the light microscope for subsequent examination under the electron microscope permitted biologists to visually recon-
                         struct the three-dimensional structure of dendritic structures with nanometer resolution. The second and third sub-
                         panels portray electron tomographic reconstructions of an unbranched spiny dendrite from cerebellum and its
                         nanometer-sized synaptic complex (from hippocampus). Image courtesy Skip Cynar, National Center for Microscopy
                         and Imaging Research, University of California, San Diego.


                             overcome errors caused when inaccurate text-                 texture, and shape. Ultimately, some hope that
                             based keywords lead to mismatches in retrieving              these systems might allow a physician to click on
                             images, write Paul Miki Willy and Karl-Heinz                 an image of a cancer in a particular patient and
                             Küfer, PhD, of the German Fraunhofer Institut                ask a database to show similar images for compar-
                             Techno- und Wirtschaftsmathematik in a 2004                  ison. So far, this technology has not yet reached a
                             paper. Content-based programs attempt to index               wide audience; some believe more work is needed
                             images according to visual features such as color,           to ensure accuracy in such searches.


16 BIOMEDICAL COMPUTATION REVIEW    Summer 2007                                                            www.biomedicalcomputationreview.org
     ACCESSING IMAGE                             sets resemble one single virtual data-           multiple databases, to allow people to
  DATABASES: CONNECTING                          base. Joel Saltz, MD, PhD, professor             discover what images are out there, and
       TO THE GRID                               and chair of the department of bio-              to analyze both remote and local
   All these image collections will              medical informatics at Ohio State                imagery and to integrate image data
do little good if no one can access              University, leads a group that develops          with information from molecular stud-
them remotely. Researchers at the                technologies that can enable “grid”              ies, clinical studies, and pathology spec-
crossroads of biomedicine and compu-             access for large image collections to            imens,” Saltz says. The National Cancer
tational science are tackling that               create such federated systems. His               Institute caBIG™ project has incorporat-
problem now.                                     group has developed middleware to                ed the Ohio State group’s software in
   One promising answer is to create             support complex distributed applica-             the caGrid software package. This was
“federated databases”—groups of                  tions. It attempts to stitch together dif-       first distributed in December and, Saltz
unique imaging collections that are              ferent bodies of images, making them             says, quite a number of funded efforts
linked together by a sort of “grid,” and         available and searchable.                        have begun to incorporate it. Furthest
that are accessible remotely via a seam-            “The overall goal of the effort is to         along in the process of opening up an
less user interface that makes the data          develop an infrastructure to connect             image database to many users with




Slicer3 image analysis software is an integral part of the brain atlas created by the Surgical Planning Laboratory and the Psychiatry
Neuroimaging Laboratory (PNL) at Brigham & Women’s Hospital in Boston. This three-dimensional digitized atlas of the human brain is used
for surgical planning, model-driven segmentation, research, and teaching. As this screenshot illustrates, Slicer3 enables users to outline and
manipulate specific regions of the brain in three dimensions based on multi-modal volumetric input data including specialized MRI methods.
An additional goal of this brain atlas is that it can be used as a template for automatically segmenting regions of interest in large new MR
data sets. Image courtesy of Ron Kikinis, Surgical Planning Laboratory, Brigham & Women’s Hospital.


www.biomedicalcomputationreview.org                                                           Summer 2007       BIOMEDICAL COMPUTATION REVIEW 17
                                                                How They’re Stacking Up
                          IMAGE COLLECTIONS:




                             Saltz’s help is the National Cancer                        and provide their feedback via software that
                             Imaging Archive.                                           allows a user to capture mark-ups, pointers, and
                                 These new systems may not be open to just              comments. For instance, a radiologist in Omaha
                             any member of the public—at least some will                might send out a CT scan of a patient’s lung via
                             require registration and credentials. But the              Saltz’ software to radiologists around the world
                             incentive to participate is high. Researchers and          as well as to computer-aided diagnosis
                             physicians who gain access will be able to com-            algorithms available at supercomputers in
                             municate with each other in new ways that could            research centers. She might hear back from
                             make a big difference to patients. A major bene-           radiologists in Mumbai, Tokyo, and Chicago,
                             fit for those linking their images to a grid is the        and from computers at a handful of univer-
                             possibility of “central review,” says Saltz. In cen-       sities, possibly discovering lung nodules she
                             tral review, radiologists remotely read an image           had missed.




                          This screenshot from the Saltz lab's gridIMAGE application shows how radiologists in remote locations can review and
                          markup images from multiple collections. A radiologist accesses the interconnected or “federated” imaging databases
                          through a single interface and can submit a review request to other participating physicians who use the same data-
                          base. The reviewers can add marks and comments and then submit their marked-up results to a central result server,
                          which transmits it to the radiologist who made the request. This application is based on the Saltz lab’s In Vivo
                          Imaging Middleware. Image courtesy Joel Saltz, Ohio State University.


18 BIOMEDICAL COMPUTATION REVIEW    Summer 2007                                                        www.biomedicalcomputationreview.org
             The next generation of applications will reveal whether the
             rise of large imaging collections will create a new science,
                         just as genetics spawned genomics.

         APPLICATIONS:                            ments—and to substantiate that a tumor             Increasing numbers of researchers on
        WILL THEY COME?                           has indeed changed size in an important        the biomolecular scale are also using
   If researchers overcome the barriers           way, he explains. Researchers could use a      imaging in their research, including scien-
described above, the question then will           central review-style process to verify their   tists like Martone and the people who uti-
be whether it will prove worthwhile. Will         reading of an image. “An image database        lize the ABA and other such atlases. For
innovative applications follow? In other          allows you to go back to a larger commu-       example, labs are using the ABA to inves-
words, if you build it, will they come?           nity of observers and confirm whether or       tigate risk factors for multiple sclerosis
   Early indications are that they will. For      not something seems to be supportable.”        and to identify genetic hotspots associated
some physicians, the near-term possibility            For researchers studying rare diseases,    with memory performance. And new
of central review alone will make federat-        the goal is to find others to compare          databases at the cellular level are popping
ed imaging databases worth the effort.            against and to increase understanding          up, including the Open Microscopy
   For neuroscientists, gaining insights          remotely. For example, says Jaffe, in the      Environment, a large public database
into the brain’s workings and connec-             old days, a researcher hoping to test a        focused on microscopy imaging data.
tions requires large numbers of fine-             drug for a rare disease such as retinoblas-
grained images. In the past, scientists had       toma—a cancer of the retina with an inci-
done studies of specific parts of the brain,      dence of only 430 cases per year—would              THE NEW NEW THING
but few had tried to discover the overall         have to request MRI films from around              Imaging is just one of many bioscience
structure of the brain. Large neuroimag-          the country to try to prove that his trial     fields moving towards more and better
ing projects such as the ABA are attempt-         worked on a range of patients. But some        information sharing and collecting. While
ing to change that. Indeed, some hope to          films would come back too dark, some           the field faces its own hurdles—the diffi-
one day map every single neuron in the            too light, and some without the right          culties of comparing images, for example—
human brain, creating a data set of               metadata. If all the data and images could     it falls within a larger trend of making data
upwards of 1 million petabytes. This “con-        be collected digitally in an online data-      available and breaking down the silos of
nectome,” promises to be the image-based          base, the researcher would more quickly        single organ or disease-focused work that
Human Genome Project of brain                     understand the drug’s impact. “What            for so long dominated the sciences. It’s the
researchers. Its success will rely on com-        you want is an electronic, common pool         same impulse that inspired the release of
puter-assisted image acquisition and              of data and metadata,” Jaffe says.             the genome and the dawn of genomics,
analysis to map the structure of the nerv-            Surgeons and other physicians could        and could cause a similarly radical shift in
ous system, says Jeff Lichtman, MD,               also benefit from such systems as              how people use image data.
PhD, professor of molecular and cellular          Rubin’s efforts to use large groups of             The next generation of applications will
biology at Harvard.                               images to inform a doctor of how to diag-      reveal whether the rise of large imaging col-
   In clinical trials for cancer treatments,      nose and treat a patient. Using Rubin’s        lections will create a new science, just as
image collections help in evaluating a            decision support software, physicians          genetics spawned genomics. Ultimately, it
drug’s effectiveness, says Carl Jaffe, MD,        can select from a series of structured         might be possible to cross-compare
diagnostic imaging branch chief for the           annotations of an image and upload the         between imaging and genomics. That’s
cancer imaging program in the division of         image data. Then a computer program            already happening in brain research proj-
cancer treatment and diagnosis at NCI.            tells them the likelihood of disease. “We      ects such as the Allen Brain Atlas, but the
The promise of using image collections to         want to give radiologists a tool to help       trend could spread throughout the body.
speed drug development is already beck-           them decide when to biopsy based on            And as in genomics, the shift could gener-
oning. “The regulatory authorities are            what they see,” he says. While it is partly    ate an entire new field of research in which
more willing to accept regression of a            based on the knowledge of expert radiol-       scientists could build an entire career.
tumor as a sign of a drug’s effective-            ogists, this type of technology will work          If the Visible Human is any proof, sim-
ness…and imaging is the pivotal marker            even better when a large number of             ply building large, accessible collections of
for this,” he says. A large database of refer-    images are available to inform the pro-        images will attract scientific curiosity and
ence images helps to balance “reader arti-        gram—hence the need for large databases        will launch a wealth of useful applications
facts”—that is, errors in radiologist’s assess-   filled with rich stores of metadata.           we cannot even imagine today. ■



www.biomedicalcomputationreview.org                                                          Summer 2007    BIOMEDICAL COMPUTATION REVIEW 19
20 BIOMEDICAL COMPUTATION REVIEW   Summer 2007   www.biomedicalcomputationreview.org
                                      DOCK THIS:
                                              Drug Design
                                      Feeds Drug Development
                                                                BY KRISTIN COBB, PHD



                                      Once upon a time, not long ago, HIV/AIDS was a scourge, killing any-
                                      one who contracted the deadly virus. Now, many people are living with
                                      the disease, which they control with drugs initially developed in the 1980s
                                      and early 1990s using an approach called computer-aided drug design—
                                      the use of computer models to find, build, or optimize drug leads.
                                         Armed with information about the 3-D structure of HIV protease, an
                                      enzyme essential to the HIV reproductive cycle, computational
                                                                      researchers designed molecules in silico to
                                                                              precisely fit the shape of the
                                                                                   enzyme’s      active   site—as
                                                                                       though fitting a key to a
                                                                                         lock. The resulting
                                                                                          drugs, potent inhibitors
                                                                                         of HIV protease and the
                                                                                       HIV life cycle, were
                                                                                   brought to market in record
                                                                            time and revolutionized the treat-
                                                              ment of HIV/AIDS.
                                         Around the same time, another anti-viral—Relenza, which treats
                                      influenza and was a forerunner to Tamiflu—was also designed using these
                                      methods. These HIV and flu drugs are among the best known success sto-
                                      ries of computer-aided drug design (see page 23 for both stories).
                                         Since those early successes, computer modeling has become an integral
                                      part of drug discovery. “Almost everything that has recently moved for-
                                      ward from big pharmaceutical companies to market has involved some
                                      sort of collaboration with computational chemistry. It’s like asking, were
                                      there chemists involved? Of course there were. It is part of the process,”
                                      says Tara Mirzadegan, PhD, head of the computer-aided drug design
                                      group at Johnson & Johnson.

www.biomedicalcomputationreview.org                                         Summer 2007   BIOMEDICAL COMPUTATION REVIEW 21
                                DOCK THIS:             Drug Design Feeds Drug Development




       “Almost everything that has recently moved forward from big
       pharmaceutical companies to market has involved some sort
       of collaboration with computational chemistry. It’s like asking,
       were there chemists involved? Of course there were. It is part
                   of the process,” says Tara Mirzadegan.

           Quite often, computers play a role
       without making the big splash they did
       with Relenza and the protease
       inhibitors. That’s probably because no
       drug is created solely in silico; the com-
       puter is just one of many tools in this
       process. But as algorithms evolve, com-
       puting power explodes, and scientists
       solve a greater number of 3-D protein
       structures, computer-aided design has
       the potential to dramatically cut the
       cost and time of drug discovery. How?
       By narrowing down the field of com-
       pounds that might help treat a particu-
       lar disease; by assembling novel drug
       molecules to disrupt specific disease
       pathways; and by providing new attack
       routes against traditionally difficult
       drug targets. Computers are also
       increasingly playing a role in optimizing
       drug leads for bioavailability and safety.
           Despite the over-hype of computers
       as the saviors of drug development
       companies, many still expect this
       process to bear important fruit.
       Computer-aided drug design played a
       critical role in the design of several
       drugs that are now in late preclinical
                                                     Docked Drug. This 3-dimensional computer graphic shows a candidate drug (a JAK2
       or early clinical development. Only
                                                     inhibitor) docked in the active site of its target protein (JAK2). JAK2 protein is implicated
       time will tell which of these, if any, will
                                                     in various myeloproliferative disorders (diseases that produce excess bone marrow cells,
       emerge as drug success stories.
                                                     such as chronic myelogenous leukemia, or CML) estimated to affect 80,000-100,000 peo-
                                                     ple in the U.S.. Courtesy of SGX Pharmaceuticals, Inc.
             VIRTUAL SCREENING                       ture of a target is screened against               start with, the ligand and protein target
          How it works: In the ideal situation,      libraries of potentially active small mol-         are often pictured as a rigid lock and
       the 3-D structure of the target molecule      ecules. The computer “docks” each                  key—but in fact they are dynamic, mov-
       (usually an enzyme or receptor) is            compound, or ligand, into the target’s             ing objects that continually change
       known, allowing scientists to directly        active site and scores its geometric and           shape and adjust their shapes in
       visualize drug-target interactions in sili-   electrostatic fit.                                 response to each other.
       co. Structure-based methods have                  Considerable progress has been                    “Imagine taking a fluffy ball and trying
       evolved in two directions since Relenza       made in docking programs in the last               to mold it to optimally fit some kind of a
       and the HIV proteases—virtual screen-         two decades, but scientists agree that             binding site. There are just way too many
       ing and fragment-based design.                the problem is complex and that they               configurations,” says Dimitris K.
          In virtual screening, the 3-D struc-       have yet to find a perfect solution. To            Agrafiotis, PhD, vice president of

                                                                                                                               Continues on page 24
22 BIOMEDICAL COMPUTATION REVIEW       Summer 2007                                                           www.biomedicalcomputationreview.org
     EARLY EXAMPLES: ANTI-VIRAL DRUGS
         Relenza and the HIV protease inhibitors stand out as          years, but the former won FDA approval sooner (in the
     the two classic examples of computer-aided drug design.           mid-1990s) because of the pressing medical need.
         Relenza was developed through a collaboration of                  Dale Kempf, PhD, who is now a distinguished
     Australian scientists, including Jose N. Varghese, PhD,           research fellow in Global Pharmaceutical Research and
     head of structural biology at CSIRO Molecular and Health          Development at Abbott, was involved in Abbott’s devel-
     Technologies. In 1983, Varghese and his colleagues used           opment of ritonavir (brand name Norvir), which started
     X-ray crystallography to solve the 3-D structure of the           in late 1987.
     enzyme neuraminidase, one of two potential protein tar-                “It’s one of the first examples of the application of
     gets on the surface of flu. Neuraminidase plays a critical        genomics for drug design,” he says. When the HIV
     role in the flu life cycle: after the virus replicates within a   genome was sequenced and published in the mid-
     host cell, neuraminidase releases the newly formed viral          1980s, several groups recognized characteristic
     progeny by cleaving a bond between the viral surface pro-         sequences suggestive of a protease enzyme.
     tein hemagglutinin and a sugar on the host cell surface,              Interestingly, the gene encoded only half a protein,
     sialic acid.                                                      which led Kempf and others to realize that the protease
         A series of structural experiments revealed important         must be composed of a dimer—two identical halves that
     insights. The active site of the enzyme was high-                          come together to form one active site. This pro-
     ly conserved in all strains of flu—both                                            vided a key structural insight even before
     human and animal; the virus routine-                                                    X-ray crystal structures of the protease
     ly escaped antibody recognition by                                                        were available: the active site had
     mutating around the periphery of                                                            to have a particular type of sym-
     the active site but never chang-                                                             metry, known as C2 or two-fold
     ing the active site itself.                                                                  symmetry (rotation 180 degrees
         “Because it was so highly                                                                around a central axis yields the
     conserved, it seemed clear to us                                                            identical structure).
     that it must have a very important                                                           Kempf’s group used that insight
     function,” Varghese says. “So, clear-                                                  to create a computer model of the
     ly if one made a molecule that went in                                            protease active site and to design possible
     there and blocked that site, it would be pretty                          inhibitors in silico by starting with a known sub-
     effective.”                                                       strate, chopping off half of the substrate, and rotating
         A synthetic analog of sialic acid was known to inhibit        the remaining half by 180 degrees.
     neuraminidase, but without sufficient potency. Using the              “And when we went into the lab and made those
     crystal structure of neuraminidase bound with this ana-           compounds, they turned out to be very potent
     log, the researchers set out to design a better inhibitor in      inhibitors,” Kempf says.
     silico. Computer predictions revealed that a particular               Using a combination of the X-ray crystal structures of
     guanidinium-for-oxygen substitution would give tight              HIV protease (which had since become available) and
     binding. Synthesis of this compound—Relenza—turned                computer graphics, they modified these compounds in
     out to be tricky, but eventually succeeded.                       silico to visualize how certain substitutions would
         “It bound in nanomolar binding, so it was very tight,         improve characteristics like bioavailability. The first com-
     and it certainly blocked the virus replication right down         pound with sufficient oral bioavailability, ritonavir, was
     to its tracks,” Varghese says.                                    synthesized in 1991.
         Relenza was licensed to GlaxoSmithKline Inc. in 1990              In 1996, the FDA approved ritonavir in record time
     and approved by the FDA in 1999. Following their                  (72 days). The total development time—about eight
     lead—and capitilizing on a patent oversight, according            years—was roughly half that of a typical drug, due both
     to Varghese—Gilead Sciences developed the better-                 to the structure-based approach and to the FDA’s accel-
     known neuraminidase inhibitor, Tamiflu (marketed by               erated review. Several other HIV proteases emerged
     Roche). Both drugs may be important in the fight                  around the same time, including saquinavir (Roche) and
     against bird flu, Varghese says.                                  nelfinavir (developed by Agouron, now a subsidiary of
         Development of the HIV protease inhibitors lagged             Pfizer). These drugs helped to revolutionize the treat-
     behind that of the neuraminidase inhibitors by several            ment of HIV.




www.biomedicalcomputationreview.org                                                  Summer 2007     BIOMEDICAL COMPUTATION REVIEW 23
                                  DOCK THIS:                Drug Design Feeds Drug Development




       Cancer Interrupted. This three-dimensional computer graphic shows a drug candidate (MET tyrosine kinase inhibitor) bound to its target pro-
       tein. MET receptor tyrosine kinase controls cell growth, division, and motility and is implicated in a range of cancers, including renal cell carci-
       noma, gastric cancer, lung cancer, glioblastoma and multiple myeloma. Courtesy of SGX Pharmaceuticals, Inc.


       Continued from page 22
       informatics at Johnson & Johnson                   make the problem computationally                   from a quantum mechanical point of
       Pharmaceutical Research & Develop-                 tractable but still meaningful,”                   view. Now the quantum mechanical cal-
       ment. “Small molecules—unless they’re              Agrafiotis says.                                   culations, as you can imagine, are hor-
       very small—tend to be very flexible. They              Besides the flexibility of the protein,        rendous,” says Jose N. Varghese, PhD,
       flop around a lot. They can assume a mul-          many docking programs do not ade-                  head of structural biology at CSIRO
       titude of conformations in 3-D.” If a mol-         quately account for the influence of               Molecular and Health Technologies.
       ecule has five rotatable bonds, then each          water—which surrounds all molecules in             “At this stage, it is a computational chal-
       bond can rotate at many different angles,          living systems. “The mathematical mod-             lenge.”
       creating a lot of freedom to take on               els for defining water and how it shapes               Methods of scoring how well a small
       unique conformations.                              itself around the receptor and the drug            molecule fits a protein’s active site also
           Most docking programs now                      molecule are still pretty unclear,” says           must trade off between speed and accu-
       account for the flexibility of the ligand          Kent Stewart, PhD, a research fellow               racy. “The scoring function that we use
       by sampling its many conformations                 in structural biology at Abbott.                   has many shortcuts and approxima-
       and docking each one, but adequately                   In addition, the algorithms estimate           tions,” says Mirzadegan. Her group will
       accounting for the flexibility of the tar-         binding energies using classical                   virtually dock the company’s one mil-
       get protein is a much more challenging             Newtonian physics, rather than quan-               lion proprietary compounds (which it
       problem. Adding protein flexibility                tum physics—which also reduces accura-             has purchased or developed over the
       exponentially increases computing                  cy. “You can calculate the binding ener-           years) against a given target, and pick
       demands.                                           gies from some sort of Newtonian point             the highest ranked 10,000 for biological
           “The state of the art today is coming          of view, treating atoms as sort of balls           testing. “We cannot afford docking one
       up with sensible simplifications that              attached to springs. Or you can treat it           compound per day. That would be one


24 BIOMEDICAL COMPUTATION REVIEW         Summer 2007                                                              www.biomedicalcomputationreview.org
“The state of the art today is coming up with sensible simplifi-
 cations that make the problem computationally tractable but
         still meaningful,” says Dimitris K. Agrafiotis.



million days. So we have to do it in a
matter of seconds or sub-seconds.”
    But increased computing power can
help boost the speed of virtual screen-
ing without compromising accuracy. In
2000, for instance, Arthur J. Olson,
PhD, professor of molecular biology
and director of the Molecular Graphics
Laboratory at The Scripps Research
Institute, started the FightAids@Home
project, which uses internet-based grid
computing—as was popularized by the
SETI@Home project—to do virtual
screening for new anti-HIV drugs.
     “If most people who have comput-
ers use only about five percent of the
CPU cycles—and the rest of the cycles
are just idle—how much wasted or avail-
able computing is there?” Olson asks.
“It turns out to be an amazing number.”
His grid computing project makes use
of that idle computer time and helps
evaluate drugs for dealing with HIV
proteins’ habit of rapidly mutating to
escape drug pressures. Fortunately, the
3-D structures have been solved for
many of the mutant HIV proteins.
With the help of about 500,000 volun-      Anti-Cancer Key. An anti-cancer drug compound—nutlin—bound to the cancer-causing pro-
teer computers, Olson used AutoDock        tein MDM2. Courtesy of RMC Biosciences, Inc.
(a popular docking program that was
developed in his lab) to screen 2000       one that captures all unique interac-        our work matures, we have been look-
small molecules against several hundred    tions with the ligands screened. “Doing      ing into the next steps involved in com-
different HIV protease mutants. The        docking on only this subset of mutants       putational drug design,” Pande says.
program took six months to run; he         would free up computer time for screen-      Using distributed computing, his group
estimates that on the Scripps super        ing larger libraries, using more dynamic     has devised new, more accurate algo-
computer, with 300 processors running,     representations of the protein tar-          rithms for docking and for calculating
it would have taken 50 years.              gets, or using more accurate scoring         ligand-protein binding energies. These
    Besides identifying several drug       functions,” he says.                         algorithms are being used in the design
leads, which are now in testing, Olson        The Folding@Home project at               of several new drugs, including new
recognizes an even more important pay-     Stanford also uses grid computing            inhibitors of the cytokine-cytokine
off: “When you do such massive dock-       for drug design. Led by Vijay S.             receptor interaction (involved in can-
ings, you actually are collecting more     Pande, PhD, associate professor of           cer); novel chaperone inhibitors (also
than just an answer; you’re collecting a   chemistry and of structural biology,         involved in cancer); and novel antibi-
lot of statistics.” Such data could, for   Folding@Home focuses on simulating           otics that target the bacterial ribosome.
example, be used to identify a subset of   protein folding and misfolding, but “as          “Distributed computing is a key
mutants that represent a spanning set—


www.biomedicalcomputationreview.org                                                   Summer 2007    BIOMEDICAL COMPUTATION REVIEW 25
                                DOCK THIS:              Drug Design Feeds Drug Development




       Fragment-based design. Drug companies, such as SGX pharmaceuticals, screen hundreds of fragments in their fragment libraries and identi-
       fy hits that serve as the building blocks for novel drug candidates. Knowledge of the binding mode of each fragment to its target is com-
       bined with advanced computational tools to produce “engineered” drug leads. For example, in this series, a hit is first identified through
       crystallographic screening (yellow); then chemical groups (red and pink) are added to the bound fragment to increase its binding affinity.
       Courtesy of SGX Pharmaceuticals, Inc.

                                                      aspect to this, as it allows us to do cal-      that a really large company would have,
                                                      culations otherwise impossible,”                you take compounds that are say one-
             Distributed                              Pande says.                                     third of the size, and explore them com-
                                                                                                      binatorically. If you explored ten frag-
        computing is key to                                                                           ments in three different positions,
                                                         FRAGMENT-BASED DESIGN                        you’d actually explore 1000 combina-
         developing better,                               Fragment-based methods take a               tions. So with a database of something
                                                                                                      like 400 compounds, you can explore a
                                                      “Lego” approach to drug design. In a
          more accurate                               lab, scientists create chemical libraries
                                                      of small compounds, or fragments—per-
                                                                                                      chemical space that is in the several mil-
                                                                                                      lions,” says Sir Tom Blundell, FRS,
                                                                                                      FMedSci, professor and chair of bio-
           algorithms for                             haps one-third the size of a typical
                                                      drug—that are easily linked together.           chemistry at the University of
                                                      They then screen the libraries for bind-        Cambridge. In 1999, Blundell co-
          computer-aided                              ing activity experimentally, using high-        founded Astex Therapeutics to do frag-
                                                      throughput X-ray crystallography (or            ment-based methods; the company is
         drug design, says                            NMR or mass spectrometry); when a               now testing a kinase inhibitor—a type of
                                                      fragment binds to the target, the crys-         cancer drug—in clinical trials.
          Vijay Pande. “It                            tallography provides an exact 3-D pic-               “The experiment is really one of
                                                      ture of the bound fragment in the active        using crystallography to do your
          allows us to do                             site. Next, with the help of computer           screening. So you’ve pushed the crys-
                                                                                                      tallography technology to the point
                                                      modeling, fragments are turned into
                                                                                                      where you can do it so rapidly that it
             calculations                             potent drug leads by adding new chem-
                                                      ical groups to the initial core fragment        becomes effective to use as a screening
                                                                                                      tool,” says Siegfried Reich, PhD, vice
              otherwise                               or by stitching together several frag-
                                                      ments that bind to different points in          president of drug discovery at SGX
                                                      the active site.                                Pharmaceuticals, another company
            impossible.”                                  “I think this approach is showing           that uses fragment-based methods.
                                                      quite good promise,” Varghese says.             (Reich previously helped develop the
                                                      “In fact, with the advent of these mod-         HIV protease inhibitor nelfinavir at
                                                      ern synchrotrons, scientists can do this        Agouron.) When it was founded in
                                                      fairly quickly—and a lot of pharmaceu-          1999, SGX was named Structural
                                                      tical companies are moving in this              Genomix and its aim was to use high
                                                      direction.”                                     throughput X-ray crystallography to
                                                          The approach offers a combinatorial         solve a record number of protein struc-
                                                      advantage: “Instead of having a data-           tures. But this was not sustainable as a
                                                      base of say four million compounds              business model. So, in 2000, the com-


26 BIOMEDICAL COMPUTATION REVIEW       Summer 2007                                                         www.biomedicalcomputationreview.org
   “When you’re talking about toxicity, it’s much easier to give a
  compound to a rat than it is to dock against all possible proteins
that are in the rat, even today,” says Art Olson. “But someday, you
   might be able to do that. We’re certainly creeping up on that.”



pany changed its name to SGX                    way, SGX got their first hit down to
Pharmaceuticals and put its crystallog-         nanomolar potency—i.e. very little of the
raphy power to use in drug discovery.           compound was required in order to bind
    One of their lead candidates is a new       the protein—in about three months.
inhibitor of BCR-ABL, a perpetually             “That gives you a flavor for how fast this
active kinase enzyme involved in chronic        can go,” Reich says.
myelogenous leukemia, or CML. The
BCR-ABL inhibitor Gleevec has had                        TRICKY TARGETS
enormous success in treating CML                    Docking algorithms and fragment-
patients, but 20 percent are resistant to       based methods work well on soluble
Gleevec. So scientists at SGX cloned,                                                         Tricky Target. This computer model of a
                                                enzymes that are easily crystallized and
expressed, purified, and crystallized the                                                     bacterial cell membrane helped scientists
                                                contain well-defined pockets where lig-
Gleevec-resistant protein. Then they                                                          at Polymedix design new antibiotics that
                                                ands can bind—but many diseases
screened their fragment library against                                                       mimic the action of the defensin proteins
                                                instead involve membrane-bound recep-
the wild type and mutant versions of                                                          (natural proteins in the body that kill
                                                tors or protein-protein interactions.
BCR-ABL to find compounds active                                                              bacteria by puncturing their membranes).
                                                    Membrane-bound receptors transmit
against both. The fragment hit that even-                                                     Courtesy of Polymedix.
                                                signals from outside to inside the cell.
tually led to their lead candidate started      Because the proteins are embedded in
with a low binding affinity of just 10          the membrane, they cannot easily be          ture-based methods and have helped
micromolars (i.e., a fairly high concen-        crystallized and it is difficult to solve    develop many drugs, including drugs to
tration of compound was required to             their structures. For example, 25 percent    treat high blood pressure, pain, and
bind at least half the protein).                of the top 100 drugs on the market today     depression.
    This is where the medicinal chemists        target G-protein coupled receptors—              Protein-protein interactions occur via
and structural biologists sit down with         including the dopamine and serotonin         surfaces that are often featureless and
the computational chemists, Reich says.         receptors in the brain—but the structure     shallow, and binding affinities can be
Computational chemists virtually build          of only one mammalian G-protein cou-         quite large—so it’s hard for small mole-
new compounds by adding chemical                pled receptor is known.                      cules to disrupt these interactions, says
groups to the starting fragment. For                When structural information is           Arthur Olson of Scripps Research
example, they might try linking all the         unavailable, computational chemists use      Institute. You have to find or design
different simple alkyl amines to one of         ligand-based methods to hunt for new         drugs that can bind to multiple
the fragment’s “chemical handles” (sites        drug leads. They superimpose a set of lig-   footholds, or hot spots, on the protein
on the fragment that easily bind to             ands with known activity against the tar-    surface, which is challenging, he says. “I
other chemical groups), Reich explains.         get and compare their structural and         think that this is an area that is really still
The computer calculates the binding             chemical features. A common pattern,         in its infancy.”
affinity for each iteration, until it finds     called a pharmacophore, emerges—key              But some progress is being made.
one with tight binding. Specialized ver-        functional groups (such as hydrogen          Kent Stewart of Abbott Labs hopes to
sions of docking programs are used to           bond donors, electrostatic charges, and      control BCL-2, a protein that is over-
calculate the binding affinities. But           hydrophobic patches) must be in certain      expressed in certain cancers. It blocks
because you already know exactly how            positions. This fingerprint is then used     apoptosis (programmed cell death) and
the fragment binds, you start with more         to virtually screen libraries for novel      thus keeps cancer cells alive. Compared
information than in virtual screening.          compounds with similar patterns.             to HIV, Stewart says, which has an actu-
    By elaborating their initial lead in this   Ligand-based methods pre-date the struc-     al cave you can dock a molecule into, on


www.biomedicalcomputationreview.org                                                      Summer 2007     BIOMEDICAL COMPUTATION REVIEW 27
                                DOCK THIS:               Drug Design Feeds Drug Development




       Cancer Interference. The oncogenic protein BCL-2 helps keep cancer cells alive via a protein-protein interaction. This Bcl-2 inhibitor—devel-
       oped at Abbott using a fragment-based approach—binds to the BCL-2 protein surface and disrupts the protein-protein interaction. The com-
       pound is in late preclinical development. Courtesy of Abbott.

       BCL-2, “there’s no such thing as a cave;        defensins—natural proteins found in                  The result: drug leads one-tenth the
       it’s a very flat and open surface, so it’s      the body that kill bacteria.                     size of the defensins, but about 100-fold
       hard to get molecules that actually                 “They work similarly to a needle or a        more potent and 1000-fold more selec-
       stick,” So, using a fragment-based              corkscrew going into a balloon. They             tive. “So we’ve been able to improve on
       approach, scientists at Abbott linked           directly attack and perforate the bacteri-       nature,” Landekic says. The compounds
       together two fragments that bind to the         al cell membrane,” says Nicholas                 are now being tested in animal studies.
       BCL-2 protein surface, resulting in a           Landekic, MBA, President, CEO, and                   “We’ve spent less than 14 million dol-
       potent compound that can disrupt the            co-founder of Polymedix. Because they            lars to date since starting Polymedix, so
       protein-protein interaction. The com-           do not target bacterial proteins—which           in terms of an efficiency and efficacy
       pound is now in late preclinical devel-         can easily evolve to escape drug pres-           rate, I think that’s pretty good,” he adds.
       opment.                                         sures—defensin-like drugs should not
           Some companies have made these              engender bacterial resistance, he says.
       difficult targets their niche area. For             Scientists at Polymedix built a com-
       example, Polymedix’s mission is to              putational model of a defensin protein                 MAKING CHEMICALS
       develop drugs against membrane-bound            inserted into a bacterial cell membrane                  INTO DRUGS
       targets, protein-protein interactions,          (a peptide-membrane interaction).                    Computer-aided methods can identi-
       and membrane-protein interactions,              Then they virtually transformed the              fy drug leads with potent activity against
       using a suite of computational tools            defensin protein into a drug-sized com-          a target, but these compounds are far
       specifically developed for these aims (by       pound. By swapping amino acid groups             from being drugs. Drugs must also be
       professors William DeGrado, PhD,                for chemically analogous small mole-             bioavailable and safe. Safety problems
       and Michael Klein, PhD of the                   cule groups, they shrunk the protein             derail many drugs late in development,
       University of Pennsylvania).                    while preserving its chemical interac-           so identifying potential safety snags
           Polymedix is working on a new line          tions (electrostatics, lipophilicity, etc.)      early on could save considerable time
       of antibiotics that mimic the action of         within the membrane.                             and money.


28 BIOMEDICAL COMPUTATION REVIEW       Summer 2007                                                           www.biomedicalcomputationreview.org
HIV Protease Inhibitor. The second-generation HIV protease inhibitor, Kaletra, was developed at Abbott. Here Kaletra is shown bound to the
active site of HIV protease. Courtesy of Abbott.

   “How well can we evaluate bioavail-         much easier to give a compound to a rat         (known as “Lipinski’s Rule of Five”)
ability and toxicity in silico? It’s           than it is to dock against all possible         that are associated with favorable
pretty blunt and not a very popular            proteins that are in the rat, even today,”      ADME profiles, such as having five or
answer: we don’t do very well,”                he says. “But someday, you might be             fewer hydrogen bond donors and a
Stewart says. “The biological mecha-           able to do that. We’re certainly creeping       molecular weight below 500.
nisms underlying bioavailability               up on that.”                                       With enough computing power, sci-
and toxicity are complex. So the math-            Computers do play a role today,              entists can also virtually screen a candi-
ematical models in those areas are still       however. Drugs must meet properties             date compound against a large panel of
in their infancy,”                             that fall under the ADME acronym: be            proteins from the body, to make sure the
   Olson agrees: We are a long way             Absorbed by the body, Distributed to            compound will not cross react with other
from being able to simulate a drug’s           the target tissues, and not Metabolized         enzymes or receptors to cause side effects.
effect on the entire human body.               or Excreted too quickly. Software pro-             To ensure that molecules identified
“When you’re talking about toxicity, it’s      grams check molecules for key features          in the computer will have real-world




    For the field to progress, says Anthony Nicholls, the current
 software needs to be more closely scrutinized—using prospective
     studies that directly compare the impact of computer-aided
       methods with more traditional drug design approaches.

www.biomedicalcomputationreview.org                                                         Summer 2007      BIOMEDICAL COMPUTATION REVIEW 29
                                 DOCK THIS:             Drug Design Feeds Drug Development




     “I think in the next seven to ten years, with the computational
      power that’s coming on line here pretty soon and the steady
     development in algorithms, computer-aided design is going to
              make a huge difference,” says Richard Casey.




     value, computational scientists benefit       ing whether large investments in tech-        atively late—in the mid-to-late 1990s. By
     from working closely with medicinal           nology, including computer-aided              this time, computer-aided drug design
     chemists during lead identification and       drug design, are paying significant           was well integrated into big pharmaceu-
     optimization.                                 dividends.                                    tical companies. Several companies
         “Medicinal chemists would tell you            Many modeling programs are unreli-        quickly identified binding sites and
     that there’s lots of intuition involved, so   able, and they are not making a big dif-      designed inhibitors, many of which are
     it’s not all computational,” says Hans        ference in the real world, cautions           now in early clinical trials. “It is expect-
     Wolters, PhD, associate director of           Anthony Nicholls, President and               ed to completely change the treatment
     informatics at XDx, Inc. For example,         CEO of OpenEye Scientific Software,           paradigm for HCV infected patients,”
     he says that as computer scientists           which develops software for computer-         Klumpp says.
     became more involved in making drugs,         aided drug design. “It’s all done on              Richard Casey, PhD, founder and
     the molecular weight of candidate com-        faith. It’s all done on the idea that ‘oh,    chief scientific officer of RMC
     pounds began to creep up precipitous-         we’re using computers, so it must be          Biosciences, Inc., has also witnessed the
     ly—to sizes that would not be easily          better,’” he says. “I think a lot of people   dramatic effect that computers can have
     absorbed by the human body.                   are fooling themselves.” He believes          on drug design. His company provides
     Medicinal chemists help recognize this        that, for the field to progress, the cur-     computer-aided drug design services for
     type of problem early in the process.         rent software needs to be more closely        small and mid-size pharmaceutical com-
                                                   scrutinized—using prospective studies         panies, which often lack in-house teams.
                                                   that directly compare the impact of               Recently, he made 3-D models and
         DEBATING        THE IMPACT                computer-aided methods with more tra-         performed in silico docking studies for a
         In the past two decades, although         ditional drug design approaches.              mid-size pharmaceutical company that
     computer-aided drug design has                    Other scientists agree that the algo-     had identified active lead compounds but
     become an integral part of drug dis-          rithms are still being refined, but have a    had no understanding of how they were
     covery, some remain skeptical as to           more optimistic outlook. They say that        binding the target, an RNA synthetase.
     whether these methods are delivering          progress is steady and that computer-             “When they saw this for the first
     on their promise. The productivity of         aided design is already having an             time, it was the ‘aha’ effect: So that’s
     the pharmaceutical industry has actu-         impact. Klaus Klumpp, PhD, an asso-           why this compound has high activity
     ally declined in the past decade (The         ciate director at Roche (who was              and this compound does not. It was a
     FDA approved 58 drugs from 2002 to            involved in the development of the HIV        real eye-opener for them,” Casey says.
     2004 compared with 110 from 1994 to           protease inhibitor saquinavir), points to         “I think in the next seven to
     1996, according to the Tufts Center           a suite of emerging drugs for hepatitis C     ten years, with the computational
     for Drug Development.) Though this            virus (HCV) as a case in point.               power that’s coming on line here
     is likely due to many factors—in partic-          HCV was discovered in 1989 and            pretty soon and the steady develop-
     ular, tightening safety standards and         the virus was difficult to grow, so struc-    ment in algorithms, computer-
     the enormous cost and time of clinical        tural information for HCV polymerase          aided design is going to make a huge
     trials—the trend has left some wonder-        and HCV protease became available rel-        difference.” ■


30 BIOMEDICAL COMPUTATION REVIEW        Summer 2007                                                      www.biomedicalcomputationreview.org
simbios news
   SimbiosNews
  BY KATHARINE MILLER




        In the (Protein) Loop
            n the gaps between the tight coils and flattened sheets
        I   that comprise most protein structures, flexible loops
            wave and bend. When crystallized, these loops can
        appear fuzzy in an electron density map—like moving
        objects captured in a still photograph. Often, loops may
        have an important role in a protein’s function, but because
        they are so mobile, their structure and dynamics can be
        hard to study.
            To better understand how protein loops move, Simbios
        researchers have created LoopTK, a toolkit that samples
        and visualizes many conformations of a loop, and provides        The Latombe group’s seed sampling algorithm successfully
        various algorithms to manipulate and analyze loop struc-         defines the motion space for loops surrounded by empty space
        tures. “We want to find answers that are distributed over        (as shown here) as well as for loops that are more constrained
        all the motion space,” says Jean-Claude Latombe, PhD,            by the surrounding protein structure (not shown). In this pic-
        a roboticist and professor of computer science at Stanford       ture, the red dots show the positions of the middle C atom of
        University whose team developed the software. LoopTK is          the loop in many sampled conformations, but for clarity only a
        now available for download on the SimTK.org web site.            small number of these conformations are displayed in their
            Latombe and his colleagues set out to place protein loops    entirety. Courtesy, Jean-Claude Latombe and Peggy Yao.
        so that they correctly connect up with the protein’s coils and
        sheets while avoiding atomic clashes in the loop and
        between the loop and the rest of the protein. “Solving both      other allows you to explore specific regions of the motion
        constraints simultaneously is the hard part,” says Latombe.      space in more detail.”
        “That’s what we do with LoopTK. And we can do it very                    Latombe’s group is working with others on two appli-
        fast. We can sample many conformations very quickly.”            cations of LoopTK. With the part of the Joint Center for
            LoopTK relies on two techniques: seed sampling and           Structural Genomics located at the Stanford Linear
        deformation sampling. The seed sampling algorithm starts         Accelerator Center, they are interpreting fuzzy electron den-
        with nothing but the amino acid sequence of the protein.         sity maps created from X-ray crystallography. “One would
        It then tries to place the loop in the full range of possible    like to know the full range of loop conformations that could
        solutions. When several correct placements are found, the        fit into this fuzziness,” says Latombe. The resulting loop posi-
        deformation sampling algorithm is used to deform the             tions could then be submitted to the Protein Data Bank.
        loop slightly without breaking the ends and without creat-       “Biologists need to be aware of the flexibility of the loop and
        ing collisions among the atoms. “The two techniques are          the uncertainty in the conformation,” says Latombe.
        very complementary,” says Latombe. “One gives you a              LoopTK can provide a sense of which conformations are
        global picture of the entire molecule in space, and the          more likely—a characterization of the distribution of possible
                                                                         conformations.
                                                                              In a second project, LoopTK is being used for functional
                                                                         homology research. Russ Altman, PhD, chair of Stanford’s
                             DETAILS                                     bioengineering department, and his group are trying to
                                                                         extract structural knowledge based on partial knowledge
     LoopTK, a C++ based object-oriented toolkit, models                 about a protein’s function. For example, if a protein X is
     the kinematics of a protein chain and provides                      known to bind to pro-
     methods to explore its motion space. In LoopTK, a                   tein Y, LoopTK might
     protein chain is modeled as a robot manipulator                     help to infer possible
     with bonds acting as links and the dihedral degree                  conformations of the
     of freedoms acting as joints.                                       loop that are consistent
        LoopTK is now available for download at                          with such binding.
     https://simtk.org/home/looptk. An application                            “There might be
     programming interface (API) lets users embed                        dozens or more applica-
     LoopTK in their application software.                               tions for this tool,” says
        LoopTK will be presented at the 7th Workshop on                  Latombe. “What we
                                                                                                           Simbios is a National Center for
     Algorithms in Bioinformatics in Philadelphia on                     hope is that by putting           Biomedical Computing located
     September 8-9, 2007. (http://www.wabi07.org/ )                      it on the web site other          at Stanford University.
                                                                         people will explore
                                                                         those possibilities.” ■

www.biomedicalcomputationreview.org                                                      Summer 2007       BIOMEDICAL COMPUTATION REVIEW 31
under the hood
  Under TheHood
  BY CHIH-WEN KAN AND MIA K. MARKEY, PhD



                          Mutual Information
                 utual information (MI) is defined in information     and post-operatively in
        M        theory as a measure of the dependencies between
                 two random variables. There are many biomedical
        applications in which it is beneficial to quantify the infor-
                                                                      order to assess the success-
                                                                      fulness of a surgery. To
                                                                      facilitate the interpretation
        mation content using a measure such as MI. In classifica-     of such sets of images, reg-
        tion problems, MI is used as a dependence measure to select   istration—the process of
        features such that they are dissimilar from each other in     aligning multiple images—is neces-
        order to reduce feature redundancy. MI can also be used in    sary. The goal of registration is to identify a transformation
        database retrieval. The MI is calculated between a query      that maps each point in one image to the corresponding
        item and every entry in the database in order to identify the point in the other image.
        entry in the database that is most similar to the query item.     One approach to image registration is based on defin-
            In image processing, it is also used extensively as a sim-ing landmarks or fiducial points in the images. By deter-
        ilarity measure for image registration and for combining      mining how to align those landmarks, one can determine
        multiple images to build 3D models. We will use the appli-    how to transform one image to match the other. However,
        cation domain of medical image registration to illustrate     manual definition of landmarks is time consuming, may
        the utility of MI.                                            be difficult even for an experienced observer, and suffers
            The mutual information of random variables A and B        from intra- and inter-reader variability.
        is defined as                                                     Another approach to image registration is to determine
                                                                      a transformation based on a measure of the similarity of
                       I(A,B) =∑            (
                                    p(a,b)log
                                                 p(a,b)
                                                p(a)p(b))             the images, such as MI. Since larger MI corresponds to
                                a,b                                   more similarity of the two images, MI is maximized in reg-
                                                                      istration algorithms.
       where p(a,b) is the joint probability distribution function        In image registration, the goal is to determine a trans-
       of A and B, and p(a) and p(b) are the marginal probabili- formation of one image such that the MI between the trans-
       ty distribution functions of A and B, respectively.            formed image and the reference image is maximized.
           Thus, in the context of medical image registration, MI Different types of transformations may be considered based
       measures the distance between the joint distributions of the on the application. The simplest class of transformations
       images’       gray                                                                                           only     permits
       values p(a,b) and
       the distribution
                           In image registration, the goal is to determine                                          rotations and
                                                                                                                    translations. In
       when the two                                                                                                 medical imag-
       images are inde-
                            a transformation of one image such that the                                             ing, a wider vari-
       pendent from                                                                                                 ety of scaling
       each other. It is a mutual information between the transformed                                               and        shape
       measure of the                                                                                               changes       are
       dependence be-      image and the reference image is maximized.                                              often needed,
       tween the two                                                                                                including non-
       images. Since the mutual information I(A,B) is the reduc- linear transformations that allow for non-uniform changes
       tion in the uncertainty of A due to the knowledge of B, across the image. An optimization algorithm is applied to
       when p(a) = p(b), the uncertainty is minimal and the reduc- dynamically search among transformations for the one with
       tion of uncertainty is maximized.                              maximal MI.
           In medical imaging, it is often necessary to compare           MI has been shown to be especially valuable for regis-
       images of a patient that are acquired at different times or by tering multi-modality images. For example, computed
       different modalities. For example, images may be taken pre- tomography (CT), positron emission tomography (PET),
                                                                                and magnetic resonance imaging (MRI) images
                                                                                of the same patient provide complementary
                                 DETAILS                                        information. Registration based on MI enables a
  Chih-Wen Kan is a graduate student in The University of Texas                 healthcare provider to directly correlate the data
  Department of Biomedical Engineering. She works on developing                 from such different imaging techniques. MI has
  diagnostic decision support systems in Dr. Mia Markey’s                       also shown promise for registering time series
  Biomedical Informatics Lab (http://bmil.bme.utexas.edu/).                     images. A series of images over time is often
                                                                                used to evaluate tissue function in addition to
                                                                                structure. ■


32 BIOMEDICAL COMPUTATION REVIEW        Summer 2007                                                      www.biomedicalcomputationreview.org
putting heads together
   PuttingHeadsTogether

      The 6th Annual International Conference on                         The Pacific Symposium on Biocomputing (PSB) 2008
      Computational Systems Bioinformatics (CSB2007)
      coordinated by the Life Sciences Society.                          WHAT: The Pacific Symposium on
                                                                         Biocomputing (PSB) 2008 is an
      WHAT: This conference is                                           international, multidisciplinary
      designed for any scientist                                         conference for the presentation and
      interested in the interaction of                                   discussion of current research in
      biology and computing who                                          the theory and application of
      wants to gain fast access to                                       computational        methods      in
      current research results; network                                  problems of biological significance.
      with other life scientists; and                                    PSB is a forum for the presentation
      listen to and meet scientific                                      of work in databases, algorithms,
      stars. CSB2007 will continue to                                    interfaces, visualization, modeling,
      be a five-day single track                                         and other computational methods,
      conference featuring 10 half-day                                   as applied to biological problems,
      tutorials, 30 referred papers plus                                 with emphasis on applications in
      keynote speakers, 150 posters                                      data-rich areas of molecular biology.
      and five full-day workshops. Special events for the evenings are   Papers and presentations are
      being planned.                                                     rigorously peer reviewed and are
                                                                         published       in     an   archival
      WHEN: August 13-17, 2007                                           proceedings volume.

      WHERE: University of California, San Diego                         WHEN: January 4-8, 2008

      MORE INFO:                                                         WHERE: The Fairmont Orchid on
      http://lifesciencessociety.org/CSB2007/index07.html                the Big Island of Hawaii

                                                                         DEADLINES: Call for Papers—July 16, 2007;
                                                                         Poster abstract submissions—Nov. 9, 2007.
      Stanford’s Bio-X Symposium: Life in Motion
                                                                         MORE INFO: http://psb.stanford.edu/
      WHAT: Bio-X, Stanford’s interdisciplinary life sciences
      initiative, hosts a major symposium each year. This year Bio-X
      has teamed up with Simbios—Stanford’s National NIH Center          OF NOTE: This year, Simbios will be holding a special session
      for Physics-based Simulation of Biological Structures—to hold      at PSB: Multiscale Modeling and Simulation: from Molecules to
      a symposium entitled, “Life in Motion”. The goal of this           Cells to Organisms
      symposium is to educate students and scientists from different
      disciplines about the exciting uses of simulations driven by the
      laws of physics and mechanics across a range of scales, from
      molecules to organisms. The talks will be presented by a series        WHY “PUTTING HEADS TOGETHER”?
      of experts and innovators from around the world. Confirmed
      speakers are: Sylvia Blemker; Joachim Frank; Robert Full;            This magazine strives to build connections
      Jessica Hodgins; John Hutchinson; Roger Kamm; Mimi                   among diverse researchers, all of whose work
      Koehl; Vijay Pande; Klaus Schulten; Demetri Terzoplulos.             touches on biomedical computation. Because
                                                                           these highlighted conferences & symposia
      WHEN: October 25, 2007                                               do the same thing, we are giving them a
                                                                           well-deserved spot in these pages. If you have
      WHERE: James Clark Center Auditorium,                                a favorite conference you’d like to see
      Stanford University                                                  appear in this magazine, let us know: editor @
                                                                           biomedicalcomputationreview.org.
      MORE INFO: simtk.org/home/lifeinmotion




www.biomedicalcomputationreview.org                                                 Summer 2007     BIOMEDICAL COMPUTATION REVIEW 33
                                                                                                                 Nonprofit Org.
                                                                                                                 U.S. Postage Paid
                                                                                                                 Permit No. 28
                                                                                                                 Palo Alto, CA

Biomedical Computation Review
Simbios A NATIONAL CENTER FOR BIOMEDICAL COMPUTING
Stanford University
318 Campus Drive
Clark Center Room S231
Stanford, CA 94305-5444




seeing science
   SeeingScience
   BY KATHARINE MILLER




                                               Remodeling by Curvature

    W        henever a cell needs to get rid of
             waste, transport materials, sort
             proteins,    or    build      new
                                                        Researchers knew that specialized pro-
                                                     teins are involved in triggering mem-
                                                     branes to remodel themselves, but exper-
                                                                                                         Using coarse-grained simulations,
                                                                                                      Kurt Kremer, PhD, Markus Deserno,
                                                                                                      PhD, and their colleagues at the Max
    organelles, membranes remodel them-              imental and theoretical research could           Planck Institute for Polymer Research in
    selves. Often that means forming small           not explain how they do it. Because the          Mainz, Germany, showed that curvature-
    enclosed compartments called vesicles.           energy required for major remodeling             mediated attraction can indeed explain
    Now researchers have gained a better             projects is greater than the energy used to      how membranes refashion themselves.
    understanding of that process using              bind the specialized proteins to the mem-        Once a membrane starts to bend, pro-
    coarse-grained computer simulations.             brane (or to each other), some suspected         teins embedded in that membrane begin
    The work was published in the May 24,            that membrane curves themselves could            to cluster and draw the membrane into a
    2007 issue of Nature.                            carry the necessary energy.                      curved shape—not unlike a vesicle.




    The coarse-grained membrane simulation starts with a flat membrane containing 46,080 lipids and 36 large hemispherical “caps” (shown in pink)
    representing membrane proteins. Over the course of roughly one millisecond, the proteins begin to aggregate and form a large vesicle. The final
    image shows a cross-section of the vesicle in order to reveal the protein caps within. Courtesy of Kurt Kremer and Markus Deserno.

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:41
posted:6/12/2011
language:English
pages:36
ghkgkyyt ghkgkyyt
About