BRIEFING TITLE - ALL CAPS 30 Jan 01

Document Sample
BRIEFING TITLE - ALL CAPS 30 Jan 01 Powered By Docstoc
					Integrating -Omics


             Brent D. Foy, Ph.D.
           Associate Professor
         Department of Physics
         Wright State University
                    Dayton, OH
                       Overview

•   Combining Genomic Data with Proteomic Data
    – Which gene makes which protein?
    – If mRNA level goes up, does the protein level go
      up?


•   Biomolecular Network Modeling
    – Issues
    – State of the Field
    – Our work
                                                         2
          Gene to Protein Identification

  Partial table from Affymetrix rat gene tox chip

J02589mRNA#2_at    J02589mRNA#2 Rat UDP glucuronosyltransferase, complete cds /cds=(32,1
J02592_s_at        J02592 Rat glutathione S-transferase Y-b subunit mRNA, 3 end /cds=(0,560
J02612mRNA_s_at    J02612mRNA RATUDPGT Rat UDP-glucuronosyltransferase mRNA, comple
J02657_s_at        J02657 Rat cytochrome P-450(M-1) mRNA, complete cds /cds=(20,1522) /g
J02669_s_at        J02669 Rat cytochrome P-450a (3-methylchlanthrene-inducible; with high tes
J02679_s_at        J02679 Rat NAD(P)H-menadione oxidoreductase mRNA, complete cds /cds=
J02720_at          J02720 Rat liver arginase mRNA, complete cds /cds=(26,997) /gb=J02720 /g
J02722cds_at      J02722cds RATHOXA Rat heme oxygenase gene, complete cds
J02752_at          J02752 Rat acyl-coA oxidase mRNA, complete cds /cds=(73,2058) /gb=J027
J02776_s_at        J02776 RATPOLB1 Rat DNA polmerase beta mRNA, complete cds
J02791_at          J02791 Rat acyl coenzyme A dehydrogenase medium chain mRNA, comple
J02810mRNA_s_at    J02810mRNA RATGSTYBX Rat prostate glutathione S-transferase mRNA, c
J02844_s_at        J02844 RATCOTA Rat carnitine octanoyltransferase mRNA, complete cds


  The ‘J02722’ is the GenBank nucleotide ID for this gene.

                                                                                       3
           Gene to Protein Identification

•   A Search for ‘J02722’ on GenBank
    (http://www.ncbi.nlm.nih.gov/Genbank/) or EBI
    (http://www.ebi.ac.uk/cgi-bin/emblfetch) brings up gene
    information page.
•   Scroll down for protein id. GenBank gives link for
    ‘AA41346.1’. EMBL gives links for EPD: ‘EP31003’ and
    Swiss-Prot: ‘P06762’. Clicking on link takes to
    information page on protein.
•   Match up Affymetrix gene id with protein id provided
    by proteomics experiment.
•   Can do reverse, given protein id, find gene id.

                                                              4
          Gene to Protein Identification

• Since we have ~150 identified proteins from proteomics, and
  ~1000 genes on Affymetrix gene chip, we did the reverse
  approach (given protein, find mRNA), and found 21 genes
  corresponding to 16 proteins that were present in both.
• Discrepancy?
   – AFFY and GenBank # M25157 – Rat Cu, Zn superoxide
      dismutase, from Sprague Dawley, lung cell line, 601 base pairs
   – AFFY and GenBank # Y00404 - Rat mRNA for copper-zinc-
      containing superoxide dismutase, from Sprague Dawley, liver, 650
      base pairs
   – Errors in public databases, or just incomplete knowledge of
      mRNA or protein varieties


                                                                         5
   Change in mRNA Expression vs
    Change in Protein Expression
Ratio of expression in absence of galactose to expression
in presence of galactose




                             Ideker T, et al., Science, 292: 929-934, 2001.   6
                         mRNA Expression vs. Protein Level



                                      Control case                                                                                Protein level vs gene expression,
                                  No hydrazine exposure                                                                          ratio 75 mM to 0 mM, different times

                100000




                                                                            protein level, ratio 75 mM to 0 mM, t = 3
                                                                                                                        2


                 10000
protein level




                  1000

                                                                                                                        1
                   100



                    10



                     1
                                                                                                                        0
                         1   10         100      1000      10000   100000                                                   0                       1                         2
                                   gene expression level                                                                        gene expression, ratio 75 mM to 0 mM, t = 0




                                                                                                                                                                                  7
                  Time Course – mRNA and
                       Protein Levels
                       50 mM Hydrazine-exposed Hepatocytes
             Heme Oxygenase (HSP32)                               Soluble Cytochrome b5

12000                                                 7000
10000                                                 6000
8000                                                  5000
                                                      4000
6000                              Protein
                                  mRNA
                                                      3000                       Protein
4000
                                                      2000                       mRNA
2000                                                  1000
   0                                                     0
        -5    0    5    10   15    20       25   30          -5   0   5    10   15   20    25   30

         Immunoglobulin Heavy Chain                          N-hydroxy-2-acetylaminofluorene;
              Binding Protein                                        sulfotransferase
25000                                                 2500
20000                                                 2000
15000                               Protein           1500
                                    mRNA
10000                                                 1000                       Protein
5000                                                   500                       mRNA

   0                                                     0
                                                                                                     8
        -5    0    5    10   15    20       25   30          -5   0    5   10   15   20    25   30
           Biomolecular Network Modeling

Genome             Transcriptome      Proteome              Metabolome
Analysis              Analysis        Analysis               Analysis



 Genome
                                        Protein
                        tRNA
           snRNA                   Modifications - Pia1    Metabolic Pathways
                        rRNA

 Genei*    Pre-mRNAi    mRNAia          Protein - Pia       Sk      Mk
                        mRNAib
                        mRNAij      Protein - Protein
                                   Interactions - Pia...
                                                             Cellular Metabolites




            Action Pathways

            Control Pathways
                                                                                9
         Metabolic Network Modeling -
               Tracer studies



• Quantify activities
                                                             plasma
                                                             lactate
  of biochemical                         F10               OL      IL
  pathways                                                   lactate                               lipid + acetate
                        plasma                                                           OA
• For example, C-13     glucose
                                                                        F2
                                                                                                     IA
                                                     pyruvate                           acetyl-CoA
  NMR analysis of         OG
                                    IG
                                               F9
  TCA cycle and                   glucose           F1      F6                                                       plasma
  gluconeogenesis                                                                                                    glutamate
                                            F8      oxaloacetate                         F3
  in liver                                                                                                           OT     IT

                                                                                              a-ketoglutarate        glutamate
                                                             F5
                                                      F7
                                                                                         F4
                                                                             fumarate



                                                                                                                                 10
                      Genetic Regulation

• Genes expressed in distinct domains, precisely delineated by
   time, state of cell, and level of response.
• This control is exerted by regulatory elements in the promoter
   and enhancer regions of genes.
• Field still young, but some quantitative results are appearing.

 Regulatory factors
          A           B      A   C           D    mRNA sequence

 DNA



 • Feedback with other genes

                                                                    11
       Biomolecular Network Modeling –
                   Issues

•   Compared to standard modeling of kinetic processes,
    challenges include:
    – Stochastic reaction behavior due to random
      diffusion processes and small numbers of
      molecules
    – Multiple protein-protein, protein-mRNA, etc.
      interactions
    – computational efficiency, parallelized code for
      operation on multiple CPUs
    – Can you separate out the model for a pathway from
      the whole cell?
                                                          12
Biomolecular Network Modeling –
             Task


gene A               mRNA A                prot A                rxn A1   A2


gene B               mRNA B                prot B                rxn B1   B2


gene C               mRNA C                prot C


gene D               mRNA D                prot D




•Compounds other than genes are mobile

•Some of these mobile compounds affect many reactions (e.g. ATP, ions)


                                                                               13
 Biomolecular Network Modeling –
     Finding the Parameters

Use the simulation itself to narrow down on the
possibilities
1. Optimize on stability
                                          Stable regions




                           Parameter 2   Parameter 1


2. Optimize on something else:
    maximum energy efficiency
    rapid cell division

                                                           14
     Biomolecular Network Modeling -
            State of the Field


•   E-Cell
•   Virtual Cell
•   Bio-Spice/Arkin
•   Specific Laboratories – Institute for Systems
    Biology/Leroy Hood’s group
•   Useful links page:
    http://www.cds.caltech.edu/erato/links.html




                                                    15
                        E-Cell



•   From Laboratory for Bioinformatics, Keio University,
    Japan
•   Attempt to integrate genes, RNA, proteins, and
    metabolites of entire cell in one simulation
•   Freely available, http://www.e-cell.org/




                                                           16
                       E-Cell


• Used to simulate a “minimal cell” based on
    Mycoplasma genitalium
•   127 genes
•   Integrate with online databases
•   Many parameters estimated
•   Substances modeled include small molecules,
    macromolecules, multi-protein complexes,
    protein-DNA complexes
•   Multiple reaction types


                                                  17
 E-Cell, published results


 Remove glucose from culture medium

ATP                                         Some mRNA levels




        Time                                                       Time
      Tomita, M., et al.; Bioinformatics, Volume 15, Number 1, 72-84 (1999)

                                                                              18
                      Virtual Cell

•   National Resource for Cell Analysis and Modeling
    (NRCAM), located at University of Connecticut Health
    Center
•   Access via internet, http://www.nrcam.uchc.edu/
•   Has a graphical, “biological users” interface
•   Compared to E-Cell
    – Includes 3-d spatial information within cell
    – Has not been applied to gene->mRNA->protein-
      >metabolites


                                                           19
        Virtual Cell

Define physiology, with reactions
       among substances




                                    20
  Virtual Cell

Geometric results




                    21
                     Bio-Spice


• Initiated at Berkeley National Laboratory,
    http://gobi.lbl.gov/~aparkin/index.html
•   Development of Bio-Spice is currently the subject
    of a DARPA project
• It will be a Simulation Program for Intra-Cell
    Evaluation, like SPICE for circuit design
•   Intended to be a “user-friendly simulation tool
    that captures the network of molecular
    interactions including gene-gene, gene-protein,
    and protein-protein interactions.”


                                                        22
Institute for Systems Biology -
       Galactose in Yeast




               Ideker T, et al., Science, 292: 929-934, 2001.


                                                                23
ISB - physical interaction network

Circles are genes, yellow means product affects another gene’s
transcription, blue means proteins interact. Grayscale of circles
is mRNA change with galactose in medium.




                                Ideker T, et al., Science, 292: 929-934, 2001.   24
          Development of Quantitative
             Tools - Transcription




                                TFIII             RNA        Activated
         TF_B        TF_A
                                               Polymerase   Nucleotides




   DNA

                     B      A           TATA     mRNA sequence

Regulatory factors


                                                                          25
Development of Quantitative
Tools - Transcription (cont.)



 State of Promoter         kon for RNA Polymerase

 TATA    A           B
 off     any         any           1e-99 (M*ms)-1
 on      off         off           1e-30
 on      on          off           5e-23
 on      off         on            1e-99
 on      on          on            5e-23




                                                    26
         Development of Quantitative
         Tools - Transcription (cont.)



Gene A
                  B       A          TATA    product = TF_A




Gene B
                          A          TATA    product = TF_B



   Plus a first-order process for degradation of TF_A and TF_B


                                                                 27
                                Development of Quantitative
                                Tools - Transcription (cont.)

                               Time course of                                            Time course of binding to
                              number of TF_A                                                 gene A promoter
                                                                                 4
                                                                              x 10
                                                                         0
                     45

                     40
                                                                        0.5
                     35
# A mRNA molecules




                     30

                     25                                                  1


                     20

                     15                                                 1.5

                     10

                     5
                                                                         2

                     0 0   0.2 0.4 0.6 0.8     1 1.2 1.4 1.6 1.8 2
                                                                               POLYMERASE          TFIII               TF_A              TF_B
                                                                               1696 events         3967 events         5 events          1852 events
                                             Time (s)               4          29.53% on           99.62% on           51.45% on         45.97% on
                                                                x 10
                                                                         0.5         1       1.5         2       2.5         3     3.5         4       4.5




                                                                                                                                                             28
        Biomolecular Network Modeling -
                Future Tasks

• Ultimate goal is to provide physiological insight on integrated
   genomic, proteomic, metabolic data sets in response to toxicity
   interventions
• Establish contact with online databases
   – Gene->protein->metabolite connections (KEGG, others)
   – protein-protein interactions (published list, Nature Biotech)
   – protein-DNA interactions (TRANSFAC, SCPD)
• Evaluate proper scale of modeling effort relevant to task. Scale in
   both the level of biological detail, and in terms of man-hours.
• Choose software and gain expertise with it, or create software as
   needed.
• One early goal - explore minimal cell and its stability in response
   to perturbation

                                                                        29
                   Collaborators


AFRL                        AFIT

Dr. John Frazier            Dr. Dennis Quinn

Dr. Charles Wang            2Lt Matt Campbell

Dr. Victor Chan
                            WSU

AFOSR                       Dr. Tatiana Karpinets

Dr. Walt Kozumbo




                                                    30
Integrating -omics




  Questions?




                     31

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:10/1/2012
language:English
pages:31