Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

PPT 3697kb - PowerPoint Presenta

VIEWS: 13 PAGES: 35

									FGE-OM: Functional Genomics
  Experiment - Object Model

          Andy Jones

  Department of Computing Science
       University of Glasgow
               Overview

•   Introduction to proteomics
•   Motivation for shared standards
•   FGE-OM
•   Database implementation - RAPAD
•   Current biological projects
Proteomics Workflow                                        2D-PAGE

                     Protein
Sample Origin
                  Solubilisation



        Mass Spectrometry

    MALDI              MS/MS
                                                     Image Analysis

                                               ID   Vol   X     Y       ID   Vol   X     Y
                                               1    454   23    24      1    654   23    24
                                               2    222   28    87      2    25    28    87
                                               3    12    20    12      3    187   21    16
                                               4    662   262   101     4    672   262   111
                                               5    49    222   90      5    54    222   90
                                               6    113   485   10      6    113   487   10
                                               7    119   98    987     7    125   98    987




         Database Search                                    Multiple Gel
                                                             Analysis


                                   Protein                      Statistical
                                   identification
                                                                 Analysis
     Motivation for Shared Standards

• Data from large studies using multiple
  techniques can be compared more easily
• Proteomics standardisation can learn from past
  efforts of MGED
• Shared aspects of microarrays & proteomics:
  – Overview of experiment
  – Sample origin
  – Experimental protocols (similarity between RNA
    extraction & protein solubilisation)
  – Higher level analysis across multiple samples
  – RNA fluorescence signal, similar to protein volume
    on a 2-D gel
 Functional Genomics Experiment - Object Model
                   Namespaces
                                  Components
                                  common to all
                      BioOM       functional
                                  genomics
Top-level of the                  experiments
Object Model
                                                        MAGE-OM
                                  Microarray specific
                                                        derived
                                  components
  FGE-OM             ArrayOM




                                  Classes
                                  modelling             PEDRo and
                   ProteomicsOM   proteomics            Gla-PSI
                                  technologies
                                                        derived
•   A database for microarrays and proteomics
•   Based on RAD microarray database at Penn
•   Additional tables to store proteomics
•   Interface based on the RAD Study-Annotator



              RAD Study-Annotator:
              Manduchi et al. Bioinformatics
              2003, (in press)
            Proteomics Standards

   • PEDRo - Proteomics Experiment Data
     Repository
      – Proposal for standard covering sample
        origin, protein separation and mass spec
      – Accepted by Proteomics Standards
        Initiative as a draft standard
      – Published in Nature Biotech 21:247-254
        (2003)

http://pedro.man.ac.uk    http://psidev.sourceforge.net/
         Proteomics Standards

Gla-PSI
   – Glasgow proposal for PSI
More detailed coverage of:
• Image analysis
• Multiple analysis of 2D gels
• DIGE
• Statistical analysis

Comparative and Functional Genomics 4:492-501 (2003)
         Overview of BioOM packages
Packages                                          Classes
                          Bio-       Measure-
 Experiment   Protocol                            Extendable
                         Material     ment



   Bio-                                           Describable
                BQS      BioEvent   Description
 Sequence



              BioAssay    Higher                  Identifiable
  BioAssay                           AuditAnd
                          Level
                Data                 Security
                         Analysis



• BioAssay: removed Hybridization class into ArrayOM
• BioAssayData: removed BioDataCube and related
classes into ArrayOM
• Other packages: unchanged from MAGE-OM
      Overview of ArrayOM packages

• Array,ArrayDesign, DesignElement
    • Describe layout of array
                                      ArrayDesign   DesignElement
• QuantitiationType - microarray
specific classes e.g. Signal
• But, standard statistical tests                    Quantitation
                                         Array
could be incorporated into BioOM in                     Type

the future
•ArrayBioAssay contains only
Hybridization class                      Array          Array
                                       BioAssay     BioAssayData
•ArrayBioAssayData contains
BioDataCube - data dimensions
    • Not directly applicable to
    proteomics or other experiments
 BioOM: BioAssayData vs ArrayOM:ArrayBioAssayData

            BioOM                            ArrayOM
            BioAssay              Design
                                             Quantitation   BioAssay
              Data                Element
                                              TypeMap        Datum
                                    Map

 BioAssay              BioData   Design      Quantitation
                                                            BioAssay
 Dimension             Values    Element        Type
                                              Mapping         Map
                                 Mapping
                   Measured        Design    Quantitation
  BioData                                                   BioAssay
                   BioAssay       Element       Type
  Tuples                                                    Mapping
                     Data        Dimension    Dimension

• Only the most generic                        Derived
                                  Feature                    BioData
classes kept in BioOM                         BioAssay
                                 Dimension                    Cube
• Data model from MAGE                          Data
does not fit proteomics                       Composite
                                  Reporter                  Transform-
• Matching spots across gels                  Sequence
                                                               ation
                                 Dimension
is more complex                               Dimension

 Relationships between classes are the same as MAGE-OM
   Overview of ProteomicsOM packages


• Packages derived from         Proteome
                                              ProteinData
                                BioAssay
  PEDRo and Gla-PSI
• Linked to classes in BioOM
  for adding generic
                                 Protein
  descriptions and protocols    Separation
                                             ProteinRecord
• Different design principles
  from MAGE-OM
• Classes have attributes       MassSpec      MassSpec
  that specify many of the       Protocol       Data
  datatypes to be captured
  ProteomicsOM:ProteinSeparation package
              Separation techniques Separation products

                                           Physical
                        Gel2D
                                           GelSpot
  BioAssay                                                BioMaterial
  Treatment

                       Column              Fraction



       Source biomaterial        BioMaterial
                                Measurement


• Separation techniques: subclass of BioAssayTreatment
• Separation products: subclass of BioMaterial
• Product of one separation technique can lead into another
using BioMaterialMeasurement
• A generic protocol can be attached to BioAssayTreatment
ProteinSeparation Package
ProteomicsOM:ProteomeBioAssay package

Legend                                                  • GelImageAnalysis -
         BioOM                                            analysis of 2-DE by
          ProteomicsOM                                    specialist software
                                                        • Re-uses Image and
                                        BioAssay
                                                          ImageAcquisition from
 Channel
                                                          BioOM
                                                        • Linked by Physical-
                             Physical            Measured
                                                          BioAssay
  Image                      BioAssay         BioAssay


              treatment       target
                                                           Measured
                 BioAssay                      Feature
                                                           BioAssay
                 Treatment                    Extraction
                                                             Data



  Image                                        GelImage    BioAssay
Acquisition                                    Analysis      Data
Gel2D
        - 1st, 2nd dimension, stain protocols, operator, MW & pI range
          Image
Image                 Channel
        Acquisition
GelImage
Analysis
       ProteomicsOM:ProteinData package

        BioAssay
                                                      • IdentifiedSpot stores
          Data
                                        BioMaterial
                                                        spot data e.g. volume
                                                      • Subclass of Physical
                                                        GelSpot and BioMaterial
       BioAssay           BioData
       Dimension           Values
                                          Physical
                                          GelSpot
                                                        for capturing further
                                                        treatments
Physical
BioAssay                  BioData        Identified
                                                      • DIGESingleSpot
                           Tuples          Spot         captures single channel
 Feature                                              • BioAssayDimension
Extraction
                          SpotRatio
                                        DIGESingle
                                           Spot
                                                        captures spots matched
                                                        across gels
GelImage
Analysis
                   Multiple           Matched
                   Analysis            Spots
ProteinData Package
                             Clicking a spot loads protein data pages




Identified
                                                                   Protein
  Spot

Search capabilities over protein name, range of pI, mass or spot volume
        MassSpecProtocol and MassSpecData


• MassSpecExperiment MassSpecProtocol Package               MassSpecData Package

  at top level
                            BioMaterial    BioAssay
• BioAssayTreatment Measurement            Treatment                Peak
  links to source of
  material and protocol
  (via BioEvent)                           MassSpec
                                                                  PeakList
• Also links to specific classes          Experiment

  for MS details e.g. ion source
• Data stored as a list of peaks        PEDRo derived
                                        classes modelling
                                                              PEDRo derived
                                                              classes modelling
• Classes for capturing                 MS protocol           database searches

  database searches from                                      Legend
  PEDRo                                                            BioOM

                                                                   ProteomicsOM
                  ProteinRecord package

• Proteins identified by MS
                                          ProteinHit               DBSearch
  and database searches
• Class Protein stores a
                                                       MassSpecData package
  single protein record
                                                       ProteinRecord package
• Protein modifications
  stored using                              Protein
                                                                   Protein
  OntologyEntry                                                  Modification

• Link to external records                      species
  stored in DatabaseEntry                            modificationType

       Legend
                               Database        Ontology
                BioOM                                              Location
                                Entry           Entry
                ProteomicsOM
                                                           Protein




                                                            Protein
                                                          Modification
Display protein name, species, pI and MW
              Data about protein modifications observed
                                           ProteinHit        DBSearch



Measures of quality of match by MS. Link to MASCOT results
                                                                Protein




                                                                Database
                                                                 Entry
Link to GeneDB record - parasite genome database
- Accession and database URL are stored in the DatabaseEntry table
                         Protein




                         Database
                          Entry
Link to Genbank record
(or other database)
                                                           Experiment


   Proteomics Workflow                                                      Material Type
                                                                            DNA
                                                                            RNA
                                           Treatment        BioMaterial     Protein
                                                                            Cell
                                                                            ...
• Top level stores          experiment    BioMaterial                                  Gel2D
  description                           Measurement


• Extraction of protein       BioMaterial
                                                        BioAssayTreatment
                                                                                     LCColumn

  mixture: BioMaterial
  and Treatment                                             Physical
                                                            BioAssay
                                                                                     MassSpec
                                                                                     Experiment

• 2-DE and liquid chromatography:
                                                                                     Acquisition
  subclasses of BioAssayTreatment                        ImageAcquisition             Protocol

• BioMaterialMeasurement used to
  link multiple separations together                         Physical
                                                             BioAssay
                                                                                      Image


• Image scan and image analysis -
  link to PhysicalBioAssay                              FeatureExtraction
                                                                                     GelImage
                                                                                     Analysis

• MeasuredBioAssay links to spot
  data and MS data                                          Measured               MeasuredBio-
                                                                                    AssayData
                                                            BioAssay
Current Project: Trypanosoma brucei

• Trypanosomes cause sleeping sickness and other
  diseases in Africa and Latin America
• Model organism for parasitology
Aims:
• Genome sequencing, microarrays and proteomics to find all
  the expressed genes and proteins - GeneDB at Sanger
• Proteomics component in Glasgow
• 2-DE and MS to find approx. 4000 proteins
• Find potential drug targets and improve genome annotation
           Work In Progress

• Develop RAPAD prototype, store and query
  data from a range of experiment types
• Support Trypanosome project - future
  integration with microarray and genomics
• Tools for generating FGE-ML and XMLSchema
• Incorporate proteomics component into
  database system at Penn (GUS)
  – Add proteomics support to ToxoDB, PlasmoDB,
    GeneDB
                       Contact
           Email: jonesa@dcs.gla.ac.uk
   http://www.dcs.gla.ac.uk/~jonesa/FGE/fge.html
Bioinformatics Research Centre - www.brc.dcs.gla.ac.uk

             Acknowledgements
This work is in collaboration with the CBIL at Penn, in
particular Chris Stoeckert and Angel Pizarro.
Trypanosome data is from studies by Mike Turner and
Anne Faldas in IBLS at Glasgow.
PhD supervisors: Ela Hunt and Jonathan Wastling
The Functional Genomics Facility at Glasgow is supported by a
Wellcome Trust grant. My research is supported by an MRC
Bioinformatics PhD studentship.
ProteomeBioAssay Package
MassSpecProtocol Package
MassSpecData Package
ProteinRecord Package

								
To top