Sample Sheet Expression of Interest by ooh16978


More Info
									Publishing expression data
      from the SMD

Catherine Ball
Tuesday, May 30, 2006
User Help: Tutorials and Workshops
•   SMD Help & FAQ
•   SMD Tutorials – regularly scheduled (we hope)
    –   Welcome to SMD
    –   Data analysis, Normalization and Clustering
    –   Publishing expression data
    –   Power users and the data repository
    –   Interested? Email
    Publishing expression data :
    a tutorial
•   What we will discuss:                       •   What we won’t discuss:
     –   Publishing                                  –   User Registration
          •   Publisher’s requirements               –   Loader Accounts
          •   Experimenter’s responsibilities        –   Submitting Data
     –   Hybridization Annotation                    –   Finding Your Data
          •   Categories, Subcategories              –   Displaying Your Data
          •   Protocols                              –   Data Retrieval and Analysis
          •   Procedures and parameters              –   Submitting a Printlist
          •   Clinical Data                          –   Data Normalization
     –   Experiment Set Annotation                   –   Data Quality Assessment
          •   Organizing Data                        –   Data Analysis (clustering)
          •   Experiment Design Categories           –   External User Tools (XCluster,
          •   Experimental Factors                       TreeView, etc.)
          •   Factor Values
     –   Making your data available
          •   SMD
          •   Web Supplements
          •   Public Data repositories

     Please fill out the sign-up sheet and survey form
     Questions? email us at:
Publishing expression data
•   Background
•   Publishing requirements and
•   Pre-publication responsibilities
    – Hybridization Annotation
    – Experiment Set Annotation
•   Post-publication responsibilities
    – Making your data available
  Background : Interpretation and
  •   Extremely difficult to either interpret or
      analyze expression results without
      being aware of all the variables
Biological characteristics, experimental design, protocol
parameters, filtering parameters, etc.
  •   Typically, these annotations, if they
      exist at all, are not attached to the data
Perhaps in a lab notebook, eventual publication (if ever
published), or in the worst scenario, only in the experimenter’s
Background : MGED
•   Microarray Gene Expression Database Society
•   Initially established November, 1999, Cambridge, UK.
•   Realized there were serious problems in
    communicating the results of genomic-scale
    expression results
•   Keen interest in a data standards, specifications, and
Background : Emerging
               •   MIAME : Minimal Information
                   About a Microarray Experiment
                    –   the requisite information needed to
                        both verify your analysis and allow
                        others to perform distinct analyses
                    –   Nature Genetics (2001) 29, 365-371

               •   MAGE-ML: MicroArray Gene
                   Expression Markup Language
                    –   data format standard required for
                        transmission and integration into
                        other expression repositories
                    –   Genome Biology (2002),
Background : MIAME checklist

• MGED Guide to authors, editors and
  reviewers of microarray gene expression
• In the interests of full disclosure and open
  research, a checklist of requirements was
  proposed, aimed at allowing manuscript
  readers “to understand the experiment, to
  identify the sequences being assayed, and to
  interpret the resulting data. ”
Publication Requirement?

                   … also being
                   adopted by Cell
                   and The Lancet -
                   others to follow…
Publishing responsibilities
•   Pre-publication
    – Provide the data and full annotation to the
      reviewers and editors.
    – This may evolve to sending data to a repository
      prior to publication (reviewer anonymity)
•   Post-publication
    – For the foreseeable future, provide a static
      snapshot of the raw result data and
      filtered/clustered data along with the gene
      annotation at the time of publication
Implications of MIAME for Stanford
Microarray Researchers

•   As of December 1, 2002, anyone submitting a
    paper to a Nature journal must submit his/her
    data to a public microarray data repository
    (such as ArrayExpress).

•   SMD users should start assembling and
    entering experimental data in preparation for
    more widespread acceptance of these
MIAME checklist

    Six parts
    1.   Biological Samples
    2.   Hybridizations
    3.   Data Normalization and Transformation
    4.   Experimental Design and Factors
    5.   Array Design
    6.   Measurements
SMD Stores Procedures
•   Biological Sample (Channels 1 and 2)
•   Growth Conditions (Channels 1 and 2)
•   Treatment (Channels 1 and 2)
•   Extract Preparation (Channels 1 and 2)
•   Chromatin IP
•   Amplification (Channels 1 and 2)
•   Labeling (Channels 1 and 2)
•   Hybridization Conditions
•   Scanning Procedure (Channels 1 and 2)
•   Feature Extraction
•   User-defined Procedures
Recording Procedural Details : Two
•   Full text Protocols
    – Great for providing the full documentation of the
      protocol to a fellow researcher, but…
    – Poor for indicating which experimental parameter
      is the key to the experimental design
•   Procedural parameters
    – Great for supervised analysis and singling out the
      important details of the experiment, but…
    – Poor for synthesizing the entire procedure
      together in a legible manner
    Where are the tools?

Enter New Data

                 View Existing Data
List Existing Protocols

  •   Display within SMD, or View external resource

  •   Edit your protocol from the list
Edit Existing Protocol
Entering a New Protocol
•   Choose the procedure
•   Supply the formatted plain text, or a simple description if
    providing the URL
Flowchart to Add Annotations
Edit your hybridizations

     Use “Edit” to add
     procedural details to
     your experiments
Experiment Types
•   CGH
     – Comparison of genomic copy number between samples
       (Comparative Genome Hybridization).
•   Chromatin IP
     – Investigation of DNA-protein interactions in which protein-bound
       DNA is immunoprecipitated.
•   Expression (Type I)
     – Investigation of gene expression where the control sample is
       tailored to the particular experiment (not a common reference).
•   Expression (Type II)
     – Investigation of gene expression where the control RNA is made
       from a common reference.
•   GMS
     – Genome Mismatch Scanning. Investigation of the parental origin of
       genomic DNA.
Edit your hybridizations

     Use “Edit” to add
     procedural details to
     your experiments
Associating a protocol with a

• Associate a previously entered protocol
• Enter a new one, if need be
Adding Procedural Parameter
Values for a Hybridization
              •   Same interface is used
                  to add experimental
                  parameter values
              •   Parameter values are
                  linked directly to the
              •   Procedural parameters
                  are modeled as
                  experimental factors
Edit your hybridizations

     Use “Edit” to add
     clinical annotation to
     your experiments
Associating Patient
               •   Patient parameters we
                   – Age at diagnosis
                   – Sex
                   – Ethnicity
                   – Family History
                   – Status
                   – Time from Operation to
                   – Date of last follow-up
                   – Patient lost prior to
Associating Clinical Sample
               •   Sample parameters we store
                   –   Tracking Information
                   –   Unique Sample ID
                   –   Linking Database
                   –   Sample Information
                   –   Sample Source
                   –   Time Post-mortem (hrs) of sample removal
                   –   Sample State, Size
                   –   Granularity
                   –   Organ of origin
                   –   Attending Surgeon
                   –   Pre-Operative Information
                   –   Prior Treatment
                   –   Clinical Stage
                   –   Post-Operative Information
                   –   Tumor Grade, Size, Type
                   –   Margins
                   –   Time from Diagnosis To Operation
                   –   Angioinvasion
                   –   Total Lymph Nodes
                   –   Positive Lymph Nodes
                   –   Pathological Stages FollowUp Information
                   –   Recurrence
                   –   Post Operative Therapy Time from Operation to
  Batch Association of

Batch Entry
MIAME checklist

    Six parts
    1.   Biological Samples
    2.   Hybridizations
    3.   Data Normalization and Transformation
    4.   Experimental Design and Factors
    5.   Array Design
    6.   Measurements
MIAME checklist : Data
Normalization and Transformation
MIAME checklist

    Six parts
    1.   Biological Samples
    2.   Hybridizations
    3.   Data Normalization and Transformation
    4.   Experimental Design and Factors
    5.   Array Design
    6.   Measurements
MIAME : Experimental Design

•   Experimental Design and Factors
    – type of experiment (set of hybridizations)
    – The number of hybridizations performed
    – experimental factors
    – hybridization design
    – the type of reference used for the
    – quality control steps taken
Organizing Data: Arraylists vs
Experiment Sets
•   Arraylists                         •   Experiment Sets
     – Personal list of experiments        – Annotated list of
     – Contains no annotation
                                           – Exists in the database
     – More difficult to share with          therefore dynamic (edit,
       others                                delete, or annotate through
     – Flat file that exists in your         a web interface)
       loader account                      – Easily shared with other
     – Accessed through                      users/ collaborators
       Advanced Search                     – Extensible
                                           – Accessed through Basic
                                           – Required for publication
                                             within SMD
Easily convert your arraylist
into an experiment set
Selecting the data for inclusion within the
experiment set

•   Select
    using either the
    basic or
    advanced search
    as a starting point

             Experiment Set Creation
Experiment Set Organization
Base Annotation for the
Experiment Set

   –Set description
       •For publications, this would likely be either the abstract or
       a figure legend
Finding Your Sets in SMD:
Basic Search

                 Experiment Sets allow
                    you to search data
                        on pre-defined
                   experiment groups.
Edit your Experiment Set
Experiment Factors : Step 1
 Procedures   Parameters   Measurements?
Experiment Factors : Step 2

                     These values can be
                     from your
                     parameters values,
                     but only if you have
                     annotated your

                     Note: full text protocols
                     cannot be utilized for
                     this purpose, but fulfill
                     their own purpose.
Benefits of Experiment
•   Meet MIAME requirements
•   Meet publishing requirements (see above)
•   Serve as a basis for new analysis tools
Post-publication responsibilities

•   Making your data easily available
    and accessible for the foreseeable
    – SMD
    – web supplement
    – public repositories
Post-publication : SMD

•   Send us the name of your MIAME-
    annotated experiment set
•   We’ll make the arrays world-viewable
    for you, and publicize your paper
•   Gene annotations and normalizations
    may change, so you must also provide
    a distinct, static view (web supplement)
Post-publication : web
•   We encourage you to make a web supplement,
    which represents a snapshot of the data, as
•   Options:
    1. You can make the web-site and host it on your own.
    2. You can make the web-site on your own and you can ask
       us to host it.
    3. You can ask us to construct one for you. Usually, given the
       amount of work that this entails (ask us ahead of time), the
       curator creating the website will expect collaborative
Post-publication : repositories

 – Submit your data to a public repository
   • ArrayExpress at the EBI
   • Gene Expression Omnibus (GEO) and NCBI
 – We produce valid MAGE-ML for
   experiment sets and array designs and can
   communicate these to the repositories for
If you require assistance
with either the creation of a
web supplement or
submission of your dataset
to a repository, contact us at
MIAME Resources

•   MIAME working group

•   MIAME checklist for authors, editors
SMD: Getting Help

                •   Click on the
                    “Help” menu
                    – Tool-specific links
                      will be listed at the
                • Use the SMD help
                  index to look for
                  specific subjects
                • Send e-mail to:
SMD: Office Hours

• Grant building, S201
• Mondays 1-3 pm
• Wednesdays 2-4 pm
                               SMD Staff
Gavin Sherlock                                                                          Catherine Ball
Co-Investigator                                                                         Director

                                             Patrick Brown
                  Farrell Wymore                               Michael Nitzberg
                  Lead Programmer                              Database Administrator

                  Zac Zachariah                                 Catherine Beauheim
                  Systems Administrator                         Scientific Programmer

                   Janos Demeter                                Heng Jin
                   Computational Biologist                      Scientific Programmer

                                                                Takashi Kido
                   Don Maier
                                                                Visiting Scholar
                   Senior Software Engineer

To top