JExpress Tromso08 by h7N2zL8r

VIEWS: 8 PAGES: 43

									An introduction to J-Express;
analysis and visualisation of
microarray gene expression data

Anne-Kristin Stavrum

Norwegian Microarray Consortium
                                          Sick vs. healthy, time series change, drug
         Biological Question
                                          treatment,...

        Experimental Design               Which samples to include? Time points? Tissue
                                          types? Sample outtake? Avoid batch effects,…
       Microarray Experiment              Illumina, Agilent, Affymetrix, in house arrays,...

           Image Analysis

                                            Illumina, Agilent, Affy, Gene Pix,...
     Expression Quantification



               Filtering



            Normalization                   J-Express, R/Bioconductor, TMeV,
                                            GeneSpring,...
      Expression Data Analysis
(Clustering, interaction analysis etc.)



     Biological Verification and
           Interpretation                 Gene ontology, Pathways, qPCR, Proteomics...
In J-Express.....
Low level analysis                     On each array separately
      ( = data preparation)
      Quality Control
      Filtering
      Normalisation
      Spot visualisation
      Data distribution and layout
      Interactive preparation
      Batch processing

   High level analysis              On expression matrix
      ( = expression data analysis)
      Clustering/projection
      Differential expression (Feature subset selection, SAM)
      Gene group analysis (Pathways, Gene Ontology)
      Visualisation
                             Quantitations


       Low-level data
Step 1 preparation and
        quality control




                             Spots
                          Output from GenePix
                          Image quantitation files (GPR)
                          Image Files (JPG)

          Samples



                          Step 2        Data
                                        Mining
  Genes
                           genepix results for
GenePix GPR File Example   each feature (spot)
                                                         Block
                                                        Column
                                                          Row
                                                         Name
                                                           ID
                                                            X
                                                            Y
                                                          Dia.
                                                     F635 Median
                                                      F635 Mean
                                                       F635 SD
                                                     B635 Median
                                                      B635 Mean
                                                       B635 SD
                                                   % > B635+1SD
                                                   % > B635+2SD
                                                     F635 % Sat.
                                                     F532 Median
                                                      F532 Mean
                                                       F532 SD
                                                     B532 Median
                                                      B532 Mean
                                                       B532 SD
                                                   % > B532+1SD
                                                   % > B532+2SD
                                                     F532 % Sat.
                                                   Ratio of Medians
                                                    Ratio of Means
                                                   Median of Ratios
                                                    Mean of Ratios
                                                      Ratios SD
                                                      Rgn Ratio
                                                        Rgn R²
                                                        F Pixels
                                                        B Pixels
                                                   Sum of Medians
                                                    Sum of Means
                                                       Log Ratio
                                                 F635 Median - B635
                                                 F532 Median - B532
                                                  F635 Mean - B635
                                                  F532 Mean - B532
                                                         Flags
            J-Express Low level analysis and data preparation




                                                      Step 2 Link arrays to
                                                      GenePix output files



                                                      Step 3 Choose Data
                                                      Fields (output from
                                                           genepix)




Step 1 Create Experiment setup
Replicates
The use of replicates
increase the quality of
microarray data




Two types of
replicates

In-Array replicates are
replicate spots on the
same array

Between-array
replicates are
replicated microarrays    The J-Express framework can handle both in-array and
(same bio-material)       between-array replicates
                          We do not use replicates in the practicals session
Quality Control

Combine GenePix output data
with the J-Express quality control
framework
Chip Image View

Control genepix
flagging and gridding



Manually select spots
and view progression
throughout the
complete experiment
Data Preparation

 Create a process batch
 to filter and normalize
 the data.

 A process batch can be
 copied to other arrays,
 saved and loaded for
 later use

 Processes are:
 •Filters
 •Normalization
 methods
 •Various control
 components such as
 plots and array views     Clear batch (no processing)
Filtering
Remove unwanted spots
                                    Remove Spots with diameter below 100




                                    Remove Spots with foreground below 2x
                                    background




Remove Spots with value below 50   Remove Empty Spots and controls
Normalization
Remove unwanted dye bias
   Other process
    components
Plot,
Flooring
Fold change view
Spot Image view
Replicate view
Process list
 complete
          T 01
          T 05
          T 10
          T 15
          ...
Gene 1
Gene 2
Gene 3
...
   Gene expression matrix
Expression matrix
In J-Express.....
Low level analysis                     On each array separately
      ( = data preparation)
      Quality Control
      Filtering
      Normalisation
      Spot visualisation
      Data distribution and layout
      Interactive preparation
      Batch processing

   High level analysis              On expression matrix
      ( = expression data analysis)
      Clustering/projection
      Differential expression (Feature subset selection, SAM)
      Gene group analysis (Pathways, Gene Ontology)
      Visualisation
J-Express desktop
Overview of expression data - Projection
Overview of expression data - Clustering
Sample groups   1                 2
                Define group of   Enter group
                samples;          name;
                Select samples    Select group
                                  colour and
                                  create group
Filtering


                                            Removed
                                            genes




                                  New
            Example: remove       dataset
            genes with constant
            expression levels
            across samples
Annotation manager
 Link in external annotation to the microarray probes




                               2                         1



                                               4




                                    3
Projection   Motivation:
                  Large datasets with a lot of similar looking (low
                  expressed) profiles are common  Use projection
                  to see similar and dissimilar groups of profiles.
Clustering
Datasets can be large, often 15000 genes
and 50-100 samples.

Clustering creates smaller groups of genes
that can be viewed and analysed separately.

These groups consist of genes that behave       MESS!
similarly across samples.

This expression similarity is often caused by
co-regulation or other interesting biological
processes.




Examples:
         Hierarchical clustering
         k-means clustering
         Self-organizing maps
Clustering
                                                           Less mess...




          MESS!




                  +                    =


Line chart:           Heat diagram         Number of   Cluster number
1 line per gene       1 row per gene       members
Hierarchical Clustering
                          Sample groups
The different clustering/projection methods
often produce similar results:
                        Self-organizing maps   Principal
                                               component
                                               analysis




                                               Self-
                                               organizing
                                               maps
                                               AND
                                               Principal
                                               component
                                               analysis
Signal intensity   Differential expression
                     Group 1         Group 2
                                               Not interesting;
                                               no response to
                                               treatment


                               Samples
Signal intensity




                     Group 1         Group 2
                                               Candidate genes!
                                               Expression levels
                                               before and after
                                               treatment are different

                               Samples


                           FSS (~t-test)
                           ANOVA
                           SAM
Differential expression - Example
 Find genes that differ in expression levels between brain regions
Gene ontology (GO) analysis
GO:
Describe gene and gene product attributes;
        - the molecular function of gene products,
        - their role in multi-step biological processes and
        - their localisation to cellular components




GO analysis:
Have a list of differentially expressed genes (from e.g. SAM)

Which biological processes do these genes take part in?

Is there an over-representation of the number of genes belonging to a
particular biological process, compared to what could be expected?
GO - Example
 Find functional gene groups over-represented in a certain brain region



                                Reference list           Enriched genes
Gene set enrichment analysis (GSEA)
 Typical GO analysis:
 Compare two different conditions
 Produce lists with up and down regulated genes
 Use GO annotation to look for over-represented
 functional groups within the lists


 Problem:
 Which cut-off should we use? Fold change? P-
 value?
 If small changes are observed, few single genes
 may come out as significantly changed using
 SAM or t-test.


 Solution:
 No cutoff, look at distribution of gene groups in
 ranked gene lists
GSEA - Example
Gene groups enriched in different brain regions
KEGG Pathways
Map region-enriched genes to known pathways
Focus on selections
Cell cycle analysis
Scripting; Using Python with J-Express




                      Adding functionality to the software;
                               - Low level processeing
                               - High level analysis
                               - Many example scripts
                                 (annotation, CGH plot,
                                 PCA, histogram etc..)
                  MAGMA
Mini Analysis Guide to Micro Arrays

Guides you through preprocessing and simple
analysis using J-Express Pro

   http://www.microarray.no/magma
http://www.microarray.no/magma
http://www.microarray.no/magma
Forum
You can also use the J-Express forum, which
can be found here:

http://www.molmine.com/forum


Please use English when writing to this forum.
Acknowledgments
J-Express Developer:
       Bjarte Dysvik

The microarray data analysis group at UIB:
       Inge Jonassen
       Kjell Petersen
       Christine Stansberg
       Anne-Kristin Stavrum

Norwegian Microarray Consortium


Download J-Express from http://www.molmine.com
Send a mail to annes@ii.uib.no to get a license.


Free licenses for Norwegian academic scientists through an agreement
with the NMC

								
To top