Gene Set Enrichment and Pathways Analysis by zgp14654

VIEWS: 55 PAGES: 44

									The Genome Access Course



                           Gene Set Enrichment and
                             Pathways Analysis




            A means to quickly understand the biological
            context of one or more genes
                                                           November 2007
The Genome Access Course




        Applications for Pathways Analysis

        • Microarray data
          – e.g., GenMAPP




        • One or more QTL

                                             Chr. 1


                                 Chr. 5
                                                  November 2007
The Genome Access Course




                     Pathway Types

                     • Biochemical / Metabolic

                     • Signaling and Cellular Processes
                        – Protein Interaction
                              • Receptor + ligand
                              • Phosphorylation
                           – Gene Regulation
                              • Transcription factor


                     • Gene Networks
                        – Combination of the above


                                                          November 2007
The Genome Access Course
                           Boehringer Mannheim “Wall Charts”




                                                               November 2007
The Genome Access Course




                     Boehringer Mannheim Wall Charts

                     • Metabolic Pathways
                     • Cellular and Molecular Processes


    Searchable at
    http://www.expasy.org

    Click on “Roche Applied
    Science’s Biochemical
    Pathways”


                                                          November 2007
The Genome Access Course




           Major Public Pathway Resources
           • KEGG

           •   The Reactome Project (CSHL/EBI)
           •   NetPath (HPRD)
           •   Pathway Interaction Database (NCI/Nature)
           •   Protein Interaction Databases   Related tools:
                – BioGRID
                     • MIPS MMPI                 DAVID
                     • BIND                      GenMAPP
                     • DIP                       KEGG Array
           • Biocarta

           • Complete listing at www.pathguide.org

                                                            November 2007
The Genome Access Course




                           November 2007
The Genome Access Course



         Kyoto Encyclopedia of Genes and Genomes (KEGG)

                                    1. Metabolism
                                        • Carbohydrate,
                                           Energy, Lipid,
                                           Nucleotide, Amino
                                           acid, Other amino acid,
                                           Glycan, PK/NRP,
                                           Cofactor/vitamin,
                                           Secondary metabolite,
                                           Xenobiotics
                                    2. Genetic Information
                                       Processing
                                    3. Environmental
                                       Information Processing
                                    4. Cellular Processes
                                    5. Human Diseases
                                    6. Drug Development

                                     http://www.genome.jp/kegg
                                                             November 2007
The Genome Access Course




            KEGG




                             Scroll
                             down
                            to view
                           pathways




                               November 2007
The Genome Access Course




                 KEGG




                             Scroll
                             down
                            to view
                            disease
                           pathways




                               November 2007
The Genome Access Course



                           Type II diabetes mellitus
                KEGG




                                                  November 2007
The Genome Access Course




                     KEGG




                            November 2007
The Genome Access Course




           Coloring KEGG Pathways
                   STEP 1: Enter data (fold change, position)




                                                                November 2007
The Genome Access Course




          Coloring KEGG Pathways
                  STEP 2: Pick pathway   Type II diabetes …




                                                              November 2007
The Genome Access Course



               Coloring KEGG Pathways




                                        November 2007
The Genome Access Course



                     The Reactome Project




                                            November 2007
The Genome Access Course




             The Reactome Project




                                    November 2007
The Genome Access Course


                     The Reactome Project




                                            November 2007
The Genome Access Course




                     NetPath (HPRD)

                     Part of Human Protein Reference Database




                                                            November 2007
The Genome Access Course

                     NetPath (HRPD)




                                         Scroll down
                                      to get protein list
                                            November 2007
The Genome Access Course




                     NetPath (HRPD)




                                      November 2007
The Genome Access Course



    Pathway Interaction Database




                                   November 2007
The Genome Access Course


                           Wnt Signaling Pathway




                                           November 2007
The Genome Access Course




                     NCBI Gene (Entrez Gene)




                                               November 2007
The Genome Access Course



    Analysis of Gene Lists
    Functional Annotation
    • Are particular pathways over-represented?
    • Are any associated to disease (OMIM)?
    • DAVID Bioinformatics Tool at NIH
       – Advantage: Ease of use
       – Disadvantage: Does not consider rank of genes

    Gene Set Enrichment Analysis
    • Given a ranked list of genes:
       – Do genes in a particular portion of the list appear in a set of
         disease-associated genes?
    • GSEA at Broad Institute
       – Advantages:
               • Considers rank of genes in list
               • Several unique curated data sets (MSigDB)
         – Disadvantages: Difficult to use
                                                                       November 2007
The Genome Access Course




                     DAVID Bioinformatics (NIAID)
                     • Functional Annotation
                        – Maps a gene list to:
                               • KEGG
                               • Gene Ontology
                               • … and over 40 other resources …
                           – Enrichment analysis
                               • Probability that the number of genes with a particular
                                 annotation is significant compared to background
                           – Functional annotation clustering
                               • Sets of genes with common annotations
                     • Functional Classification
                     • Gene ID Conversion
                     • Gene Name Batch Viewer

                                                                                November 2007
The Genome Access Course




                     DAVID Functional Annotation




                                               November 2007
The Genome Access Course



    DAVID Functional Annotation




                                  November 2007
The Genome Access Course




                     DAVID Functional Annotation




                                               November 2007
The Genome Access Course




                     DAVID Functional Annotation




                                               November 2007
The Genome Access Course




                     DAVID Functional Annotation




                                               November 2007
The Genome Access Course




                 DAVID Functional Annotation




                                 Click on symbols
                                 to expand



                                                    November 2007
The Genome Access Course




                     DAVID Functional Annotation




                                  Click on Button




                                                    November 2007
The Genome Access Course




                     Enriched Pathway




                                        November 2007
The Genome Access Course




                     Probability Calculation
                     • Enrichment analysis
                            – For example, our input list of 300 genes has 3
                              members of the p53 signaling pathway. Is that
                              significant given there are 40 genes in the genome
                              that are part of the pathway?


                                                  Input List          Genome
                           In Pathway                 3                  40
                           Not In Pathway           297               29,960


                     • Fisher’s Exact Test
                                p = 0.008



                                                                               November 2007
The Genome Access Course




           KEGG Pathway With Genes Flagged




                                      November 2007
The Genome Access Course




                    Mappings to Pathways




                                           Click on Bar




                                                  November 2007
The Genome Access Course




        Which Genes Are In Which Pathways




                                      November 2007
The Genome Access Course



                           Gene Ontology Chart




                                                 November 2007
The Genome Access Course




              Functional Annotation Clustering




                                             November 2007
The Genome Access Course




                      Annotation Clusters




                                            November 2007
The Genome Access Course



                           Annotation Clusters




                                                 November 2007
The Genome Access Course



                     Annotation Clusters




                                           November 2007
The Genome Access Course




                     Summary
                     • Pathways analysis
                           – Quickly understand the biological context for
                             a list of genes (microarray, QTL, etc.)
                     • Public resources
                           – Growing rapidly
                           – Ability to “paint” diagrams
                     • DAVID Bioinformatics
                           – Powerful resource to map genes to many
                             resources


                                                                   November 2007

								
To top