Project 3: Cluster analysis of gene expression data Statistical methods for the analysis of gene expression data have been widely used in cDNA experiments. In Dudoit, S., et al. (2002), a cDNA experiment was performed to identify genes with altered expression in two mouse models (the apolipoprotein AI knocked out mice and the scavenger receptor BI transgenic mice) with very low HDL cholesterol levels compared to inbred control mice. Besides the identification of significantly expressed genes using R package SMA , the participants in this project are expected to use different clustering methods described in Eisen MB, et al. (1998) (hierarchical clustering, self-organizing maps (SOMs), k-means clustering, principal component analysis, and suitable model-based clustering methods) to classify the significantly expressed genes. Clustering analysis of time-course gene expression data (Cho, R., et al. (1998), Spellman PT, et al. (1998)), which identifies subsets of genes that behave similarly along time under the set of experimental conditions, will also be considered in this project. Participants are encouraged to use the public microarray and gene expression database, such as NCBI’s Gene Expression Omnibus (GEO) and Stanford Microarray Database (SMD), and develop their own model-based clustering algorithms. References: Spellman PT, Sherlock, G., Zhang, MQ, Iyer, VR, Anders, K., Eisen, MB, Brown, PO, Botstein, D., and Futcher, B., (1998) Comprehensive identification of cell cycle- regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273-97 Cho, R., Campbell MJ, Winzeler EA, Steinmetz L., Conway A., Wodicka L., Wolfsberg TG, Gabrielian AE, Landsman D., Lockhart DJ, and Davis RW, (1998) A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle. Molecular Cell, 2:65–73 Eisen MB, Spellman PT, Brown, PO, and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863- 14868 Dudoit, S., Yang, Y.H., Callow, M.J., and Speed, T.P. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. STATISTICA SINICA 12 (1): 111-139.
Pages to are hidden for
"Project 3: Cluster analysis of gene expression data"Please download to view full document