Drosophila Microarray Analysis RNA isolation, Hybridization, Normalization, DE analysis Eye imaginal disc total RNA was isolated with TRIzol reagent (Invitrogen) and was further purified using an RNeasy kit (Qiagen). In accordance with the Affymetrix protocol (Affymetrix expression manual), all samples were processed and a total of 15ug of fragmented and labeled cRNA were hybridized to the Affymetrix GeneChip arrays (Drosophila genome 385K 2.0). The Chips were then washed and stained using an Affymetrix Fluidics Station 450 and Flourescence was detected using the Affymetrix GS3000. For each genotype there were three biological replicates, a total of 12 Affymetrix raw files (*.CEL files) were background corrected and normalized using the R-Bioconductor (Gentleman et al. 2004) package “affy” (Irizarry et al. 2003a; Gautier et al. 2004). Following normalization, the statistical technique RMA (Robust Multi-chip Average Method) (Irizarry et al. 2003a; Irizarry et al. 2003b) was used to estimate gene expression. The Bioconductor package “arrayQualityMetrics” (Kauffmann et al. 2009) was used to determine the quality and variability of the microarray experiment. In order to carry out differential expression (DE) analysis (linear model) of genes with a preprocessed data set with an empirical Bayes method, the Bioconductor package “limma” (Smyth 2004) was used to calculate the log2 fold change (logFC). The fold change (FC) is defined as the ratio of intensities between the two experimental conditions under comparison. Because these corresponded to two populations of RNA, each comprised by 3 biological replicates, the ratio of the average intensities across replicates was calculated. This package determines the t-test p-value and then determines the multiple test correction (FDR) (Benjamini and Hochbert 1995) p-value for each of the genes under the two experimental conditions. For all comparisons, a gene was considered differentially expressed if it had a FDR p-value < 0.05 (irrespective of the DE logFC). A more stringent logFC filter was applied of >0.5 (~ >1.4 fold increase or decrease) to all comparisons to generate a more refined set of DE genes. To annotate probes with the corresponding Ensembl gene ID, gene symbol and description, we used the annotation description from Affymetrix website (http://www.affymetrix.com/support/technical/annotationfilesmain.affx), Biomart, Ensembl v.55 (Drosophila melanogaster genes; BDGP 5.4) (Hubbard et al. 2007) and FlyBase database (Drysdale 2008). Enrichment Analysis of Pathways Enrichment Analysis (EA) of Pathways Functional annotation of genes is based on Gene Ontology (GO Consortium, 2006). EA was performed using Gitools (http://www.gitools.org) to identify processes that might be enriched among up- or down- regulated genes. In search of statistical significance (p-value) we use binomial distribution and p-value calculated as: Where: n = total no. of genes in the category x = number of differently expressed genes in the category. p= frequency of upregulated or downregulated genes Resulting p-values were adjusted for multiple testing using the Benjamin and Hochberg's method of False Discovery Rate (FDR) (Benjamini and Hochbert 1995). Identification of putative Sd and dE2F1 binding sites for DE genes We determined if the DE genes only found in either rbf/wt, warts/wt, or rbf wts (DM)/wt contain either putative binding sites for Sd, dE2F1, or both Sd and dE2F1. To search for SD binding sites in promoter regions (600bp upstream sequence with 100bp downstream sequences relative to transcription start site (TSS) for these genes) the TRANSFAC database (Release 2009.1) (Matys et al. 2003) position frequency matrices (PFM) for Drosophila (insect matrices) was used. STORM algorithm (Schones et al. 2007) was used for scanning through sequences to find SD binding sites represented in the TRANSFAC PFM. Since TRANSFAC database does not offer a PFM for Drosophila (or insect) specific E2F transcription factor, we used the ChIP-on-chip data from (Xu et al. 2007) for E2F1, E2F4, E2F6 on five different human cell lines including normal cells, and we used the Drosophila ortholog (Biomart, Ensembl v.55 ; Homo sapiens genes GRCh37; Drosophila melanogaster genes; BDGP 5.4 ; (Hubbard et al. 2007)) for these human E2F target genes to search which DE genes from our analysis were among the ortholog targets. All Drosophila ortholog targets of human E2F1, E2F4 and E2F6 were considered as putative Drosophila dE2F1 targets Following this, an examination of common targets determined which DE genes contained putative binding sites for both Sd and dE2F1. Hybridization was done at by the Functional Genomics Facility at the University of Chicago and raw data analysis was performed by AI and NLB at the University of Pompeu Fabra. Cloning of endogenous luciferase reporters A minimal promoter taken from the heat shock protein-70 gene was cloned in between the HindIII and BglII sites of the (Promega). For each reporter the following fragments were PCR amplified from genomic DNA isolated from Canton S. flies and then subcloned (in the sense orientation) upstream the hsp70 minimal promoter in the pGL3-basic luciferase vector [All position references are with respect the transcriptional start site of each gene as annotated by Flybase version 2010_04]. Gene Fragment Sites subcloned in between Cdc2c -620bp to +140bp MluI-XhoI dDP -701bp to +119bp MluI-XhoI Ex2kb -2kb to -1bp MluI-XhoI Ex1kb -1kb to +1kb MluI-XhoI CycB3 -640bp to +120bp KpnI-NheI Dachs -560bp to +220bp KpnI-NheI DNA polymerase ε -680bp to +280bp KpnI-NheI Mcm2 -620bp to +101bp KpnI-NheI Mcm3 -680bp to +149bp KpnI-NheI Mcm10 -600bp to +159bp KpnI-NheI DNA polymerase ε 553 -680bp to -127bp KpnI-NheI DNA polymerase ε SdE2F -427bp to -127bp KpnI-NheI dDP 709 -701bp to +8bp KpnI-NheI dDP 273 -428bp to +119bp KpnI-NheI dDP SdE2F -428bp to +8bp KpnI-NheI Following sequencing analysis, each plasmid was then used as described in the paper. Primer sequences used in the cloning and plasmid maps are available upon request. References: Benjamini, Y. and Hochbert, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society 57: 289–300. Drysdale, R. 2008. FlyBase : a database for the Drosophila research community. Methods Mol Biol 420: 45-59. Gautier, L., Cope, L., Bolstad, B.M., and Irizarry, R.A. 2004. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3): 307-315. Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., and et al. 2004. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10): R80. G.O. Consortium, The Gene Ontology (GO) project in 2006, Nucleic Acids Res. 34 (2006), pp. D322–D326. Hubbard, T.J., Aken, B.L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., and et al. 2007. Ensembl 2007. Nucleic Acids Res 35(Database issue): D610-617. Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B., and Speed, T.P. 2003a. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31(4): e15. Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U., and Speed, T.P. 2003b. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2): 249-264. Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., and Yamanishi, Y. 2008. KEGG for linking genomes to life and the environment. Nucleic Acids Res 36(Database issue): D480-484. Kauffmann, A., Gentleman, R., and Huber, W. 2009. arrayQualityMetrics--a bioconductor package for quality assessment of microarray data. Bioinformatics 25(3): 415-416. Matys, V., Fricke, E., Geffers, R., Gossling, E., Haubrock, M., Hehl, R., Hornischer, K., Karas, D., Kel, A.E., Kel-Margoulis, and et al. 2003. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 31(1): 374-378. Schones, D.E., Smith, A.D., and Zhang, M.Q. 2007. Statistical significance of cis- regulatory modules. BMC Bioinformatics 8: 19. Smyth, G.K. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3: Article3. Xu, X., Bieda, M., Jin, V.X., Rabinovich, A., Oberley, M.J., Green, R., and Farnham, P.J. 2007. A comprehensive ChIP-chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members. Genome Res 17(11): 1550-1561.
Pages to are hidden for
"Microarray Analysis"Please download to view full document