Data analysis of gene expression data

HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Data analysis of gene expression data Jaakko Hollmén HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Personnel • Jaakko Hollmén, Heikki Mannila • Graduate students (3): Jouni Seppänen, Salla Ruosaari, Anne Patrikainen • Undergraduate students (2) : Mikko Katajamaa, Antti Rasinen Jaakko Hollmén 2 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Gene expression data • State of protein production • Tissue to RNA to hybridized arrays • High-dimensional, noisy measurement data matrices • 500-10000 simultaneous measurements from an organism Jaakko Hollmén 3 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Research scope • Goal: advances in data analysis, with a specific focus on analyzing gene expression data • High-dimensional, noisy measurement data matrices • Signal decomposition and projection methods (PCA, ICA, NMF, ...), MCMC, and pattern discovery methods Jaakko Hollmén 4 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Understanding measurements Source levels Simulation model Image analysis Normalization • Simulation model for gene expression data • To understand measurements and their analysis Jaakko Hollmén Data analysis Verify results 5 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Closer look at the real world Jaakko Hollmén 6 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Expression data as numbers 0.8214 0.4447 0.6154 0.7919 0.9218 0.7382 0.1763 0.4057 0.9355 0.9169 0.4103 0.8936 0.0579 0.3529 0.8132 0.0099 0.1389 0.2028 0.1987 0.6038 0.2722 0.1988 0.0153 0.7468 0.4451 0.9318 0.4660 0.4186 0.8462 0.5252 0.2026 0.6721 0.8381 0.0196 0.6813 0.3795 0.8318 0.5028 0.7095 0.4289 0.3046 0.1897 0.5298 0.6405 0.2091 0.3798 0.7833 0.6808 0.4611 0.5678 0.7942 0.0592 0.6029 0.0503 0.4154 0.3050 0.8744 0.0150 0.7680 0.9708 0.9901 0.7889 0.4387 0.4983 0.2140 0.6435 0.3200 0.9601 0.7266 0.4120 0.7446 0.2679 0.4399 0.9334 0.6833 0.2126 0.8392 0.6288 0.1338 0.2071 0.6072 0.6299 0.3705 0.5751 0.4586 0.8699 0.9342 0.2644 0.1603 0.8729 0.2379 0.6458 0.9669 0.6649 0.8704 0.0099 0.1370 0.8188 0.4302 0.8903 0.7349 0.6873 0.3461 0.1660 0.1556 0.1911 0.4225 0.8560 0.4902 0.8159 0.4608 0.4574 0.4507 0.4122 0.9016 0.0056 0.2974 0.0492 0.6932 0.6501 0.9830 0.5527 0.4001 0.1988 0.6252 0.7334 0.7505 0.7400 0.4319 0.6343 0.8030 0.0839 0.9455 0.9159 0.6020 0.2536 0.8735 0.5134 0.7327 0.4222 0.9614 0.0721 0.5534 0.2920 0.8580 0.3358 0.6802 0.0534 0.3567 0.4983 0.4344 0.5625 0.6166 0.1133 0.8983 0.7546 0.7911 0.8150 0.6700 0.2009 0.2731 0.6262 0.5369 0.0595 0.0890 0.2713 0.4091 0.4740 0.0147 0.6641 0.7241 0.2816 0.2618 0.7085 0.7839 0.9862 0.4733 0.9028 0.4511 0.8045 0.8289 0.1663 0.3939 0.5208 0.7181 0.5692 0.4608 0.4453 0.0877 0.4435 0.3663 0.3025 0.8518 0.7595 0.9498 0.5579 0.0142 0.5962 0.8162 0.9771 0.2219 0.7037 0.5221 0.9329 0.7134 0.2280 0.4496 0.1722 0.9688 0.3557 0.2440 0.8220 0.2632 0.7536 0.6596 0.2141 0.6021 0.6049 0.6595 0.1834 0.6365 0.1703 0.5396 0.6234 0.6859 0.6773 0.8768 0.0129 0.3104 0.7791 0.3073 0.9267 0.6787 0.0743 0.0707 0.0119 0.2272 0.5163 0.4582 0.7032 0.5825 0.5092 0.0743 0.1932 0.3796 0.2764 0.7709 0.3139 0.6382 0.9866 0.5029 0.9477 0.7258 0.3987 0.3584 0.2853 0.8686 0.6264 0.2412 0.9781 0.6405 0.2298 0.6813 0.6658 0.1347 0.0225 0.2622 0.1165 0.0693 0.8529 0.1803 0.0324 0.7339 0.5365 0.2760 0.3685 0.0129 0.8892 0.8660 0.2542 0.5695 0.1593 0.5944 0.3311 0.6586 0.8636 0.5676 0.9805 0.7918 0.1526 0.8330 0.1919 0.6390 0.6690 0.1302 0.2544 0.8030 0.6678 0.0136 0.5616 0.4546 0.9049 0.2822 0.0650 0.4766 0.9837 0.9223 0.5612 0.6523 0.7727 0.1062 0.0011 0.5418 0.0069 0.4513 0.1957 0.7871 0.6186 0.0155 0.8909 0.7617 0.9070 0.7586 0.3807 0.3311 0.5041 0.5646 0.7672 0.7799 0.4841 0.8022 0.4710 0.2028 0.5796 0.6665 0.6768 0.8995 0.6928 0.4397 0.7010 0.6097 0.2999 0.8560 0.1121 0.2916 0.0974 0.3974 0.3333 0.9442 0.8386 0.2584 0.0429 0.0059 0.5744 0.7439 0.8068 0.6376 0.2513 0.1443 0.6516 0.9461 0.8159 0.9302 0.3099 0.2688 0.5365 0.1633 0.2110 0.2168 0.6518 0.0528 0.2293 0.6674 0.3109 0.3066 0.7207 0.9544 0.1311 0.2233 0.3965 0.1351 0.2411 0.9275 0.3911 0.5113 0.0929 0.0217 0.1595 0.8445 0.8792 0.1870 0.9913 0.7120 0.8714 0.4796 0.4960 0.2875 0.0609 0.2625 0.1863 0.9171 0.1233 0.0134 0.3697 0.6986 0.8893 0.5938 0.1567 0.3167 0.2334 0.0084 0.3969 0.6499 0.0850 0.7688 0.9697 0.7148 0.7820 0.2376 0.1957 0.7430 0.6508 0.9398 0.8328 0.4700 0.6299 0.0582 0.5422 0.4557 0.8631 0.8552 0.4723 0.7869 0.6560 0.0000 0.1312 0.4949 0.0383 0.2274 0.3279 0.8995 0.3137 0.2517 0.4330 0.8424 0.1845 0.5082 0.4522 0.3256 0.3801 0.8865 0.7613 0.8838 0.4574 0.7992 0.1341 0.0653 0.3751 0.3735 0.4840 0.9695 0.3421 0.9636 0.1205 0.0483 0.3802 0.4128 0.4014 0.4210 0.3770 0.9073 0.6702 0.9618 0.1630 0.7486 0.3741 0.4542 0.0386 0.5624 0.3723 0.7928 0.7952 0.3829 0.2528 0.3429 0.9678 0.4798 0.3683 0.7646 0.3771 0.9003 0.1834 0.3683 0.9175 0.5159 0.0903 0.7353 0.0047 0.6031 0.9569 0.3974 0.7316 0.6846 0.9785 0.7067 0.1684 0.8137 0.4662 0.7223 0.9949 0.3625 0.7308 0.6497 0.6813 0.0076 0.6541 0.9452 0.6133 0.7829 0.0032 0.7970 0.6418 0.1785 0.5294 0.2187 0.5481 0.0582 0.5876 0.4161 0.1864 0.0639 0.0748 0.3100 0.9441 0.9807 0.5551 0.9885 0.6916 0.2417 0.8098 0.9345 0.1288 0.6868 0.2972 0.6472 0.4638 0.7333 0.6223 0.9898 0.1524 0.2033 0.8193 0.0584 0.5385 0.1902 0.5995 0.2923 0.0913 0.5068 0.8841 0.6156 0.0464 0.9519 0.1690 0.8267 0.6114 0.8473 0.1141 0.6492 0.1148 0.4734 0.6832 0.1333 0.4641 0.0713 0.5812 0.5660 0.2553 0.2385 0.0160 0.3847 0.7573 0.5752 0.4081 0.1957 0.5122 0.7133 0.8674 0.4974 0.0750 0.7666 0.0454 0.1651 0.7772 0.2083 0.2518 0.3965 0.4807 0.5093 0.6248 0.6255 0.9912 0.3592 0.2760 0.6781 0.5088 0.2769 0.5788 0.8228 0.9415 0.4443 0.4232 0.9962 0.6141 0.9441 0.9121 0.8150 0.6896 0.3087 0.5582 0.6368 0.7691 0.0540 0.1148 0.8460 0.1724 0.0370 0.3126 0.8173 0.2346 0.0264 0.3554 0.7439 0.2987 0.1812 0.4152 0.8673 0.6249 0.0552 0.4041 0.3020 0.1523 0.3092 0.0033 0.4374 0.6764 0.8229 0.7558 0.1626 0.5520 0.5251 0.9194 0.4419 0.0448 0.9646 0.0135 0.5520 0.9343 0.8986 0.6180 0.6999 0.9391 0.4521 0.1767 0.6168 0.5184 0.3645 0.7733 0.8283 0.3184 0.5960 0.7818 Jaakko Hollmén 7 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Quality control at spot level • Choose good quality spots for subsequent analysis • image analysis, detection and costsensitive classification Jaakko Hollmén 8 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Collaboration with biologists • Department of Medical genetics, Lab. of Cytomolecular Genetics, U. of Helsinki • Institute of Occupational Health • Turku Centre for Biotechnology • Karolinska Institutet • Journal articles during 2002: Wikman et al., Identification of differentially expressed genes in pulmonary adenocarcinoma by using a cDNA array. Oncogene 21(37), 2002, Nature Publishing Group Niini et al., Expression of myeloid-specific genes in childhood acute lymphoblastic leukemia – cDNA array study. Leukemia, 16(11), 2002, Nature Publishing Group Mannila et al., Long-range control of gene expression in yeast. Bioinformatics 18(3), 2002. Jaakko Hollmén 9 HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE FROM DATA TO KNOWLEDGE Current topics and further work • Correlation between gene expression and gene location in the genome • Combinations with sequence information • Time-series analysis, decompositions • Sparse decompositions of data matrices • MCMC techniques • Pattern discovery methods • Etc. Jaakko Hollmén 10

Related docs
Other docs by larryp