Analysis of Telemore Length Distributions Promoter: Olivier Thas (Dept. Applied Mathematics, Biometrics and Process Control, Coupure Links 653, Building A, Room 69; tel. 09 2645933, email: firstname.lastname@example.org) Introduction The nucleoprotein complexes at the termini of linear chromosomes are called the telomeres. They consists of long TTAGGG-tandem repeats and are essential for the maintenance of chromosomal integrity. In normal somatic cells, telomeres shorten with each cell division, but other factors have a negative influence on their length as well. Critically shortened telomeres induce cell proliferation arrest. Tumor cells and other immortal cells compensate this erosion by the activation of the telomerase enzyme or by alternative mechanisms of telomere lengthening. Telomere length is therefore considered as a marker for ‘biological age’. Human telomere lengths are measured by first isolating DNA from a sample of cells. Since there is variability between the telomere lengths of DNA in different cells, and since data can only be obtained from a large sample of cells, the telomere length data arise as a distribution of lengths. Many studies have shown that the mean telomere length decreases with age, but researchers at the Department of Molecular Biotechnology also believe that the skewness of the distribution changes with age. The goal of this thesis project is to build a model that describes the skewness of the telomere length distribution as a function of age. Many studies have suggested that the telomere lengths are lognormally distributed. However, this strong parametric assumption implies that the skewness cannot be modelled independently from the mean. Moreover, by looking at recent data researchers now believe that this distributional assumption does not hold, particularly with respect to the skewness. Clearly, a more flexible or nonparametric model is needed here. Inference in these models will be based on empirical likelihood theory. The data analysis and the construction of the regression model become even more complex because of the nature of the raw data. Telomere length distributions are derived from measurements of optical density (measurements of radioactive labelled DNA-fragments on a Southern Blot), and these densities are disturbed by a small background signal. Unfortunately, most methods to eliminate the background noise are based on the parametric assumption of lognormallity. Thus, part of the work consists in applying alternative statistical methods to filter the data. There is no prior biological knowledge necessary.