genome by lanyuehua

VIEWS: 1 PAGES: 22

									                      Genome

                                                        Lesk,
                               Introduction to Bioinformatics,
                                                     Chapter 2




Michael Schroeder
BioTechnological Center
TU Dresden
                          Biotec
             Organisms and cells
 All organisms consist of small cells
    Human body has approx 6x1013 cells of about 320
     different types
 Cell size can vary greatly
    Human red blood cell  5 microns (0.005 mm)
    Neuron from spinal cord 1m long
 Two types of organisms
    Prokaryotes - Bacteria for example
    Eukaryotes - most other organisms
    Archaea – few organisms living in hostile
     environments
   By Michael Schroeder, Biotec, 2004              2
                      Genomes and Genes:
                              Not all DNA codes for genes

Organism                              Number of bp          Genes
ФX-174                                              5386             10 Virus infecting E.coli
Human mitochondrion                               16,569             37 Subcellular organelle
Mycoplasma pneumoniae                           816,394             680 Pneumonia
Mycoplasma laboratorium                                             382 Minimal genome project

Hemophilus influenzae                          1,830,138       1,738 Middle ear infection

E. coli                                        4,639,221       4,406
Saccharomyces cerevisiae                       12.1 x 106      5,885 Yeast

C. elegans                                     95.5 x 106     19,099 Worm
Drosophila melanogaster                         1.8 x 108     13,601 Fruit fly

H. sapiens                                      3.2 x 109     22,333 Human




          By Michael Schroeder, Biotec, 2004                                                3
                Genetic information
 Genes as discovered by Mendel entirely abstract
  entities
 Chromosomes are physical entities and their
  banding patterns their landmarks
   Chromosomes are numbered in size (1=largest)
   Human chromome: p (petite=short), q (queue) arm,
    e.g. 15q11.1

 DNA sequences = hereditary information in physical
  form


     By Michael Schroeder, Biotec, 2004          4
                        Locating genes
 The disease cystic fibrosis is known since middle
  ages, the relevant protein was not
 Folklore: „Children with excessive salt in sweat -
  noticable when kissing them on forehead - were short
  lived“
 Implication: Chloride channel in epithelial tissues
 Search in family pedrigrees identified various genetic
  markers (Variable Number Tandem Repeat), which
  limited the genomic region first from 1-2 Mio bp to
  300kb
 Finally the deletion 508Phe in the CFTR gene was
  identified as cause

     By Michael Schroeder, Biotec, 2004          5
                      Chromosome




By Michael Schroeder, Biotec, 2004   6
Chromosome banding pattern map




   By Michael Schroeder, Biotec, 2004   7
Chromosome banding pattern map




   By Michael Schroeder, Biotec, 2004   8
          2 Types of Maps: Physical Map
      Genome sequencing projects supply the DNA
       sequence of each chromosome
      The physical distance is the number of base
       pairs that separate two genes
                                                                                                                                   180 Mbp


                                                                                                                                   110
                                                                                                                                   100




               Gene A                                                                            Gene B


            …ACTGTATGACTGGCATGGCACTGGGGCAAATGTGCACTC…                                                                              5
                                                                                                                                   0
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
                     By Michael Schroeder, Biotec, 2004                                                                        9
           2 Types of Maps: Genetic Map
                                                                             • Chromosomes are carriers of
                                                                               genetic information

                                                                            • Genetic information is linked and
                                                                              linearly arranged inside the
                                                                              chromosome

                                                                             • This linkage is sometimes
                                                                               broken: recombination
                                                                               (crossing-over)




                                                                                              Genetic Maps
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
                    By Michael Schroeder, Biotec, 2004                                                                         10
         2 Types of Maps: Genetic Maps
       Genes located far from each other are more likely to be uncoupled
        during a crossing-over

       A Morgan is the genetic distance in which 1
        crossing-over is expected to occur
                                                                                                                       110 cM



                                                                                                                       78
                                                                                                                       70




                                                                                                                       2
                                                                                                                       0

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

                    By Michael Schroeder, Biotec, 2004                                                                         11
                           Why 2 Types of Maps?


              Historical background

            Genetic markers may be mapped in only
             one system (conversions needed)
            Genetic markers may be ambiguous
            Different systems provide us with
             complementary information (not completely
             redundant)
                      By Michael Schroeder, Biotec, 2004
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003   12
                    Expected Map Conversion



                                   bps / cM



                                                                                       Linear relationship




                                                                                       bps / cM


C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003


                    By Michael Schroeder, Biotec, 2004                                                                         13
                   Observed Map Conversion
                          Non linear relationship (Yu A, et al. 2001.
                           Nature, 409:951-3
                          Outliers
                          Marker abiguity
                          Local marker density
                          Inversions cR
                             bps / cM /



                                                                                       Linear relationship
         bps

                                                                              Human chromosome 12
                                                                                 bps / cM / cR

                                                                                                                           cM
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
                      By Michael Schroeder, Biotec, 2004                                                                        14
                                    General Properties
                      Gene density and recombination

            Recombination is mostly higher in areas with a high gene
             density.



                 bps


                                                                                                               Yao, et al. (2002)
                                                                                                              Proc Natl Acad Sci
                   high gene                                                                                     99(9):6157
                   density                                               high recombination



                                                               Human chromosome 12
                                                                                                                       cM
                      By Michael Schroeder, Biotec, 2004
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003   15
                How to Detect Genes?
 Detecting of regions similar to known coding regions from
  other organisms
    Gene expressed (in another organism)  mRNA  cDNA = EST
     (Expressed Sequence Tags)
    search for start of EST
 Ab initio: derive gene from sequence itself
    Bacteria easy as genes are contiguous
    Eucaryotes problem: alternative splicing
       Initial exon:
              Search for TATA box ~30bp upstream,
              no in-frame stop codon,
              ends before GT splice signal
       Internal exon:
              AG splice signal,
              no in-frame stop codons,
              ends before GT splice signal
        Final exon followed by polyadenylation


      By Michael Schroeder, Biotec, 2004              17
                                     Brent, Nat Biotech, 2007
By Michael Schroeder, Biotec, 2004               18
              How to detect genes:
               De novo prediction
 GenScan (late 90s)
   predicts 10% of ORFs in human genome
   Overprediction of 45,000 genes (~22,000 current
    estimate)
 TwinScan (ealry 2000s):
   Use alignment between target and a related genome:
    detect one third of ORFs in human genome
 N-Scan
   Includes pseudo gene detection
   Predicts 20,138 genes


     By Michael Schroeder, Biotec, 2004           19
                              Applications
 Genetic diversity and anthropology
   Cheetahs very closely related to each other pointing to
    a population bottleneck 10,000 years ago
   Humans: mitochondrial DNA passed on through
    maternal line, Y chromosome from father to son
      Variation in mitochondrial DNA in humans suggests
       single maternal ancestor 140,000-200,000 years ago
      Population of Iceland (first inhabited 1100 years ago)
       descended from Scandinavian males and femals from
       Scandinavia and the British Isles
      Basques linguistically and genetically isolated



     By Michael Schroeder, Biotec, 2004              20
            Evolution of Genomes
 Phylogenetic profiles
    What genes do different phyla share?
    What homologous proteins do different phyla share
    What functions to different phyla share?




     By Michael Schroeder, Biotec, 2004           21
          Shared functions of
     bacteria, archaea, and eucarya
 Functions shared by Haemophilus influenza (bacteria), Methanococus jannaschii
  (archaea), Saccharomyces cerevisiae (eucarya)
     Energy:
           Biosyntehsis of cofactors, amino acids
           Central and intermediary metabolism
           Energy metabolism
           Fatty acids and phospholipids
           Nucleotide biosynthesis
           Transport
     Information:
         Replication
         Transcription
         Translation
     Communication and regulation
         Regulatory functions
         Cell envelope/cell wall
         Cellular processes

 Can we construct a minimal organism?



       By Michael Schroeder, Biotec, 2004                             22
                                  Summary
 Relation of DNA, genes and chromosomes
 Relationship of distance in Morgan and basepairs
 How to find genes in DNA
    By similarity
    Ab initiov with Introns, exons, alternative splicing



 Read Lesk, chapter 2




     By Michael Schroeder, Biotec, 2004                 23

								
To top