Genome Structural Variation in Human and Primate Evolution by X52Is25h

VIEWS: 77 PAGES: 30

									Genome Structural Variation in
 Human and Primate Evolution

      James M. Sikela, Ph.D.
 Professor, University of Colorado
    Denver School of Medicine
Genomics Course Lecture, February
             22, 2011
                    Key Points
• All regions of the human genome are not created equal
• Gene duplication (copy number variation) is a (the?)
  major mechanism underlying genome evolution
• ArrayCGH can reconstruct the evolutionary history of
  gene duplication & loss in the human/primate genome
• Lineage-specific gene duplications are candidates to
  underlie lineage-specific traits
• Selection of evolutionarily adaptive sequences also is a
  key driver of human disease
• Human lineage-specific amplification of DUF1220
  protein domains as a candidate underlying human brain
  evolution
Primate Family Tree




                  Smithsonian Human Origins Program
          Human Characteristics
• Body shape and thorax    • Neocortex expansion
• Cranial properties       • Enhanced language &
  (brain case and face)      cognition
• Small canine teeth       • Advanced tool making
• Skull balanced upright
  on vertebral column
• Reduced hair cover
• Enhanced sweating
• Dimensions of the
  pelvis
• Elongated thumb and
  shortened fingers
• Relative limb length

                             modified from S. Carroll, Nature, 2005
  Reports of “human-specific” genes
• FOXP2
  – Mutated in family with language disability
  – Two human-specific amino acid changes
• ASPM/MCPH
  – Mutated in individuals with microcephaly
  – Under positive selection?
• HAR1F
  – Encodes RNA (not protein) product
  – Gene sequence highly changed in humans
• DUF1220 protein domains
  – Highly increased in copy number in humans
  – Copy number correlation with microcephaly/macrocephaly
  – Expressed in important brain regions
Molecular Mechanisms Underlying
       Genome Evolution
 • Single nucleotide substitutions
      - change gene expression & structure
 • Genome rearrangements
 • Gene duplication
      - copy number change: gene dosage
      - redundancy as a facilitator of
        innovation
   Strategies to identify human
lineage-specific genomic changes
 •Comparative genomic sequencing
   •Chimp genome sequence 2005
   •HAR1F, 2006
 •Cross-species brain gene expression
 profiling
   •Human, chimp, macaque
 •Comparative genomic copy number
 studies: gene duplication & loss
   Fortna, et al, 2004: Human and great
   ape lineages
                              Comparative analysis of
                                 primate genome
                                 WGS sequences




Genome-wide comparison of                                      Evolutionary studies of
 inter-species gene copy                                      disease genes associated
  number and structural                                       with cognitive dysfunction
         variation              Genes and genomic                     e.g. MCPH
                                changes underlying
                              human-specific cognitive
                                    capabilities




      Functional testing of                                 Comparative brain gene
        candidate genes                                       expression studies




                                                   Sikela, J.M., PLoS Genet. 2, e80, 2006
Gene Duplication & Evolutionary Change
    •“There is now ample evidence that gene
        duplication is the most important
    mechanism for generating new genes and
      new biochemical processes that have
      facilitated the evolution of complex
         organisms from primitive ones.”
                        - W. H. Li in Molecular
                              Evolution, 1997
    •“Exceptional duplicated regions underlie
              exceptional biology”
                        - Evan Eichler, Genome
                       Research 11:653-656, 2001
   Interhominoid cDNA Array-Based
Comparative Genomic Hybridization (aCGH)

                           Fig 1. Measuring genomic DNA
                           copy number alteration using
                           cDNA microarrays (array CGH).
                           Fluorescence ratios are
                           depicted in a pseudocolor scale,
                           such that red indicates
                           increased, and green
                           decreased, gene copy number
                           in the test (right) compared to
                           reference sample (left).
          Experimental Design
• Carry out pairwise aCGH comparisons between
  human and other primate species
• Use a microarray containing >41,000 human cDNAs
  representing >24,000 human genes
• Hybridize human genomic DNA (reference
  sequence: green) and other primate genomic DNAs
  (test sequence: red) simultaneously to the
  microarray
• Visualize aCGH signals “gene-by-gene” along each
  chromosome across five species: human (n=5),
  bonobo (n=3), chimpanzee (n=4), gorilla (n=3) and
  orangutan (n=3)
Whole Genome Caryoscope Image of Interhominoid aCGH Data
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss




                                               Fortna, et al, PLoS Biol. 2004
     Clustering of hominoid lineage-specific genes

Region   Cytogenetic Position   Nucleotide Position cDNAs Cytogenetic Features
1        1p36.33                10205-370863          14   p subtelomeric
2        1p36.13                16040148-16248006     12
3        1p13.2-1q21.2          119385828-145366889   66   pericentromeric region; C band
4        2p11.2                 87371301-88563579     20
5        2p11.1-2q11.2          89358358-93970939     20   pericentromeric region
6        2q14.1                 112101086-112411341   31   chromosome 2 fusion region
7        2q21.2-2q21.3          130634597-131402172   17
8        5p13.3-5p14.3          20943443-22425809     12   inversion region
9        5q13.3                 70353511-70903396     15   inversion region (SMA region)
10       6p22.1                 26692149-26992489     9
11       7q34                   141632015-142216972   11
12       9p24.3                 17070-17490           12   p subtelomeric
13       9p13.3-9q21.12         38562165-62840292     77   pericentromeric region
14       14p11.1                13063292-13805918     10   pericentromeric (acrocentric)
15       15p11.1-15p11.2        13039694-15384734     18   pericentromeric (acrocentric)
16       16p11.1-16p11.2        32314412-35474685     15   pericentromeric region
17       18p11.1-18q11.21       14311227-18260062     9    pericentromeric region
18       19p13.3                16401-198604          8    p subtelomeric
19       20p11.1-20q11.21       25698233-29620848     11   pericentromeric region
20       21p11.2                7669179-11968553      9    pericentromeric (acrocentric)
21       22q11.1                13034022-14321656     12   pericentromeric (acrocentric)
22       22q13.33               47696896-47744592     10   q subtelomeric
23       Yq11.223               20925957-27898184     15   near heterochromatin
H   C   G   O

                Value of Outgroup Comparisons:
                Chimp vs Human CNVs are not
                necessarily lineage-specific
               aCGH Caveats
• Functional status of extra copies unknown
• Small copy number changes in large, highly
  similar gene families difficult to detect
• Genes “lost” in human lineage will be missed
• “Lineage-specific” term needs better
  validation:
  – More individuals needed within each species
  – More species need to be assayed to identify “
    lineage-specific” changes
   Conclusions from Fortna, et al
• First genome-wide & first gene-based
  survey of gene duplication & loss in human
  and great ape evolution
• Identified most of the major lineage-
  specific gene copy number changes that
  have occurred over the past 15 million
  years of human and great ape evolution
• Identified genes that potentially underlie
  many of the phenotypic characteristics
  that distinguish these species from one
  another
“This (Fortna, et al, 2004) is the first time
that copy number changes among apes have
   been assayed for the vast majority of
 human genes, and we can expect that the
 biological consequences of the 140 human-
specific copy number changes identified in
this study will be heavily investigated over
              the coming years. “

             ---M. Hurles, PLoS Biol. 2004
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss
                                     Number of BLAT Hits




                                10
                                     15
                                          20
                                               25
                                                    30
                                                             35
                                                                     40
                                                                              45
                                                                                     50




                        0
                            5
               321470
               470930
               781385
               594438
               843276
              1212231
               296679
               383823
               119768
               126229
               135010
               234376
               279874
                50904
               297084
                                                                                     *




               298685
               298862
               323796
               451080
               470261
               488945
               626842
               704320
               730398




IMAGE Clone
               741841
               767345
               811138
               823588
               969906
              1030854
                                                                                                                Chimp, and Macaque Genomes




              1031047
              1467026
              1468074
              1474402
                                                         Intronless




              1557341
              1638749
              1641894
              1641988
              1683035
              1699118
              1759573
                                                         Chimp intron-containing
                                                         Human intron-containing




              1856246
                                                         Macaque intron-containing
                                                                                          BLAT-Predicted Intronless vs. Intron-Containing HLS Gene Copies in Human,




              1874052
              1946251
                        0
                            5
                                10
                                     15
                                          20
                                               25
                                                    30
                                                             35
                                                                     40
                                                                              45
                                                                                     50
 DUF1220
Repeat Unit




              Popesco, et al, Science 2006
InterPro-predicted DUF1220-containing proteins
    BLAT Estimation of the Number of DUF1220 Domains Found in Different Species

A




B
    Q9H094_HUMAN/236-298       Q9C0H0_HUMAN/138-201     Q8IX77_HUMAN/116-178        Q8ND86_HUMAN/334-400

    100                        100                     100                           100
     80                         80                      80                            80
     60                         60                      60                            60
     40                         40                      40                            40
     20                         20                      20                            20
      0                          0                       0                             0
          HSA




                                                                                           HSA
                                      HSA




                                                             HSA
     O95877_HUMAN/28-94         Q8ND86_HUMAN/184-250    Q8IX62_HUMAN/186-252             Q8IX62_HUMAN/111-177

    100                        100                     100                           100
     80                         80                      80                            80
     60                         60                      60                            60
     40                         40                      40                            40
     20                         20                      20                            20
      0                          0                       0                             0




                                                                                           HSA
          HSA




                                      HSA




                                                             HSA


          Q8IX62_HUMAN/17-83    Q8IX71_HUMAN/95-158                    C             O75042_HUMAN/1586-1638

    100                        15                                              4
     80                                                                        3
     60                        10
                                                                               2
     40                        5
     20                                                                        1
      0                        0                                               0

                                                                                   HSA
                                     HSA
          HSA
       Copy Number of DUF1220 (Q8IX62/17-33)
          Copy Num ber of DUF1220 (Q8IX62/17-33)
           Sequences in Primate Species
                                        Sequences in Prim ate Species




Q-PCR Predicted Copy
                                  70
                                   70




  Q-PCR Predicted Copy
                                  60
                                   60


                                  50
                                   50


      Number
                         Number
                                  40
                                   40


                                  30
                                   30


                                  20
                                   20


                                  10
                                   10


                                   00




                                                                            Orangutan



                                                                                                 Macaque
                                                 Bonobo




                                                                                                           Baboon
                                                                                        Gibbon
                                         Human




                                                                  Gorilla
                                                          Chimp
  BLAT-based DUF1220 copy number in
sequenced primates using IMAGE:843276
• Full insert cDNA query (491 bp) encodes 3
  DUF1220 domains
• BLAT hits (>200 score) in each species:
• Species (Assembly)       Copies (x3)
  –   Human (5/04):       51
  –   Human (3/06):       50
  –   Chimp (11/03):       6
  –   Chimp (3/06):       25
  –   Orangutan (7/07):   10
  –   Macaque (1/06):      3
 Summary of aCGH, Q-PCR and
       BLAT results:
• DUF1220 domains are highly amplified in
  human, reduced in African great apes,
  further reduced in orangutan and Old
  World monkeys, single copy in non-
  primate mammals and absent in non-
  mammals
Sequences encoding DUF1220 domains:
• are virtually all primate specific
• are increasingly amplified generally as a
  function of a species evolutionary proximity
  to humans, where the greatest number of
  copies (218) is found
• show signs of positive selection
• are highly expressed in brain regions
  associated with higher cognitive function
• in brain show neuron-specific expression
  preferentially in cell bodies and dendrites

                             Popesco, et al, Science 2006
          Recent Relevant Publications
•   Fortna, A., Kim, Y., MacLaren, E., Marshall, K., Hahn, G., Meltesen, L., Brenton, M.,
    Hink, R., Burgers, S., Hernandez-Boussard, T., Karimpour-Fard, A., Glueck, D.,
    McGavran, L., Berry, R., Pollack, J.R. and Sikela, J.M.: Lineage-specific gene
    duplication and loss in human and great ape evolution. PLoS Biology,
    Jul;2(7):E207, 2004.

•   Sikela, J.M.: The Jewels of Our Genome: The Search for the Genomic Changes
    Underlying the Evolutionarily Unique Capacities of the Human Brain. PLoS Genet,
    May;2(5):e80, 2006.

•   Popesco, M., MacLaren, E., Hopkins, J., Dumas, L., Cox, M., Meltesen, L.,
    McGavrin, L, Wyckoff, G., and Sikela, J.M.: Human lineage-specific amplification,
    selection and neuronal expression of DUF1220 domains. Science, 313:1304-1307,
    2006.

•   Dumas, L., Kim, Y., Karimpour-Fard, A., Cox, M., Hopkins, J., Pollack, J., and
    Sikela, J.M.: Gene copy number variation spanning 60 million years of human and
    primate evolution. Genome Research 17:1266-1277, 2007.

•   Dumas L. and Sikela, J.M.: DUF1220 Domains, Cognitive Disease and Human Brain
    Evolution. Cold Spring Harb. Symp. Quant. Biol. E-published, October 22, 2009.

								
To top