Mutation by MikeJenny


                  Types and Sources
•   Mutation is a decay force whose ultimate roots are in the second law of
    thermodynamics (entropy). Living things survive inevitable mutations by a
    combination of being tolerant of a certain level of mutation, repairing mutational
    damage, killing cells that are mutated beyond repair, and relying on natural selection
    to remove individuals with unfavorable mutations.

•   Simple mutations: base substitutions and small indels. “Indel” stands for insertion-
    deletion, which is based on the idea that when you see a difference in DNA sequence
    between two species it is usually difficult to tell whether there was an insertion in one
    species or a deletion in the other.
•   More complex mutations are larger events involving the insertion, rearrangement, or
    deletion of large pieces of DNA. Typical events include fusion of two different genes
    and insertion of transposable elements.

•   Internal sources: DNA polymerase can insert the wrong nucleotide or slip at a certain
    rate. Transposable elements can move, cause other sections of DNA to move, or
    produce reverse transcriptase that acts on other messenger RNAs.
•   External sources: damage to the DNA caused by chemicals in the environment,
    including oxygen, or by radiation.
     DNA Polymerase
•   DNA polymerase, the enzyme that
    replicates DNA, is not perfectly
    accurate. One problem is that bases
    spontaneously undergo a “keto-enol
    shift”, where a hydrogen moves its
    position in ketones. Guanine and
    thymine bases are subject to this at a
    low rate, and it causes mispairing.
•   DNA polymerase has a proofreading
    function, a 3’ to 5’ exonuclease
    activity, which backs up and removes
    newly inserted nucleotides if they are
    mispaired. This function lowers the
    DNA polymerase error rate from about
    1 error in 106 nucleotides to about 1 in
    109. Still, that is about 6 errors every
    time the genome is replicated.
•   DNA polymerase also can slip,
    especially when replicating short
    repeats (microsatellites). This
    generates small indels.
                                CpG Islands
•   Another chemical instability is that cytosine
    occasionally gets deaminated: it loses an amino
    group. This converts it into uracil, which is not a DNA
    base and is removed by repair enzymes.
•   However, in many places, a C followed by a G (CpG:
    the “p” is the connecting phosphate) gets methylated:
    a CH3 group is attached to the 5 position on the ring.
•   When 5-methyl cytosine is spontaneously
    deaminated, it is converted to thymine, a standard
    DNA base. Replication leads to a base change: one
    daughter stays a C-G base pair while the other is
    converted to T-A.
•   Over evolutionary time, this has led to a loss of CpG
    dinucleotides in human DNA.
•   However, methylation of cytosine is associated with
    gene inactivation, and genes that are expressed in
    most cells (housekeeping genes) usually do not have
    methylated cytosines at their 5’ ends. In these areas,
    the frequency of CpG stays high.
•   These areas of high CpG are called “CpG islands”.
    There are about 30,00 of them in the human genome,
    and most of them are associated with genes.
•   However, the presence of a CpG island does not
    necessarily imply the existence of a gene, and vice
                     Base Substitutions
•   Two basic types:
     –   transition: converting one purine to the other purine, or
         one pyrimidine into the other pyrimidine.
     –   transversion: converting a purine to a pyrimidine or the
•   Logically, transversions should be twice as frequent
    since there are twice as many of them as transitions.
•   However, in practice, transitions are about twice as
    common as transversion. Due to a combination of
    natural selection and ease of occurrence.

•   Neutral substitution rate: how often to nucleotides
    change in the absence of selection pressure. In a
    comparison of the human and mouse genomes, 165
    Mbp of DNA associated with non-functional transposon
    sequences were identified in both species. These had
    about 67% identical bases, and models implied a rate
    of 0.46 substitutions per position over the 75 million
    years since the human and mouse lineages diverged..
    This works out to 2 x 10-9 substitutions per year for
    each site, in the absence of selection pressure. This
    estimate agrees with other estimates based on
    different methods.
         Substitutions Within Genes
•   We mostly care about the functional parts of
    the genome, the genes and their control
    regions. Since most of the genes are
    presumably necessary for life, some
    mutations will be deleterious and others not.
•   In the human-mouse genome comparison,
    variation in the rate of substitutions across
    the various portions of genes was clear:
    fewest in the exons, most in the introns, and
    an intermediate amount in the UTRs and
    flanking regions.
•   For coding regions, the degeneracy of the
    genetic code has a large effect.
     –   some sites are non-degenerate: any change
         results in a different amino acid. 65% of
         human codons.
     –   other sites are two-fold degenerate:
         transitions give the same amino acid while
         transversions give a different amino acid. 19%
         of codon sites.
     –   other sites are four-fold degenerate: any
         mutation gives the same amino acid. These
         sites are all third positions of codons. 16% of
         codon sites.
•   Mutations that give the same amino acid are
    called silent or synonymous mutations.
    They are presumed to be selectively neutral.
                     More on Substitution
•   In addition to synonymous
    mutations, some amino acid
    changes are “conservative” in
    that they have little or no affect
    on the protein’s function.
     –   for example, isoleucine and
         valine are both hydrophobic
         and readily substitute for each
     –   other amino acid substitutions
         are very unlikely: leucine
         (hydrophobic) for aspartic acid
         (hydrophilic and charged). This
         would be a non-conservative
     –   Some amino acids play unique
         roles: cysteines form disulfide
         bridges, prolines induce kinks
         in the chain, etc.
     –   However, some amino acids           BLOSUM62 Table. Numbers on the diagonal
         are critical fro active sites and   indicate the likelihood of the amino acid
         cannot be substituted.              staying the same. The off-diagonal numbers
•   Tables of substitution                   are relative substitution frequencies.
    frequencies for all pairs of
    amino acids have been
         Detecting Natural Selection
•   Patterns of base substitution within a gene can be used as evidence for natural
    selection, by comparing the ratio of synonymous to non-synonymous substitutions.
•   Compare orthologs: genes in two different species that can be traced to a common
•   Can also compare paralogs within a species: genes resulting from duplication.
     –   a confounding problem: can you accurately identify orthologs between species, or are you
         comparing paralogs between the species?
•   Measured by comparing KS, the number of synonymous substitutions per site, to KA,
    the number of non-synonymous substitutions per site. Note that these numbers are
    corrected for the different levels of degeneracy for each site. The summary statistic
    is the KA / KS ratio.
•   Possible results.
     –   neutral selection: the gene is apparently not being selected. Often seen when a pseudogene
         is compared to a functional gene. Synonymous and non-synonymous substitutions occur at
         the same frequency. KA / KS = 1.
     –   negative (purifying) selection: the gene is being selected for similar functions in both species.
         Synonymous substitutions are more frequent than non-synonymous. KA / KS < 1
     –   positive (disruptive) selection: the gene is being selected for different functions in the two
         species. An unexpectedly high number of non-synonymous substitutions. KA / KS > 1
•   The median KA / KS value for humans vs. mice was 0.115. The lowest value
    (greatest purifying selection) was for calmodulin, histones, ribosomal proteins,
    ubiquitin, actin: genes involved with critical cellular functions common to all
    organisms. The highest ratios were seen for defense and immune response proteins
           Trinucleotide Repeats
• Trinucleotide repeats (TNRs) are a type of microsatellite, an array of
  3 bp repeats.
• DNA polymerase often slips at TNRs, increasing or decreasing the
  copy number.
• Because a codon is 3 bp long, TNRs within a coding region don’t
  change the reading frame.
• However, some TNRs cause diseases even though they are in the
• There are only 10 possible TNRs, considering the two DNA strands
  and the different orders you could write the bases. For example,
  the TNR that causes Fragile X syndrome could be written as CCG,
  CGC, GCC, GGC, or GCG.
• Below a certain number, the repeats are relatively stable. But,
  above that, the copy number can change drastically in both mitosis
  and meiosis. These alleles are called “pre-mutation alleles”. Above
  an even higher point, the mutant phenotype appears.
                    Huntington Disease
•   Huntington Disease. A dominant autosomal
    disease, with most people heterozygotes.
•   Onset usually in middle age.
•   Neurological: starts with irritability and
    depression, includes fidgety behavior and
    involuntary movement (chorea), followed by
    psychosis and death.
•   Caused by CAG repeats within the coding
    region, giving a tract of glutamines. Below
    28 copies is normal, between 28 and 34
    copies is the premutation allele: normal
    phenotype but unstable copy number that
    puts the next generation at risk. Above 34
    copies gives the disease.
•   HD shows “anticipation”: the age of onset
    gets earlier with every generation. This is
    due to a direct correlation between copy
    number and age of onset.
•   There is a genetic test for the disease, but in
    the absence of effective treatment few
    actually take the test.
•   Function of the protein remains unknown,
    the excess glutamines may cause it to
    aggregate and lose function.
                   Fragile X Syndrome
•   Fragile X syndrome. The most common form
    of human mental retardation.
•   The phenotype includes moderate to severe
    mental retardation, macroorchidism, large
    ears, prominent jaw, and high-pitched,
    jocular speech. Expression is variable, with
    mental retardation the most common
•   Males having only 1 X, are affected more
    frequently and severely than females.
•   Appears as a secondary constriction on the
    X, which appears in cells starved for folate.
    The X can actually break at that point, but
    this isn’t a common feature.
•   Caused by CGG repeats in the 5’ UTR of
    the FMR1 gene.
•   Normal copy number is about 30. Between
    55 and 200 copies, the copy number is
    unstable, but the person is normal. Above
    200 copies, the mutant phenotype appears.
•   The gene gets heavily methylated and is not
•   The function of the protein is unclear, but it
    is an RNA-binding protein that seems to be
    involved with translational regulation,
    possibly through RNA interference as part of
    the RISC complex.
     Mutations Affecting RNA
• Altered promoters, splice sites, poly-A
  addition sites.
                      Gene Conversion
•   If a cell contains two different copies of
    a gene, either on homologous
    chromosomes or as paralogs,
    sometimes one copy will “convert” the
    other copy to its sequence.
•   Gene conversion (at least between
    homologues) is a normal outcome of
    recombination. We need to look at the
    Holliday molecular model of
    recombination to understand this. This
    model is a bit simple compared to
    current theory, but is still basically

•   The homologues are paired in
    prophase of meiosis 1.
•   Single stranded breaks in both
    homologues are catalyzed by
•   The free ends invade the homologous
    DNA, forming heteroduplexes.
•   “Branch migration” occurs and the
    heteroduplexes are extended.
          More Gene Conversion
• Recombinase cuts the DNA
• Two possibilities at this
  point, occurring with equal
• 1. A “north-south” cut occurs
  after the 2 DNA molecules
  twist relative to each other.
  The result is a crossover: the
  two homologues are broken
  and rejoined at this point,
  giving recombinant
  chromosomes. Note that
  there is a heteroduplex
  region at the breakpoint.
        More Gene Conversion
• 3. The other possibility is
  that an “east-west” cut
  occurs. This gives a
  short heteroduplex
  region, but the 2
  chromosomes are still
  intact: no crossover has
• However, if the
  heteroduplex occurs
  within a gene that is
  being monitored, it will
  result in an offspring with
  an altered gene: gene
    Steroid 21-Hydroxylase Deficiency
•   The medical condition is “congenital adrenal hyperplasia”, and autosomal
    recessive condition. 21-hydroxylase is an enzyme necessary for converting
    cholesterol into aldosterone and cortisol. Aldosterone affects kidney
    function: causes salt to be retained. Cortisol is the main stress response
•   The biggest problem is that hormone precursors build up in the adrenals
    and get converted to testosterone, the major male hormone. This causes
    the external genitalia to develop into the male pattern, or develop
    “ambiguous genitalia” regardless of the individual’s gender (“virilization”). In
    milder cases, and in males, puberty occurs early in childhood. Female
    embryos develop a normal uterus and ovaries.
•   In some cases, salt is not retained in the body well, which is life-threatening
    but treatable with hormones.

•   The functional gene, CYP21A2, is located about 30 kb from a pseudogene,
    CYP21A2P on chromosome 6p. The pseudogene contains 9 mutations that
    inactivate it. Almost all cases result from one of two causes:
     – An unequal crossing over between these loci, resulting in a normal 5’ end of the
       gene and a mutant 3’ end (from the pseudogene), plus deletion of all teh
       intervening sequences.
     – Gene conversion converts part of the normal allele to the pseudogene sequence.
    Hemophilia A: Inversion Problems
•   The clotting factor VIII gene, F8, is on the X
    chromosome and is the major cause of
•   F8 is a large gene, and completely contained
    within intron 22 are two small genes
    transcribed from the opposite strand.
•   One of these genes, F8A, has another copy
    several hundred kb away, on the opposite
    strand. Thus, these two very similar genes
    are in opposite orientation.
•   Sometimes crossing over during meiosis will
    pair these regions are recombination will
    occur. This results in an inversion.
•   The inversion completely disrupts the main
    F8 gene, because its 5’ half is now inverted
    and far away from its 3’ half.
•   This accounts for about 45% of hemophilia A
•   Almost all new cases arise during male
    meiosis: in females, the two homologous X
    chromosomes are paired, which seems to
    inhibit this inversion.
  Transposable Element Insertions
• Functional copies of LINE-1 elements, Alu sequences,
  and some endogenous retroviral sequences (LTR
  retrotransposons) exist in the human genome. They
  occasionally transpose into genes that give a detectable
• The first examples found were two independent
  insertions of the 3’ end of LINE-1 into exons of the
  clotting factor 8 gene. Additional examples have been
  found since.
• Transposable element movement has also been
  implicated in cancer and the chromosome
  rearrangements that accompany it.
• Recombination between Alu sequences in different parts
  of the genome can generate deletions.
                  DNA Damage
• A list of agents that damage DNA:
   – ionizing radiation: induces breaks in DNA
   – Ultraviolet light: crosslinks adjacent thymidines (thymidine
   – alkylating agents: attach hydrocarbon groups to bases, either
     blocking DNA polymerase or crosslinking the bases
   – intercalating agents: slip between the DNA bases and cause
     DNA polymerase to insert extra bases or misread the sequence.
   – depurination: the link between purine bases and the deoxyribose
     spontaneously breaks
   – deaminination: loss of amino group from cytosine convers it to
   – reactive oxygen: peroxide and superoxide attack the purine and
     pyrimidine rings
                                DNA Repair
•   There are at least 5 separate DNA repair
    mechanisms in human cells
•   Direct repair, simply reversing the damage,
    is possible in some cases, notably removing
    methyl groups from guanine.
•   Base excision repair. A damaged base is
    removed from its sugar by a DNA
    glycosylase (several types). After this, the
    DNA strand is cut by AP endonuclease and
    the sugar-phosphate without its base is
    removed from the DNA chain. A new
    nucleotide is added by DNA polymerase and
    the chain is re-ligated.
•   Nucleotide excision repair. Abnormal bases,
    including thymidine dimers, are removed
    along with a number of surrounding bases.
    The missing section is then re-synthesized
    and ligated. Xeroderma pigmentosum, a
    genetic disease that causes extreme
    sensitivity to sunlight, is due to defects in
    this repair system.
                      DNA Repair
•   Post-replication repair. Double stranded
    breaks are repaired by randomly joining
    DNA ends, or by a gene-conversion-like
    mechanism that involves the homologous
    chromosome. The breast cancer
    susceptibility genes BRCA1 and BRCA2 are
    involved in this pathway.
•   Mismatch repair. Mispaired bases (those
    not caught by the DNA polymerase’s editing
    function) are repaired by an enzyme
    complex that moves along the DNA. When
    it finds a mismatched base pair, it removes a
    number of bases on one of the DNA strands
    and re-synthesizes them. The gene for
    hereditary non-polyposis colon cancer is
    involved in this system.

•   In addition, cells with DNA damage are often
    induced to kill themselves through the
    process of apoptosis, or they stop dividing
    by not entering the S phase of the cell cycle.
    More on this when we talk about cancer.

To top