Mutation Types and Sources • Mutation is a decay force whose ultimate roots are in the second law of thermodynamics (entropy). Living things survive inevitable mutations by a combination of being tolerant of a certain level of mutation, repairing mutational damage, killing cells that are mutated beyond repair, and relying on natural selection to remove individuals with unfavorable mutations. • Simple mutations: base substitutions and small indels. “Indel” stands for insertion- deletion, which is based on the idea that when you see a difference in DNA sequence between two species it is usually difficult to tell whether there was an insertion in one species or a deletion in the other. • More complex mutations are larger events involving the insertion, rearrangement, or deletion of large pieces of DNA. Typical events include fusion of two different genes and insertion of transposable elements. • Internal sources: DNA polymerase can insert the wrong nucleotide or slip at a certain rate. Transposable elements can move, cause other sections of DNA to move, or produce reverse transcriptase that acts on other messenger RNAs. • External sources: damage to the DNA caused by chemicals in the environment, including oxygen, or by radiation. DNA Polymerase • DNA polymerase, the enzyme that replicates DNA, is not perfectly accurate. One problem is that bases spontaneously undergo a “keto-enol shift”, where a hydrogen moves its position in ketones. Guanine and thymine bases are subject to this at a low rate, and it causes mispairing. • DNA polymerase has a proofreading function, a 3’ to 5’ exonuclease activity, which backs up and removes newly inserted nucleotides if they are mispaired. This function lowers the DNA polymerase error rate from about 1 error in 106 nucleotides to about 1 in 109. Still, that is about 6 errors every time the genome is replicated. • DNA polymerase also can slip, especially when replicating short repeats (microsatellites). This generates small indels. CpG Islands • Another chemical instability is that cytosine occasionally gets deaminated: it loses an amino group. This converts it into uracil, which is not a DNA base and is removed by repair enzymes. • However, in many places, a C followed by a G (CpG: the “p” is the connecting phosphate) gets methylated: a CH3 group is attached to the 5 position on the ring. • When 5-methyl cytosine is spontaneously deaminated, it is converted to thymine, a standard DNA base. Replication leads to a base change: one daughter stays a C-G base pair while the other is converted to T-A. • Over evolutionary time, this has led to a loss of CpG dinucleotides in human DNA. • However, methylation of cytosine is associated with gene inactivation, and genes that are expressed in most cells (housekeeping genes) usually do not have methylated cytosines at their 5’ ends. In these areas, the frequency of CpG stays high. • These areas of high CpG are called “CpG islands”. There are about 30,00 of them in the human genome, and most of them are associated with genes. • However, the presence of a CpG island does not necessarily imply the existence of a gene, and vice versa. Base Substitutions • Two basic types: – transition: converting one purine to the other purine, or one pyrimidine into the other pyrimidine. – transversion: converting a purine to a pyrimidine or the reverse. • Logically, transversions should be twice as frequent since there are twice as many of them as transitions. • However, in practice, transitions are about twice as common as transversion. Due to a combination of natural selection and ease of occurrence. • Neutral substitution rate: how often to nucleotides change in the absence of selection pressure. In a comparison of the human and mouse genomes, 165 Mbp of DNA associated with non-functional transposon sequences were identified in both species. These had about 67% identical bases, and models implied a rate of 0.46 substitutions per position over the 75 million years since the human and mouse lineages diverged.. This works out to 2 x 10-9 substitutions per year for each site, in the absence of selection pressure. This estimate agrees with other estimates based on different methods. Substitutions Within Genes • We mostly care about the functional parts of the genome, the genes and their control regions. Since most of the genes are presumably necessary for life, some mutations will be deleterious and others not. • In the human-mouse genome comparison, variation in the rate of substitutions across the various portions of genes was clear: fewest in the exons, most in the introns, and an intermediate amount in the UTRs and flanking regions. • For coding regions, the degeneracy of the genetic code has a large effect. – some sites are non-degenerate: any change results in a different amino acid. 65% of human codons. – other sites are two-fold degenerate: transitions give the same amino acid while transversions give a different amino acid. 19% of codon sites. – other sites are four-fold degenerate: any mutation gives the same amino acid. These sites are all third positions of codons. 16% of codon sites. • Mutations that give the same amino acid are called silent or synonymous mutations. They are presumed to be selectively neutral. More on Substitution • In addition to synonymous mutations, some amino acid changes are “conservative” in that they have little or no affect on the protein’s function. – for example, isoleucine and valine are both hydrophobic and readily substitute for each other. – other amino acid substitutions are very unlikely: leucine (hydrophobic) for aspartic acid (hydrophilic and charged). This would be a non-conservative substitution. – Some amino acids play unique roles: cysteines form disulfide bridges, prolines induce kinks in the chain, etc. – However, some amino acids BLOSUM62 Table. Numbers on the diagonal are critical fro active sites and indicate the likelihood of the amino acid cannot be substituted. staying the same. The off-diagonal numbers • Tables of substitution are relative substitution frequencies. frequencies for all pairs of amino acids have been generated. Detecting Natural Selection • Patterns of base substitution within a gene can be used as evidence for natural selection, by comparing the ratio of synonymous to non-synonymous substitutions. • Compare orthologs: genes in two different species that can be traced to a common ancestor. • Can also compare paralogs within a species: genes resulting from duplication. – a confounding problem: can you accurately identify orthologs between species, or are you comparing paralogs between the species? • Measured by comparing KS, the number of synonymous substitutions per site, to KA, the number of non-synonymous substitutions per site. Note that these numbers are corrected for the different levels of degeneracy for each site. The summary statistic is the KA / KS ratio. • Possible results. – neutral selection: the gene is apparently not being selected. Often seen when a pseudogene is compared to a functional gene. Synonymous and non-synonymous substitutions occur at the same frequency. KA / KS = 1. – negative (purifying) selection: the gene is being selected for similar functions in both species. Synonymous substitutions are more frequent than non-synonymous. KA / KS < 1 – positive (disruptive) selection: the gene is being selected for different functions in the two species. An unexpectedly high number of non-synonymous substitutions. KA / KS > 1 • The median KA / KS value for humans vs. mice was 0.115. The lowest value (greatest purifying selection) was for calmodulin, histones, ribosomal proteins, ubiquitin, actin: genes involved with critical cellular functions common to all organisms. The highest ratios were seen for defense and immune response proteins Trinucleotide Repeats • Trinucleotide repeats (TNRs) are a type of microsatellite, an array of 3 bp repeats. • DNA polymerase often slips at TNRs, increasing or decreasing the copy number. • Because a codon is 3 bp long, TNRs within a coding region don’t change the reading frame. • However, some TNRs cause diseases even though they are in the UTRs. • There are only 10 possible TNRs, considering the two DNA strands and the different orders you could write the bases. For example, the TNR that causes Fragile X syndrome could be written as CCG, CGC, GCC, GGC, or GCG. • Below a certain number, the repeats are relatively stable. But, above that, the copy number can change drastically in both mitosis and meiosis. These alleles are called “pre-mutation alleles”. Above an even higher point, the mutant phenotype appears. Huntington Disease • Huntington Disease. A dominant autosomal disease, with most people heterozygotes. • Onset usually in middle age. • Neurological: starts with irritability and depression, includes fidgety behavior and involuntary movement (chorea), followed by psychosis and death. • Caused by CAG repeats within the coding region, giving a tract of glutamines. Below 28 copies is normal, between 28 and 34 copies is the premutation allele: normal phenotype but unstable copy number that puts the next generation at risk. Above 34 copies gives the disease. • HD shows “anticipation”: the age of onset gets earlier with every generation. This is due to a direct correlation between copy number and age of onset. • There is a genetic test for the disease, but in the absence of effective treatment few actually take the test. • Function of the protein remains unknown, the excess glutamines may cause it to aggregate and lose function. Fragile X Syndrome • Fragile X syndrome. The most common form of human mental retardation. • The phenotype includes moderate to severe mental retardation, macroorchidism, large ears, prominent jaw, and high-pitched, jocular speech. Expression is variable, with mental retardation the most common feature. • Males having only 1 X, are affected more frequently and severely than females. • Appears as a secondary constriction on the X, which appears in cells starved for folate. The X can actually break at that point, but this isn’t a common feature. • Caused by CGG repeats in the 5’ UTR of the FMR1 gene. • Normal copy number is about 30. Between 55 and 200 copies, the copy number is unstable, but the person is normal. Above 200 copies, the mutant phenotype appears. • The gene gets heavily methylated and is not expressed. • The function of the protein is unclear, but it is an RNA-binding protein that seems to be involved with translational regulation, possibly through RNA interference as part of the RISC complex. Mutations Affecting RNA • Altered promoters, splice sites, poly-A addition sites. Gene Conversion • If a cell contains two different copies of a gene, either on homologous chromosomes or as paralogs, sometimes one copy will “convert” the other copy to its sequence. • Gene conversion (at least between homologues) is a normal outcome of recombination. We need to look at the Holliday molecular model of recombination to understand this. This model is a bit simple compared to current theory, but is still basically correct. • The homologues are paired in prophase of meiosis 1. • Single stranded breaks in both homologues are catalyzed by recombinase. • The free ends invade the homologous DNA, forming heteroduplexes. • “Branch migration” occurs and the heteroduplexes are extended. More Gene Conversion • Recombinase cuts the DNA molecules • Two possibilities at this point, occurring with equal frequency. • 1. A “north-south” cut occurs after the 2 DNA molecules twist relative to each other. The result is a crossover: the two homologues are broken and rejoined at this point, giving recombinant chromosomes. Note that there is a heteroduplex region at the breakpoint. More Gene Conversion • 3. The other possibility is that an “east-west” cut occurs. This gives a short heteroduplex region, but the 2 chromosomes are still intact: no crossover has occurred. • However, if the heteroduplex occurs within a gene that is being monitored, it will result in an offspring with an altered gene: gene conversion. Steroid 21-Hydroxylase Deficiency • The medical condition is “congenital adrenal hyperplasia”, and autosomal recessive condition. 21-hydroxylase is an enzyme necessary for converting cholesterol into aldosterone and cortisol. Aldosterone affects kidney function: causes salt to be retained. Cortisol is the main stress response hormone. • The biggest problem is that hormone precursors build up in the adrenals and get converted to testosterone, the major male hormone. This causes the external genitalia to develop into the male pattern, or develop “ambiguous genitalia” regardless of the individual’s gender (“virilization”). In milder cases, and in males, puberty occurs early in childhood. Female embryos develop a normal uterus and ovaries. • In some cases, salt is not retained in the body well, which is life-threatening but treatable with hormones. • The functional gene, CYP21A2, is located about 30 kb from a pseudogene, CYP21A2P on chromosome 6p. The pseudogene contains 9 mutations that inactivate it. Almost all cases result from one of two causes: – An unequal crossing over between these loci, resulting in a normal 5’ end of the gene and a mutant 3’ end (from the pseudogene), plus deletion of all teh intervening sequences. – Gene conversion converts part of the normal allele to the pseudogene sequence. Hemophilia A: Inversion Problems • The clotting factor VIII gene, F8, is on the X chromosome and is the major cause of hemophilia. • F8 is a large gene, and completely contained within intron 22 are two small genes transcribed from the opposite strand. • One of these genes, F8A, has another copy several hundred kb away, on the opposite strand. Thus, these two very similar genes are in opposite orientation. • Sometimes crossing over during meiosis will pair these regions are recombination will occur. This results in an inversion. • The inversion completely disrupts the main F8 gene, because its 5’ half is now inverted and far away from its 3’ half. • This accounts for about 45% of hemophilia A cases. • Almost all new cases arise during male meiosis: in females, the two homologous X chromosomes are paired, which seems to inhibit this inversion. Transposable Element Insertions • Functional copies of LINE-1 elements, Alu sequences, and some endogenous retroviral sequences (LTR retrotransposons) exist in the human genome. They occasionally transpose into genes that give a detectable phenotype. • The first examples found were two independent insertions of the 3’ end of LINE-1 into exons of the clotting factor 8 gene. Additional examples have been found since. • Transposable element movement has also been implicated in cancer and the chromosome rearrangements that accompany it. • Recombination between Alu sequences in different parts of the genome can generate deletions. DNA Damage • A list of agents that damage DNA: – ionizing radiation: induces breaks in DNA – Ultraviolet light: crosslinks adjacent thymidines (thymidine dimers). – alkylating agents: attach hydrocarbon groups to bases, either blocking DNA polymerase or crosslinking the bases – intercalating agents: slip between the DNA bases and cause DNA polymerase to insert extra bases or misread the sequence. – depurination: the link between purine bases and the deoxyribose spontaneously breaks – deaminination: loss of amino group from cytosine convers it to uracil – reactive oxygen: peroxide and superoxide attack the purine and pyrimidine rings DNA Repair • There are at least 5 separate DNA repair mechanisms in human cells • Direct repair, simply reversing the damage, is possible in some cases, notably removing methyl groups from guanine. • Base excision repair. A damaged base is removed from its sugar by a DNA glycosylase (several types). After this, the DNA strand is cut by AP endonuclease and the sugar-phosphate without its base is removed from the DNA chain. A new nucleotide is added by DNA polymerase and the chain is re-ligated. • Nucleotide excision repair. Abnormal bases, including thymidine dimers, are removed along with a number of surrounding bases. The missing section is then re-synthesized and ligated. Xeroderma pigmentosum, a genetic disease that causes extreme sensitivity to sunlight, is due to defects in this repair system. DNA Repair • Post-replication repair. Double stranded breaks are repaired by randomly joining DNA ends, or by a gene-conversion-like mechanism that involves the homologous chromosome. The breast cancer susceptibility genes BRCA1 and BRCA2 are involved in this pathway. • Mismatch repair. Mispaired bases (those not caught by the DNA polymerase’s editing function) are repaired by an enzyme complex that moves along the DNA. When it finds a mismatched base pair, it removes a number of bases on one of the DNA strands and re-synthesizes them. The gene for hereditary non-polyposis colon cancer is involved in this system. • In addition, cells with DNA damage are often induced to kill themselves through the process of apoptosis, or they stop dividing by not entering the S phase of the cell cycle. More on this when we talk about cancer.
Pages to are hidden for
"Mutation"Please download to view full document