Document Sample
pop_genet Powered By Docstoc
					Population Genetics
  Evolution by Natural Selection
• Unlike Mendel, Charles Darwin made a big splash when
  his defining work, "On the Origin of Species by Means of
  Natural Selection, or the Preservation of Favoured
  Races in the Struggle for Life" (which we refer to as “The
  Origin of Species”) published in 1859.
• Darwin set forth a scientific theory that described how
  one species could give rise to another species, given
  sufficient time. It was heavily attacked at the time (and
  continuing to this day) by people who thought that it
  contradicted their religious beliefs. Nevertheless, the
  basic theory has survived and flourished, and today it is
  one of the main pillars of biological theory.
• A fundamental concept in evolutionary theory is “fitness”,
  which can defined as the ability to survive and
  reproduce. Reproduction is key: to be evolutionarily fit,
  an organism must pass its genes on to future
• Basic idea behind evolution by natural selection: the
  more fit individuals contribute more to future generations
  than less fit individuals. Thus, the genes found in more
  fit individuals ultimately take over the population.
• Natural selection requires 3 basic conditions:
   – 1. there must be inherited traits.
   – 2. there must be variation in these traits among members of the
   – 3. some inherited traits must affect fitness
         Genetics of Populations
• Darwin didn’t understand how inheritance worked--Mendel’s work
  was still in the future. It wasn’t until the 1930’s when Mendelian
  genetics was incorporated into evolutionary theory, in what is called
  the “Neo-Darwinian synthesis”.
• Translated into Mendelian terms, the basis for natural selection is
  that alleles that increase fitness will increase in frequency in a
• Thus, the main object of study in evolutionary genetics is the
  frequency of alleles within a population.
• A “population” is a group of organisms of the same species that
  reproduce with each other. There is only one human population: we
  all interbreed.
• The “gene pool” is the collection of all the alleles present within a
• We are mostly going to look at frequencies of a single gene, but
  population geneticists generally examine many different genes
    Allele and Genotype Frequencies
•   Each diploid individual in the population has 2 copies of each gene. The allele
    frequency is the proportion of all the genes in the population that are a particular
•   The genotype frequency of the proportion of a population that is a particular

•   For example: consider the MN blood group. In a certain population there are 60 MM
    individuals, 120 MN individuals, and 20 NN individuals, a total of 200 people.
•   The genotype frequency of MM is 60/200 = 0.3.
•   The genotype frequency of MN is 120/200 = 0.6
•   The genotype frequency of NN is 20/200 = 0.1

•   The allele frequencies can be determined by adding the frequency of the homozygote
    to 1/2 the frequency of the heterozygote.
•   The allele frequency of M is 0.3 (freq of MM) + 1/2 * 0.6 (freq of MN) = 0.6
•   The allele frequency of N is 0.1 + 1/2 * 0.6 = 0.4

•   Note that since there are only 2 alleles here, the frequency of N is 1 - freq(M).
 Heterozygosity and Polymorphism
• A gene is called “polymorphic” if there is more than 1
  allele present in at least 1% of the population. Genes
  with only 1 allele in the population are called
  “monomorphic”. Some genes have 2 alleles: they are
• In a study of white people from New England, 122
  human genes that produced enzymes were examined.
  Of these, 51 were monomorphic and 71 where
  polymorphic. On the DNA level, a higher percentage of
  genes are polymorphic.

• Heterozygosity is the percentage of heterozygotes in a
  population. Averaged over the 71 polymorphic genes
  mentioned above, the heterozygosity of this population
  of humans was 0.067.
    Hardy-Weinberg Equilibrium
• Early in the 20th century G.H. Hardy and Wilhelm Weinberg
  independently pointed out that under ideal conditions you could
  easily predict genotype frequencies from allele frequencies, at least
  for a diploid sexually reproducing species such as humans.

• For a dimorphic gene (two alleles, which we will call A and a), the
  Hardy-Weinberg equation is based on the binomial distribution:
    p2 + 2pq + q2 = 1
  where p = frequency of A and q = frequency of a, with p + q = 1.
• p2 is the frequency of AA homozygotes
• 2pq is the frequency of Aa heterozygotes
• q2 is the frequency of aa homozygotes

• H-W can be viewed as an extension of the Punnett square, using
  frequencies other than 0.5 for the gamete (allele) frequencies.
     Hardy-Weinberg Example
• Taking our previous example population, where
  the frequency of M was 0.6 and the frequency of
  N was 0.4.
• p2 = freq of MM = (0.6)2 = 0.36
• 2pq = freq of MN - 2 * 0.6 * 0.4 = 0.48
• q2 = freq of NN = (0.4)2 = 0.16

• These H-W expected frequencies don’t match
  the observed frequencies. We will examine the
  reasons for this soon.
         Rare Alleles and Eugenics
•   A popular idea early in the 20th century was
    “eugenics”, improving the human population through
    selective breeding. The idea has been widely
    discredited, largely due to the evils of “forced
    eugenics” practiced in certain countries before and
    during World War 2. We no longer force “genetically
    defective” people to be sterilized.
•   However, note that positive eugenics: encouraging
    people to breed with superior partners, is still
    practiced in places.

•   The problem with sterilizing “defectives” is that most
    genes that produce a notable genetic diseases are
    recessive: only expressed in heterozygotes. If you
    only sterilize the homozygotes, you are missing the
    vast majority of people who carry the allele.
•   For example, assume that the frequency of a gene
    for a recessive genetic disease is 0.001, a very
    typical figure. Thus p = 0.999 and q = 0.001. Thus
    p2 = 0.998, 2pq = 0.002, and q2 = 0.000001. The
    ratio of heterozygotes (undetected carriers) to
    homozygotes (people with the disease) is 2000 to 1:
    you are sterilizing only 1/2000 of the people who
    carry the defective allele. This is simply not a
    workable strategy for improving the gene pool.
Nazi Eugenics

          "The Threat of the Underman. It
          looks like this: Male criminals
          had an average of 4.9 children,
          criminal marriage, 4.4 children,
          parents of slow learners, 3.5
          children, a German family 2.2
          children, and a marriage from
          the educated circles, 1.9
Estimating Allele Frequencies from
Recessive Homozygote Frequency
• If Hardy-Weinberg equilibrium is assumed (an assumption we will
  examine shortly), it is possible to estimate the allele frequencies for
  a gene that shows complete dominance even though heterozygotes
  can’t be distinguished from the dominant homozygotes.
• The frequency of recessive homozygotes is q2. Thus, the frequency
  of the recessive allele is the square root of this. Very simple.
• For example, the recessive genetic disease PKU has a frequency in
  the population of about 1 in 10,000. q2 thus equals 0.0001 (10-4).
  The square root of this is 0.01 (10-2), which implies that the
  frequency of the PKU allele is 0.01 and the frequency of the normal
  allele is 0.99. Thus the frequency of the heterozygous genotype is 2
  * 0.99 * 0.01 = 0.198. Abut 2% of the population is a carrier of the
  PKU allele.

• Note again: this ASSUMES H-W equilibrium, and this assumption is
  not always true.
   Necessary Conditions for Hardy-
       Weinberg Equilibrium
• The relationship between allele frequencies and genotype
  frequencies expressed by the H-W equation only holds if these 5
  conditions are met. None of them is completely realistic, but all are
  met approximately in many populations.
• If a population is not in equilibrium, it takes only 1 generation of
  meeting these conditions to bring it into equilibrium. Once in
  equilibrium, a population will stay there as long as these conditions
  continue to be met.

    –   1. no new mutations
    –   2. no migration in or out of the population
    –   3. no selection (all genotypes have equal fitness)
    –   4. random mating
    –   5. very large population
    Testing for H-W Equilibrium
• If we have a population where we can distinguish all
  three genotypes, we can use the chi-square test once
  again to see if the population is in H-W equilibrium. The
  basic steps:
   – 1. Count the numbers of each genotype to get the observed
     genotype numbers, then calculate the observed genotype
   – 2. Calculate the allele frequencies from the observed genotype
   – 3. Calculate the expected genotype frequencies based on the H-
     W equation, then multiply by the total number of offspring to get
     expected genotype numbers.
   – 4. Calculate the chi-square value using the observed and
     expected genotype numbers.
   – 5. Use 1 degree of freedom (because there are only 2 alleles).
•   Data: 26 MM, 68 MN, 106 NN, with a total population of 200 individuals.
•   1. Observed genotype frequencies:
     –   MM: 26/200 = 0.13
     –   MN: 68/200 = 0.34
     –   NN:106/200 = 0.53
•   2. Allele frequencies:
     –   M: 0.13 + 1/2 * 0.34 = 0.30
     –   N: 0.53 + 1/2 * 0.34 = 0.70
•   3. Expected genotype frequencies and numbers:
     –   MM: p2 = (0.30)2 = 0.09 (freq) x 200 = 18
     –   MN: 2pq = 2 * 0.3 * 0.7 = 0.42 (freq) * 200 = 84
     –   NN: q2 = (0.70)2 = 0.49 (freq) * 200 = 98
•   4. Chi-square value:
     –   (26 - 18)2 / 18 + (68 - 84)2 / 84 + (106 - 98)2 / 98
     –    = 3.56 + 3.05 + 0.65
     –    = 7.26
•   5. Conclusion: The critical chi-square value for 1 degree of freedom is 3.841. Since
    7.26 is greater than this, we reject the null hypothesis that the population is in Hardy-
    Weinberg equilibrium.
       Relaxing the H-W Conditions:
             Random Mating
•   The fullest meaning of “random mating” implies that any gamete has an
    equal probability of fertilizing any other gamete, including itself. In a sexual
    population, this is impossible because male gametes can only fertilize
    female gametes.
•   More or less random mating in a sexual population is achieved in some
    species of sea urchin, which gather in one place and squirt all of their
    gametes, male and female, out into the open sea. The gametes then find
    each other and fuse together to become zygotes.
•   In animal species, mate selection is far more common than random
    fertilization. A very general rule is “assortative mating”, that like tends to
    mate with like: tall people with tall people, short people with short people,
    etc. This rule is true for externally detectable phenotypes such as
    appearance, but invisible traits like blood groups are usually close to H-W
    equilibrium in the population.
•   Assortative mating is most easily analyzed as a tendency for inbreeding.
    You are more like your relatives than you are to random strangers. Thus
    you are somewhat more likely to mate with a distant relative than would be
    expected by chance alone.
                                        Japanese Blood Type Personality Chart
My Boyfriend is                    Best
                                                           Type A
                                             Conservative, reserved, patient, punctual,

   Type B                          Traits
                                             perfectionist and good with plants.
                                             Introverted, obsessive, stubborn, self
                                   Traits    conscious, and uptight

                                                           Type B
                                   Best      Creative and passionate. Animal loving.
                                   Traits    Optimistic and flexible
                                   Worst     Forgetful, irresponsible, individualist

                                                          Type AB
                                   Best      Cool, controlled, rational. Sociable and
                                   Traits    popular. Empathic
                                   Worst     Aloof, critical, indecisive and unforgiving

                                                           Type O
                                   Best      Ambitious, athletic, robust and self-confident.
                                   Traits    Natural leaders
 in Korean, written and directed   Worst     Arrogant, vain and insensitive. Ruthless
 by Choi Seok-Won                  Traits
         Measuring Inbreeding
• Recall that inbreeding decreases the number of
  heterozygotes in the population: each generation of
  selfing decreases the number of heterozygotes by 1/2.
• By comparing the number of heterozygotes observed to
  the number expected for a population in H-W
  equilibrium, we can estimate the degree of inbreeding.
• A measure of inbreeding in the “inbreeding coefficient”,
     F = 1 - (obs hets) / (exp hets).
• If F = 0, the observed heterozygotes is equal to the
  expected number, meaning that the population is in H-W
• If F = 1, there are no heterozygotes, implying a
  completely inbred population.
• Thus, the higher F is, the more inbred the population is.
• Wild oats is a common plant in California, the cause of the golden-
  brown hillsides all summer out there.
• Wild oats can pollinate itself, but the pollen also blows in the wind so
  it can cross fertilize. The task is to estimate the relative proportions
  of these two types of mating.

• Data for the phosphoglucomutase (Pgm) gene:
    – 104 AA, 9 AB, 42 BB = 155 total individuals
• H-W calculations:
    – freq of A = 104 + 1/2 * 9 = 108.5 / 155 = 0.7
    – freq of B = 1 - freq(A) = 0.3

    –   exp heterozygotes = 2pq = 2 * 0.7 * 0.3 = 0.42 (freq) * 155 = 65.1
    –   F = 1 -(obs hets) / (exp hets) = 1 - 9 / 65.1 = 1 - 0.14
    –   F = 0.84
    –   This is a very inbred population: most matings are self-pollination.
Inbreeding Depression and Genetic
•   For most species, including humans,
    too much inbreeding leads to weak         gen litter   % dead
    and sickly individuals, as seen in this
    example of mice inbred by brother-
    sister matings.
                                                  size     by 4
•   Inbreeding depression is caused by
    homozygosity of genes that have slight
    deleterious effects. It has been
    estimated that on the average, each       0    7.50    3.9
    human carries 3 recessive lethal
    alleles. These are not expressed
    because they are covered up by
    dominant wild type alleles. This
                                              6    7.14    4.4
    concept is called the “genetic load”.
•   However, it has been argued that
    some amount of inbreeding is good,
                                              12   7.71    5.0
    because it allows the expression of
    recessive genes with positive effects.
    The level of inbreeding in the US has     18   6.58    8.7
    been estimated (from Roman Catholic
    parish records) at about F = 0.0001,
    which is approximately equivalent to      24   4.58    36.4
    each person mating with a fifth cousin.

                                              30   3.20    45.5
• Mutation is unavoidable. It happens as a result of
  radiation in the environment: cosmic rays, radioactive
  elements in rocks and soil, etc., as well as mutagenic
  chemical compounds, both natural and artificially made,
  and just as a chance event inherent in the process of
  DNA replication.
• However, the rate of mutation is quite low: for any given
  gene, about 1 copy in 104 - 106 is a new mutation.
• Mutations provide the necessary raw material for
  evolutionary change, but by themselves new mutations
  do not have a measurable effect on allele or genotype
• Migration is the movement of individuals in or
  out of a population. Migration is necessary to
  keep a species from fragmenting into several
  different species. Even as low a level as one
  individual per generation moving between
  populations is enough to keep a species unified.
• Migration can be thought of as combining two
  populations with different allele frequencies and
  different numbers together into a single
  population. After one generation of random
  mating, the combined population will once again
  be in H-W equilibrium.
                 Migration Examples
•   Population X has 20 individuals with frequency of the A allele = 0.8.
    Population Y has 10 individuals with frequency of the A allele = 0.2. The
    two populations mix. What is the frequency of A in the final population?
•   There are 20 + 10 = 30 individuals in the final population, for a total of 60
    copies of the gene.
     – For population X, 40 * 0.8 = 32 copies are A, and 8 are a.
     – For population Y, 20 * 0.2 = 4 copies are A, and 16 are a.
     – Adding these together, the final population has 32 + 4 = 36 A alleles and 8 + 16 =
       24 a alleles. Out of 60 alleles, the frequency of A is 36/60 = 0.6

•   A real example: African Americans have a large proportion of African
    ancestry, but also some European ancestry. The Duffy blood group has an
    allele with a frequency of 0 among West African populations, and an
    average frequency of 0.43 among European populations. Other blood
    groups can also be used in this technique: very little assortative mating
    occurs on the basis of blood group.
     – In Oakland CA, African-Americans are reported to have about 22% European
     – In Charleston South Carolina, the proportion is about 3.7%
• Selection is the primary factor driving evolution. Genes that confer
  increased fitness tend to take over a population. Note that random
  events also play a big factor: sometimes a “good” gene is lost due to
  chance events. Also, a gene that confers increased fitness in one
  environment may confer decreased fitness in another environment.
• Selection can occur at many places in the life cycle: the embryo
  might be defective, the fetus might not survive to birth, the immature
  offspring might be killed, the individual might not be able to find a
  mate or might be sterile.
• We will simplify all of this by assuming that the gametes are
  produced at random and combine at random, to produce a
  population of zygotes in H-W equilibrium. Then, we will apply
  selection to the zygotes, killing off different proportions of the
  different genotypes.
• Fitness is a function of the genotype. We will define the “relative
  fitness” of the best genotype as equal to 1.0, and the fitnesses of the
  two other genotypes as equal to or less than 1.
        Selection Against Recessive
•   This situation is what happens with a recessive genetic disease.
    Heterozygotes and dominant homozygotes are indistinguishable and have
    the same relative fitness: 1.0. The recessive homozygote has the genetic
    disease and a fitness less than 1. The exact fitness depends on the nature
    of the disease.
•   Start with a population where p = 0.6 and q = 0.4, and assume that the aa
    homozygote has a relative fitness of 0.1 (i.e. 90% of the aa offspring die
    without reproducing).
•   The zygotes produces (in H-W equilibrium) are 0.36 AA, 0.48 Aa, and 0.16
•   Selection on the zygotes reduces the aa’s by 90%, to 0.016.
•   However, proportions must add to 1.0, so we divide each proportion by a
    correction factor. The correction factor is the sum of the remaining
    proportions: 0.36 + 0.48 + 0.016 = 0.856.
•   So, after selection, the frequency of AA is 0.36 / 0.856 = 0.42. The
    frequency of Aa is 0.48 / 0.856 = 0.56. The frequency of aa is 0.016 / 0.856
    = 0.019.
•   Final allele frequencies: A = 0.42 + 1/2 * 0.56 = 0.70. a = 1 - freq(A) = 0.3.
           Selection Favoring the
• Some genes maintain 2 alleles in the population by
  having the heterozygote more fit than either
• An example is HbS, the sickle cell hemoglobin allele. In
  rural West Africa, where malaria is endemic and medical
  support is rudimentary, the relative fitness of the HbA
  homozygote is estimated at 0.85, due to susceptibility to
  malaria. The relative fitness of the HbS homozygote is
  estimated at approximately 0, with almost none reaching
  reproductive age due to sickle cell disease. The
  heterozygote is the most fit, so it given a relative fitness
  of 1.0. Under these conditions, it is possible to predict
  an equilibrium frequency of the HbS allele of about 0.13.
  This is approximately what is seen in various West
  African countries.
                        Genetic Drift
• Genetic drift is the random changes in allele frequencies. Genetic
  drift occurs in all populations, but it has a major effect on small
• For Darwin and the neo-Darwinians, selection was the only force
  that had a significant effect on evolution. More recently it has been
  recognized that random changes, genetic drift, can also significantly
  influence evolutionary change. It is thought that most major events
  occur in small isolated populations.

• Simple example: A population of 1 female and 2 males, where the
  female chooses only 1 male to mate with. Assume that the female
  has the Aa genotype, male #1 is AA, and male #2 is aa.
    – initially the allele frequencies are 0.5 A and 0.5 a
    – if male #1 gets to mate, the offspring will have a 0.75 A, 0.25 a
    – if male #2 mates, the offspring will be 0.25 A and 0.75 a.
                  Fixation of Alleles
•   Genetic drift causes allele
    frequencies to fluctuate randomly
    each generation. However, if the
    frequency of an allele ever
    reaches zero, it is permanently
    eliminated from the population.
    The other allele, whose frequency
    is now 1.0, is “fixed”, which means
    that all individuals in the
    population will be homozygous for
    that allele. This continues for all
    future generations (in the absence
    of mutation).
•   The average rate at which alleles
    become fixed is a function of the
    population size. The larger the
    population, the longer it takes for
    fixation to occur.
    Population Bottlenecks and
         Founder Effect
• Bottlenecks and the founder effect are closely
  related phenomena.
• Founder effect: If a small group of individuals
  leaves a larger population and develops into a
  separate, isolated population, the allele
  frequencies in the new population are
  determined by the allele frequencies in the
  founders. Since these frequencies are probably
  different from those found in the general
  population, the new population will have a
  different set of frequencies.
• This is especially true for rare alleles, which can
  suddenly become prominent if one of the
  founders has the rare allele.
           Founder Effect Example
•   Founder effect example: the Amish are
    a group descended from 30 Swiss
    founders who renounced technological
    progress. Most Amish mate within the
    group. One of the founders had Ellis-
    van Crevald syndrome, which causes
    short stature, extra fingers and toes,
    and heart defects. Today about 1 in
    200 Amish are homozygous for this
    syndrome, which is very rare in the
    larger US population.
•   Note the effect inbreeding has here:
    the problem comes from this recessive
    condition becoming homozygous due
    to the mating of closely related people.
• A population bottleneck is essentially the same phenomenon as the
  founder effect, except that in a bottleneck, the entire species is
  wiped out except for a small group of survivors. The allele
  frequencies in the survivors determines the allele frequencies in the
  population after it grows large once again.
• Example: Pingalop atoll is an island in the South Pacific. A typhoon
  in 1780 killed all but 30 people. One of survivors was a man who
  was heterozygous for the recessive genetic disease achromatopsia.
  This condition caused complete color blindness. Today the island
  has about 2000 people on it, nearly all descended from these 30
  survivors. About 10% of the population is homozygous for
  achromatopsia This implies an allele frequency of about 0.26.
                Human Bottleneck
• The human population is
  thought to have gone through
  a population bottleneck about
  100,000 years ago. There is
  more genetic variation among
  chimpanzees living within 30
  miles of each other in central
  Africa than there is in the entire
  human species.
• The tree represents mutational
  differences in mitochondrial
  DNA for various members of
  the Great Apes (including