Learning Center
Plans & pricing Sign in
Sign Out

Speciation_ hybrid zones and outliers


									Detecting selection using
     genome scans
        Roger Butlin
    University of Sheffield
Nielsen R (2005) Molecular signatures of natural selection. Annu. Rev.
Genet. 39, 197–218.

What signatures does selection leave in the genome?

1.   Population differentiation – today’s focus!
2.   Frequency spectrum, e.g. Tajima’s D
3.   Selective sweeps
4.   Haplotype structure (linkage disequilibrium)
5.   MacDonald-Kreitman tests (or PAML over long time-scales)
Frequency distribution:

   From Nielsen (2005): frequency of derived allele in a sample of 20 alleles.

   Tajima’s D = (π-S)/sd, summarises excess of rare variants
Selective sweep:
Extended haplotype homozygosity (Sabeti et al. 2002)
MacDonald-Kreitman and related tests

          dN = replacement changes per replacement site
          dS = silent changes per silent site

          dN/dS = 1 - neutral

          dN/dS < 1 - conserved (purifying selection)

          dN/dS > 1 - adaptive evolution (positive selection)
Selection on phenotypic traits:
Association analysis
Candidate genes
Genome scans
(aka ‘Outlier analysis’)
Littorina saxatilis – locally adapted morphs

                                               What signatures of
                                               selection might we look



Thornwick Bay
Signatures of selection:

    Departure from HWE
    Low diversity (selective sweep)
    Frequency spectrum tests
    High divergence
    Elevated proportion of non-synonymous substitutions
Neutral loci
Stabilizing selection
Local adaptation
Charlesworth et al. 1997 (from Nosil et al. 2009)
A concrete example: adaptation to altitude in Rana temporaria (Bonin et al. 2006)

                              High – 2000m

                           Intermediate – 1000m

                               Low – 400m

                                                            190 individuals
                                                            392 AFLP bands
Generating the expected distribution

DetSel – Vitalis et al. 2001                     Dfdist – Beaumont & Nichols 1996

                to      t                                        m   N
  Ne       N0 Ne                                     N                      N
                        N2                                       N   N
                                                             N       N

F1,2 – measure of divergence                         FST – symmetrical population
of population 1,2 from                               differentiation, as a function
population 2,1                                       of heterozygosity

                        Does the structure/history matter?
DetSel                         Dfdist
             95% CI



         ‘Low 1’ vs ‘High 1’
                                  DetSel Dfdist         Both   Interpretation

   Monomorphic in one             35       N/A                 Unreliable outliers

   Significant in one             14       29                  False positives
   Significant in comparisons     3        11                  Local effects
   involving one population

   Significant in at least 2      2        3            1      Adaptation to
   comparisons                                                 altitude
   Significant in global                   6                   Adaptation to
   comparison across altitudes             (2 at 99%)          altitude

392 AFLPs, 12 pairwise comparisons across altitude or 3 altitude categories, 95% cut off
343 loci   8 loci
 Outliers and selected traits

Rogers and Bernatchez (2007):
     Dwarf x Normal cross  both backcrosses
     Measure ‘adaptive’ traits (9)
     QTL map (>400 AFLP plus microsatellites)
     Homologous AFLP in 4 natural sympatric population pairs
     Outlier analysis (forward simulation based on Winkle)

                                                                          Coregonus clupeaformis
                                                                          (lake whitefish)

                            Homologous          Outlier AFLP in   Outlier within QTL
                               AFLP            homologous set*    (based on 1.5 LOD support)

     Hybrid x Dwarf             180                  19                         9
                                                                  (3.6 expected, P=0.0015)

     Hybrid x Normal            131                   8                         4
                                                                  (0.5 expected, P=0.0002)

                             *Only 3 outliers shared between lakes
Roger Butlin - Genome scans   21
Nosil et al. 2009 review of 14 studies:

     1.   0.5 – 26% outliers, most studies 5-10%
     2.   1 - 5% outliers replicated in pair-wise comparisons
     3.   25 - 100% of outliers specific to habitat comparisons
     4.   No consistent pattern for EST-associated loci
     5.   LD among outliers typically low

     But many methodological differences between studies
         Population sampling
         Marker type
         Analysis type and options
         Statistical cut-offs
Environmental correlations

    SAM – Joost et al. 2007

    IBA – Nosil et al. 2007

         FST for each locus correlated with ‘adaptive distance’, controlling for geographic
         distance (partial Mantel test)
Methodological improvements – Bayesian approaches

    BayesFst – Beaumont & Balding 2004
    Bayescan – Foll & Gaggiotti 2008

                                For each locus i and population j we have an FST measure,
                                relative to the ‘ancestral’ population, Fij

                                Then decompose into locus and population components,

                                Log(Fij/(1-Fij) = αi + βj
                                αi is the locus-effect
                                  – 0 neutral, +ve divergence selection, -ve balancing selection
                                βj is the population effect

                                Assuming Dirichlet distribution of allele frequencies among
                                subpopulations, can estimate αi + βj by MCMC

                                In Bayescan, also explicitly test αi = 0
Apparently much greater power to detect balancing selection than FDIST
Lower false positive rate
Wider applicability
Methodological improvements – hierarchical structure

    Arlequin – Excoffier et al. 2009
Circles – simulated STR data, grey – null distribution
 Bayenv – Coop et al. 2010

     Estimates variance-covariance matrix of allele frequencies then tests for
     correlations with environmental variables (or categories).

     Software available at:

Multiple analyses? Candidate vs control? E.g. Shimada et al. 2010
Hohenlohe et al. 2010
Mäkinen et al 2008

7 populations
3 marine, 4 freshwater

103 STR loci
Analysed by BayesFst
(and LnRH)

5 under directional selection
(3 in Eda locus)

15 under balancing selection

Used as a test case by Excoffier et al
     2 directional
     3 balancing
Can we replicate these results?


    Stickleback_allele.txt – input file
    Output_fst.txt – view with R routine plot_Bayescan


    Stickleback_data_standard.arp – IAM
    Stickleback_data_repeat.arp – SMM

    Run using Arlequin3.5

    Try hierarchical and island models, maybe different hierarchies
Sympatric speciation?
   FST distribution as evidence of speciation with gene flow

   Savolainen et al (2006)

   Cf. Gavrilets and Vose (2007)
         • few loci underlying key traits
         • intermediate selection
         • initial environmental effect on phenology

                                                               Howea - palms

To top