Docstoc

Bootstrapping - Download as PowerPoint

Document Sample
Bootstrapping - Download as PowerPoint Powered By Docstoc
					 Bootstrapping (non-parametric)

• Bootstrapping is a
  modern statistical
  technique that uses
  computer intensive
  random resampling
  of data to determine
  sampling error or
  confidence intervals
  for some estimated
  parameter
 Bootstrapping (non-parametric)
• Characters are resampled with replacement
  to create many bootstrap replicate data sets
• Each bootstrap replicate data set is analysed
  (e.g. with parsimony, distance, ML)
• Agreement among the resulting trees is
  summarized with a majority-rule consensus
  tree
• Frequency of occurrence of groups,
  bootstrap proportions (BPs), is a
  measure of support for those groups
• Additional information is given in partition
  tables
                             Bootstrapping
Original data matrix            Resampled data matrix
      Characters                     Characters
 Taxa 1 2 3 4 5 6 7 8           Taxa 1 2 2 5 5 6 6 8             Summarise the results of
 A   RRYYYYYY                   A   RRRYYYYY                     multiple analyses with a
 B   RRYYYYYY                   B   RRRYYYYY
 C   YYYYYRRR                   C   YYYYYRRR                     majority-rule consensus tree
 D   YYRRRRRR                   D   YYYRRRRR                     Bootstrap proportions (BPs)
 Outgp R R R R R R R R          Outgp R R R R R R R R
                                                                 are the frequencies with
                                                                 which groups are
Randomly resample characters from the original data with
replacement to build many bootstrap replicate data sets of the   encountered in analyses of
same size as the original - analyse each replicate data set      replicate data sets
                                                                     A     B    C    D
 A         B     C     D        A         B   C            D
                  1
                                              5
                 2     1                       5
     8                2                                              96%
      7                             8
                                     6                 2
       6
           5                          6            2                     66%
            4                                  1
             3
                                                                                Outgroup
                  Outgroup                     Outgroup
           Bootstrapping - an example
  Ciliate SSUrDNA - parsimony bootstrap                 Partition Table
                                 Ochromonas (1)       123456789    Freq
                                                      -----------------
                                 Symbiodinium (2)
                100                                   .**...... 100.00
                                 Prorocentrum (3)     ...**.... 100.00
                                                      .....**.. 100.00
                      84
                                 Euplotes (8)         ...****.. 100.00
                                                      ...******   95.50
                                 Tetrahymena (9)
                                                      .......**   84.33
      96                         Loxodes (4)          ...****.*   11.83
                           100                        ...*****.    3.83
                                 Tracheloraphis (5)   .*******.    2.50
                100
                                                      .**....*.    1.00
                                 Spirostomum (6)
                           100                        .**.....*    1.00
Majority-rule consensus
                                 Gruberia (7)
  Bootstrapping - random data
                                                                  Partition Table
Randomly permuted data - parsimony bootstrap                      123456789    Freq
                                                                  -----------------
                                                                  .*****.**   71.17
            Ochromonas                           Ochromonas
                                                                  ..**.....   58.87
            Symbiodinium                         Symbiodinium     ....*..*.   26.43
                                       16
      59    Prorocentrum                    59   Prorocentrum     .*......*   25.67
            Loxodes               26             Loxodes          .***.*.**   23.83
 71         Tracheloraphis                       Spirostomumum    ...*...*.   21.00
                                            21
                             71        16        Tetrahymena      .*..**.**   18.50
            Spirostomumum
                                                                  .....*..*   16.00
            Euplotes                             Euplotes         .*...*..*   15.67
            Tetrahymena                          Tracheloraphis   .***....*   13.17
            Gruberia                             Gruberia         ....**.**   12.67
                                                                  ....**.*.   12.00
Majority-rule consensus (with minority                            ..*...*..   12.00
                                                                  .**..*..*   11.00
components)                                                       .*...*...   10.80
                                                                  .....*.**   10.50
                                                                  .***.....   10.00
        Bootstrap - interpretation
• Bootstrapping was introduced as a way of establishing
  confidence intervals for phylogenies
• This interpretation of bootstrap proportions (BPs) depends on
  the assumption that the original data is a random sample from
  a much larger set of independent and identically distributed
  data
• However, several things complicate this interpretation
   - Perhhaps the assumptions are unreasonable - making any
     statistical interpretation of BPs invalid
   - Some theoretical work indicates that BPs are very
     conservative, and may underestimate confidence intervals -
     problem increases with numbers of taxa
   - BPs can be high for incongruent relationships in separate
     analyses - and can therefore be misleading (misleading data
     -> misleading BPs)
   - with parsimony it may be highly affected by inclusion or
     exclusion of only a few characters
      Bootstrap - interpretation
• Bootstrapping is a very valuable and widely
  used technique - it (or some suitable)
  alternative is demanded by some journals, but it
  may require a pragmatic interpretation:
• BPs depend on two aspects of the support for a
  group - the numbers of characters supporting a
  group and the level of support for incongruent
  groups
• BPs thus provides an index of the relative
  support for groups provided by a set of data
  under whatever interpretation of the data
  (method of analysis) is used

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:31
posted:8/17/2012
language:Portuguese
pages:7