Slide 1

Shared by: 1jNFv9
Categories
Tags
-
Stats
views:
1
posted:
11/23/2011
language:
English
pages:
31
Document Sample
scope of work template
							You will go through the process of
science, learning how statistics is
  applied to a study of saguaros.
OBSERVATION
     Saguaros seem to occur in different
      numbers on the north versus south slope
      of Gates Pass.



Gates Pass




             www.gamineral.org/t04-gates_pass.html
RESEARCH QUESTION
 Based on the observation that saguaros
  seem to occur in different numbers on the
  north and south slopes of Gates Pass:
  What descriptive question (versus causal
  question) could you ask?
 Descriptive: Is saguaro density affected by
  whether the slope faces north versus south?
    Only requires a count to answer.
   Causal: Why is saguaro density affected by
    whether the slope faces north versus south?
    Requires controlled studies of many factors to answer.
LITERATURE REVIEW
   Not needed to come up with the multiple
    hypotheses for this question because it is
    a descriptive question – there either is not
    an effect or there is an effect of slope
    direction on saguaro density.
MULTIPLE HYPOTHESES
First, what is the null hypothesis (H0) – the
one that states that there is not a cause
and effect relationship?
 H0: Saguaro density is not affected by
   whether the slope faces north versus
   south.
 Second, what are the alternative hypotheses
 (H1 and H2 in this case)?
  H1: Saguaro density is greater on the
    north-facing slope.
  H2: Saguaro density is greater on the
    south-facing slope.
DEDUCTIONS
 What evidence would you need to be convinced
  each hypothesis is correct or incorrect?
 In other words, By how much would the
  densities have to differ for you to be convinced
  that the direction of the slope affects saguaro
  density?
 We will come back to this later…but in “real life”
  you are supposed to come up with deductions
  before collecting any data.
TESTS: Three Data Sets
            North Slope South Slope
 Data set A     99          101
            North Slope South Slope
 Data set B     90          110
           North Slope South Slope
Data set C     80          120
Imagine that these are three possible sets of data that
you could have collected by counting saguaros on a
north and south slope. Note that I have kept the total
number of saguaros the same (200) for each data set.
TESTS: THREE EXAMPLES




Here are the same data displayed in a bar graph
TENTATIVE CONCLUSION
   For each data set (A-C) did the evidence convince
    you that the differences in densities were
    significant enough to warrant ruling out the null
    hypothesis (that the distribution of saguaros on the
    two slopes was just random) and tentatively
    concluding that slope did affect saguaro density?
TENTATIVE CONCLUSION
Perhaps it would help to know   p=probability this
 what the chance is that the     would happen
   distribution is random?         randomly?

North Slope    South Slope
                                     p=?%
A        99        101

North Slope    South Slope
                                     p=?%
B        90        110

North Slope    South Slope
                                     p=?%
C        80        120
TENTATIVE CONCLUSION
    But how do we determine
     the probability that the
     distribution is random?



       STATISTICS
STATISTICS
   If you are comparing counts, then use the
    chi (pronounced kie) square test.
     Example: count of saguaros on two slopes.
   If you are comparing averages, then use
    the t-test.
     Example: comparing average height of
     saguaros on two slopes.
Chi-Square Test
  You compare the actual counts to what
   the expected count would be if the
   distribution was random.
  In our case, with a total of 200 saguaros
   counted on both slopes, what would the
   expected distribution be if they were
   distributed perfectly random?

                North Slope South Slope
     Expected       100         100
Chi-Square Test
    The chi-squared test tells you the probability
     that the difference between observed and
     expected occurred by chance.

                      North Slope     South Slope
      Observed             99             101
      Expected            100             100
      Difference           -1              1
Chi-Square Test                               Type in the
                                              numbers in the
     Use my Excel file online
                                              gray boxes and
                                              then hit enter
2 categories   Category 1   Category 2


Your Data                                                 <<<significant
   >>>            99          101        P value   0.89     if <0.05


2 categories   Category 1   Category 2


Your Data                                                 <<<significant
   >>>            90          110        P value   0.16     if <0.05
TENTATIVE CONCLUSION
  Using my Excel file online,   p=probability this
you would come up with these     would happen
        probabilities.             randomly

North Slope    South Slope
                                    p=0.89=89%
A        99        101

North Slope    South Slope
                                    p=0.16=16%
B        90        110

North Slope    South Slope
                                    p=0.005=<1%
C        80        120
DEDUCTIONS
 Which brings us back to deductions.
 What probability of being wrong are we willing
  to risk?
     The worse mistake you can make in science (Type 1
     error) is to conclude that the difference is due to
     cause and effect when it was really random.
   Are we willing to be wrong 89% of the time?
    16% of the time? Less than 1% of the time?
   Scientists most often use 5% = p<0.05
DEDUCTIONS
 If slope direction does not affect saguaro
  density, then p > or equal to 0.05 (5%).
 If saguaro density is greater on the north slope,
  then p will be less than 0.05 and saguaro
  density will be greater on the north slope.
 If saguaro density is greater on the south slope,
  then p will be less than 0.05 and saguaro
  density will be greater on the south slope.
TENTATIVE CONCLUSION
 For data sets A&B, because p>0.05 there is no
  significant difference in saguaro density so slope
  direction unlikely to affect saguaro density.
 For data set C, because p<0.05 saguaro density is
  significantly greater on the south slope so slope
  direction likely affects saguaro density.
    This could be your table
Table 1. Number of saguaros per hectare on
the north and south slope of Gates Pass near
Tucson, AZ as counted September 1, 2009.

              North     South
              slope     slope    Significant?
Data set A     99        101     No; p=0.89
Data set B     90        110     No; p=0.16
Data set C     80        120     Yes; p=0.005
    This could be your graph




Figure 1. Number of saguaros per hectare on
the north and south slope of Gates Pass near
Tucson, AZ as counted September 1, 2009.
T-Test Example
    The t-test tells you the probability that the
     difference between two averages is random,
     and considers variability within the data.
    For example
      H0: Slope direction does not affect saguaro
       height.
      H1: Slope direction does affect saguaro
       height.

        Gates
        Pass

                www.gamineral.org/t04-gates_pass.html
Sample Data
Table 1. Saguaro height (in meters) on the
north and south slope of Gates Pass near
Tucson, AZ as counted September 1, 2009.
 North-facing Slope    South-facing slope
          3.5                   1.0
          0.2                   3.2
          0.5                   3.5
          1.5                   4.2
          3.1                   0.8
          0.8                   0.7
          1.2                   3.4
Sample Data
Table 2. Average saguaro height (in meters) for
7 saguaros measured on the north and south
slope of Gates Pass near Tucson, AZ on
September 1, 2009.

North-facing Slope    South-facing slope
         1.5                   2.4


Are these significantly different?
It depends on sample size and
amount of variability in the data.
                                                       Lots of variability:


Number saguaros
                                                       Probably not
                    North slope          South slope
                     average              average
                                                       significantly
                                                       different
                  x x xx x x x x x xx x x x
                       1.0         2.0         3.0       4.0
                                    Height (m)
Number saguaros




                     North slope         South slope
                      average             average
                                                       Little variability:
                                                       Probably
                                                       significantly
                              xx         xx            different
                             xxx         xx
                             xx          xxx
                       1.0         2.0         3.0       4.0
                                    Height
T-Test
  Use my Excel file online
  Click on the t-test tab at the bottom.


  Group 1     Group 2
    3.5         1.0
    0.2         3.2
    Etc.        Etc.       P value          0.272
 TENTATIVE CONCLUSION?
Remember, this is the
  Magic number:
If P<0.05
then we tentatively conclude that
there is a significant difference
because there is less than 5%
chance that it could have
happened randomly.
If P>0.05
then we tentatively conclude that
there is NOT a significant
difference because there is
more than 5% chance that it
could have happened randomly.
T-Test
     Group 1     Group 2
       3.5         1.0
       0.2         3.2
       Etc.        Etc.       P value       0.272
    TENTATIVE CONCLUSION?

   Average saguaro height on the north slope (1.5 m)
    is not significantly different (p=0.27) from average
    saguaro height on the south slope (2.4 m).
REVIEW
 Before you collect data (i.e., in your
  proposal) you have to decide how much
  difference is enough to convince you that
  there is cause and effect going on versus
  just random chance.
 Statistics (e.g., chi-square and t-test) can
  be used to calculate the probability that
  the difference is random.
 Use p<0.05 to rule out the null hypothesis
  and tentatively conclude there are
  significant differences that suggest cause
  and effect.

						
Related docs
Other docs by 1jNFv9
Livro de Endere�os - Turma Dedo
Views: 515  |  Downloads: 0
Relat�rio de Atividades de 2005
Views: 454  |  Downloads: 0
SAT - Excel
Views: 33  |  Downloads: 1
Cartographie g�n�tique
Views: 182  |  Downloads: 0
Gross Scores
Views: 7  |  Downloads: 0
TORNADO ATAU PUTTING BELIUNG
Views: 28  |  Downloads: 0