PowerPoint Presentation by OmGrbA

VIEWS: 18 PAGES: 53

									Midterm Review Session
          Things to Review
• Concepts
• Basic formulae
• Statistical tests
          Things to Review
• Concepts
• Basic formulae
• Statistical tests
Populations <-> Parameters;
  Samples <-> Estimates
     Nomenclature
            Population   Sample
            Parameter    Statistics
 Mean                       x
Variance                  s2
Standard                  s
Deviation
 In a random sample, each
member of a population has
an equal and independent
 chance of being selected.
  Review - types of variables
                          Nominal
• Categorical variables
                           Ordinal

                          Discrete
• Numerical variables
                          Continuous
                            Reality



                      Ho true          Ho false


Result
         Reject Ho    Type I error     correct




   Do not reject Ho   correct         Type II error
Sampling distribution of the mean, n=10




Sampling distribution of the mean, n=100




Sampling distribution of the mean, n = 1000
          Things to Review
• Concepts
• Basic formulae
• Statistical tests
          Things to Review
• Concepts
• Basic formulae
• Statistical tests
  Sample                         Null hypothesis




Test statistic                    Null distribution
                    compare




       How unusual is this test statistic?

       P < 0.05                 P > 0.05


Reject Ho                       Fail to reject Ho
           Statistical tests
• Binomial test
• Chi-squared goodness-of-fit
  – Proportional, binomial, poisson
• Chi-squared contingency test
• t-tests
  – One-sample t-test
  – Paired t-test
  – Two-sample t-test
           Statistical tests
• Binomial test
• Chi-squared goodness-of-fit
  – Proportional, binomial, poisson
• Chi-squared contingency test
• t-tests
  – One-sample t-test
  – Paired t-test
  – Two-sample t-test
    Quick reference summary:
           Binomial test
• What is it for? Compares the proportion of successes
  in a sample to a hypothesized value, po
• What does it assume? Individual trials are randomly
  sampled and independent
• Test statistic: X, the number of successes
• Distribution under Ho: binomial with parameters n and
  po .
• Formula:
                n  x
         P(x)   p 1 p
                             nx
                                                         P = 2 * Pr[xX]
                x 
          P(x) = probability of a total of x successes
          p = probability of success in each trial
          n = total number of trials
                       Binomial test
                                        Null hypothesis
        Sample                          Pr[success]=po




      Test statistic                      Null distribution
x = number of successes   compare
                                          Binomial n, po



            How unusual is this test statistic?

            P < 0.05                   P > 0.05


       Reject Ho                       Fail to reject Ho
                   Binomial test

H0: The relative frequency of successes in the population is p0


HA: The relative frequency of successes in the population is not p0
           Statistical tests
• Binomial test
• Chi-squared goodness-of-fit
  – Proportional, binomial, poisson
• Chi-squared contingency test
• t-tests
  – One-sample t-test
  – Paired t-test
  – Two-sample t-test
    Quick reference summary:
     2 Goodness-of-Fit test
• What is it for? Compares observed frequencies in
  categories of a single variable to the expected
  frequencies under a random model
• What does it assume? Random samples; no expected
  values < 1; no more than 20% of expected values < 5
• Test statistic: 2
• Distribution under Ho: 2 with
              df=# categories - # parameters - 1
• Formula:
                                                     
                                                 2
                         Observed  Expected
               
               2
                                       i

                                      Expectedi
                                                  i

                    all classes
                                      2 goodness of fit test
                           Sample                                 Null hypothesis:
                                                                Data fit a particular
                                                                Discrete distribution
     Calculate expected values

              Test statistic
                          Observedi  Expectedi                 Null distribution:
                                                 2

       
       2
                                                    compar
            all classes          Expectedi
                                                     e                2 With
                                                                  N-1-param. d.f.


                                   How unusual is this test statistic?

                                  P < 0.05                      P > 0.05


                      Reject Ho                                 Fail to reject Ho
   2 Goodness-of-Fit test

H0: The data come from a certain distribution


HA: The data do not come from that distrubition
     Possible distributions
               n  x
      Pr[ x]   p 1 p
                            nx

               x 
                         
                   e          X
          PrX  
                    X!

     Pr[x] = n * frequency of occurrence
               Given a number of categories
Proportional   Probability proportional to number of opportunities
               Days of the week, months of the year




               Number of successes in n trials
 Binomial      Have to know n, p under the null hypothesis
               Punnett square, many p=0.5 examples




               Number of events in interval of space or time
  Poisson      n not fixed, not given p
               Car wrecks, flowers in a field
           Statistical tests
• Binomial test
• Chi-squared goodness-of-fit
  – Proportional, binomial, poisson
• Chi-squared contingency test
• t-tests
  – One-sample t-test
  – Paired t-test
  – Two-sample t-test
     Quick reference summary:
       2 Contingency Test
• What is it for? Tests the null hypothesis of no association
  between two categorical variables
• What does it assume? Random samples; no expected
  values < 1; no more than 20% of expected values < 5
• Test statistic: 2
• Distribution under Ho: 2 with
              df=(r-1)(c-1) where r = # rows, c = # columns
• Formulae:
                                                         Observedi  Expectedi 
                                                                                2

  Expected 
               RowTotal* ColTotal
                  GrandTotal
                                    2                        Expectedi
                                           all classes
                                      2 Contingency Test
                           Sample                              Null hypothesis:
                                                                No association
                                                              between variables
     Calculate expected values

              Test statistic
                          Observedi  Expectedi               Null distribution:
                                                 2

       
       2
                                                    compar
            all classes          Expectedi
                                                     e              2 With
                                                                 (r-1)(c-1) d.f.


                                   How unusual is this test statistic?

                                  P < 0.05                    P > 0.05


                      Reject Ho                               Fail to reject Ho
       2 Contingency test

H0: There is no association between these two variables


HA: There is an association between these two variables
           Statistical tests
• Binomial test
• Chi-squared goodness-of-fit
  – Proportional, binomial, poisson
• Chi-squared contingency test
• t-tests
  – One-sample t-test
  – Paired t-test
  – Two-sample t-test
    Quick reference summary:
        One sample t-test
• What is it for? Compares the mean of a numerical
  variable to a hypothesized value, μo
• What does it assume? Individuals are randomly
  sampled from a population that is normally distributed.
• Test statistic: t
• Distribution under Ho: t-distribution with n-1 degrees of
  freedom.
• Formula:
                       Y  o
                    t
                        SEY
                      One-sample t-test

       Sample                          Null hypothesis
                                    The population mean
                                        is equal to o



     Test statistic                       Null distribution
                          compare          t with n-1 df
        Y  o
     t
        s/ n

            How unusual is this test statistic?

         P < 0.05                  P > 0.05


     Reject Ho                       Fail to reject Ho
      One-sample t-test

Ho: The population mean is equal to o

Ha: The population mean is not equal to o
Paired vs. 2 sample
   comparisons
    Quick reference summary:
           Paired t-test
• What is it for? To test whether the mean difference in a
  population equals a null hypothesized value, μdo
• What does it assume? Pairs are randomly sampled
  from a population. The differences are normally
  distributed
• Test statistic: t
• Distribution under Ho: t-distribution with n-1 degrees of
  freedom, where n is the number of pairs
• Formula:               d  do
                   t
                          SE d
                      Paired t-test

       Sample                           Null hypothesis
                                      The mean difference
                                         is equal to o



     Test statistic                      Null distribution
                                          t with n-1 df
        d  do           compare
     t
                                            *n is the number of pairs



         SE d

            How unusual is this test statistic?
                                       P > 0.05
         P < 0.05

     Reject Ho                         Fail to reject Ho
           Paired t-test

Ho: The mean difference is equal to 0

Ha: The mean difference is not equal 0
     Quick reference summary:
         Two-sample t-test
• What is it for? Tests whether two groups have the
  same mean
• What does it assume? Both samples are random
  samples. The numerical variable is normally
  distributed within both populations. The variance of
  the distribution is the same in the two populations
• Test statistic: t
• Distribution under Ho: t-distribution with n1+n2-2
  degrees of freedom.                                      1 
                        Y1  Y2                       2 1
                                            SEY Y  sp   
• Formulae:         t                                  n1 n 2 
                                                       1   2




                          SE Y Y                 s 
                                                   2
                                                   p
                                                      df1s12  df2 s2
                                                                    2

                                1   2
                                                        df1  df2
                                         
                     Two-sample t-test
                                        Null hypothesis
       Sample                        The two populations
                                     have the same mean
                                                12


     Test statistic                        Null distribution
                           compare         t with n1+n2-2 df
         Y1  Y2
      t
         SE Y Y
             1   2




            How unusual is this test statistic?
         P < 0.05                      P > 0.05


     Reject Ho                           Fail to reject Ho
      Two-sample t-test

Ho: The means of the two populations are
  equal

Ha: The means of the two populations are
  not equal
Which test do I use?
                         Methods for a
                         single variable
                     1

How many variables
 am I comparing?
                     2    Methods for
                         comparing two
                           variables
       Methods for one variable
                       Is the variable
                         categorical
      Categorical       or numerical?


     Comparing to a
   single proportion po                        Numerical
   or to a distribution?


      po               distribution




                           2 Goodness-     One-sample t-test
Binomial test                 of-fit test
      Methods for two variables

                                                    X
                                          Explanatory variable
    Response variable           Categorical                    Numerical
                             Contingency table
       Categorical          Grouped bar graph
Y                               Mosaic plot
                            Multiple histograms
                                                               Scatter plot
       Numerical      Cumulative frequency distributions
      Methods for two variables

                                                   X
                                          Explanatory variable
    Response variable           Categorical                     Numerical
                             Contingency table
                            Contingency                        Logistic
       Categorical          Grouped bar graph
                                analysis                    regression
Y                               Mosaic plot
                            Multiple histograms
                                                                Scatter plot
       Numerical                   t-test distributions
                      Cumulative frequency                  Regression
Methods for two variables

       Is the response variable
       categorical or numerical?



  Categorical                Numerical




 Contingency
                                   t-test
  analysis
                                How many variables
                                 am I comparing?

                           1                                 2

                    Is the variable
                      categorical                           Is the response variable
                     or numerical?
                                                            categorical or numerical?
 Categorical


         Comparing to a               Numerical                                 Numerical
       single proportion po
                                                      Categorical
       or to a distribution?



      po
                    distribution




                     2 Goodness-                             Contingency
                                                                                 t-test
Binomial test                           One-sample t-test      analysis
                        of-fit test
                  Sample Problems
An experiment compared the testes sizes of four
experimental populations of monogamous flies to four
populations of polygamous flies:




  a. What is the difference in mean testes size for males from monogamous populations
  compared to males from polyandrous populations? What is the 95% confidence interval for
  this estimate?
  b. Carry out a hypothesis test to compare the means of these two groups. What conclusions
  can you draw?
             Sample Problems

In Vancouver, the probability of rain during a winter day
is 0.58, for a spring day 0.38, for a summer day 0.25,
and for a fall day 0.53. Each of these seasons lasts one
quarter of the year.

What is the probability of rain on a randomly-chosen
day in Vancouver?
                Sample problems
A study by Doll et al. (1994) examined the relationship
between moderate intake of alcohol and the risk of heart
disease. 410 men (209 "abstainers" and 201 "moderate
drinkers") were observed over a period of 10 years, and the
number experiencing cardiac arrest over this period was
recorded and compared with drinking habits. All men were
40 years of age at the start of the experiment. By the end of
the experiment, 12 abstainers had experienced cardiac
arrest whereas 9 moderate drinkers had experienced
cardiac arrest.

Test whether or not relative frequency of cardiac arrest was
different in the two groups of men.
            Sample Problems
An RSPCA survey of 200 randomly-chosen Australian
pet owners found that 10 said that they
had met their partner through owning the pet.

A. Find the 95% confidence interval for the proportion
of Australian pet owners who find love through their
pets.

B. What test would you use to test if the true proportion
is significantly different from 0.01? Write the formula
that you would use to calculate a P-value.
               Sample Problems
One thousand coins were each flipped 8 times, and
the number of heads was recorded for each coin.
Here are the results:




     Does the distribution of coin flips match the
     distribution expected with fair coins? ("Fair coin"
     means that the probability of heads per flip is 0.5.)
     Carry out a hypothesis test.
                    Sample problems
Vertebrates are thought to be unidirectional in growth, with size either increasing or holding
steady throughout life. Marine iguanas from the Galápagos are unusual in a number of ways, and a
team of researchers has suggested that these iguanas might actually shrink during the low food
periods caused by El Niño events (Wikelski and Thom 2000). During these events, up to 90% of the
iguana population can die from starvation. Here is a plot of the changes in body length of 64
surviving iguanas during the 1992-1993 El Niño event.




       The average change in length was −5.81mm, with standard deviation 19.50mm.
       Test the hypothesis that length did not change on average during the El Niño event   .

								
To top