Docstoc

Analysis of Variance.ppt

Document Sample
Analysis of Variance.ppt Powered By Docstoc
					    T-TESTS AND
ANALYSIS OF VARIANCE
     Jennifer Kensler
Laboratory for Interdisciplinary Statistical Analysis
Virginia Tech’s source for expert statistical analysis since 1948
                                                                           www.lisa.stat.vt.edu

                              Collaboration:
                              Personalized statistical advice
                              Great advice right now:
                              Meet with LISA before
                              collecting your data


                                                           Short Courses:
                                                           Designed to help
                                                           graduate students apply
                                                           statistics in their research

                              Walk-In Consulting:
                              Monday—Friday* 12-2PM
                              for questions <30 minutes
                              * Mon—Thurs in summer
                              * We help with research—not
                                class projects or homework
Laboratory for Interdisciplinary Statistical Analysis
Virginia Tech’s source for expert statistical analysis since 1948
                                                                           www.lisa.stat.vt.edu

                              Collaboration:
                              Personalized statistical advice
                              Great advice right now:
                              Meet with LISA before
                              collecting your data


                                                           Short Courses:
                                                           Designed to help
                                                           graduate students apply
                                                           statistics in their research

                              Walk-In Consulting:
                              Monday—Friday* 12-2PM
                              for questions <30 minutes
                              * Mon—Thurs in summer
                              * We help with research—not
                                class projects or homework
    T-TESTS AND
ANALYSIS OF VARIANCE
    ONE SAMPLE T-TEST
5
ONE SAMPLE T-TEST
   Used to test whether the population mean is
    different from a specified value.

   Example: Is the mean height of 12 year old girls
    greater than 60 inches?




                                                       6
STEP 1: FORMULATE THE HYPOTHESES
   The population mean is not equal to a specified
    value.
    H0: μ = μ0
    Ha: μ ≠ μ0
   The population mean is greater than a specified
    value.
    H0: μ = μ0
    Ha: μ > μ0
   The population mean is less than a specified value.
    H0: μ = μ0
    Ha: μ < μ0                                            7
STEP 2: CHECK THE ASSUMPTIONS
   The sample is random.

   The population from which the sample is drawn
    is either normal or the sample size is large.




                                                    8
STEPS 3-5
   Step 3: Calculate the test statistic:

                            y  0
                         t
                            s/ n
                n

                 yi  y 2
Where     s   i 1
                      n 1

   Step 4: Calculate the p-value based on the
    appropriate alternative hypothesis.

                                                 9
   Step 5: Write a conclusion.
IRIS EXAMPLE
   A researcher would like to know whether the
    mean sepal width of a variety of irises is different
    from 3.5 cm.

   The researcher randomly measures the sepal
    width of 50 irises.

   Step 1: Hypotheses
    H0: μ = 3.5 cm
    Ha: μ ≠ 3.5 cm
                                                           10
JMP
   Steps 2-4:
    JMP Demonstration
    Analyze  Distribution
    Y, Columns: Sepal Width

    Test Mean
    Specify Hypothesized Mean: 3.5



                                     11
JMP OUTPUT




 Step 5 Conclusion: The mean sepal width is not
significantly different from 3.5 cm.

                                                   12
     TWO SAMPLE T-TEST
13
TWO SAMPLE T-TEST
   Two sample t-tests are used to determine
    whether the population mean of one group is
    equal to, larger than or smaller than the
    population mean of another group.

   Example: Is the mean cholesterol of people taking
    drug A lower than the mean cholesterol of people
    taking drug B?



                                                        14
STEP 1: FORMULATE THE HYPOTHESES
   The population means of the two groups are not
    equal.
    H0: μ1 = μ2
    Ha: μ1 ≠ μ2
   The population mean of group 1 is greater than the
    population mean of group 2.
    H0: μ1 = μ2
    Ha: μ1 > μ2
   The population mean of group 1 is less than the
    population mean of group 2.
    H0: μ1 = μ2                                          15
    Ha: μ1 < μ2
STEP 2: CHECK THE ASSUMPTIONS
   The two samples are random and independent.

   The populations from which the samples are
    drawn are either normal or the sample sizes are
    large.

   The populations have the same standard
    deviation.



                                                      16
STEPS 3-5
   Step 3: Calculate the test statistic
                                y1  y2
                          t
                                  1 1
                             sp      
                                  n1 n2

               (n1  1) s12  (n2  1) s2
                                        2

where     sp 
                     n1  n2  2


 Step 4: Calculate the appropriate p-value.
 Step 5: Write a Conclusion.

                                               17
TWO SAMPLE EXAMPLE
   A researcher would like to know whether the
    mean sepal width of setosa irises is different from
    the mean sepal width of versicolor irises.

   Step 1 Hypotheses:
    H0: μsetosa = μversicolor
    Ha: μsetosa ≠ μversicolor




                                                          18
JMP
   Steps 2-4:
    JMP Demonstration:
    Analyze  Fit Y By X
    Y, Response: Sepal Width
    X, Factor: Species




                               19
JMP OUTPUT




 Step 5 Conclusion: There is strong evidence (p-
value < 0.0001) that the mean sepal widths for
the two varieties are different.

                                                    20
     PAIRED T-TEST
21
PAIRED T-TEST
   The paired t-test is used to compare the means of
    two dependent samples.

   Example:
    A researcher would like to determine if
    background noise causes people to take longer to
    complete math problems. The researcher gives 20
    subjects two math tests one with complete silence
    and one with background noise and records the
    time each subject takes to complete each test.
                                                        22
STEP 1: FORMULATE THE HYPOTHESES
   The population mean difference is not equal to zero.
    H0: μdifference = 0
    Ha: μdifference ≠ 0
   The population mean difference is greater than
    zero.
    H0: μdifference = 0
    Ha: μdifference > 0
   The population mean difference is less than a zero.
    H0: μdifference = 0
    Ha: μdifference < 0
                                                           23
STEP 2: CHECK THE ASSUMPTIONS
   The sample is random.

   The data is matched pairs.

   The differences have a normal distribution or the
    sample size is large.




                                                        24
STEPS 3-5
    Step 3: Calculate the test Statistic:


                       d 0
                   t
                      sd / n
 Where d bar is the mean of the differences and sd
 is the standard deviations of the differences.

    Step 4: Calculate the p-value.

    Step 5: Write a conclusion.
                                                     25
PAIRED T-TEST EXAMPLE
   A researcher would like to determine whether a
    fitness program increases flexibility. The
    researcher measures the flexibility (in inches) of
    12 randomly selected participants before and
    after the fitness program.

   Step 1: Formulate a Hypothesis
    H0: μAfter - Before = 0
    Ha: μ After - Before > 0

                                                         26
PAIRED T-TEST EXAMPLE
   Steps 2-4:
    JMP Analysis:
    Create a new column of After – Before
    Analyze  Distribution
    Y, Columns: After – Before

    Test Mean
    Specify Hypothesized Mean: 0


                                            27
JMP OUTPUT




Step 5 Conclusion: There is not evidence that
the fitness program increases flexibility.

                                                28
     ONE-WAY ANALYSIS OF
     VARIANCE
29
ONE-WAY ANOVA
   ANOVA is used to determine whether three or
    more populations have different distributions.




                   A       B            C
                                                     30
                    Medical Treatment
ANOVA STRATEGY

The   first step is to use the ANOVA F test to
determine if there are any significant differences
among means.

   If the ANOVA F test shows that the means are
not all the same, then follow up tests can be
performed to see which pairs of means differ.


                                                     31
ONE-WAY ANOVA MODEL
     yij  i   ij
     Where
     yij is the responseof the jth trial on the ith factor level
     i is the mean of the ith group
      ij ~ N (0,  2 )
     i  1,, r
     j  1, , ni


 In other words, for each group the observed
 value is the group mean plus some random
 variation.
                                                                   32
ONE-WAY ANOVA HYPOTHESIS
   Step 1: We test whether there is a difference in
    the means.


           H 0 : 1  2    r
           H a : The i are not all equal.




                                                       33
STEP 2: CHECK ANOVA ASSUMPTIONS
 The samples are random and independent of each
  other.
 The populations are normally distributed.

 The populations all have the same variance.




   The ANOVA F test is robust to the assumptions
    of normality and equal variances.
                                                    34
STEP 3: ANOVA F TEST




    A      B        C             A     B      C

                    Medical Treatment


 Compare the variation within the samples to the   35

 variation between the samples.
ANOVA TEST STATISTIC

               Variation between Groups MSG
          F                            
                Variation within Groups   MSE




 Variation within groups           Variation within groups
 small compared with               large compared with
 variation between groups          variation between groups   36
 → Large F                         → Small F
MSG
    The mean square for groups, MSG, measures the
 variability of the sample averages.
    SSG stands for sums of squares groups.



      SSG
MSG 
       r -1
      n1 ( y1  y ) 2  n 2 ( y2  y ) 2    n r ( y1  y ) 2
    
                                     r -1
                                                                           37
MSE
 Mean square error, MSE, measures the variability
within the groups.
 SSE stands for sums of squares error.



      SSE
MSE 
      n-r
      (n1 - 1)s1  (n 2 - 1)s2    (n r - 1)s2
               2
                            2                 r
                         n-r
Where
        ni

        (y
        j 1
                 ij    yi  )
 si 
               ni  1                                38
STEPS 4-5
   Step 4: Calculate the p-value.

   Step 5: Write a conclusion.




                                     39
ANOVA EXAMPLE
 A researcher would like to determine if three
  drugs provide the same relief from pain.
 60 patients are randomly assigned to a treatment
  (20 people in each treatment).

   Step 1: Formulate the Hypotheses
    H0: μDrug A = μDrug B = μDrug C
    Ha : The μi are not all equal.


                                                     40
STEPS 2-4
   JMP demonstration
    Analyze  Fit Y By X
    Y, Response: Pain
    X, Factor: Drug




                           41
JMP OUTPUT AND CONCLUSION




  Step 5 Conclusion: There is strong evidence
 that the drugs are not all the same.




                                                 42
FOLLOW-UP TEST
 The p-value of the overall F test indicates that
  the level of pain is not the same for patients
  taking drugs A, B and C.
 We would like to know which pairs of treatments
  are different.
 One method is to use Tukey’s HSD (honestly
  significant differences).




                                                     43
TUKEY TESTS
   Tukey’s test simultaneously tests
                      H 0 : i  i '
                      H a : i  i '
    for all pairs of factor levels. Tukey’s HSD
    controls the overall type I error.


JMP demonstration
Oneway Analysis of Pain By Drug 
Compare Means  All Pairs, Tukey HSD              44
JMP OUTPUT




 The JMP output shows that drugs A and C are
significantly different.
                                                45
     TWO-WAY ANALYSIS OF
     VARIANCE
46
TWO-WAY ANOVA
 We are interested in the effect of two categorical
  factors on the response.
 We are interested in whether either of the two
  factors have an effect on the response and
  whether there is an interaction effect.
       An interaction effect means that the effect on the
        response of one factor depends on the level of the
        other factor.




                                                             47
           INTERACTION

               No Interaction                                         Interaction

                                     Factor B Low                                          Factor B Low
                                     Factor B High                                         Factor B High
Response




             Low              High                   Response   Low                 High
                   Factor A                                            Factor A




                                                                                                           48
  TWO-WAY ANOVA MODEL

yijk     i   j  (  ) ij   ijk
Where
yijk is the responseof the kth trial on the ith factor A level and the jth factor B level
 is the overall mean
 i is the main effect of the ith level of factor A
 j is the main effect of the jth level of factor B
(  ) ij is the interaction effect of the ith level of factor A and the jth level of factor B
 ijk ~ N (0,  2 )
i  1, , a
j  1,, b
k  1,...,nij
                                                                                             49
TWO-WAY ANOVA EXAMPLE
   We would like to determine the effect of two
    alloys (low, high) and three cooling temperatures
    (low, medium, high) on the strength of a wire.

   JMP demonstration
    Analyze  Fit Model
    Y: Strength
    Highlight Alloy and Temp and click Macros 
    Factorial to Degree

                                                        50
JMP OUTPUT




 Conclusion: There is strong evidence of an
 interaction between alloy and temperature.
                                              51
     ANALYSIS OF COVARIANCE
52
ANALYSIS OF COVARIANCE (ANCOVA)
 Covariates are variables that may affect the
  response but cannot be controlled.
 Covariates are not of primary interest to the
  researcher.
 We will look at an example with two covariates,
  the model is

           yij  i  covariates   ij

                                                    53
ANCOVA EXAMPLE
   Consider the one-way ANOVA example where we
    tested whether the patients receiving different
    drugs reported different levels of pain. Perhaps
    age and gender may influence the pain. We can
    use age and gender as covariates.

   JMP demonstration
    Analyze  Fit Model
    Y: Pain
    Add: Drug
         Age                                           54

         Gender
JMP OUTPUT




             55
CONCLUSION
   The one sample t-test allows us to test whether
    the population mean of a group is equal to a
    specified value.

   The two-sample t-test and paired t-test allow us
    to determine if the population means of two
    groups are different.

   ANOVA and ANCOVA methods allow us to
    determine whether the population means of
    several groups are statistically different.
                                                       56
SAS AND SPSS
   For information about using SAS and SPSS to do
    ANOVA:

http://www.ats.ucla.edu/stat/sas/topics/anova.htm
http://www.ats.ucla.edu/stat/spss/topics/anova.htm




                                                     57
REFERENCES
   Fisher’s Irises Data (used in one sample and two
    sample t-test examples).

   Flexibility data (paired t-test example):
    Michael Sullivan III. Statistics Informed
    Decisions Using Data. Upper Saddle River, New
    Jersey: Pearson Education, 2004: 602.




                                                       58

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:5/16/2012
language:Latin
pages:58
shensengvf shensengvf http://
About