Hypothesis Testing

Document Sample
Hypothesis Testing Powered By Docstoc
					Hypothesis Testing

Always about a population parameter
Attempt to prove (or disprove) some assumption

alternate hypothesis: What you wish to prove
  Example: Person is guilty of crime
null hypothesis: Assume the opposite of what is
  to be proven. The null is always stated as an
  Example: Person is innocent
     The test

1.    Take a sample, compute statistic of interest.
        The evidence gathered against defendent
2.    How likely is it that if the null were true, you
      would get such a statistic? (the p-value)
        How likely is it that an innocent person would be
        found at the scene of crime, with gun in hand,
3.    If very unlikely, then null must be false, hence
      alternate is proven beyond reasonable doubt.
4.    If quite likely, then null may be true, so not
      enough evidence to discard it in favor of the
 Types of Errors

                         Null is really    Null is really
                             True             False
reject null,             Type I Error     Good Decision
assume alternate is      (convict the
proven                   innocent)
do not reject null,      Good Decision    Type II Error
evidence for alternate                    (let guilty go free)
not strong enough
Hypothesis Testing Roadmap

                            Hypothesis Testing

                 Continuous                            Attribute

         Normal,                  Non-Normal,         c2 Contingency
     Interval Scaled             Ordinal Scaled           Tables

 Means        Variance     Medians         Variance    Correlation

  Z-tests         c2       Correlation     Levene’s   Same tests as
  t-tests       F-test     Sign Test                    Medians

 ANOVA        Bartlett’s   Wilcoxon
Regression                  Mood’s

       Parametric Tests

Use parametric tests when:

1.   The data are normally distributed
2.   The variances of populations (if more than one is sampled
     from) are equal
3.   The data are at least interval scaled
    One sample z - test

Used when testing to see if sample comes from a known
population. A sample of 25 measurements shows a mean of 17.
Test whether this is significantly different from a the hypothesized
mean of 15, assuming the population standard deviation is known
to be 4.

  One-Sample Z

  Test of mu = 15 vs not = 15
  The assumed standard deviation = 4

  N Mean SE Mean         95% CI           Z   P
  25 17.0000 0.8000 (15.4320, 18.5680) 2.50 0.012
      Z-test for proportions

70% of 200 customers surveyed say they prefer the taste of Brand X
  over competitors. Test the hypothesis that more than 66% of
  people in the population prefer Brand X.

 Test and CI for One Proportion

 Test of p = 0.66 vs p > 0.66

 Sample X N Sample p            Bound Z-Value P-Value
 1     140 200 0.700000         0.646701 1.19  0.116
  One sample t-test

   BP                                 Probability Plot of BP Reduction
                                               Normal - 95% CI
   %                                                                          Mean

   10                    95                                                   N
                                                                              P-Value   0.850
   12                    80

    9                    70

    8                    40
    7                    20

   12                    10

   13                    1
                              0   5     10        15        20     25    30
                                             BP Reduction
            The data show reductions in Blood Pressure in a
   18       sample of 17 people after a certain treatment. We
   19       wish to test whether the average reduction in BP
   20       was at least 13%, a benchmark set by some other
   15       treatment that we wish to match or better.
     One Sample t-test – Minitab results

One-Sample T: BP Reduction

Test of mu = 13 vs > 13

Variable      N Mean StDev SE Mean Bound T           P
BP Reduction 17 13.8235 3.9248 0.9519 12.1616 0.87 0.200

The p-value of 0.20 indicates that the reduction in BP could not be
proven to be greater than 13%. There is a 0.20 probability that it is
not greater than 13%.
       Two Sample t-test

You realize that though the overall reduction is not proven to be
more than 13%, there seems to be a difference between how men
and women react to the treatment. You separate the 17
observations by gender, and wish to test whether there is in fact a
significant difference between genders.

  M     F                                Test for Equal Variances for BP Reduction
  10   15                  F
                                                                                                  Test Statistic

  12   16                                                                                             Lev ene's Test

                                                                                                  Test Statistic    0.14
                                                                                                  P-Value          0.716
   9   18                  M

   8   12                      1       2             3              4               5         6
                                       95% Bonferroni Confidence Intervals for StDevs

   7   18
  12   19                  F

  14   20

  13   17                  M

       15                      6   8        10        12       14         16        18   20
                                                       BP Reduction
       Two Sample t-test

The test for equal variances shows that they are not different for the 2
samples. Thus a 2-sample t test may be conducted. The results are
shown below. The p-value indicates there is a significant difference
between the genders in their reaction to the treatment.

Two-sample T for BP Reduction M vs BP Reduction F

         N Mean StDev SE Mean
BP Red M 8 10.63 2.50 0.89
BP Red F 9 16.67 2.45 0.82

Difference = mu (BP Red M) - mu (BP Red F)
Estimate for difference: -6.04167
95% CI for difference: (-8.60489, -3.47844)
T-Test of difference = 0 (vs not =): T-Value = -5.02 P-Value = 0.000
                                         DF = 15
Both use Pooled StDev = 2.4749
       Basics of ANOVA

Analysis of Variance, or ANOVA is a technique used to
test the hypothesis that there is a difference between    Obs.   Type A Type B
the means of two or more populations. It is used in
Regression, as well as to analyze a factorial
experiment design, and in Gauge R&R studies.              1        2        6
The basic premise of ANOVA is that differences in the     2        3        7
means of 2 or more groups can be seen by
partitioning the Sum of Squares. Sum of Squares           3        4        8
(SS) is simply the sum of the squared deviations of the
observations from their means. Consider the following     Mean     3        7
example with two groups. The measurements show the
thumb lengths in centimeters of two types of              SS       2        2
Total variation (SS) is 28, of which only 4 (2+2) is
within the two groups. Thus 24 of the 28 is due to the
                                                                 Mean = 5
differences between the groups. This partitioning of
SS into ‘between’ and ‘within’ is used to test the
                                                                 SS = 28
hypothesis that the groups are in fact different from
each other.

See www.statsoft.com for more details.
         Results of ANOVA

The results of                          One-way ANOVA: Type A, Type B
running an ANOVA on
the sample data from
the previous slide are shown            Source DF SS MS        F      P
here. The hypothesis test               Factor 1 24.00 24.00 24.00 0.008
computes the F-value as the             Error 4 4.00     1.00
ratio of MS ‘Between’ to                Total 5 28.00
MS ‘Within’. The greater the
value of F, the greater the
likelihood that there is in fact        S = 1 R-Sq = 85.71% R-Sq(adj) = 82.14%
a difference between the groups.
looking it up in an F-distribution
table shows a p-value of 0.008,
indicating a 99.2% confidence that
the difference is real (exists in the
Population, not just in the sample).

Minitab: Stat/ANOVA/One-Way (unstacked)
           Two-Way ANOVA

Strength   Temp   Speed
                           Is the strength of steel produced different
20.0       Low    Slow     for different temperatures to which it is
22.0       Low    Slow     heated and the speed with which it is
21.5       Low    Slow
                           cooled? Here 2 factors (speed and temp)
23.0       Low    Fast
24.0       Low    Fast     are varied at 2 levels each, and strengths
22.0       Low    Fast     of 3 parts produced at each combination
25.0       High   Slow     are measured as the response variable.
24.0       High   Slow
24.5       High   Slow
17.0       High   Fast     Two-way ANOVA: Strength versus Temp, Speed
18.0       High   Fast
17.5       High   Fast
                           Source      DF    SS      MS      F   P
                           Temp        1 3.5208     3.5208 5.45 0.048
                           Speed       1 20.0208   20.0208 31.00 0.001
The results show
                           Interaction 1 58.5208   58.5208 90.61 0.000
significant main effects   Error       8 5.1667     0.6458
as well as an              Total      11 87.2292
interaction effect.
                           S = 0.8036 R-Sq = 94.08% R-Sq(adj) = 91.86%
    Two-Way ANOVA

The box plots give an indication of the interaction effect. The
effect of speed on the response is different for different levels of
temperature. Thus, there is an interaction effect between
temperature and speed.

                                   Boxplot of Strength by Temp, Speed










                         Speed    Fast          Slow           Fast         Slow
                         Temp            High                         Low

Shared By: