# Hypothesis Testing

Document Sample

```					Hypothesis Testing
Introduction

Attempt to prove (or disprove) some assumption

Setup:
alternate hypothesis: What you wish to prove
Example: Person is guilty of crime
null hypothesis: Assume the opposite of what is
to be proven. The null is always stated as an
equality.
Example: Person is innocent
The test

1.    Take a sample, compute statistic of interest.
The evidence gathered against defendent
2.    How likely is it that if the null were true, you
would get such a statistic? (the p-value)
How likely is it that an innocent person would be
found at the scene of crime, with gun in hand,
etc.
3.    If very unlikely, then null must be false, hence
alternate is proven beyond reasonable doubt.
4.    If quite likely, then null may be true, so not
enough evidence to discard it in favor of the
alternate.
Types of Errors

Null is really    Null is really
True             False
reject null,             Type I Error     Good Decision
assume alternate is      (convict the
proven                   innocent)
do not reject null,      Good Decision    Type II Error
evidence for alternate                    (let guilty go free)
not strong enough

Hypothesis Testing

Continuous                            Attribute

Normal,                  Non-Normal,         c2 Contingency
Interval Scaled             Ordinal Scaled           Tables

Means        Variance     Medians         Variance    Correlation

Z-tests         c2       Correlation     Levene’s   Same tests as
Non-Normal
t-tests       F-test     Sign Test                    Medians

ANOVA        Bartlett’s   Wilcoxon
Kruskal-
Correlation
Wallis
Regression                  Mood’s

Friedman’s
Parametric Tests

Use parametric tests when:

1.   The data are normally distributed
2.   The variances of populations (if more than one is sampled
from) are equal
3.   The data are at least interval scaled
One sample z - test

Used when testing to see if sample comes from a known
population. A sample of 25 measurements shows a mean of 17.
Test whether this is significantly different from a the hypothesized
mean of 15, assuming the population standard deviation is known
to be 4.

One-Sample Z

Test of mu = 15 vs not = 15
The assumed standard deviation = 4

N Mean SE Mean         95% CI           Z   P
25 17.0000 0.8000 (15.4320, 18.5680) 2.50 0.012
Z-test for proportions

70% of 200 customers surveyed say they prefer the taste of Brand X
over competitors. Test the hypothesis that more than 66% of
people in the population prefer Brand X.

Test and CI for One Proportion

Test of p = 0.66 vs p > 0.66

95%
Lower
Sample X N Sample p            Bound Z-Value P-Value
1     140 200 0.700000         0.646701 1.19  0.116
One sample t-test

BP                                 Probability Plot of BP Reduction
Normal - 95% CI
Reduction
99
%                                                                          Mean
StDev
13.82
3.925

10                    95                                                   N
17
0.204
90
P-Value   0.850
12                    80

9                    70

Percent
60
50
8                    40
30
7                    20

12                    10

5
14
13                    1
0   5     10        15        20     25    30
BP Reduction
15
16
18
12
The data show reductions in Blood Pressure in a
18       sample of 17 people after a certain treatment. We
19       wish to test whether the average reduction in BP
20       was at least 13%, a benchmark set by some other
17
15       treatment that we wish to match or better.
One Sample t-test – Minitab results

One-Sample T: BP Reduction

Test of mu = 13 vs > 13

95%
Lower
Variable      N Mean StDev SE Mean Bound T           P
BP Reduction 17 13.8235 3.9248 0.9519 12.1616 0.87 0.200

The p-value of 0.20 indicates that the reduction in BP could not be
proven to be greater than 13%. There is a 0.20 probability that it is
not greater than 13%.
Two Sample t-test

You realize that though the overall reduction is not proven to be
more than 13%, there seems to be a difference between how men
and women react to the treatment. You separate the 17
observations by gender, and wish to test whether there is in fact a
significant difference between genders.

M     F                                Test for Equal Variances for BP Reduction
F-Test
10   15                  F
Test Statistic
P-Value
0.96
0.941

12   16                                                                                             Lev ene's Test
Gender

Test Statistic    0.14
P-Value          0.716
9   18                  M

8   12                      1       2             3              4               5         6
95% Bonferroni Confidence Intervals for StDevs

7   18
12   19                  F

14   20
Gender

13   17                  M

15                      6   8        10        12       14         16        18   20
BP Reduction
Two Sample t-test

The test for equal variances shows that they are not different for the 2
samples. Thus a 2-sample t test may be conducted. The results are
shown below. The p-value indicates there is a significant difference
between the genders in their reaction to the treatment.

Two-sample T for BP Reduction M vs BP Reduction F

N Mean StDev SE Mean
BP Red M 8 10.63 2.50 0.89
BP Red F 9 16.67 2.45 0.82

Difference = mu (BP Red M) - mu (BP Red F)
Estimate for difference: -6.04167
95% CI for difference: (-8.60489, -3.47844)
T-Test of difference = 0 (vs not =): T-Value = -5.02 P-Value = 0.000
DF = 15
Both use Pooled StDev = 2.4749
Basics of ANOVA

Analysis of Variance, or ANOVA is a technique used to
test the hypothesis that there is a difference between    Obs.   Type A Type B
the means of two or more populations. It is used in
Regression, as well as to analyze a factorial
experiment design, and in Gauge R&R studies.              1        2        6
The basic premise of ANOVA is that differences in the     2        3        7
means of 2 or more groups can be seen by
partitioning the Sum of Squares. Sum of Squares           3        4        8
(SS) is simply the sum of the squared deviations of the
observations from their means. Consider the following     Mean     3        7
example with two groups. The measurements show the
thumb lengths in centimeters of two types of              SS       2        2
primates.
Overall
Total variation (SS) is 28, of which only 4 (2+2) is
within the two groups. Thus 24 of the 28 is due to the
Mean = 5
differences between the groups. This partitioning of
SS into ‘between’ and ‘within’ is used to test the
SS = 28
hypothesis that the groups are in fact different from
each other.

See www.statsoft.com for more details.
Results of ANOVA

The results of                          One-way ANOVA: Type A, Type B
running an ANOVA on
the sample data from
the previous slide are shown            Source DF SS MS        F      P
here. The hypothesis test               Factor 1 24.00 24.00 24.00 0.008
computes the F-value as the             Error 4 4.00     1.00
ratio of MS ‘Between’ to                Total 5 28.00
MS ‘Within’. The greater the
___________________________________
value of F, the greater the
likelihood that there is in fact        S = 1 R-Sq = 85.71% R-Sq(adj) = 82.14%
a difference between the groups.
looking it up in an F-distribution
table shows a p-value of 0.008,
indicating a 99.2% confidence that
the difference is real (exists in the
Population, not just in the sample).

Minitab: Stat/ANOVA/One-Way (unstacked)
Two-Way ANOVA

Strength   Temp   Speed
Is the strength of steel produced different
20.0       Low    Slow     for different temperatures to which it is
22.0       Low    Slow     heated and the speed with which it is
21.5       Low    Slow
cooled? Here 2 factors (speed and temp)
23.0       Low    Fast
24.0       Low    Fast     are varied at 2 levels each, and strengths
22.0       Low    Fast     of 3 parts produced at each combination
25.0       High   Slow     are measured as the response variable.
24.0       High   Slow
24.5       High   Slow
17.0       High   Fast     Two-way ANOVA: Strength versus Temp, Speed
18.0       High   Fast
17.5       High   Fast
Source      DF    SS      MS      F   P
Temp        1 3.5208     3.5208 5.45 0.048
Speed       1 20.0208   20.0208 31.00 0.001
The results show
Interaction 1 58.5208   58.5208 90.61 0.000
significant main effects   Error       8 5.1667     0.6458
as well as an              Total      11 87.2292
interaction effect.
S = 0.8036 R-Sq = 94.08% R-Sq(adj) = 91.86%
Two-Way ANOVA

The box plots give an indication of the interaction effect. The
effect of speed on the response is different for different levels of
temperature. Thus, there is an interaction effect between
temperature and speed.

Boxplot of Strength by Temp, Speed

25

24

23

22
Strength

21

20

19

18

17

16
Speed    Fast          Slow           Fast         Slow
Temp            High                         Low

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 2 posted: 3/21/2013 language: English pages: 16