The elements of Statistics Test
1. Null hypothesis, H0.
2. Alternative hypothesis, Ha.
3. Test statistic.
4. Rejection region.
Definition
A type I error is made if H0 rejected when H0 is true
The probability of type I error is denoted by α.
The value of α is called the level of the test.
A type II error is made if H0 is accepted when Ha
is true.
The probability of a type II error is denoted by β.
1
Example 10.5
Claim: salespeople are averaging 15
We know that for a large enough n, the sample mean
Y is a point estimator of μ, that is approximately
normally distributed with Y and Y / n .
2
Hence our test statistic is
Y 0 Y 0
z
Y / n
The rejection region, with α=.05, is given by
{ z z.05 1.645 }
The population variance can be approximated by the
sample variance.
y 0 17 15
z 4 1.645
s / n 3/ 36
From which the claim is incorrect, and the number of
calls exceeds 15.
3
Large Sample α-Level Hypothesis Test
H 0 : 0
0 (upper-tail alternative)
H a : 0 (lower-tail alternative)
(two-tail alternative)
0
ˆ
0
Test statistic: Z
ˆ
z za (upper-tail RR)
Rejection Region : z za (lower-tail RR)
z za / 2 (two-tail RR)
4
Calculating Type II Error Probabilities and
Finding the Sample Size for the Z Test
Calculating β can be difficult, but it is easy for the
tests we will use.
ˆ
P is not in PR when Ha is true
ˆ ˆ
P ˆ k when = P
a a
when = a
a
ˆ ˆ
ˆ a
If a is the true value of , then has
ˆ
approximately standard normal distribution.
5
Example 10.8
Same claim as before, but now you want to find a
difference equal to one call in the mean number of
customers per week:
H0 : μ=15
Ha : μ=16
Find β for this test with α= β=.05
Solution
From the previous example we know that the
rejection region for a .05 level test was given by
y 0
z 1.645
s/ n
Which is equivalent to
y 0 1.645 s / n
From which we calculate
y 15 1.645 3/ 36 15.8225 =k
Draw a figure and rejection region.
Y a 15.8225 16
P P Z .36 .3594
/ n 3/ 36
6
Sample Size for and Upper-Tail α-Level Test.
z z 2
2
n
a 0
2
Example 10.9
Consider the same problem as in the previous
example (10.8), except that now:
H0 : μ=15
Ha : μ=16
with α= β=.05
Find the sample size which will assure accuracy.
Solution
Since α= β=.05, it follows that za z z.05 1.645
Then
z z 1.645 1.645 9
2 2
n 97.4
a 0 16 15
2 2
7
Relationship between hypothesis-testing
procedures and confidence intervals.
In section 8.6 we obtained the result
ˆ
P z / 2 z / 2 1
ˆ
The expression in the brackets is also called
acceptance region
ˆ
Thus when we test H 0 : against a two sided
alternative what we mean is that ˆ is one of many
values which can be the estimator.
The one sided tests are referred to as lower
confidence bound and upper confidence bound.
8
Attained significance levels or p-values
If W is a test statistic, the p-value, or attained
significance level, is the smallest level of significance
α for which the observed data indicate that the null
hypothesis should be rejected.
9
Small Sample Hypothesis Testing for μ and μ1- μ2
Assume that Y1 , Y2 ,..., Yn denote a random sample of a
size n form a normal distribution with unknown mean
μ and unknown variance σ2. If Y and S denote the
sample mean and sample standard deviation,
respectively, and if H0: μ= μ0 is true, then
Y 0
T
S/ n
Has a t distribution with n-1 degrees of freedom.
Then dependent on the alternative hypothesis
0 (upper-tail alternative)
H a : 0 (lower-tail alternative)
(two-tailed alternative)
0
Rejection region is
10
In a similar fashion we proceed with small-sample
tests for comparing two population means
Assumption: Independent samples form normal
distributions with 12 2
2
H 0 : 1 2 D0
1 2 D0 (upper-tail alternative)
H a : 1 2 D0 (lower-tail alternative)
D (two-tailed alternative)
1 2 0
Test statistic:
Y Y D0
T 1 2 .
S p 1/ n1 1/ n2
where
Sp
n1 1 S12 n2 1 S12
n1 n2 2
t t (upper-tail RR)
Rejection Region: t t (lower-tail RR)
t t (two-tailed RR)
/2
11
Test of Hypotheses Concerning a Population
Variance
Assumption: Y1 , Y2 ,..., Yn constitute a random sample
from a normal distribution with
E Yi and V Yi 2
H0 : 2 0
2
2 0 (upper-tail alternative)
2
H a : 2 0 (lower-tail alternative)
2
2
02 (two-tailed alternative)
n 1 S 2
Test statistic:
2
02
Rejection Region:
2 (upper-tail RR)
2
2
12 (lower-tail RR)
2
/ 2 or 2 12 / 2 (two-tailed RR)
2
12
Test of the Hypothesis 12 2
2
Assumptions: independent samples form normal
populations
H0 : 12 22
H a : 12 2
2
S12
Test Statistic: F 2
S2
Rejection Region: F F , where F is chosen so that
P F F when F has v1 n1 1 numerator
degree of freedom and v2 n2 1 denominator
degrees of freedom.
If H a : 12 2
2
RR: F F n1 1
n2 1, / 2
or F F n2 1
n2 1, / 2
1
13
Power of Tests and the Neyman-Pearson Lemma
The goodness of a test is measured by α and β, the
probabilities of type I and type II errors, where α is
chosen in advance and determines the location of the
rejection region.
A related but ore useful concept for evaluating the
performance of a test is called the power of the test.
Definition
Suppose W is the test statistic and RR is the rejection
region for a test of a hypothesis involving th2 value
of a parameter θ. Then the power of the test denoted
by power (θ), is the probability that the test will lead
to rejection of H0 when the actual parameter value is
θ. That is,
Power(θ)=P(W in RR when the parameter value is θ).
14
Relationship between Power and β
If a is a value of θ in the alternative hypothesis Ha,
then
Power( a )=1-β( a )
Definition 10.4
If a random sample is taken from a distribution with
parameter θ, a hypothesis is said to be a simple
hypothesis is that hypothesis uniquely specifies the
distribution of the population form which the sample
is taken. Any hypothesis that is not simple is called
composite hypothesis.
15
The Neyman-Pearson Lemma
Suppose that we wish to test the simple null
hypothesis H 0 : 0 versus the simple alternative
hypothesis H a : a , based on a random sample
Y1 , Y2 ,..., Yn from a distribution with parameter θ. Let
L (θ) denote the likelihood of the sample when the
value of the parameter is θ. Then for a given α, the
test that maximizes the power at a has a rejection
region, RR, determined by
L 0
k
L a
The value of k is chosen so that the test has the
desired value for . Such a test is a most powerful α-
level test for H0 versus Ha.
16
Likelihood Ratio Test
Define λ by
max
L ^
0 0
L ^
max
A likelihood ratio test of H0: 0 versus Ha:
employs λ as a test statistic, and the region is
determined by λ
17
Theorem 10.2
Let Y1 , Y2 ,..., Yn have joint likelihood function L .
Let r0 denote the number of free parameters that are
specified by: H 0 : 0 , and let r denote the
number of free parameters specified by the statement
. Then for large n, -2lnλ has approximately 2
distribution with r0 r degrees of freedom.
18
Some Comments on the Theory of Hypothesis
Testing
1. How do we choose between one-tailed and two –
tailed test?
It depends on the practical interest. That is if we
need in a precise measure of something and any
deviation from that would be harmful, we want a two
tailed test. On the other hand if we are hedging
against high inflation financial risks, and we will
suffer only from the high levels of inflation, a one-
tail test might be sufficient.
2. Calculation of the type I error depends upon the
value of the parameter specified in the null
hypothesis, while to calculate type II error we need a
clearly defined value of the alternative.
3. When a truly meaningful and believable value of
type II error can be calculated, we should feel
justified in accepting the null hypothesis.
4. When it is impossible to obtain a meaningful value
of the type II error, we modify our procedure as
follows. When the value of the test statistic is not in
the rejection region, we will “fail to reject” rather
than “accept” the null hypothesis.
19
5. If null hypothesis is rejected for a “small” type I
error, it does not mean that the null is wrong by a
“large” amount. It means that the null can be rejected
with a small probability the rejection is a mistake.
6. Formulating
H 0 : p .5
i)
H a : p .5
Will lead to exactly the same conclusions as
H 0 : p .5
ii)
H a : p .5
That is we will get exactly the same type I error in
both cases. Thus we can simplify our life by using (i)
instead of (ii).
20
Summary
You can pose two type of questions;
1. What is the true value of ?
2. Is 0 the true value of ?
In this chapter we were answering the second
question.
A two tailed test can be viewed as finding the region
of acceptable null hypothesis values.
Type I and type II errors measure the goodness of
statistical inference.
21
Degrees of Freedom
The term degrees of freedom (df) is a measure of the
number of independent pieces of information on
which the precision of a parameter estimate is based.
The degrees of freedom for an estimate equals the
number of observations (values) minus the number of
additional parameters estimated for that calculation.
As we have to estimate more parameters, the degrees
of freedom available decreases. It can also be thought
of as the number of observations (values) which are
freely available to vary given the additional
parameters estimated.
22