# web.as.uky.edustatisticsuserskvielesta570s10p by dffhrtcv3

VIEWS: 4 PAGES: 17

• pg 1
```									 Everything about
Single Proportions

STA 570 401-402
Spring 2010
Things we’ve learned about
hypothesis testing

   1) Planning a hypothesis test
–   For fixed α and power at a fixed alternative value, we
can determine the required sample size n.
   2) Planning or Analysis phase of a hypothesis
test.
–   For fixed n and α, we can compute the cutoff for the
hypothesis test (where to reject).
–   For fixed n and α, we can compute the power for a
fixed alternative value.
   3) Analysis phase of a hypothesis test (e.g. we
have data).
–   Determine whether or not to accept or reject H0.
–   Compute the p-value of the test.
Things we’ve learned about
confidence intervals

   1) Prior to collecting data
–   We can determine the required sample size to
achieve a particular confidence interval length.
   2) After collecting data
–   We can construct a confidence interval from the
data.
Information common to all
hypothesis tests we have studied
or will study

   The null hypothesis states a parameter is equal
to a specific value, while the alternative can
state that the parameter is any of “<“, “>”, or “≠”
that specific value.
   The cutoff of the hypothesis test is based on the
null distribution, which is the sampling
distribution when H0 is true. The cutoff is in the
direction/s of the alternative hypothesis.
   Cutoff/s
–   α percentile for a “<“ alternative (reject below)
–   1-α percentile for a “>” alternative (reject above)
–   both the α/2 and 1-(α/2) percentiles for a “≠”
alternative (reject to the extremes)
More common information

   Finding the power of the test requires the
alternative distribution, the sampling
distribution at a fixed point in the alternative
hypothesis.
   The power of the test is the probability of
rejecting the null hypothesis when the
alternative hypothesis is true (the right
answer, thus we want a high power).
More common information

   To compute a sample size in advance of the
experiment, we want to find a sample size
which simultaneously achieves a given α and
power.
   We need both the null and alternative
distributions in terms of an unknown n.
   We then solve an equation equating
percentiles of the null and alternative
distributions, based on the direction of the
alternative.
Which percentiles to equate?

   To find the required sample size in
advance...
–   “<“ alternative – equate α percentile of null to
POW percentile of alternative.
–   “>” alternative – equate 1-α percentile of null to 1-
POW percentile of alternative.
–   “≠” alternative with alternative value less than null
value – equate the α/2 percentile of null to POW
percentile of alternative.
–   “≠” alternative with alternative value greater than
null value – equate the 1-(α/2) percentile of null to
POW percentile of alternative.
Common information about p-
values

   p-values are computed from the data.
   For all α > p-value, reject the null hypothesis
H0. For all α < p-value, do not reject H0.
   Way to remember – small p-values typically
result in rejecting H0.
   To compute a p-value, you need to compute
a probability involving the null distribution, in
the direction of the alternative hypothesis.
Computing p-values

   To compute a p-value, compute the following
probabilities under the null distribution.
–   “<“ alternative – compute the probability below the
data.
–   “>” alternative – compute the probability above
the data.
–   “≠” alternative, data below the null value –
compute the probability below the data AND
double the result.
–   “≠” alternative, data above the null value –
compute the probability above the data AND
double the result.
Common information about
confidence intervals

   For STA570 (there are more general situations
we do not consider) a confidence interval is
centered at the best point estimate available.
   The width of the confidence interval is
determined by computing the width needed to
contain the middle 1-α of the sampling
distribution.
   If that width itself depends on the parameter
(e.g. p(1-p) involves the parameter p), then
estimate the parameter in determining the width.
   To determine a sample size in advance, set up
an equation relating the length of the interval to
n, then solve for n.
Inference for Proportions

   All inference for proportions of based on
drawing a random sample of size n from a
large population (remember n>30 and we
assume we are sampling less than 5% of the
population).
   We estimate the population proportion p with
the sample proportion phat. While phat is
usually not exactly equal to p, it varies in a
close range around p defined by the
sampling distribution
phat ~ N(p, sqrt(p(1-p)/n))
Conducting a hypothesis test

   We have a null hypothesis that specifies a
single value for p, H0 : p=p0.
   We are testing against an alternative H1 :
p<p0, H1 : p>p0, or H1 : p≠p0.
   The null distribution is N(p0, sqrt(p0(1-p0)/n))
   Where useful, the alternative distribution is
N(p1, sqrt(p1(1-p1)/n)) for a fixed p1.
Determining sample sizes for
hypothesis tests

   Let s0=sqrt(p0(1-p0)) and s1=sqrt(p1(1-p1))
   Let z0 and z1 be the Z values corresponding
to the appropriate percentiles from the null
and alternative distribution.
   The minimum required sample size, for all
cases, is
2
 z1s1  z0 s0 
n               
 p0  p1 
Confidence Intervals

   A confidence interval takes the observed value
of phat and computes a range of possible values
for the population proportion p.
   The “confidence level” of this interval refers to
the probability the procedure will produce an
interval containing the population proportion.
Here α is 1 minus the confidence level (e.g. 99%
confidence results in α=0.01)
   Let z* be the Z-score corresponding to the
1-(α/2) percentile. Note the values
corresponding to Z=±z* contain probability
equaling the confidence level.
Confidence Intervals

   A confidence interval for p is
phat ± z* sqrt(phat(1-phat)/n)
   Note we have estimated the standard
deviation using phat. This does not cause
problems for the sample sizes we are
considering (n>30).
   Note the length of this confidence interval is
2 z* sqrt(phat(1-phat)/n))
Computing sample sizes for
confidence intervals

   Recall the length of the confidence interval is
2 z* sqrt(phat(1-phat)/n))
   To make sure this is less than a prespecified
length L, you need n to be at least

n
 
4 p (1  p ) z
ˆ       ˆ      * 2

2
L
   If you have a guess of phat in advance, use
it. Otherwise, guess phat=0.5 to protect
yourself against all phats simultaneously.
Common threads

   We make repeated use of the sampling
distribution (through the null distribution, the
alternative distribution, and the form of the
confidence interval).
   Different situations have different sampling
distributions, but the way we use them we
remain the same. We will still want particular
percentiles of the null and alternative
distributions, for example.

```
To top