# web.as.uky.edustatisticsusersvielesta570s08re by dffhrtcv3

VIEWS: 5 PAGES: 14

• pg 1
```									 Everything about
Single Proportions

STA 570 001-002
Spring 2008
Inference for Proportions

   All inference for proportions of based on
drawing a random sample of size n from a
large population (remember n>30 and we
assume we are sampling less than 5% of the
population).
   We estimate the population proportion p with
the sample proportion phat. While phat is
usually not exactly equal to p, it varies in a
close range around p defined by the
sampling distribution
phat ~ N(p, sqrt(p(1-p)/n))
proportions

   We now know how to
   1) Conduct a hypothesis test
   2) Compute power for a hypothesis test
   3) Compute the p-value for a hypothesis test
   4) Determine the sample size for a
hypothesis test
   5) Construct a confidence interval
   6) Determine a sample size in advance for a
confidence interval
Conducting a hypothesis test

   We have a null hypothesis that specifies a single
value for p, H0 : p=p0.
   We are testing against an alternative H1 : p<p0,
H1 : p>p0, or H1 : p≠p0.
   The null distribution is N(p0, sqrt(p0(1-p0)/n))
   The cutoff depends on the alternative. For “<“,
reject for phat below the α percentile of the null
distribution. For “>”, reject for phat above the 1-α
percentile of the null distribution. For “≠”, reject
for phat outside the α/2 and 1-(α/2) percentiles
of the null distribution.
Computing power for a hypothesis
test

   Power is the probability of rejecting the null
hypothesis when H1 is true (the right decision).
   For composite alternatives, we talk about computing
power at a point in the alternative (points near the
null have power near α, point far from the null have
power near 1).
   So you are given a point p1 in the alternative.
   Compute the alternative distribution N(p1, sqrt(p1(1-
p1)/n))
   Find the area where you reject H0 (using the null
distribution as before)
   Compute the probability the alternative distribution
places in the rejection region. That is the power.
P-values

   The p-value is defined as the transition point
between α values where you would reject H0
and α values where you would not reject H0.
   If you have a p-value, remember you reject
H0 whenever the p-value is smaller than α.
You do not reject when the p-value is bigger
than α.
   Computing the p-value depends on the
alternative AND requires the data.
Computing p-values

   For a “<“ alternative, the p-value is the
probability the null distribution places below
phat.
   For a “>” alternative, the p-value is probability
the null distribution places above phat.
   For a “≠” alternative, if depends on the value
of phat. For phat below p0, the p-value is
twice the probability the null distribution
places below phat. For phat above p0, the p-
value is twice the probability the null
distribution places above phat.
Determining sample sizes for
hypothesis tests

   A sample size calculation is premised on
finding a minimum sample size for achieving
a desired α while simultaneously achieving a
desired power (POW) at a particular point p1.
   Remember power changes as p varies in the
alternative hypothesis, you have to pick
somewhere you consider meaningful for p1.
   How to compute this sample size depends
on the alternative.
Determining sample sizes for
hypothesis tests

   For a “<“ alternative, you need to equate the α
percentile of the null distribution with the POW
percentile of the alternative distribution.
   For a “>” alternative, you need to equate the 1-α
percentile of the null distribution with the 1-POW
percentile of the alternative distribution.
   For a “≠” alternative with p1<p0, you need to equate
the α/2 percentile of the null distribution to the POW
percentile of the alternative distribution.
   For a “≠” alternative with p1>p0, you need to equate
the 1-(α/2) percentile of the null distribution to the 1-
POW percentile of the alternative distribution.
Determining sample sizes for
hypothesis tests

   Let s0=sqrt(p0(1-p0)) and s1=sqrt(p1(1-p1))
   Let z0 and z1 be the Z values corresponding
to the appropriate percentiles from the null
and alternative distribution.
   The minimum required sample size, for all
cases, is
2
 z1s1  z0 s0 
n               
 p0  p1 
Confidence Intervals

   A confidence interval takes the observed value
of phat and computes a range of possible values
for the population proportion p.
   The “confidence level” of this interval refers to
the probability the procedure will produce an
interval containing the population proportion.
Here α is 1 minus the confidence level (e.g. 99%
confidence results in α=0.01)
   Let z* be the Z-score corresponding to the
1-(α/2) percentile. Note the values
corresponding to Z=±z* contain probability
equaling the confidence level.
Confidence Intervals

   A confidence interval for p is
phat ± z* sqrt(phat(1-phat)/n)
   Note we have estimated the standard
deviation using phat. This does not cause
problems for the sample sizes we are
considering (n>30).
   Note the length of this confidence interval is
2 z* sqrt(phat(1-phat)/n))
Computing sample sizes for
confidence intervals

   Recall the length of the confidence interval is
2 z* sqrt(phat(1-phat)/n))
   To make sure this is less than a prespecified
length L, you need n to be at least

n
 
4 p (1  p ) z
ˆ       ˆ      * 2

2
L
   If you have a guess of phat in advance, use
it. Otherwise, guess phat=0.5 to protect
yourself against all phats simultaneously.