# Inference for Proportions by cJ74v9

VIEWS: 10 PAGES: 23

• pg 1
```									 Inference for Proportions

Inference for a Population Proportion
Comparing Two Proportions
Inference
•       Remember, in most situations much of the ‘population’
is not known – i.e., we sometimes don’t know:
–     the population standard deviation (  ) or
–     the population parameter ( p )
•       But, we can continue to work with samples and use the
values observed there to make estimates (inferences)
about the overall population
–     Sample population parameter (p )  ̂
–     If the population standard deviation is unknown, you can
̂
calculate it using the p = p (1  p )
n
as long as the population is 10x the sample size
and the normality conditions are met np  10, n(1  p)  10
Inference
•   If we standardize the values (around z), and rearrange

the equation we get: z  p  p
p(1  p)
n

•   In very large samples, p approximately is p…
so, the Standard Error becomes:
      
p (1  p )
SE 
n
for a confidence interval for p of:
estimate  z * SEestimate
     
         p(1  p)
p  z*
n
Sampling Distribution
Conditions for Inference about a Proportion
•       Again, for Inferences about a Proportion, the following
conditions must be met (assumed):
–     The data are a simple random sample (SRS)
–     The population is at least 10x as large as the sample
–     For a test of Ho: p  po , the sample size n is large enough
that np  10, n(1  p)  10
Inference Examples
•       In our activity yesterday, we conducted 4 separate but
similar simulations…
–     2 sets of random head/tail coin simulations
http : / / www.random.org / coins
http : / / shazam.econ.ubc.ca / flip /

–     Red/black card simulation
http : / / www.random.org / playing  cards
–     Penny-fall simulation
•       What were your sample proportion values?
compared to the expected values?
•       What did you notice about the Standard Errors?
and Inference confidence intervals?
Inference – Calculating Sample Size
•       Like before, you can also calculate the (needed)
sample size from proportional information…
Rearranging from before, the margin of error can be
rewritten as: m  z* p(1  p)
n

•       For example…
–       Using 1 of our head/tail simulations:

p = 0.48, z* for 95% CI (1.960), and a desired margin of
error (m) of 0.05, the needed sample size (n) is:
     
p(1  p)                 0.48*0.52             0.2496
m  z*            ...0.05  1.960           ...0.0255         ...Re arranging , n  383.85
n                         n                    n

A sample size of 384 is needed for a margin of error of 0.05
Inference for Proportions

Inference for a Population Proportion
Comparing Two Proportions
Comparing Two Proportions
•       Remembering our example of 2 means…
•       When you have a two-sample problem, you need to
take into account the two sets of sample proportions
and sample sizes, to give you a resulting sampling
distribution…
•       But first, we must assume that the two samples are:
–     Both randomly selected
–     Both occur from normally distributed populations
–     Independent from each other        (of course, right?!) 
Comparing Two Proportions
•   For a 2-sample problem, we have:
Population   Population              Sample             Sample
Proportion               Size             Proportion
1             p1                   n1                  
p 1

2             p2                   n2                  
p2

•   Recalling how to work with standard deviations and
variances with 2 variables…  X Y   X  Y
 X Y   X   Y
2        2     2

•                                              
When we compare the sampling distribution, p1 –  2
p
the mean is  P  P   P   P  p1  p2
1   2           1        2

p1 (1  p1 ) p2 (1  p2 )
and the variance is          2
P  P2     
2
P
2
P2                
1          1
n1           n2
Confidence Intervals for Two Proportions
•    The standard deviation of p1 – p2 is the square root of
the variance… p1 (1  p1 ) p2 (1  p2 )

n1              n2
(note: you may add the variances, NOT the standard
deviations!)
•    To obtain a confidence interval, the sample proportions
can replace the population proportions, resulting in a
Standard Error of:                      
p (1  p ) p (1  p )
SE    1         1
   2          2

n1                n2

the confidence interval has the form:                  estimate  z * SEestimate
Sampling Distribution
12.1 Quiz on Friday…
•       Possible Topics:
–        Conditions needed for an inference problem
–        Inference for a population proportion
•      Confidence intervals (z-scores)
•      Standard errors
–        Calculating sample sizes
–        Inference for a two-sample proportion
Significance Tests
•       An observed difference between 2 sample proportions
can reflect a difference in the populations OR
it may just be due to chance variation in random
sampling
•       Significance tests can help determine if the sample
represents the population
–     Significance tests for p1 – p2
–     Ho: p1 = p2…
Null hypothesis says there is no difference between the 2
populations
Inference – Remembering our Formulas
•   If we standardize the values (around z), and rearrange

the equation we get: z  p  p
p(1  p)
n

•   In very large samples, p approximately is p…
so, the Standard Error becomes:
      
p (1  p )
SE 
n
for a confidence interval for p of:
estimate  z * SEestimate
     
         p(1  p)
p  z*
n
Significance Test example (from book, p.707)
•       High levels of cholesterol in the blood are associated
with higher risk of heart attacks.
–        Will using a drug to lower blood cholesterol reduce heart
attacks?
–        Middle-aged men were assigned at random to 1 of 2
treatments
•      2051 took the drug Gemfibrozil
•      A control group of 2030 took a placebo
•      During the next 5 years, 56 men in the Gemfibrozil group and 84 in
the placebo group had heart attacks.
Significance Test example (cont’d)
•       Significance test for cholesterol vs. heart attacks…
•      56 of 2051 who took the drug Gemfibrozil had heart attacks
•      84 of a control group (placebo) of 2030 had heart attacks.
–        Define variables:
     56
•      p1 – proportion for Gemfibrozil who       p1        0.0273
suffer heart attacks                           2051
     84
•      p2 – proportion for placebo group who     p2        0.0414
suffer heart attacks                           2030
•      Ho: p1 = p2 Ha: p1 < p2
–        Pool the sample proportions             56  84    140
p                  0.0343
2051  2030 4081

–        Check the conditions… n1p = 2051*(0.0343) = 70.3

n(1-p) = 2030*(0.9657) = 1960.4
both are > 5, therefore we can use the 2-sample z-procedure
Significance Test example (cont’d)
•   To test the hypothesis…        Ho: p1 = p2,
   
the z-statistic formula is:             p1  p2
z
       1   1
p(1  p)(  )
n1 n2

•   In terms of a variable Z having the
standard normal distribution,
the p-value for a test of Ho against…
Ha: p1 > p2 is P(Z>z)
Ha: p1 < p2 is P(Z<z)
Ha : p1  p2...
2 P( Z  z )
Significance Test example (cont’d)
•   Continuing… for 2-sample distribution, z-formula is:
   
p1  p2
z
       1   1
p(1  p)(  )
n1 n2
0.0273  0.0414

1    1
0.0343*0.9657(           )
2051 2030
0.0141
            2.47
0.005695
•   The p-value for a test of
Ho against…
Ha : p1  p2
Significance Test example (cont’d)
•   From the z-table, -2.47
corresponds to 0.0068

•   Interpreting… With a very low p-value, and p < 
(i.e., far below an  of 0.05 or 0.01), we have enough
information to reject the Ho (that p1 = p2)
•   In other words, we have evidence to believe the drug
Gemfibrozil reduced the rate of heart attacks.
Chapter Review problems…
•       As part of the chapter 12 review,
the homework is from p. 719:
–     Problems 12.35 through 12.42
–     Be prepared to discuss these Tuesday…

•       Reminder: Chapter 11 and 12 test on Wednesday!
Inference Procedures (Chapters 10-12)
•       At this point, we have worked with a number of
distributions and types of problems…
how do we sift through it all ?!
•       There are a few ways to look at these…
–     Mean ( ) vs. proportions ( p )
–     Known vs. unknown standard deviations
–     1-sample, 2-samples, matched pairs
–     z-distributions, t-distributions
•       See page 719 in your text for a good summary chart…
Inference Procedures (the ‘big picture’)

```
To top