Section 8.1 - Estimating a Proportion with Confidence

Document Sample
Section 8.1 - Estimating a Proportion with Confidence Powered By Docstoc
					  Section 8.1 - Estimating a Proportion with Confidence

Objectives:

 1.   To find a confidence interval graphically

 2.   Understand a confidence interval as consisting of those population
      proportions for which the result from the sample is reasonably likely

 3.   To always check the three conditions before constructing a
      confidence interval

 4.   To construct a confidence interval using the formula

 5.   To interpret a confidence interval and the meaning of “confidence”

 6.   To compute the required sample size for a given margin of error
  Section 8.1 - Estimating a Proportion with Confidence

General idea:

Consider the population of the U.S. Suppose you are
interested in the proportion of redheads in the population.

Since the proportion of redheads is probably unknown, you will
have to estimate it. What should you do?
 – Take a sample. (The size will depend on how much time
     and money you have.)
 – Compute the sample proportion. (The Central Limit
     Theorem tells you that this estimator is unbiased, and
     has other “desirable” properties.) This is your best guess.
 – Are you “sure”? What do you mean by sure? How “sure”
     do you need to be?
 Section 8.1 - Estimating a Proportion with Confidence

Introduction

A Pew Research Center survey found that 55% of singles
ages 18-29 say they aren’t in a committed relationship and
are not actively looking. This percentage is based on
interviews with 1068 singles. The survey reported a margin
of error of 3%.

The researchers also say that they are 95% confident that
the error in the percentage (55%) is less than 3% either way.
That is, they are 95% confident that if they were to ask all
young singles in the U.S., between 52% and 58% would
report that they aren’t in a committed relationship and are not
actively looking.

What do they mean by this?
 Section 8.1 - Estimating a Proportion with Confidence

Reasonably Likely Events

                                         ˆ
About 95% of all sample proportions p will fall
within about two standard errors of the population
proportion p, that is, within the interval
                  p(1 - p)
         p  1.96
                     n
The sample proportions in this interval are called
reasonably likely.
This rule works well only under the condition that both
np  10 and n(1 p)  10.
 Section 8.1 - Estimating a Proportion with Confidence

Reasonably Likely Events and Rare Events

Reasonably likely events are those in the middle 95% of
the distribution of all possible outcomes. The outcomes in
the upper 2.5% and lower 2.5% of the distribution are rare
events - they happen, but rarely.

              Rare                              Rare
            Lower 2.5%                        Upper 2.5%




                          Reasonably Likely
                             Middle 95%
 Section 8.1 - Estimating a Proportion with Confidence

Example: Reasonably Likely Results from Coin Flips

Suppose you flip a fair coin 100 times.
What are the reasonably likely values of the
                  ˆ
sample proportion p ?

What numbers of heads are reasonably likely?
 Section 8.1 - Estimating a Proportion with Confidence

Example: Reasonably Likely Results from Coin Flips
Suppose you flip a fair coin 100 times. What are the reasonably
                                       ˆ
likely values of the sample proportion p ?

Check conditions :
 np  (100)(0.50)  50  10; n(1  p)  (100)(0.50)  50  10
                              ˆ
95% of all sample proportions p should fall in the interval
               p(1  p)                (0.5)(0.5)
      p  1.96           0.50  1.96
                  n                       100
                         0.50  1.96(0.05)
                         0.50  0.10
                         0.4, 0.6 
 Section 8.1 - Estimating a Proportion with Confidence

Example: Reasonably Likely Results from Coin Flips

Suppose you flip a fair coin 100 times. What numbers of
heads are reasonably likely?

In about 95% of the samples, the number of successes x
will be in the interval
      np  1.96 np(1  p)  50  1.96 (100)(0.5)(0.5)
                          50  1.96(5)
                          50  10
                          40, 60 
 Section 8.1 - Estimating a Proportion with Confidence

Introduction, continued.


 The Pew Research Center doesn’t know the value of p
 (the percentage of young singles not in a relationship).

 For each possible value of p, Pew can compute how close
 to p most sample proportions will be.

 By knowing the variability expected in random samples, Pew
                        ˆ
 can estimate how close p should be to p.
 Section 8.1 - Estimating a Proportion with Confidence

The Meaning of a Confidence Interval

Suppose you take repeated random samples of size 40 from a population
with 60% successes. What proportion of successes would be reasonably
likely in your sample?
 np  (40)(0.60)  24  10; n(1  p)  (40)(0.40)  16  10
  p  p  0.60
   ˆ

       p(1  p)      (0.60)(0.40)
 p 
  ˆ                               0.077
          n               40
 Reasonably likely = Middle 95%
                   =  p  1.96   p
                       ˆ            ˆ

                    0.60  1.96(0.077)
                    0.60  0.151
                    0.449, 0.751
 Section 8.1 - Estimating a Proportion with Confidence

The Meaning of a Confidence Interval

           Reasonably likely sample proportions for n = 40
      p     (1 - p)            ME = 1.96       CI = p ± ME
     0.9      0.1      0.047       0.092         [0.808, 0.992]
     0.8      0.2      0.063       0.123         [0.677, 0.923]
     0.7      0.3      0.072       0.141         [0.559, 0.841]
     0.6      0.4      0.077       0.151         [0.449, 0.751]
     0.5      0.5      0.079       0.155         [0.345, 0.655]
     0.4      0.6      0.077       0.151         [0.249, 0.551]
     0.3      0.7      0.072       0.141         [0.159, 0.441]
     0.2      0.8      0.063       0.123         [0.077, 0.323]
     0.1      0.9      0.047       0.092         [0.008, 0.192]
  Section 8.1 - Estimating a Proportion with Confidence

The Meaning of a Confidence Interval

    Reasonably likely sample proportions for samples of size n = 40

        1   y


            Proportion
      0.8
            of
            Successes
            in the
      0.6
            Population


      0.4




      0.2




                         0.2         0.4            0.6           0.8   1


                               Proportion of Successes in the Sample
     -0.2
   Section 8.1 - Estimating a Proportion with Confidence

The Meaning of a Confidence Interval

Suppose that in an experiment, 75%, or 30 out of the 40 trials, resulted in
success.

Is it plausible that the true proportion is 50%?

Is it plausible that the true proportion is 80%

What values are plausible for the population proportion?
   Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval

Plausible population percentages are p = 0.6, p = 0.7, p = 0.8

           1   y


               Proportion
         0.8
               of
               Successes
               in the
         0.6
               Population


         0.4




         0.2




                            0.2         0.4            0.6           0.8   1


                                  Proportion of Successes in the Sample
        -0.2
   Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval

Plausible population percentages are p = 0.6, p = 0.7, p = 0.8.

The sample proportion 0.75 (represented by the red vertical line) intersects
the reasonably likely range of values for p = 0.80 (from 0.677 to 0.923,
represented by the orange line segment).

If the population proportion is 0.80, you are reasonably likely to get 30
successes in 40 trials, or 75%.

The sample proportion 0.75 (represented by the red vertical line) does not
intersect the reasonably likely range of values for p = 0.50 (from 0.345 to
0.655, represented by the orange line segment).

If the population proportion is 0.50, you are not likely to get 30
successes in 40 trials, or 75%.
   Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval

Plausible population percentages are p = 0.6, p = 0.7, p = 0.8

           1   y


               Proportion
         0.8
               of
               Successes
               in the
         0.6
               Population


         0.4




         0.2




                            0.2         0.4            0.6           0.8   1


                                  Proportion of Successes in the Sample
        -0.2
   Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval

Plausible population percentages are from about p = 0.6 to about p = 0.85.
These plausible percentages for the population proportion are called the
95% confidence interval for p.
        1   y


            Proportion
      0.8
            of
            Successes
            in the
      0.6
            Population


      0.4




      0.2




                         0.2         0.4            0.6           0.8   1


                               Proportion of Successes in the Sample
     -0.2
 Section 8.1 - Estimating a Proportion with Confidence


A 95% confidence interval consists of those population
                                              ˆ
proportions p for which the sample proportion p is
reasonably likely.

Note that the population proportion p is the unknown
parameter.
  Section 8.1 - Estimating a Proportion with Confidence

A Confidence Interval for a Population Proportion

Recall our formula for the "reasonably likely" interval, which
represents the middle 95% of the sampling distribution :
                                p(1  p)
pˆ  1.96   p  p  1.96 
               ˆ
                                  n

Where did the "1.96" come from?

It is the z - score corresponding to a probability of 0.9750.
  Section 8.1 - Estimating a Proportion with Confidence

A Confidence Interval for a Population Proportion

The population proportion p is an unknown parameter.

In fact, estimating p is the whole point of what we are doing.

The idea is to estimate p by a range (interval) of values instead
                             ˆ
of by a single value (point) p.
                                                    p(1  p)
Let's see if we can modify the formula p  1.96 
                                                      n
  Section 8.1 - Estimating a Proportion with Confidence

A Confidence Interval for a Population Proportion

                                                       ˆ
 Since we don't know p, let's use the next best thing, p :
                                 p(1  p )
                                 ˆ     ˆ
                  p  1.96 
                  ˆ
                                      n

 Instead of using 1.96, which is the z - score that corresponds
 to the middle 95%, let's just put in a variable, z* , which will
 depend on how confident we want to be.
                               p(1  p )
                               ˆ     ˆ
                  p  z* 
                  ˆ
                                  n
   Section 8.1 - Estimating a Proportion with Confidence

The formula for a confidence interval for the proportion of successes
p in the population is based on three components : the sample
proportion p , the standard error  p , and the confidence level z*
           ˆ                        ˆ

                        p(1  p )
                        ˆ     ˆ
             p  z* 
             ˆ
                         n
                              ˆ
Here n is the sample size and p is the proportion of successes in the sample.
The value of z* depends on how confident you want to be that the confidence
interval will contain p.
        90% CI  z*  1.645
        95% CI  z*  1.96
        99% CI  z*  2.576

Where do these values come from?
   Section 8.1 - Estimating a Proportion with Confidence
A Confidence Interval for a Population Proportion

Once again, what is it that we are trying to do?

We wish to find out the value of an unknown population parameter - the
proportion of successes.

The best estimate of the value of the population proportion, based on the
Central Limit Theorem, is to take a random sample and compute the
sample proportion. (Bigger samples are better, etc.)

In some applications, it is useful to consider a range or interval of values,
instead of just one. Depending on how “confident” we want or need to be,
we can construct a confidence interval - a range of likely values for the
population proportion.
   Section 8.1 - Estimating a Proportion with Confidence

A Confidence Interval for a Population Proportion

A confidence interval for the proportion of successes p
in the population is given by the formula
                      p(1  p )
                      ˆ     ˆ
             pz 
             ˆ    *

                         n

This confidence interval is reasonably accurate for
(1) Simple random samples from binomial populations.
(2) np  10 and n(1- p)  10
(3) Populations that are at least 10 times the size of the sample.
         N  10  n
   Section 8.1 - Estimating a Proportion with Confidence

A Confidence Interval for a Population Proportion

                                      p(1  p )
                                      ˆ     ˆ
For the confidence interval p  z 
                            ˆ    *
                                                  ,
                                         n

                           p(1  p )
                           ˆ     ˆ
the expression ME  z 
                      *
                                     is called the Margin of Error
                              n

The margin of error is one - half the width of the confidence interval.

                   ˆ
The point estimate p is located in the center of the confidence interval.
  Section 8.1 - Estimating a Proportion with Confidence

Example: Safety Violations
Suppose you have a random sample of 40 buses from a large
city and find that 24 buses have a safety violation. Find the
90% confidence interval for the proportion of all buses that
have a safety violation.
          90% confidence interval :
                 24
          p
          ˆ           0.60
                 40
          z*
           90%    1.645
                      p(1  p )
                      ˆ     ˆ                  (0.60)(0.40)
          pz 
          ˆ      *
                                 0.60  1.645
                         n                          40
                                 0.60  0.127
                               0.473, 0.727 
   Section 8.1 - Estimating a Proportion with Confidence
Example: Safety Violations
Suppose you have a random sample of 40 buses from a large city and find
that 24 buses have a safety violation. Find the 90% confidence interval for
the proportion of all buses that have a safety violation.


Using the TI-83/84:

STAT TESTS 1-PropZInt ENTER

1-PropZInt
                                             1-PropZInt
x: 24                                        (.47259, .72741)
n: 40                                        ˆ
                                             p = .6
C-Level: .90
Calculate [ENTER]
                                             n = 40
  Section 8.1 - Estimating a Proportion with Confidence

The Capture Rate

Sometimes a confidence interval “captures” the true population
proportion and sometimes it doesn’t.

The capture rate of a method of constructing confidence
intervals is the proportion of confidence intervals that contain
the population parameter (proportion) in repeated usage of the
method.

If a polling company uses 95% confidence intervals in a large
number of different surveys, the population proportion p should
be in 95% of them.
 Section 8.1 - Estimating a Proportion with Confidence




Correct statement :
                                                         p(1  p )
                                                         ˆ     ˆ
"I am 95% confident that the interval p  z 
                                      ˆ          *

                                                   n
contains the true value of the population proportion p."

Incorrect statement :
"I am 95% confident that the true value of the population
                                                 p(1  p )
                                                 ˆ     ˆ
proportion p will fall in the interval p  z 
                                       ˆ    *
                                                             "
                                                     n
  Section 8.1 - Estimating a Proportion with Confidence

Margin of Error and Sample Size

95% confidence intervals for large sample sizes are
narrower than those for small sample sizes :
          p(1  p)
          ˆ     ˆ
 pz 
 ˆ   *

            n
n increases  SE decreases  ME decreases  CI decreases
  Section 8.1 - Estimating a Proportion with Confidence

Margin of Error and Sample Size

Example: The Effect of Sample Size on the Margin of Error
                                              ˆ
Suppose you take a random sample and get p = 0.7
(a) If n = 100, find the 95% confidence interval for p
and state the margin of error
      p(1  p )
      ˆ     ˆ                 (0.7)(0.3)
ˆz 
p    *
                 0.70  1.96
         n                       100
                 0.70  0.0898
                  0.6102, 0.7898 
The margin of error is 0.0898
  Section 8.1 - Estimating a Proportion with Confidence

Margin of Error and Sample Size

Example: The Effect of Sample Size on the Margin of Error

                                              ˆ
Suppose you take a random sample and get p = 0.7
(b) What happens to the confidence interval and margin of
error if you quadruple the sample size, to n = 400?
         p(1  p )
         ˆ     ˆ                 (0.7)(0.3)
pz 
ˆ    *
                    0.70  1.96
            n                       400
                    0.70  0.0449
                  0.6551, 0.7449 
The margin of error is 0.0449
  Section 8.1 - Estimating a Proportion with Confidence

What Sample Size Should You Use?

To find a formula for the sample size, take the formula
for the margin of error and solve for the sample size n :
              p(1  p )
              ˆ      ˆ
 ME  z *

                 n
          * 2  p(1  p ) 
                 ˆ       ˆ
 ME  z 
   2
                  n 
                           

     * 2  p(1  p ) 
            ˆ      ˆ
      
 n z 
          ME 2       
  Section 8.1 - Estimating a Proportion with Confidence

What Sample Size Should You Use?



To use the formula for the sample size, you need to know
(1) what margin of error is acceptable
(2) the confidence level (use 95% unless otherwise specified)
                 ˆ
(3) the value of p (use 0.5 if no other information is available)
                          p(1  p ) 
                           ˆ     ˆ
            n z * 2
                         
                          ME 2     
  Section 8.1 - Estimating a Proportion with Confidence

What Sample Size Should You Use?

Example: What sample size should you use for a survey if you
want the margin of error to be at most 3% with 95% confidence
but you have no estimate of p?

                            p(1  p ) 
                             ˆ     ˆ
             n z  
                    * 2
                           
                            ME 2     
                        0.5  0.5 
                 1.96 
                       2
                                2 
                        0.03 
                 1067.111
                 1068

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:4/15/2013
language:English
pages:35