VIEWS: 0 PAGES: 35 POSTED ON: 4/15/2013 Public Domain
Section 8.1 - Estimating a Proportion with Confidence Objectives: 1. To find a confidence interval graphically 2. Understand a confidence interval as consisting of those population proportions for which the result from the sample is reasonably likely 3. To always check the three conditions before constructing a confidence interval 4. To construct a confidence interval using the formula 5. To interpret a confidence interval and the meaning of “confidence” 6. To compute the required sample size for a given margin of error Section 8.1 - Estimating a Proportion with Confidence General idea: Consider the population of the U.S. Suppose you are interested in the proportion of redheads in the population. Since the proportion of redheads is probably unknown, you will have to estimate it. What should you do? – Take a sample. (The size will depend on how much time and money you have.) – Compute the sample proportion. (The Central Limit Theorem tells you that this estimator is unbiased, and has other “desirable” properties.) This is your best guess. – Are you “sure”? What do you mean by sure? How “sure” do you need to be? Section 8.1 - Estimating a Proportion with Confidence Introduction A Pew Research Center survey found that 55% of singles ages 18-29 say they aren’t in a committed relationship and are not actively looking. This percentage is based on interviews with 1068 singles. The survey reported a margin of error of 3%. The researchers also say that they are 95% confident that the error in the percentage (55%) is less than 3% either way. That is, they are 95% confident that if they were to ask all young singles in the U.S., between 52% and 58% would report that they aren’t in a committed relationship and are not actively looking. What do they mean by this? Section 8.1 - Estimating a Proportion with Confidence Reasonably Likely Events ˆ About 95% of all sample proportions p will fall within about two standard errors of the population proportion p, that is, within the interval p(1 - p) p 1.96 n The sample proportions in this interval are called reasonably likely. This rule works well only under the condition that both np 10 and n(1 p) 10. Section 8.1 - Estimating a Proportion with Confidence Reasonably Likely Events and Rare Events Reasonably likely events are those in the middle 95% of the distribution of all possible outcomes. The outcomes in the upper 2.5% and lower 2.5% of the distribution are rare events - they happen, but rarely. Rare Rare Lower 2.5% Upper 2.5% Reasonably Likely Middle 95% Section 8.1 - Estimating a Proportion with Confidence Example: Reasonably Likely Results from Coin Flips Suppose you flip a fair coin 100 times. What are the reasonably likely values of the ˆ sample proportion p ? What numbers of heads are reasonably likely? Section 8.1 - Estimating a Proportion with Confidence Example: Reasonably Likely Results from Coin Flips Suppose you flip a fair coin 100 times. What are the reasonably ˆ likely values of the sample proportion p ? Check conditions : np (100)(0.50) 50 10; n(1 p) (100)(0.50) 50 10 ˆ 95% of all sample proportions p should fall in the interval p(1 p) (0.5)(0.5) p 1.96 0.50 1.96 n 100 0.50 1.96(0.05) 0.50 0.10 0.4, 0.6 Section 8.1 - Estimating a Proportion with Confidence Example: Reasonably Likely Results from Coin Flips Suppose you flip a fair coin 100 times. What numbers of heads are reasonably likely? In about 95% of the samples, the number of successes x will be in the interval np 1.96 np(1 p) 50 1.96 (100)(0.5)(0.5) 50 1.96(5) 50 10 40, 60 Section 8.1 - Estimating a Proportion with Confidence Introduction, continued. The Pew Research Center doesn’t know the value of p (the percentage of young singles not in a relationship). For each possible value of p, Pew can compute how close to p most sample proportions will be. By knowing the variability expected in random samples, Pew ˆ can estimate how close p should be to p. Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Suppose you take repeated random samples of size 40 from a population with 60% successes. What proportion of successes would be reasonably likely in your sample? np (40)(0.60) 24 10; n(1 p) (40)(0.40) 16 10 p p 0.60 ˆ p(1 p) (0.60)(0.40) p ˆ 0.077 n 40 Reasonably likely = Middle 95% = p 1.96 p ˆ ˆ 0.60 1.96(0.077) 0.60 0.151 0.449, 0.751 Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Reasonably likely sample proportions for n = 40 p (1 - p) ME = 1.96 CI = p ± ME 0.9 0.1 0.047 0.092 [0.808, 0.992] 0.8 0.2 0.063 0.123 [0.677, 0.923] 0.7 0.3 0.072 0.141 [0.559, 0.841] 0.6 0.4 0.077 0.151 [0.449, 0.751] 0.5 0.5 0.079 0.155 [0.345, 0.655] 0.4 0.6 0.077 0.151 [0.249, 0.551] 0.3 0.7 0.072 0.141 [0.159, 0.441] 0.2 0.8 0.063 0.123 [0.077, 0.323] 0.1 0.9 0.047 0.092 [0.008, 0.192] Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Reasonably likely sample proportions for samples of size n = 40 1 y Proportion 0.8 of Successes in the 0.6 Population 0.4 0.2 0.2 0.4 0.6 0.8 1 Proportion of Successes in the Sample -0.2 Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Suppose that in an experiment, 75%, or 30 out of the 40 trials, resulted in success. Is it plausible that the true proportion is 50%? Is it plausible that the true proportion is 80% What values are plausible for the population proportion? Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Plausible population percentages are p = 0.6, p = 0.7, p = 0.8 1 y Proportion 0.8 of Successes in the 0.6 Population 0.4 0.2 0.2 0.4 0.6 0.8 1 Proportion of Successes in the Sample -0.2 Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Plausible population percentages are p = 0.6, p = 0.7, p = 0.8. The sample proportion 0.75 (represented by the red vertical line) intersects the reasonably likely range of values for p = 0.80 (from 0.677 to 0.923, represented by the orange line segment). If the population proportion is 0.80, you are reasonably likely to get 30 successes in 40 trials, or 75%. The sample proportion 0.75 (represented by the red vertical line) does not intersect the reasonably likely range of values for p = 0.50 (from 0.345 to 0.655, represented by the orange line segment). If the population proportion is 0.50, you are not likely to get 30 successes in 40 trials, or 75%. Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Plausible population percentages are p = 0.6, p = 0.7, p = 0.8 1 y Proportion 0.8 of Successes in the 0.6 Population 0.4 0.2 0.2 0.4 0.6 0.8 1 Proportion of Successes in the Sample -0.2 Section 8.1 - Estimating a Proportion with Confidence The Meaning of a Confidence Interval Plausible population percentages are from about p = 0.6 to about p = 0.85. These plausible percentages for the population proportion are called the 95% confidence interval for p. 1 y Proportion 0.8 of Successes in the 0.6 Population 0.4 0.2 0.2 0.4 0.6 0.8 1 Proportion of Successes in the Sample -0.2 Section 8.1 - Estimating a Proportion with Confidence A 95% confidence interval consists of those population ˆ proportions p for which the sample proportion p is reasonably likely. Note that the population proportion p is the unknown parameter. Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion Recall our formula for the "reasonably likely" interval, which represents the middle 95% of the sampling distribution : p(1 p) pˆ 1.96 p p 1.96 ˆ n Where did the "1.96" come from? It is the z - score corresponding to a probability of 0.9750. Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion The population proportion p is an unknown parameter. In fact, estimating p is the whole point of what we are doing. The idea is to estimate p by a range (interval) of values instead ˆ of by a single value (point) p. p(1 p) Let's see if we can modify the formula p 1.96 n Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion ˆ Since we don't know p, let's use the next best thing, p : p(1 p ) ˆ ˆ p 1.96 ˆ n Instead of using 1.96, which is the z - score that corresponds to the middle 95%, let's just put in a variable, z* , which will depend on how confident we want to be. p(1 p ) ˆ ˆ p z* ˆ n Section 8.1 - Estimating a Proportion with Confidence The formula for a confidence interval for the proportion of successes p in the population is based on three components : the sample proportion p , the standard error p , and the confidence level z* ˆ ˆ p(1 p ) ˆ ˆ p z* ˆ n ˆ Here n is the sample size and p is the proportion of successes in the sample. The value of z* depends on how confident you want to be that the confidence interval will contain p. 90% CI z* 1.645 95% CI z* 1.96 99% CI z* 2.576 Where do these values come from? Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion Once again, what is it that we are trying to do? We wish to find out the value of an unknown population parameter - the proportion of successes. The best estimate of the value of the population proportion, based on the Central Limit Theorem, is to take a random sample and compute the sample proportion. (Bigger samples are better, etc.) In some applications, it is useful to consider a range or interval of values, instead of just one. Depending on how “confident” we want or need to be, we can construct a confidence interval - a range of likely values for the population proportion. Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion A confidence interval for the proportion of successes p in the population is given by the formula p(1 p ) ˆ ˆ pz ˆ * n This confidence interval is reasonably accurate for (1) Simple random samples from binomial populations. (2) np 10 and n(1- p) 10 (3) Populations that are at least 10 times the size of the sample. N 10 n Section 8.1 - Estimating a Proportion with Confidence A Confidence Interval for a Population Proportion p(1 p ) ˆ ˆ For the confidence interval p z ˆ * , n p(1 p ) ˆ ˆ the expression ME z * is called the Margin of Error n The margin of error is one - half the width of the confidence interval. ˆ The point estimate p is located in the center of the confidence interval. Section 8.1 - Estimating a Proportion with Confidence Example: Safety Violations Suppose you have a random sample of 40 buses from a large city and find that 24 buses have a safety violation. Find the 90% confidence interval for the proportion of all buses that have a safety violation. 90% confidence interval : 24 p ˆ 0.60 40 z* 90% 1.645 p(1 p ) ˆ ˆ (0.60)(0.40) pz ˆ * 0.60 1.645 n 40 0.60 0.127 0.473, 0.727 Section 8.1 - Estimating a Proportion with Confidence Example: Safety Violations Suppose you have a random sample of 40 buses from a large city and find that 24 buses have a safety violation. Find the 90% confidence interval for the proportion of all buses that have a safety violation. Using the TI-83/84: STAT TESTS 1-PropZInt ENTER 1-PropZInt 1-PropZInt x: 24 (.47259, .72741) n: 40 ˆ p = .6 C-Level: .90 Calculate [ENTER] n = 40 Section 8.1 - Estimating a Proportion with Confidence The Capture Rate Sometimes a confidence interval “captures” the true population proportion and sometimes it doesn’t. The capture rate of a method of constructing confidence intervals is the proportion of confidence intervals that contain the population parameter (proportion) in repeated usage of the method. If a polling company uses 95% confidence intervals in a large number of different surveys, the population proportion p should be in 95% of them. Section 8.1 - Estimating a Proportion with Confidence Correct statement : p(1 p ) ˆ ˆ "I am 95% confident that the interval p z ˆ * n contains the true value of the population proportion p." Incorrect statement : "I am 95% confident that the true value of the population p(1 p ) ˆ ˆ proportion p will fall in the interval p z ˆ * " n Section 8.1 - Estimating a Proportion with Confidence Margin of Error and Sample Size 95% confidence intervals for large sample sizes are narrower than those for small sample sizes : p(1 p) ˆ ˆ pz ˆ * n n increases SE decreases ME decreases CI decreases Section 8.1 - Estimating a Proportion with Confidence Margin of Error and Sample Size Example: The Effect of Sample Size on the Margin of Error ˆ Suppose you take a random sample and get p = 0.7 (a) If n = 100, find the 95% confidence interval for p and state the margin of error p(1 p ) ˆ ˆ (0.7)(0.3) ˆz p * 0.70 1.96 n 100 0.70 0.0898 0.6102, 0.7898 The margin of error is 0.0898 Section 8.1 - Estimating a Proportion with Confidence Margin of Error and Sample Size Example: The Effect of Sample Size on the Margin of Error ˆ Suppose you take a random sample and get p = 0.7 (b) What happens to the confidence interval and margin of error if you quadruple the sample size, to n = 400? p(1 p ) ˆ ˆ (0.7)(0.3) pz ˆ * 0.70 1.96 n 400 0.70 0.0449 0.6551, 0.7449 The margin of error is 0.0449 Section 8.1 - Estimating a Proportion with Confidence What Sample Size Should You Use? To find a formula for the sample size, take the formula for the margin of error and solve for the sample size n : p(1 p ) ˆ ˆ ME z * n * 2 p(1 p ) ˆ ˆ ME z 2 n * 2 p(1 p ) ˆ ˆ n z ME 2 Section 8.1 - Estimating a Proportion with Confidence What Sample Size Should You Use? To use the formula for the sample size, you need to know (1) what margin of error is acceptable (2) the confidence level (use 95% unless otherwise specified) ˆ (3) the value of p (use 0.5 if no other information is available) p(1 p ) ˆ ˆ n z * 2 ME 2 Section 8.1 - Estimating a Proportion with Confidence What Sample Size Should You Use? Example: What sample size should you use for a survey if you want the margin of error to be at most 3% with 95% confidence but you have no estimate of p? p(1 p ) ˆ ˆ n z * 2 ME 2 0.5 0.5 1.96 2 2 0.03 1067.111 1068