VIEWS: 10 PAGES: 23 POSTED ON: 7/7/2012 Public Domain
Inference for Proportions Inference for a Population Proportion Comparing Two Proportions Inference • Remember, in most situations much of the ‘population’ is not known – i.e., we sometimes don’t know: – the population standard deviation ( ) or – the population parameter ( p ) • But, we can continue to work with samples and use the values observed there to make estimates (inferences) about the overall population – Sample population parameter (p ) ̂ – If the population standard deviation is unknown, you can ̂ calculate it using the p = p (1 p ) n as long as the population is 10x the sample size and the normality conditions are met np 10, n(1 p) 10 Inference • If we standardize the values (around z), and rearrange the equation we get: z p p p(1 p) n • In very large samples, p approximately is p… so, the Standard Error becomes: p (1 p ) SE n for a confidence interval for p of: estimate z * SEestimate p(1 p) p z* n Sampling Distribution Conditions for Inference about a Proportion • Again, for Inferences about a Proportion, the following conditions must be met (assumed): – The data are a simple random sample (SRS) – The population is at least 10x as large as the sample – For a test of Ho: p po , the sample size n is large enough that np 10, n(1 p) 10 Inference Examples • In our activity yesterday, we conducted 4 separate but similar simulations… – 2 sets of random head/tail coin simulations http : / / www.random.org / coins http : / / shazam.econ.ubc.ca / flip / – Red/black card simulation http : / / www.random.org / playing cards – Penny-fall simulation • What were your sample proportion values? compared to the expected values? • What did you notice about the Standard Errors? and Inference confidence intervals? Inference – Calculating Sample Size • Like before, you can also calculate the (needed) sample size from proportional information… Rearranging from before, the margin of error can be rewritten as: m z* p(1 p) n • For example… – Using 1 of our head/tail simulations: p = 0.48, z* for 95% CI (1.960), and a desired margin of error (m) of 0.05, the needed sample size (n) is: p(1 p) 0.48*0.52 0.2496 m z* ...0.05 1.960 ...0.0255 ...Re arranging , n 383.85 n n n A sample size of 384 is needed for a margin of error of 0.05 Inference for Proportions Inference for a Population Proportion Comparing Two Proportions Comparing Two Proportions • Remembering our example of 2 means… • When you have a two-sample problem, you need to take into account the two sets of sample proportions and sample sizes, to give you a resulting sampling distribution… • But first, we must assume that the two samples are: – Both randomly selected – Both occur from normally distributed populations – Independent from each other (of course, right?!) Comparing Two Proportions • For a 2-sample problem, we have: Population Population Sample Sample Proportion Size Proportion 1 p1 n1 p 1 2 p2 n2 p2 • Recalling how to work with standard deviations and variances with 2 variables… X Y X Y X Y X Y 2 2 2 • When we compare the sampling distribution, p1 – 2 p the mean is P P P P p1 p2 1 2 1 2 p1 (1 p1 ) p2 (1 p2 ) and the variance is 2 P P2 2 P 2 P2 1 1 n1 n2 Confidence Intervals for Two Proportions • The standard deviation of p1 – p2 is the square root of the variance… p1 (1 p1 ) p2 (1 p2 ) n1 n2 (note: you may add the variances, NOT the standard deviations!) • To obtain a confidence interval, the sample proportions can replace the population proportions, resulting in a Standard Error of: p (1 p ) p (1 p ) SE 1 1 2 2 n1 n2 the confidence interval has the form: estimate z * SEestimate Sampling Distribution 12.1 Quiz on Friday… • Possible Topics: – Conditions needed for an inference problem – Inference for a population proportion • Confidence intervals (z-scores) • Standard errors – Calculating sample sizes – Inference for a two-sample proportion Significance Tests • An observed difference between 2 sample proportions can reflect a difference in the populations OR it may just be due to chance variation in random sampling • Significance tests can help determine if the sample represents the population – Significance tests for p1 – p2 – Ho: p1 = p2… Null hypothesis says there is no difference between the 2 populations Inference – Remembering our Formulas • If we standardize the values (around z), and rearrange the equation we get: z p p p(1 p) n • In very large samples, p approximately is p… so, the Standard Error becomes: p (1 p ) SE n for a confidence interval for p of: estimate z * SEestimate p(1 p) p z* n Significance Test example (from book, p.707) • High levels of cholesterol in the blood are associated with higher risk of heart attacks. – Will using a drug to lower blood cholesterol reduce heart attacks? – Middle-aged men were assigned at random to 1 of 2 treatments • 2051 took the drug Gemfibrozil • A control group of 2030 took a placebo • During the next 5 years, 56 men in the Gemfibrozil group and 84 in the placebo group had heart attacks. Significance Test example (cont’d) • Significance test for cholesterol vs. heart attacks… • 56 of 2051 who took the drug Gemfibrozil had heart attacks • 84 of a control group (placebo) of 2030 had heart attacks. – Define variables: 56 • p1 – proportion for Gemfibrozil who p1 0.0273 suffer heart attacks 2051 84 • p2 – proportion for placebo group who p2 0.0414 suffer heart attacks 2030 • Ho: p1 = p2 Ha: p1 < p2 – Pool the sample proportions 56 84 140 p 0.0343 2051 2030 4081 – Check the conditions… n1p = 2051*(0.0343) = 70.3 n(1-p) = 2030*(0.9657) = 1960.4 both are > 5, therefore we can use the 2-sample z-procedure Significance Test example (cont’d) • To test the hypothesis… Ho: p1 = p2, the z-statistic formula is: p1 p2 z 1 1 p(1 p)( ) n1 n2 • In terms of a variable Z having the standard normal distribution, the p-value for a test of Ho against… Ha: p1 > p2 is P(Z>z) Ha: p1 < p2 is P(Z<z) Ha : p1 p2... 2 P( Z z ) Significance Test example (cont’d) • Continuing… for 2-sample distribution, z-formula is: p1 p2 z 1 1 p(1 p)( ) n1 n2 0.0273 0.0414 1 1 0.0343*0.9657( ) 2051 2030 0.0141 2.47 0.005695 • The p-value for a test of Ho against… Ha : p1 p2 Significance Test example (cont’d) • From the z-table, -2.47 corresponds to 0.0068 • Interpreting… With a very low p-value, and p < (i.e., far below an of 0.05 or 0.01), we have enough information to reject the Ho (that p1 = p2) • In other words, we have evidence to believe the drug Gemfibrozil reduced the rate of heart attacks. Chapter Review problems… • As part of the chapter 12 review, the homework is from p. 719: – Problems 12.35 through 12.42 – Be prepared to discuss these Tuesday… • Reminder: Chapter 11 and 12 test on Wednesday! Inference Procedures (Chapters 10-12) • At this point, we have worked with a number of distributions and types of problems… how do we sift through it all ?! • There are a few ways to look at these… – Mean ( ) vs. proportions ( p ) – Known vs. unknown standard deviations – 1-sample, 2-samples, matched pairs – z-distributions, t-distributions • See page 719 in your text for a good summary chart… Inference Procedures (the ‘big picture’)