VIEWS: 26 PAGES: 8 POSTED ON: 9/13/2012 Public Domain
Name: Statistics Davidson College Economics 105, Jan-May 2004 Mark C. Foley Review # 2 Suggested Solutions Directions: This review is closed-book, closed-notes (except for your formula sheet) to be taken in one sitting not to exceed 4 hours. You may use a calculator and/or Excel. Perform your calculations to 3 decimal places, where necessary. There are 100 points on the exam. Each problem is worth 25 points. You must show all your work to receive full credit. Any assumptions you make and intermediate steps should be clearly indicated. Do not simply write down a final answer to the problems without an explanation. Please turn in your formula sheet with your exam. Think clearly and work efficiently. Honor Pledge Start time End time Problem 1 The government of Japan claims that average life expectancy of all its citizens’ is 83 years. You travel to Japan, visit cemeteries and randomly find the records for eight deceased individuals, and collect the following data on age at death: 82 77 85 76 81 91 70 82 The government is worried about advertising a life expectancy that is too high, and so wonders if the data provide evidence that the population life expectancy is less than 83. (a) Write down the null hypothesis and appropriate alternative hypothesis. X = life expectancy X population mean life expectancy H 0 : X 83 H1 : X 83 (b) Stating any assumptions, at the 5% significance level, perform the hypothesis test on the sample statistic scale. p-value .05 x =80.5 83 x xa t7 -1.895 t 7 =-1.122 0 n x 2 i nx 2 52120 8( 644 ) 2 s 2 i 1 8 39.714 , s X 6.302 n 1 X 7 x X x 83 We know that 1.895 a . So 1.895 a xa 78.777 sX 6.302 n 8 So using the sample statistic scale, we fail to reject the null hypothesis because 80.5 is greater than 78.777 (i.e., the sample mean is not in the rejection region, < 78.777). Finally, to use the t-distribution, we must assume that X is distributed normally in the population. (c) At the 5% significance level, perform the hypothesis test on the test statistic scale. Maintain your assumptions from (b). On the test statistic scale, we compare the t-statistic to the critical value of –1.895. 80.5 83 t7 1.122 . Since –1.122 > -1.895, we again fail to reject the null at the 5% level. 6.302 8 (d) Place a range on the p-value using the statistical tables and, using the 5% significance level, perform the hypothesis test using the p-value. Maintain your assumptions from (b). The p-value is greater than .05 since we failed to reject the null at the 5% level. We also know that it is greater than .10 since t 7 ,.10 1.415 and thus 10% of the area under a t 7 distribution is to the left of –1.415. So because our test statistic of –1.122 > -1.415, the p-value is greater than .10. Excel tells me that the exact p-value is .1494. We again reject the null hypothesis since the p- value exceeds the significance level (.05). Problem 2 (a) Let W be a linear combination of random variables X and Y where W gX hY and g and h are constants. Prove that the W g 2 X h 2 Y 2 gh XY X Y , where XY is the 2 2 2 correlation between X and Y . W E[(W W ) 2 ] E[(gX hY ( g X h Y ))2 ] E[((g ( X X ) h(Y Y ))2 ] 2 E[ g 2 ( X X ) 2 h 2 (Y Y ) 2 2 gh( X X )(Y Y )] Now bring the expectation operator through since the E[sum of stuff] = sum of expected value of each individually, = E[ g 2 ( X X ) 2 ] E[h 2 (Y Y ) 2 ] E[2 gh( X X )(Y Y )] Bring out the constants, = g 2 E[( X X ) 2 ] h 2 E[(Y Y ) 2 ] 2 ghE[( X X )(Y Y )] Rewriting the definitions, = g 2 X h 2 Y 2 ghCov( X , Y ) 2 2 And finally, rewriting the definition of Cov(X,Y), = g 2 X h 2 Y 2 gh XY X Y 2 2 (b) A researcher suspected that the number of between-meal snacks eaten by students in a day during final exams might depend on the number of tests a student had to take on that day. The table below shows the joint probabilities, estimated from a survey. Calculate Var[2X - 3Y]. Number of Tests (X) Number of 0 1 2 3 Marginal Prob (Y) Snacks (Y) 0 .07 .06 0 0 .13 1 .07 0 0 .03 .10 2 .06 .20 .14 .07 .47 3 .02 0 .16 .12 .30 Marginal prob (X) .22 .26 .30 .22 1 E[X] = 0(.22) + 1(.26) + 2(.30) + 3(.22) = 1.52 E[Y] = 0(.13) + 1(.10) + 2(.47) + 3(.30) = 1.94 Cov (X,Y) = E[XY] - E[X]*E[Y] = 1*3*(.03) + 2*1*(.20) + 2*2*(.14) + 2*3*(.07) + 3*2*(.16) + 3*3*(.12) - (1.52*1.94) = 3.51 – 2.9448 = .5612 Var(Y) = E[Y2] – (E[Y])2 = 12(.10) + 22(.47) + 32(.30) – 1.942 = .9164 Var(X) = E[X2] – (E[X])2 = 12(.26) + 22(.30) + 32(.22) – 1.522 = 1.1296 Var[2X - 3Y] = 22 *Var [X] + 32 *Var[Y] - 2*2*3Cov(X,Y) = 4*1.1296 + 9*.9164 - 12*.5612 = 6.0316 Problem 3 (a) You are told that a 95% confidence interval for the population mean is 17.3 to 24.5. If the population standard deviation is 18.2, how large was the sample? X X z1 2 n X 1.96 (18 .2) z1 (24 .5 17 .3) / 2 3.6 n 98 2 n n (b) Interpret the confidence interval in part (a). Be specific. In repeated samples, we expect 95% of the intervals to contain the true population mean. (c) Using a properly labeled diagram (and without doing more calculations), explain how and why the confidence interval in part (a) would change if the significance level changed to 1%. f (x ) .025 .005 X x z -2.57 -1.96 0 1.96 2.57 If the significance level changed to 1%, then it would be a 99% confidence interval, and it would be wider than the one in part (a) because the z-value would be bigger (2.57 vs. 1.96). That is, more of the sampling distribution would be within the 99% CI. (d) Using a properly labeled diagram (and without doing more calculations), explain how and why the confidence interval in part (a) would change if the sample size were now 36. f (x ) .025 .025 x z -1.96 0 1.96 1.96 z -1.96 The standard error, X , increases as n decreases, so the sampling distribution has n greater variance, as in the darker p.d.f. above. With a 95% confidence interval still, the value above which 2.5% of the probability lies is still 1.96 (and on the left side, it’s –1.96), but with a larger standard error, a 95% confidence interval is now wider. Problem 4 Let W ~ N (2,1) and Z ~ N (0,1) . (a) Calculate the probability that W is greater than -1 and also the probability that Z is greater than -1. Do two separate calculations. fW ( w 1) 1 FW (1) 1 FZ ( 112 ) 1 FZ (3) 1 (1 FZ (3)) .9987 f Z ( z 1) 1 FZ (1) 1 (1 FZ (1)) .8413 (b) Using a properly-labeled diagram and the appropriate z-values, clearly explain why the answer to part (a) is different for Z than for W. f(z) f(w) w, z -1 0 2 z -1 0 -3 zw 0 -1 is both absolutely farther from the mean of W than from the mean of Z and relatively farther (1 standard deviations compared to 3 standard deviations). Relative distances determine the probabilities. (c) The number of households ordering the pay-per-view movie Finding Nemo is normally distributed. Twenty percent of the time fewer than 20,000 households order the movie. Only ten percent of the time more than 28,000 households order. What are the mean and standard deviation of the number of households ordering Finding Nemo? P(X<20000) = .2 and P(X>28000) = .10 P(z<(20000-m)/k) = .2 and P(z>(28000-m)/k) = .10 (20000-m)/k = -.84 and (28000-m)/k = 1.28 Two equations, two unknowns. 2.12 k = 8000 or k = 3774 then m = 23,170 if you linearly interpolated (I didn’t make it clear to do so this time): FZ (.84) .7995 28 “steps” and we want to take 5 of them, so .84 + 5/28(.85-.84) = .84179 = FZ (.85) .8023 .842 FZ (1.28) .8997 18 “steps” and we want to take 3 of them, so 1.28 + 3/18(1.29-1.28) = FZ (1.29) .9015 1.28167 = 1.282 Problem 5 (a) State, precisely, the Central Limit Theorem. Roughly speaking it says that sample means are eventually distributed normally. That is, the sampling distribution of the sample mean follows a normal distribution if the sample size is big enough (>30). More precisely, if X1, X2, …, Xn are i.i.d. observations from a population with ANY distribution X X having mean X and variance X , then 2 ~ N (0,1) if n is large (> 30), or equivalently, X n 2 then X ~ N ( X , X ) if n is large. n (b) Using a graph for the population distribution and at least one graph for the sampling distribution, illustrate and explain the central limit theorem. Label all axes and curves.