VIEWS: 87 PAGES: 7 POSTED ON: 11/26/2011 Public Domain
Economics 422 Introduction to Econometrics Maureen Cropper Fall 2007 FINAL EXAM (Answers in Italics) 1. Suppose that you have a sample of n (xi,yi) pairs that have been generated by: yi = 0 + 1 xi + ui. a. (10) State 5 assumptions that must be true about {ui} and about {xi} if these observations were generated by the classical normal regression model, in addition to the fact that yi = 0 + 1 xi + ui. [You must write the assumptions out.] Since you already have SLR.1, you would need to write SLR.2-SLR.5, plus MLR.6 (the normality assumption. b. (15) Under these assumptions, least squares estimators are (i) unbiased, (ii) linear and (iii) best linear unbiased. Define each term. Unbaised means E( ) = 1 ; Linear means that is a linear combination of the {yi}; ˆ ˆ 1 1 Best linear unbiased means having the smallest variance of all linear, unbiased estimators. c. (05) Under these assumptions, what is the variance of the least squares estimator of 1? V( ) = σ2/ SSTX ˆ1 2. In the model of question 1, explain what are the consequences of assuming that V(ui) =2xi. a. (05) Which of the three properties in 1.b. continue to hold? 1 is still linear in the {yi} and unbiased but not longer BLUE. ˆ b. (05) Is the formula you presented in 1.c. still correct? What does this imply? No. The correct formula is given by equation (8.2) in the book. This means that the t-test based on 1.c. is no longer correct. Nor are the F-tests in chapter 4. c. (05) Transform the original equation so that the least squares estimators as applied to the transformed equation have desirable properties. Divide both sides of the equation through by √xi. (The square root of xi.) 3. Consider the consumption function Ci = 0 + 1Yi + 2Xi + 3Zi + ui Where Ci is the consumption of the i-th household in $ Yi is its income in $ Xi = 1 if the family is Black, = 0 otherwise Zi = 1 if the family is Asian, = 0 otherwise A cross-section survey is available in which Blacks, Whites and Asians (as defined in some consistent manner) are represented. The following results are obtained: (1) Ci = 549.66 + 0.83 Yi + 56.37 Xi - 22.66 Zi (6.50) (5.40) (2.90) (-2.50) R2 = 0.88; SSR = 560,000; n = 124 (Numbers in parentheses are t-ratios.) (2) Ci = 560.60 + 0.86 Yi (8.65) (6.85) R2 = 0.82; SSR = 840,000; n = 124 (06)(a) What is the intercept of the Black consumption function? 549.66 + 56.37 What is the intercept of the White consumption function? 549.66 What is the slope (marginal propensity to consume) of the Asian consumption function? Slope = 0.83 (10)(b) At the .05 level, can you reject the hypothesis that race has no effect on consumption, given income? [7 pts. for correct test statistic; 3 for critical value and conclusion.] Do an F-test using (0.88 – 0.82)/2. Compare to critical F = 3.07 (1-0.88)/120 Can reject H0 at .05 level (30 > 3.07) (06)(c) Set up an appropriate model for the case in which each racial group may have a different marginal propensity to consume. Ci = 0 + 1Yi + 2Xi + 3Zi + 4XiYi + 5ZiYi+ ui (06)(d) In equation (1) above how would you test the hypothesis that Asians and Whites consume the same, given income? If you have been given sufficient information to perform the test, do so at the .01 level. 2 Do a t-test. The t-ratio on the Asian dummy = -2.50. With 120 d.f. the critical t is -2.617 so DO NOT REJECT at the 0.01 level. (10)(e) In equation (1) above can you reject the null hypothesis that 1 = 2 = 3 = 0 at the .05 level? [7 pts. for correct test statistic; 3 for critical value and conclusion.] Do an F-test using (0.88)/3____ Compare to critical F = 2.68 (1-0.88)/120 Can reject H0 at .05 level (293.3 > 2.68) (02)(f) What assumption of the multiple regression model would be violated if a dummy variable for Whites were added to equation (1)? MLR.3 No perfect collinearity. 4. The following equation describes the median housing price in a community in terms of the amount of pollution (NOx) and the average number of rooms in houses in the community (rooms). log(price) = β0 + β1log(NOx) + β2rooms + u (1) Suppose that estimation of this equation yields the following results: log(price) = 9.23 – 0.718 log(NOx) + 0.306 rooms (2) (03) a. Suppose that NOx concentrations increase by 10%. By how much will median housing price decrease? By 7.18% (03) b. Suppose that the number of rooms in a house increases by 1. By how much will median housing price increase? By 30.6% (06) c. Suppose that rooms is dropped from equation (1). Write the formula for the least squares estimator of the coefficient of log(NOx) in this equation (call it β1′ ) as a function of the least squares estimators of β1 and β2 in equation (1) and any additional terms you need. Define all terms that you use. β1′ = β1 + β2δ1 This equation is valid in terms of population parameters and also in terms of OLS estimators (with hats on all terms). δ1 is the coefficient of log(NOx) in a regression on rooms (the dropped variable) on log(NOx), the included one. (03) d. When rooms is dropped from the equation the result is: log(price) = 11.71 – 1.043 log(NOx) (3) 3 Explain in words why this occurred. When rooms is dropped, log(NOx) picks up its effect. Since δ1 < 0 and β2 > 0, dropping rooms adds a negative number to the original coefficient. 5. Using the data in GPA2.RAW the following equation was estimated (std. errors in parentheses): sat = 1,028.10 + 19.30 hsize – 2.19 hsize2 – 45.09 female – 169.81 black + 62.312 female*black (6.29) (3.83) (0.53) (4.29) (12.71) (18.15) n = 4,137 R2 = .0858 The variable sat is the combined SAT score, hsize is the size of the student’s high school graduating class, in hundreds, female is a gender dummy variable and black is a dummy variable equal to 1 for blacks and zero otherwise. (a) (03) What is the hisze at which sat reaches its peak? At 19.30/(2*2.19). (b) (03) Holding hsize fixed, what is the estimated difference in SAT score between nonblack females and nonblack males? (Answer should be a number.) -45.09 (c) (03) What is the estimated difference in SAT scores between nonblack males and black males? (Answer should be a number.) -169.81 (d) (06) What is the estimated difference in SAT scores between black females and nonblack females? (Answer should be a number.) What would you need to do to test if this difference is statistically significant? (Explain in words.) Difference = -169.81 + 62.312 To test if this is significant, you could test whether the sum of the coefficient on black and female*black equals 0. 6. Indicate whether each of the following statements is true or false. No justification is required. (a) (04) In the simple regression model if xt = c for all t, the least squares estimators are not defined. True (b) (04) When regressors exhibit multicollinearity, least squares estimators are unbiased, but are no longer Best Linear Unbiased estimators. False (c) (04) When errors are heteroskedastic, the usual F statistic no longer has the F distribution. True (d) (04) The Gauss Markov Theorem states that the least squares estimators have the smallest variance of all linear, unbiased estimators. True (e) (04) When data are heteroskedastic, weighted least squares estimators have smaller variances than ordinary least squares estimators. True 4 7. This question deals with the interpretation of coefficients and standard errors in the multiple regression model. This was question 4 on the midterm. Please see "Solutions to Midterm" on the course website. (10) a. Write the expression for the variance of the least squares estimator of 1 (the coefficient of x1) in the multiple linear regression model. [Assume there are k slope coefficients in the model.] Be sure to define all terms in the equation. (05) b. How in practice would you estimate the variance of the least squares estimator of 1? (05) c. Suppose that a regression of x1 on x2, x3, …., xk yields an R2 of 0.95. What does this imply about the precision with which 1 can be estimated, compared with the case in which the R2 = 0.30? 8. The following model allows the return to education to depend upon the amount of both parents’ education, called parecduc, log(wage) = β0 + β1educ + β2educ*pareduc + β3exper + β4tenure + u (05)a. In terms of the above variables, what is the proportionate change in the wage corresponding to a one year change in education? β1 + β2pareduc (05)b. A researcher estimates the above equation and obtains the following results (standard errors in parentheses): log(wage) = 5.65 + .047 educ + .00078 educ*pareduc + .019 exper + .010 tenure (.13) (.01) (.00021) (.004) (.003) R2 = 0.169; n = 722 What do the results imply about the impact of parents’ education = 36 years on the returns to eduation? It increases the return to each year of education by .00078*36 (05)c. Do you believe that the researcher has obtained unbiased estimates of this effect? Explain. No, because the level of pareduc has been omitted from the equation. 9. The attached sheet lists the results of regressing average annual per capita GDP growth over the period 1965-90 (yt) on the set of variables listed in part A. of Computer Project #2. (03)(a) By how many percentage points will a change in malfal66 (the malaria index) from 0 to 1 reduce yt? 5 By 1.16 percentage points. (03)(b) Is the coefficient of POP100KM significantly different from zero at the .01 level using a two-tailed test? Explain. No it is NOT significantly different from 0 at the .01 level. The p-value is > .01 (=.013). (03)(c) What are the endpoints of a 95% confidence interval for the coefficient of TROPICAR? -1.301707 and .1130238 (03) (d) Can you reject the null hypothesis that the coefficient on TROPICAR is different from -1.0 at the .05 level? Explain. No you can’t. -1 lies within the endpoints of the 95 percent C.I. for TROPICAR. (05)(e) What is the F-statistic for the null hypothesis that all coefficients except the constant term in the equation are equal to zero? Can you reject the null hypothesis at the .01 level? The F-stat = 27.17. Yes can reject at the .01 level since the p-value < .01. (03)(f) What is the sum of squared residuals (SSR) for the equation? Use this to estimate V(ui). SSR = 62.53 Estimator of V(ui) = SSR/(n-k-1)= 62.53/(75-9-1) . reg gdpg6590 lgdp65 syr15651 llifex65 open6590 icrg82 pop100km tropicar malfal66 Source | SS df MS Number of obs = 75 -------------+------------------------------ F( 8, 66) = 27.17 Model | 205.866212 8 25.7332765 Prob > F = 0.0000 Residual | 62.5200453 66 .947273413 R-squared = 0.7671 -------------+------------------------------ Adj R-squared = 0.7388 Total | 268.386257 74 3.62684131 Root MSE = .97328 ------------------------------------------------------------------------------ gdpg6590 | Coef. Std. Err. t P>|t| [95% Conf.Interval] -------------+---------------------------------------------------------------- lgdp65 | -2.528158 .2555312 -9.89 0.000 -3.038343 -2.017974 syr15651 | .1727434 .1614383 1.07 0.289 -.1495785 .4950654 llifex65 | 4.322535 1.227665 3.52 0.001 1.871422 6.773648 open6590 | 1.735505 .3902791 4.45 0.000 .9562873 2.514722 icrg82 | .3472981 .1001661 3.47 0.001 .1473099 .5472862 pop100km | .9149605 .3592766 2.55 0.013 .1976416 1.632279 tropicar | -.5943416 .3542913 -1.68 0.098 -1.301707 .1130238 malfal66 | -1.16357 .5518607 -2.11 0.039 -2.265396 -.0617448 6 _cons | 1.27923 4.649643 0.28 0.784 -8.004081 10.56254 ----------------------------------------------------------------------------- 10. (10) Define R2. Suppose that R2 = 0.842 and SST = 6658.9. What is SSR? R2 = 1 – [SSR/SST]; 0.842 = 1 – [SSR/6658.9] (solve for SSR) 7