VIEWS: 8 PAGES: 28 CATEGORY: Markets / Industries POSTED ON: 4/9/2011
Solutions to End-of-Section and Chapter Review Problems 297 CHAPTER 13 13.1 (a) When X = 0, the estimated expected value of Y is 2. (b) For increase in the value X by 1 unit, we can expect an increase by an estimated 5 units in the value of Y. (c) ˆ Y 2 5 X 2 5(3) 17 (d) yes, (e) no, (f) no, (g) yes, (h) no 13.2 (a) When X = 0, the estimated expected value of Y is 16. (b) For increase in the value X by 1 unit, we can expect a decrease in an estimated 0.5 units in the value of Y. (c) ˆ Y 16 0.5 X 16 0.5(6) 13 13.3 (a) Weekly Sales, Y 4 3 2 1 0 0 5 10 15 20 Shelf Space, X (b),(c) For each increase in shelf space of an additional foot, there is an expected increase in weekly sales of an estimated 0.074 hundreds of dollars, or $7.40. (d) ˆ Y 1.45 0.074 X 1.45 0.074(8) 2.042 , or $204.20 (e) b0 1.5333 , b1 0.064 For each increase in shelf space of an additional foot, there is an expected increase in weekly sales of an estimated 0.064 hundreds of dollars, or $6.40. ˆ Y 1.5333 0.064 X 1.5333 0.064(8) 2.0453 , or $204.53 (f) The best allocation to pet food depends on the profit made per foot of shelf space. The expected weekly sales (and profits) per foot of shelf space actually declines at the amount of allocated shelf space increases from 5 to 20 feet, however, if the profitability is still high enough, it will be worthwhile assigning a higher amount to pet food. 298 Chapter 13: Simple Linear Regression 13.4 (a) Scatter Diagram Weekly Sales, Y 15 10 5 0 0 200 400 600 800 1000 1200 Customers, X (b),(c) For each increase of one additional customer, there is an expected increase in weekly sales of an estimated 0.00873 thousands of dollars, or $8.73. (d) ˆ Y 2.423 0.00873 X 2.423 0.00873(600) 7.661 , or $7661 (e) b0 1.578 , b1 0.01009 For each increase in shelf space of an additional foot, there is an expected increase in weekly sales of an estimated 0.01009 thousands of dollars, or $10.09. ˆ Y 1.578 0.01009 X 1.578 0.01009(600) 7.632 , or $7632 13.5 (a) Scatter Diagram 25 20 # of Ord ers 15 10 5 0 0 200 400 600 800 Weight (lbs.) (b) ˆ Y 0.1912 0.0297X (c) For each increase of one additional pound, the estimated average number of order will increase by 0.0297. (d) Y 0.1912 0.0297 500 15.043 ˆ Solutions to End-of-Section and Chapter Review Problems 299 13.6 (a) Scatter Plot 400 350 Gross ($millions) 300 250 200 150 100 50 0 0 10 20 30 40 50 60 70 Video Units Sold ˆ (b),(c) Y 76.54 4.3331X (d) For each increase of 1 million dollars in box office gross, expected home video units sold are estimated to increase by 4.3331 thousand, or 4333.1 units. 76.54 represents the portion of thousands of home video units that are not affected by box office gross. (e) ˆ Y 76.54 4.3331X 76.54 4.3331(20) 163.202 or 163,202 units. (f) Some other factors that might be useful in predicting video unit sales are (i) the number of days the movie was screened, (ii) the rating of the movie by critics, (iii) the amount of advertisement spent on the video release, etc. 13.7 (a) Scatter Plot 2500 2000 Monthly Rent ($) 1500 1000 500 0 0 500 1000 1500 2000 2500 Size (square feet) ˆ (b),(c) Y 177.1 1.065 X 300 Chapter 13: Simple Linear Regression 13.7 (d) For each increase of 1 square foot in space, the expected monthly rental is estimated cont. to increase by $1.065. 177.1 represents the portion of apartment monthly rental that is not affected by square footage. (e) ˆ Y 177.1 1.065 X 177.1 1.065(1000) $1242.10 (f) An apartment with 500 square feet is outside the relevant range for the independent variable. (g) The apartment with 1200 square feet has the more favorable rent relative to size. Based on the regression equation, a 1200 square foot apartment would have an expected monthly rent of $1455.10, while a 1000 square foot apartment would have an expected monthly rent of $1242.10. 13.8 (a) Scatter Plot 120 Tensile Strength (lbs/squrae 100 80 in.) 60 40 20 0 0 10 20 30 40 Hardness (Rockwell E units) (b) ˆ Y 6.0483 2.0191X (c) For each increase of one additional Rockwell E unit in hardness, the estimated average tensile strength will increase by 2.0191 pounds per square inch. (d) Y 6.0483 2.0191 70 147.382 pounds per square inch. ˆ 13.9 80% of the variation in the dependent variable can be explained by the variation in the independent variable. 13.10 SST = 40 and r2 = 0.90. So, 90% of the variation in the dependent variable can be explained by the variation in the independent variable. 13.11 r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the variation in the independent variable. 13.12 r2 = 0.667. So, 66.7% of the variation in the dependent variable can be explained by the variation in the independent variable. 13.13 Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at least as large as SSR. Solutions to End-of-Section and Chapter Review Problems 301 13.14 (a) r2 = 0.684. So, 68.4% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 0.308 (c) Based on (a) and (b), the model should be very useful for predicting sales. 13.15 (a) r2 = 0.912. So, 91.2% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 0.5015 (c) Based on (a) and (b), the model should be very useful for predicting sales. 13.16 (a) r2 = 0.9731. So, 97.31% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 0.7258 (c) Based on (a) and (b), the model should be very useful for predicting the number of order. 13.17 (a) r2 = 0.728. So, 72.8% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 47.87 (c) Based on (a) and (b), the model should be very useful for predicting sales. 13.18 (a) r2 = 0.723. So, 72.3% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 194.6 (c) Based on (a) and (b), the model should be very useful for predicting monthly rent. 13.19 (a) r2 = 0.4613. So, 46.13% of the variation in the dependent variable can be explained by the variation in the independent variable. (b) sYX 9.0616 (c) Based on (a) and (b), the model is only marginally useful for predicting tensile strength. 13.20 A residual analysis of the data indicates no apparent pattern. The assumptions of regression appear to be met. 13.21 A residual analysis of the data indicates a pattern, with sizeable clusters of consecutive residuals that are either all positive or all negative. This appears to violate the assumption of independence of errors. 13.22 (a)-(b) Based on a residual analysis, the model appears to be adequate. 13.23 (a)-(b) Based on a residual analysis of the studentized residuals versus customers, the model appears to be adequate. 302 Chapter 13: Simple Linear Regression 13.24 (a) Weight Residual Plot 1.5 1 Residuals 0.5 0 -0.5 -1 -1.5 -2 0 100 200 300 400 500 600 700 800 Weight Normal Probability Plot 1.5 1 0.5 Residuals 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -0.5 -1 -1.5 -2 Z Value The residual plot does not reveal any obvious pattern. So a linear fit appears to be adequate. (b) The residual plot does not reveal any possible violation of the homoscedasticity assumption. This is not a time series data, so we do not need to evaluate the independence assumption. The normal probability plot shows that the distribution has a thicker left tail than a normal distribution but there is no sign of severe skewness. 13.25 (a)-(b) Based on a residual analysis of the studentized residuals versus box office gross, the model appears to be adequate. 13.26 (a)-(b) Based on a residual analysis of the studentized residuals versus size, the model appears to be adequate. Solutions to End-of-Section and Chapter Review Problems 303 13.27 (a) Hardness Residual Plot 20 10 Residuals 0 20 25 30 35 40 -10 -20 Hardness The residual plot does not reveal any obvious pattern. So a linear fit appears to be adequate. (b) Normal Probability Plot 20 15 10 5 Residuals 0 -3 -2 -1 0 1 2 3 -5 -10 -15 -20 Z Value The residual plot does not reveal any possible violation of the homoscedasticity assumption. This is not a time series data, so we do not need to evaluate the independence assumption. The normal probability plot shows that the distribution has a slightly thinner right tail than a normal distribution but there is no sign of severe skewness. 13.28 (a) An increasing linear relationship exists. (b) D = 0.109 (c) There is strong positive autocorrelation among the residuals. 13.29 (a) There is no apparent pattern in the residuals over time. (b) D = 1.661>1.36. There is no evidence of positive autocorrelation among the residuals. (c) The data are not positively autocorrelated. 13.30 (a) No, since the data have been collected for a single period for a set of stores. (b) If a single store was studied over a period of time and the amount of shelf space varied over time, computation of the Durbin-Watson statistic would be necessary. 304 Chapter 13: Simple Linear Regression 13.31 (a) Scatter Plot 160 140 120 Kilowatt Usage 100 80 60 40 20 0 0 20 40 60 80 100 Atmospheric Temperature (degree F) (b) b0 = 169.455, b1 = –1.8579 (c) For each increase of one degree in Fahrenheit temperature, the expected average kilowatt usage is estimated to decrease by 1.8579. (d) ˆ Y 169.455 1.8579 X 169.455 1.8579(50) 76.56 (e) r2 = 0.894. So, 89.4% of the variation in average kilowatt usage can be explained by the variation in the average temperature. (f) sYX 11.63 (g) Temperature Residual Plot 25 20 15 10 5 Residuals 0 -5 -10 -15 -20 -25 -30 0 10 20 30 40 50 60 70 80 90 Temperature Solutions to End-of-Section and Chapter Review Problems 305 13.31 (h) cont. Residuals vs Time Period 25 20 15 10 Residuals 5 0 -5 0 5 10 15 20 25 30 -10 -15 -20 -25 -30 Time Period (i) D = 1.18<1.27. There is evidence of positive autocorrelation among the residuals. (j) The plot of the residuals versus temperature indicates that positive residuals tend to occur for the lowest and highest temperatures in the data set. A nonlinear model might be more appropriate. The evidence of positive autocorrelation is another reason to question the validity of the model. 13.32 (a) Scatter Diagram 100 90 80 70 60 Cost ($000) 50 40 30 20 10 0 0 1000 2000 3000 4000 5000 6000 7000 # of order (b) b0 = 0.458, b1 = 0.0161 (c) For each increase of one order, the expected distribution cost is estimated to increase by 0.0161 thousand dollars, or $16.10. (d) ˆ Y 0.458 0.0161X 0.458 0.0161(4500) 72.908 or $72,908 (e) r2 = 0.844. So, 84.4% of the variation in distribution cost can be explained by the variation in the number of orders. (f) sYX 5.218 306 Chapter 13: Simple Linear Regression 13.32 (g) cont. Orders Residual Plot 15 10 5 Residuals 0 -5 -10 -15 0 1000 2000 3000 4000 5000 6000 7000 Orders (h) Residuals 15 10 5 Residuals 0 0 5 10 15 20 25 30 -5 -10 -15 Time Period (i) D = 2.08>1.45. There is no evidence of positive autocorrelation among the residuals. (j) Based on a residual analysis, the model appears to be adequate. Solutions to End-of-Section and Chapter Review Problems 307 13.33 (a) Scatter Diagram 140 Gasoline Price (cents/gal.) 120 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 Crude Oil ($/bbl.) (b) ˆ Y 42.8798+2.6573X (c) For each increase of 1 $/bbl. of crude oil price, the estimated average gasoline price will increase by 2.6573 cents/gallon. (d) Y 42.8798+2.6573 20 96.03 cents/gallon. ˆ (e) r2 = 0.7117. So 71.17% of the variation in gasoline price can be explained by the variation in crude oil price. (f) SYX = 12.32. (g) Crude Oil Price Residual Plot 40 30 20 Residuals 10 0 -10 -20 0 5 10 15 20 25 30 35 40 Crude Oil Price 308 Chapter 13: Simple Linear Regression 13.33 (h) cont. Residuals 40 30 20 Residuals 10 0 0 5 10 15 20 25 30 -10 -20 Time Period (i) D = 0.5915 < 1.27, there is evidence of positive autocorrelation. (j) Normal Probability Plot 40 30 20 Residuals 10 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -10 -20 Z Value According to the residual plot of crude oil price, a nonlinear model is more appropriate. The plot of residual versus time series along with the Durbin-Watson statistic suggest that there is strong evidence of positive autocorrelation. The normal probability plot indicates that the distribution has thinner tails than a normal distribution but there is no sign of severe skewness. Solutions to End-of-Section and Chapter Review Problems 309 13.34 (a) Scatter Diagram 4 3.5 Sales Per Store ($000) 3 2.5 2 1.5 1 0.5 0 0 20 40 60 80 100 120 Temperature (Degree F) (b) b0 = –2.535, b1 = 0.060728 (c) For each increase of one degree Fahrenheit in the high temperature, expected sales are estimated to increase by 0.060728 thousand dollars, or $60.73. (d) ˆ Y 2.535 0.060728 X 2.535 0.060728(83) 2.5054 or $2505.40 (e) r2 = 0.94. So, 94% of the variation in sales per store can be explained by the variation in the daily high temperature. (f) sYX 0.1461 (g) Temperature Residual Plot 0.3 0.2 0.1 Residuals 0 -0.1 -0.2 -0.3 -0.4 0 20 40 60 80 100 120 Temperature 310 Chapter 13: Simple Linear Regression 13.34 (h) cont. Residuals 0.3 0.2 0.1 Residuals 0 0 5 10 15 20 25 -0.1 -0.2 -0.3 -0.4 Time Period (i) D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals. (j) The plot of the residuals versus time period shows some clustering of positive and negative residuals for intervals in the domain, suggesting a nonlinear model might be better. Otherwise, the model appears to be adequate. (k) b0 = –2.6281, b1 = 0.061713 For each increase of one degree Fahrenheit in the high temperature, expected sales are estimated to increase by 0.061713 thousand dollars, or $61.71. ˆ Y 2.6281 0.061713 X 2.6281 0.061713(83) 2.4941 or $2494.10 2 r = 0.929. 92.9% of the variation in sales per store can be explained by the variation in the daily high temperature. sYX 0.1623 D = 1.24. The test of the Durbin-Watson statistic is inconclusive as to whether there is positive autocorrelation among the residuals. The plot of the residuals versus time period shows some clustering of positive and negative residuals for intervals in the domain, suggesting a nonlinear model might be better. Otherwise, the model appears to be adequate. The results are similar to those in (a)-(j). 13.35 (a) t b1 / sb1 4.5 / 1.5 3.00 (b) With n = 18, df = 18 – 2 =16. t16 2.1199 (c) Reject H0. There is evidence that the fitted linear regression model is useful. (d) b0 t16 sb1 1 b0 t16 sb1 , 4.5 2.1199(1.5) 1 4.5 2.1199(1.5) , 1.32 1 7.68 13.36 (a) MSR SSR / p 60 / 1 60 MSE SSE /(n p 1) 40 / 18 2.222 F MSR / MSE 60 / 2.222 27 (b) F1,18 4.414 (c) Reject H0. There is evidence that the fitted linear regression model is useful. Solutions to End-of-Section and Chapter Review Problems 311 13.37 (a) t 4.65 t10 2.2281 with 10 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (b) 0.0386 1 0.1094 13.38 (a) t 13.65 t18 2.1009 with 18 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (b) 0.0074 1 0.0101 13.39 (a) p-value is virtually 0 < 0.05. Reject H0. There is evidence that the fitted linear regression model is useful. (b) 0.0276 1 0.0318 13.40 (a) t 8.65 t 28 2.0484 with 28 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (b) 3.3073 1 5.3589 13.41 (a) t 7.74 t 23 2.0687 with 23 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (b) 0.7805 1 1.3497 13.42 (a) p-value = 7.26497E-06 < 0.05. Reject H0. There is evidence that the fitted linear regression model is useful. (b) 1.2463 1 2.7918 13.43 (a) For the Ford Motor Company, the estimated value of its stock will increase by 0.92% on average when the S & P 500 index increases by 1%. For the Houston Industries, the estimated value of its stock will increase by 0.43% on average when the S & P 500 index increases by 1%. For IBM, the estimated value of its stock will increase by 1.09% on average when the S & P 500 index increases by 1%. For LSI Logic, the estimated value of its stock will increase by 1.80% on average when the S & P 500 index increases by 1%. (b) A stock is riskier than the market if the estimated absolute value of the beta exceeds one. This can be used to gauge the volatility of a stock in relative to how the market behaves in general. 13.44 (a) % dialy change in ULPIX b0 2.00 % dialy change in S&P 500 Index (b) If the S&P gains 30% in a year, the ULPIX is expected to gain an estimated 60%. (c) If the S&P loses 35% in a year, the ULPIX is expected to lose an estimated 70%. (d) Since the leverage funds have higher volatility and, hence, higher risk than the market, risk averse investors should stay away from these funds. Risk takers, on the other hand, will benefit from the higher potential gain from these funds. 312 Chapter 13: Simple Linear Regression 13.45 (a) r = -0.1641. (b) t = -0.4706, p-value = 0.6505 > 0.05. Do not reject H0. There is not enough evidence to conclude that there is a significant linear relationship between the retial price and the energy cost per year of medium-size top-freezer refrigerators. 13.46 (a) r = 0.9656 (b) The p-value of the t test is essentially zero. At 0.05 level of significance, there is significant linear relationship between calories and fat content. (c) Yes, one would expect the ice creams with higher fat content to have more calories. 13.47 (a) ˆ When X = 2, Y 5 3 X 5 3(2) 11 1 ( X X )2 1 (2 2) 2 h n i 0.05 (X i X ) n 2 20 20 i 1 ˆ 95% confidence interval: Y t18 sYX h 11 2.1009 1 0.05 10.53 YX 11.47 (b) ˆ 95% prediction interval: Y t18 sYX 1 h 11 2.1009 1 1.05 8.847 YI 13.153 13.48 (a) ˆ When X = 4, Y 5 3 X 5 3(4) 17 1 ( X X )2 1 (4 2) 2 h n i 0.25 (X i X ) n 2 20 20 i 1 ˆ 95% confidence interval: Y t18 sYX h 11 2.1009 1 0.25 15.95 YX 18.05 (b) ˆ 95% prediction interval: Y t18 sYX 1 h 11 2.1009 1 1.25 14.651 YI 19.349 (c) The intervals in this problem are wider because the value of X is farther from X . 13.49 (a) 1.7867 Y | X 2.2964 (b) 1.3100 YI 2.7740 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. 13.50 (a) 7.3664 Y | X 7.9549 (b) 6.5667 YI 8.7546 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. Solutions to End-of-Section and Chapter Review Problems 313 13.51 (a) 14.7150 Y | X 15.3701 (b) 13.5059 YI 16.5793 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. 13.52 (a) 100.96 Y | X 138.77 (b) 20.1 YI 219.72 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. 13.53 (a) 1153.0 Y | X 1331.5 (b) 829.9 YI 1654.6 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. 13.54 (a) 116.7082 Y | X 178.0564 (b) 111.5942 YI 183.1704 (c) Part (b) provides an estimate for an individual response and Part (a) provides an estimate for an average predicted value. 13.55 The slope of the line b1 represents the estimated expected change in Y per unit change in X. It represents the estimated average amount that Y changes (either positively or negatively) for a particular unit change in X. The Y intercept b0 represents the estimated average value of Y when X equals 0. 13.56 The coefficient of determination measures the proportion of variation in Y that is explained by the independent variable X in the regression model. 13.57 The unexplained variation or error sum of squares (SSE) will be equal to zero only when the regression line fits the data perfectly and the coefficient of determination equals 1. 13.58 The explained variation or regression sum of squares (SSR) will be equal to zero only when there is no relationship between the Y and X variables, and the coefficient of determination equals 0. 13.59 Unless a residual analysis is undertaken, you will not know whether the model fit is appropriate for the data. In addition, residual analysis can be used to check whether the assumptions of regression have been seriously violated. 13.60 The assumptions of regression are normality of error, homoscedasticity, and independence of errors. The normality of error assumption can be evaluated by obtaining a histogram, box- and-whisker plot, and/or normal probability plot of the residuals. The homoscedasticity assumption can be evaluated by plotting the residuals on the vertical axis and the X variable on the horizontal axis. The independence of errors assumption can be evaluated by plotting the residuals on the vertical axis and the time order variable on the horizontal axis. This assumption can also be evaluated by computing the Durbin-Watson statistic. 314 Chapter 13: Simple Linear Regression 13.61 The Durbin-Watson statistic is a measure of the autocorrelation among the residuals. It measures the correlation among consecutive residuals. 13.62 If the data in a regression analysis has been collected over time, then the assumption of independence of errors needs to be evaluated using the Durbin-Watson statistic. 13.63 The confidence interval for the mean response estimates the average response for a given X value. The prediction interval estimates the value for a single item or individual. 13.64 (a) Scatter Diagram 80 70 Delivery Time (minutes) 60 50 40 30 20 10 0 0 50 100 150 200 250 300 350 Number of Cases (b) b0 = 24.84, b1 = 0.14 (c) ˆ ˆ Y 24.84 0.14 X , where X is the number of cases and Y is the estimated delivery time. (d) For each additional case, the estimated delivery time increases by 0.14 minutes. 24.84 is the portion of the estimated delivery time that is not affected by the number of cases. (e) ˆ Y 24.84 0.14 X 24.84 0.14(150) 45.84 (f) No, 500 cases is outside the relevant range of the data used to fit the regression equation. (g) r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the variation in the number of cases. (h) Since b1 is positive, r r 2 0.972 0.986 (i) sYX 1.987 (j) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the number of cases, there is no pattern. The model appears to be adequate. (k) t 24.88 t18 2.1009 with 18 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (l) 44.88 YX 46.80 (m) 41.56 YI 50.12 (n) 0.1282 1 0.1518 Solutions to End-of-Section and Chapter Review Problems 315 13.65 (a) Scatter Diagram 500 Number of Trade Executions 450 400 350 300 250 200 150 100 50 0 0 500 1000 1500 2000 2500 3000 Number of Incoming Calls (b) b0 = –63.02, b1 = 0.189 (c) ˆ ˆ Y 63.02 0.189 X , where X is the number of incoming calls and Y is the estimated number of trade executions. (d) For each additional incoming call, the estimated number of trade executions increases by 0.189 minutes. – 63.02 is the portion of the estimated delivery time that is not affected by the number of incoming calls. (e) ˆ Y 63.02 0.189 X 63.02 0.189(2000) 314.99 (f) No, 5000 incoming calls is outside the relevant range of the data used to fit the regression equation. (g) r2 = 0.630. So, 63.0% of the variation in trade executions can be explained by the variation in the number of incoming calls. (h) Since b1 is positive, r r 2 0.63 0.794 (i) sYX 29.42 (j) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the number of cases, there is no pattern. The model appears to be adequate. (k) D = 1.96 (l) D = 1.96>1.52. There is no evidence of positive autocorrelation. The model appears to be adequate. (m) t 7.50 t 33 2.0345 with 33 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (n) 302.07 YX 327.91 (o) 253.76 YI 376.22 (p) 0.1377 1 0.2403 316 Chapter 13: Simple Linear Regression 13.66 (a) Scatter Diagram 140 120 Selling Price ($000) 100 80 60 40 20 0 0 20 40 60 80 100 Assessed Value ($000) b0 = –44.172, b1 = 1.78171 (b) For each additional dollar in assessed value, the estimated selling price increases by $1.78. –44.172 is the portion of the estimated selling price that is not affected by the assessed value. (c) ˆ Y 44.172 1.78171X 44.172 1.78171(70) 80.458 or $80,458 (d) sYX 3.475 (e) r2 = 0.926. 92.6% of the variation in selling price can be explained by the variation in the assessed value. (f) Since b1 is positive, r r 2 0.926 0.962 (g) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the assessed value, there is no pattern. The model appears to be adequate. (h) t 18.66 t 28 2.0484 with 28 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (i) 78.707 YX 82.388 (j) 73.195 YI 87.900 (k) 1.5862 1 1.9773 Solutions to End-of-Section and Chapter Review Problems 317 13.67 (a) Scatter Diagram 100 90 Assessed Value ($000) 80 70 60 50 40 30 20 10 0 0 0.5 1 1.5 2 2.5 Heating Area (thousands of square feet) b0 = 51.915, b1 = 16.633 (b) For each additional 1000 square feet in heating area, the estimated assessed value increases by $16,633. $51,915 is the portion of the estimated assessed value that is not affected by heating area. (c) ˆ Y 51.915 16.633 X 51.915 16.633(1.75) 81.024 or $81,024 (d) sYX 2.919 (e) r2 = 0.659. 65.9% of the variation in assessed value can be explained by the variation in heating area. (f) Since b1 is positive, r r 2 0.659 0.812 (g) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the heating area, there is no pattern. The model appears to be adequate. (h) t 5.02 t13 2.1604 with 13 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (i) 79.279 YX 82.769 (j) 74.479 YI 87.569 (k) 9.469 1 23.797 (l) b0 = 52.805, b1 = 15.849 For each additional 1000 square feet in heating area, the estimated assessed value increases by $15,849. $52,805 is the portion of the estimated assessed value that is not affected by heating area. ˆ Y 52.805 15.849 X 52.805 15.849(1.75) 80.541 or $80,541 sYX 2.598 r2 = 0.689. 68.9% of the variation in assessed value can be explained by the variation in heating area. Since b1 is positive, r r 2 0.689 0.83 Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the heating area, there is no pattern. The model appears to be adequate. 318 Chapter 13: Simple Linear Regression 13.67 (l) t 5.37 t13 2.1604 with 13 degrees of freedom for 0.05 . Reject H0. There cont. is evidence that the fitted linear regression model is useful. 78.987 YX 82.096 74.716 YI 86.367 9.471 1 22.227 13.68 (a) Scatter Diagram 4.5 4 3.5 3 2.5 GPI 2 1.5 1 0.5 0 0 100 200 300 400 500 600 700 800 GMAT Score b0 = 0.30, b1 = 0.00487 (b) For each additional point on the GMAT score, the estimated GPI increases by 0.00487. 0.30 is the portion of the GPI that is not affected by the GMAT score. (c) ˆ Y 0.30 0.00487 X 0.30 0.00487(600) 3.222 5 (d) sYX 0.158 (e) r2 = 0.793. 79.3% of the variation in the GPI can be explained by the variation in the GMAT score. (f) Since b1 is positive, r r 2 0.793 0.891 (g) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the GMAT score, there is no pattern. The model appears to be adequate. (h) t 8.31 t18 2.1009 with 18 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (i) 3.144 YX 3.301 (j) 2.886 YI 3.559 (k) 0.00366 1 0.00608 (l) b0 = 0.258, b1 = 0.00494 For each additional point on the GMAT score, the estimated GPI increases by 0.00494. 0.258 is the portion of the GPI that is not affected by the GMAT score. ˆ Y 0.258 0.00494 X 0.258 0.00494(600) 3.221 sYX 0.147 r2 = 0.820. 82.0% of the variation in the GPI can be explained by the variation in the GMAT score. Solutions to End-of-Section and Chapter Review Problems 319 13.68 (l) Since b1 is positive, r r 2 0.82 0.906 cont. Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the GMAT score, there is no pattern. The model appears to be adequate. t 9.06 t18 2.1009 with 18 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. 3.147 YX 3.295 2.903 YI 3.539 0.00380 1 0.00609 13.69 (a) Scatter Diagram 4.5 4 Completion Time (hours) 3.5 3 2.5 2 1.5 1 0.5 0 0 50 100 150 200 250 300 350 Invoice Processed (b) b0 = 0.4024, b1 = 0.012608 (c) For each additional invoice processed, the estimated completion time increases by 0.012608 hours. 0.4024 is the portion of the estimated completion time that is not affected by the number of invoices processed. (d) ˆ Y 0.4024 0.012608 X 0.4024 0.012608(150) 2.2934 (e) sYX 0.3342 (f) r2 = 0.892. 89.2% of the variation in completion time can be explained by the variation in the number of invoices processed. (g) Since b1 is positive, r r 2 0.892 0.945 (i) Based on a visual inspection of the graphs of the distribution of studentized residuals and the residuals versus the number of invoices, there is no pattern. The model appears to be adequate. (j) D = 1.78 (k) D = 1.78>1.49. There is no evidence of positive autocorrelation. The model appears to be adequate. (l) t 15.24 t 28 2.0484 with 28 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. (m) 2.1638 YX 2.4230 (n) 1.5966 YI 2.9902 320 Chapter 13: Simple Linear Regression 13.70 (a) O-ring damage index 12 O-ring Damage Index 10 8 6 4 2 0 0 10 20 30 40 50 60 70 80 Temperature (degrees F) There is not any clear relationship between atmospheric temperature and O-ring damage from the scatter plot. (b),(f) Chart Title 12 10 O-ring Damage Index 8 6 4 2 0 -2 0 20 40 60 80 100 -4 Temperature (degrees F) (c) In (b), there are 16 observations with an O-ring damage index of 0 for a variety of temperature. If one concentrates on these observations with no O-ring damage, there is obviously no relationship between O-ring damage index and temperature. If all observations are used, the observations with no O-ring damage will bias the estimated relationship. If the intention is to investigate the relationship between the degrees of O-ring damage to atmospheric temperature, it makes sense to focus only on the flight in which there was O-ring damage. (d) Prediction should not be made for an atmospheric temperature of 31 0F because it is outside the range of the temperature variable in the data. Such prediction will involve extrapolation, which assumes that any relationship between two variables will continue to hold outside the domain of the temperature variable. (e) ˆ Y 18.036 0.234X (g) A nonlinear model is more appropriate for these data. Solutions to End-of-Section and Chapter Review Problems 321 13.70 (h) cont. Temperature Residual Plot 7 6 5 4 Residuals 3 2 1 0 -1 -2 -3 0 20 40 60 80 100 Temperature The string of negative residuals and positive residuals that lie on a straight line with a positive slope in the lower-right corner of the plot is a strong indication that a nonlinear model should be used if all 23 observations are to be used in the fit. 13.71 (a) Scatter Diagram 80 1999 Gross Profits (in millions) 70 60 50 40 30 20 10 0 0 50 100 150 200 250 300 350 400 Page Views (monthly visitors in thousands) If the outlier (Amazon.com) in the upper right corner of the scatter diagram is removed, there is not an obvious linear relationship between page views and gross profits. (b) ˆ Y 1.354 0.154X (c) Since all the companies in the data are internet companies and rely their business on online customers, it is not meaningful to interpret the estimated intercept when there is no online visitor at all. The estimated slope coefficient b1 = 0.154 means that for each increase in one thousand additional monthly visitors, the average gross product of a company is estimated to increase by 0.154 million dollars. 322 Chapter 13: Simple Linear Regression 13.71 (d) 0.0553 1 0.2524 . In the long-run, 95% of all the confidence intervals that are cont. constructed for the slope parameter will contain the true value of the slope parameter. Since the interval does not contain 0, we are 95% confidence that there is a significant linear relationship between page views and gross profits. (e) r2 = 0.6183. 61.83% of the total variation in gross profits can be explained by the variation in the number of monthly visitors. (f) SYX = 13.4245. The standard error of the estimate measures the average squared distance between the values of the dependent variable and its fit on the least squares regression line. (g) The outliers are Amazon.com, Cheap Tickets, and About.com. (h) (a) Scatter Diagram 25 1999 Gross Profits (in millions) 20 15 10 5 0 0 20 40 60 80 100 120 140 160 Page Views (monthly visitors in thousands) With Amazon.com removed from the data, there is no obvious relationship between page views and gross profits. (b) ˆ Y 12.598 0.0637X (c) Since all the companies in the data are internet companies and rely their business on online customers, it is not meaningful to interpret the estimated intercept when there is no online visitor at all. The estimated slope coefficient b1 = -0.0637 means that for each increase in one thousand additional monthly visitors, the average gross product of a company is estimated to decrease by 0.0637 million dollars. (d) 0.1691 1 0.0417 . In the long-run, 95% of all the confidence intervals that are constructed for the slope parameter will contain the true value of the slope parameter. Since the interval contains 0, we cannot conclude that there is significant linear relationship between page views and gross profits with 95% level of confidence. (e) r2 = 0.2259. 22.59% of the total variation in gross profits can be explained by the variation in the number of monthly visitors. (f) SYX = 6.262. The standard error of the estimate measures the average squared distance between the values of the dependent variable and its fit on the least squares regression line. Solutions to End-of-Section and Chapter Review Problems 323 13.71 (h) (g) The outliers are Cheap Tickets, and eToys.com. The exclusion of Amazon.com changes the estimated slope coefficient from positive to negative. It is extremely influential on the least squares regression estimates. It is called influential point in regression analysis. 13.72 (a) Scatter Diagram 5000 4500 4000 Weight (grams) 3500 3000 2500 2000 1500 1000 500 0 0 20 40 60 80 100 Circumference (cms.) (b) ˆ Y 2629.222+82.4717X (c) For each increase in one additional cm in circumference, the estimated average weight of a pumpkin will increase by 82.4717 grams. (d) Y 2629.222+82.4717 60 2319.080 grams. ˆ (e) There appears to be a positive relationship between weight and circumference of a pumpkin. It is a good idea for the farmer to sell pumpkin by circumference instead of weight for circumference is a good predictor of weight and it is much easier to measure the circumference of a pumpkin than its weight. (f) r2 = 0.9373. 93.73% of the variation in pumpkin weight can be explained by the variation in circumference. (g) SYX = 277.7495. (h) Circumference Residual Plot 600 400 200 Residuals 0 -200 -400 -600 -800 0 10 20 30 40 50 60 70 80 90 Circumference There appears to be a nonlinear relationship between circumference and weight. 324 Chapter 13: Simple Linear Regression 13.72 (i) p-value is virtually 0. Reject H0. There is sufficient evidence to conclude that there cont. is a linear relationship between the circumference and the weight of a pumpkin. (j) 72.7875 1 92.1559 (k) 2186.9589 Y | X 2451.2020 (l) 1726.5508 YI 2911.6101 13.73 (a) Scatter Diagram 120 100 80 Wins 60 40 20 0 0 1 2 3 4 5 6 E.R.A. (b) ˆ Y 152.8097 15.0927X (c) b0 = 152.8097. For a team that has an E.R.A. of 0, the estimated average number of wins is 152.81. For each additional unit increase in team E.R.A., the estimated average number of wins decreases by 15.09. (d) Y 152.8097 15.0927 4.5 84.89 ˆ (e) SYX = 7.6363. (f) r2 = 0.4354. So, 43.54% of the variation in number of wins can be explained by the variation in the team E.R.A.. (g) Since b1 is negative, r r 2 0.4354 0.6599