Documents
User Generated
Resources
Learning Center

# chap13_part1

VIEWS: 8 PAGES: 28

• pg 1
```									                                              Solutions to End-of-Section and Chapter Review Problems   297

CHAPTER 13

13.1   (a)     When X = 0, the estimated expected value of Y is 2.
(b)     For increase in the value X by 1 unit, we can expect an increase by an estimated 5
units in the value of Y.
(c)       ˆ
Y  2  5 X  2  5(3)  17
(d) yes, (e) no, (f) no, (g) yes, (h) no

13.2   (a)     When X = 0, the estimated expected value of Y is 16.
(b)     For increase in the value X by 1 unit, we can expect a decrease in an estimated 0.5
units in the value of Y.
(c)      ˆ
Y  16  0.5 X  16  0.5(6)  13

13.3   (a)
Weekly Sales, Y

4
3
2
1
0
0           5          10         15         20
Shelf Space, X

(b),(c) For each increase in shelf space of an additional foot, there is an expected increase in
weekly sales of an estimated 0.074 hundreds of dollars, or \$7.40.
(d)       ˆ
Y  1.45  0.074 X  1.45  0.074(8)  2.042 , or \$204.20
(e)      b0  1.5333 , b1  0.064
For each increase in shelf space of an additional foot, there is an expected increase in
weekly sales of an estimated 0.064 hundreds of dollars, or \$6.40.
ˆ
Y  1.5333  0.064 X  1.5333  0.064(8)  2.0453 , or \$204.53
(f)     The best allocation to pet food depends on the profit made per foot of shelf space.
The expected weekly sales (and profits) per foot of shelf space actually declines at
the amount of allocated shelf space increases from 5 to 20 feet, however, if the
profitability is still high enough, it will be worthwhile assigning a higher amount to
pet food.
298    Chapter 13: Simple Linear Regression

13.4    (a)

Scatter Diagram

Weekly Sales, Y
15
10
5
0
0   200     400     600       800     1000   1200
Customers, X

(b),(c) For each increase of one additional customer, there is an expected increase in weekly
sales of an estimated 0.00873 thousands of dollars, or \$8.73.
(d)      ˆ
Y  2.423  0.00873 X  2.423  0.00873(600)  7.661 , or \$7661
(e)     b0  1.578 , b1  0.01009
For each increase in shelf space of an additional foot, there is an expected increase in
weekly sales of an estimated 0.01009 thousands of dollars, or \$10.09.
ˆ
Y  1.578  0.01009 X  1.578  0.01009(600)  7.632 , or \$7632

13.5    (a)
Scatter Diagram

25

20
# of Ord ers

15

10

5

0
0        200          400           600          800
Weight (lbs.)

(b)     ˆ
Y  0.1912  0.0297X
(c)     For each increase of one additional pound, the estimated average number of order
will increase by 0.0297.
(d)     Y  0.1912  0.0297 500  15.043
ˆ
Solutions to End-of-Section and Chapter Review Problems       299

13.6   (a)

Scatter Plot

400
350

Gross (\$millions)
300
250
200
150
100
50
0
0         10         20       30         40     50          60    70
Video Units Sold

ˆ
(b),(c) Y  76.54  4.3331X
(d)     For each increase of 1 million dollars in box office gross, expected home video units
sold are estimated to increase by 4.3331 thousand, or 4333.1 units. 76.54 represents
the portion of thousands of home video units that are not affected by box office gross.
(e)     ˆ
Y  76.54  4.3331X  76.54  4.3331(20)  163.202 or 163,202 units.
(f)     Some other factors that might be useful in predicting video unit sales are (i) the
number of days the movie was screened, (ii) the rating of the movie by critics, (iii)

13.7   (a)

Scatter Plot

2500

2000
Monthly Rent (\$)

1500

1000

500

0
0             500          1000         1500        2000        2500
Size (square feet)

ˆ
(b),(c) Y  177.1  1.065 X
300     Chapter 13: Simple Linear Regression

13.7     (d)    For each increase of 1 square foot in space, the expected monthly rental is estimated
cont.           to increase by \$1.065. 177.1 represents the portion of apartment monthly rental that is
not affected by square footage.
(e)     ˆ
Y  177.1  1.065 X  177.1  1.065(1000)  \$1242.10
(f)    An apartment with 500 square feet is outside the relevant range for the
independent variable.
(g)    The apartment with 1200 square feet has the more favorable rent relative to size.
Based on the regression equation, a 1200 square foot apartment would have an
expected monthly rent of \$1455.10, while a 1000 square foot apartment would have
an expected monthly rent of \$1242.10.

13.8     (a)

Scatter Plot

120
Tensile Strength (lbs/squrae

100

80
in.)

60

40

20

0
0   10            20               30   40
Hardness (Rockwell E units)

(b)     ˆ
Y  6.0483  2.0191X
(c)    For each increase of one additional Rockwell E unit in hardness, the estimated
average tensile strength will increase by 2.0191 pounds per square inch.
(d)     Y  6.0483  2.0191 70  147.382 pounds per square inch.
ˆ

13.9            80% of the variation in the dependent variable can be explained by the variation in
the independent variable.

13.10           SST = 40 and r2 = 0.90. So, 90% of the variation in the dependent variable can be
explained by the variation in the independent variable.

13.11           r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
variation in the independent variable.

13.12           r2 = 0.667. So, 66.7% of the variation in the dependent variable can be explained by
the variation in the independent variable.

13.13           Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at
least as large as SSR.
Solutions to End-of-Section and Chapter Review Problems             301

13.14   (a)     r2 = 0.684. So, 68.4% of the variation in the dependent variable can be explained by
the variation in the independent variable.
(b)     sYX  0.308
(c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.15   (a)     r2 = 0.912. So, 91.2% of the variation in the dependent variable can be explained by
the variation in the independent variable.
(b)     sYX  0.5015
(c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.16   (a)     r2 = 0.9731. So, 97.31% of the variation in the dependent variable can be explained
by the variation in the independent variable.
(b)     sYX  0.7258
(c)     Based on (a) and (b), the model should be very useful for predicting the number of
order.

13.17   (a)     r2 = 0.728. So, 72.8% of the variation in the dependent variable can be explained by
the variation in the independent variable.
(b)     sYX  47.87
(c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.18   (a)     r2 = 0.723. So, 72.3% of the variation in the dependent variable can be explained by
the variation in the independent variable.
(b)     sYX  194.6
(c)     Based on (a) and (b), the model should be very useful for predicting monthly rent.

13.19   (a)     r2 = 0.4613. So, 46.13% of the variation in the dependent variable can be explained
by the variation in the independent variable.
(b)     sYX  9.0616
(c)     Based on (a) and (b), the model is only marginally useful for predicting tensile
strength.

13.20           A residual analysis of the data indicates no apparent pattern. The assumptions of
regression appear to be met.

13.21           A residual analysis of the data indicates a pattern, with sizeable clusters of
consecutive residuals that are either all positive or all negative. This appears to
violate the assumption of independence of errors.

13.22   (a)-(b) Based on a residual analysis, the model appears to be adequate.

13.23   (a)-(b) Based on a residual analysis of the studentized residuals versus customers, the model
302     Chapter 13: Simple Linear Regression

13.24    (a)
Weight Residual Plot

1.5

1

Residuals   0.5

0

-0.5

-1

-1.5

-2
0                 100       200       300        400      500     600   700    800
Weight

Normal Probability Plot

1.5

1

0.5
Residuals

0
-2   -1.5       -1        -0.5      0      0.5   1     1.5    2
-0.5

-1

-1.5

-2
Z Value

The residual plot does not reveal any obvious pattern. So a linear fit appears to be
(b)     The residual plot does not reveal any possible violation of the homoscedasticity
assumption. This is not a time series data, so we do not need to evaluate the
independence assumption. The normal probability plot shows that the distribution
has a thicker left tail than a normal distribution but there is no sign of severe
skewness.

13.25    (a)-(b) Based on a residual analysis of the studentized residuals versus box office gross, the

13.26    (a)-(b) Based on a residual analysis of the studentized residuals versus size, the model
Solutions to End-of-Section and Chapter Review Problems   303

13.27   (a)
Hardness Residual Plot

20

10

Residuals     0
20            25             30               35   40
-10

-20
Hardness

The residual plot does not reveal any obvious pattern. So a linear fit appears to be
(b)
Normal Probability Plot

20

15

10

5
Residuals

0
-3                  -2           -1             0            1   2    3
-5

-10

-15

-20
Z Value

The residual plot does not reveal any possible violation of the homoscedasticity
assumption. This is not a time series data, so we do not need to evaluate the
independence assumption. The normal probability plot shows that the distribution
has a slightly thinner right tail than a normal distribution but there is no sign of
severe skewness.

13.28   (a)   An increasing linear relationship exists.
(b)   D = 0.109
(c)   There is strong positive autocorrelation among the residuals.

13.29   (a)   There is no apparent pattern in the residuals over time.
(b)   D = 1.661>1.36. There is no evidence of positive autocorrelation among the
residuals.
(c)   The data are not positively autocorrelated.

13.30   (a)   No, since the data have been collected for a single period for a set of stores.
(b)   If a single store was studied over a period of time and the amount of shelf space
varied over time, computation of the Durbin-Watson statistic would be necessary.
304     Chapter 13: Simple Linear Regression

13.31    (a)

Scatter Plot

160
140
120
Kilowatt Usage

100
80
60
40
20
0
0               20              40           60               80         100
Atmospheric Temperature (degree F)

(b)                    b0 = 169.455, b1 = –1.8579
(c)                    For each increase of one degree in Fahrenheit temperature, the expected average
kilowatt usage is estimated to decrease by 1.8579.
(d)                     ˆ
Y  169.455  1.8579 X  169.455  1.8579(50)  76.56
(e)                    r2 = 0.894. So, 89.4% of the variation in average kilowatt usage can be explained by
the variation in the average temperature.
(f)                     sYX  11.63
(g)
Temperature Residual Plot

25
20
15
10
5
Residuals

0
-5
-10
-15
-20
-25
-30
0   10       20      30     40      50        60        70   80     90
Temperature
Solutions to End-of-Section and Chapter Review Problems                                    305

13.31   (h)
cont.
Residuals vs Time Period

25
20
15
10
Residuals     5
0
-5 0          5            10             15                  20          25          30
-10
-15
-20
-25
-30
Time Period

(i)   D = 1.18<1.27. There is evidence of positive autocorrelation among the residuals.
(j)   The plot of the residuals versus temperature indicates that positive residuals tend to
occur for the lowest and highest temperatures in the data set. A nonlinear model
might be more appropriate. The evidence of positive autocorrelation is another reason
to question the validity of the model.

13.32   (a)
Scatter Diagram

100
90
80
70
60
Cost (\$000)

50
40
30
20
10
0
0   1000         2000           3000                4000        5000        6000        7000
# of order

(b)   b0 = 0.458, b1 = 0.0161
(c)   For each increase of one order, the expected distribution cost is estimated to increase
by 0.0161 thousand dollars, or \$16.10.
(d)   ˆ
Y  0.458  0.0161X  0.458  0.0161(4500)  72.908 or \$72,908
(e)   r2 = 0.844. So, 84.4% of the variation in distribution cost can be explained by the
variation in the number of orders.
(f)   sYX  5.218
306     Chapter 13: Simple Linear Regression

13.32    (g)
cont.
Orders Residual Plot

15

10

5
Residuals

0

-5

-10

-15
0        1000       2000        3000        4000    5000   6000   7000
Orders

(h)
Residuals

15

10

5
Residuals

0
0          5          10          15          20      25     30
-5

-10

-15
Time Period

(i)    D = 2.08>1.45. There is no evidence of positive autocorrelation among the residuals.
(j)    Based on a residual analysis, the model appears to be adequate.
Solutions to End-of-Section and Chapter Review Problems     307

13.33   (a)
Scatter Diagram

140

Gasoline Price (cents/gal.)
120

100

80

60

40

20

0
0       5      10      15         20       25       30    35    40
Crude Oil (\$/bbl.)

(b)   ˆ
Y  42.8798+2.6573X
(c)   For each increase of 1 \$/bbl. of crude oil price, the estimated average gasoline price
will increase by 2.6573 cents/gallon.
(d)   Y  42.8798+2.6573  20  96.03 cents/gallon.
ˆ
(e)   r2 = 0.7117. So 71.17% of the variation in gasoline price can be explained by the
variation in crude oil price.
(f)   SYX = 12.32.
(g)
Crude Oil Price Residual Plot

40

30

20
Residuals

10

0

-10

-20
0       5      10       15         20          25    30    35     40
Crude Oil Price
308     Chapter 13: Simple Linear Regression

13.33    (h)
cont.
Residuals

40

30

20

Residuals
10

0
0        5          10          15         20       25    30
-10

-20
Time Period

(i)    D = 0.5915 < 1.27, there is evidence of positive autocorrelation.
(j)
Normal Probability Plot

40

30

20
Residuals

10

0
-2               -1.5      -1       -0.5         0          0.5        1   1.5   2
-10

-20
Z Value

According to the residual plot of crude oil price, a nonlinear model is more
appropriate. The plot of residual versus time series along with the Durbin-Watson
statistic suggest that there is strong evidence of positive autocorrelation. The normal
probability plot indicates that the distribution has thinner tails than a normal
distribution but there is no sign of severe skewness.
Solutions to End-of-Section and Chapter Review Problems   309

13.34   (a)
Scatter Diagram

4

3.5

Sales Per Store (\$000)
3

2.5

2

1.5

1

0.5

0
0       20       40         60            80   100    120
Temperature (Degree F)

(b)   b0 = –2.535, b1 = 0.060728
(c)   For each increase of one degree Fahrenheit in the high temperature, expected sales
are estimated to increase by 0.060728 thousand dollars, or \$60.73.
(d)    ˆ
Y  2.535  0.060728 X  2.535  0.060728(83)  2.5054 or \$2505.40
(e)   r2 = 0.94. So, 94% of the variation in sales per store can be explained by the variation
in the daily high temperature.
(f)   sYX  0.1461
(g)
Temperature Residual Plot

0.3

0.2

0.1
Residuals

0

-0.1

-0.2

-0.3

-0.4
0       20       40         60            80   100   120
Temperature
310     Chapter 13: Simple Linear Regression

13.34    (h)
cont.
Residuals

0.3

0.2

0.1
Residuals

0
0   5       10          15        20         25
-0.1

-0.2

-0.3

-0.4
Time Period

(i)     D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals.
(j)     The plot of the residuals versus time period shows some clustering of positive and
negative residuals for intervals in the domain, suggesting a nonlinear model might be
better. Otherwise, the model appears to be adequate.
(k)     b0 = –2.6281, b1 = 0.061713
For each increase of one degree Fahrenheit in the high temperature, expected sales
are estimated to increase by 0.061713 thousand dollars, or \$61.71.
ˆ
Y  2.6281  0.061713 X  2.6281  0.061713(83)  2.4941 or \$2494.10
2
r = 0.929. 92.9% of the variation in sales per store can be explained by the variation
in the daily high temperature.
sYX  0.1623
D = 1.24. The test of the Durbin-Watson statistic is inconclusive as to whether there
is positive autocorrelation among the residuals.
The plot of the residuals versus time period shows some clustering of positive and
negative residuals for intervals in the domain, suggesting a nonlinear model might be
better. Otherwise, the model appears to be adequate.
The results are similar to those in (a)-(j).

13.35    (a)     t  b1 / sb1  4.5 / 1.5  3.00
(b)      With n = 18, df = 18 – 2 =16. t16  2.1199
(c)    Reject H0. There is evidence that the fitted linear regression model is useful.
(d)     b0  t16 sb1  1  b0  t16 sb1 , 4.5  2.1199(1.5)  1  4.5  2.1199(1.5) ,
1.32  1  7.68

13.36    (a)     MSR  SSR / p  60 / 1  60
MSE  SSE /(n  p  1)  40 / 18  2.222
F  MSR / MSE  60 / 2.222  27
(b)     F1,18  4.414
(c)     Reject H0. There is evidence that the fitted linear regression model is useful.
Solutions to End-of-Section and Chapter Review Problems             311

13.37   (a)    t  4.65  t10  2.2281 with 10 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(b)    0.0386  1  0.1094

13.38   (a)    t  13.65  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(b)    0.0074  1  0.0101

13.39   (a)    p-value is virtually 0 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b)    0.0276  1  0.0318

13.40   (a)    t  8.65  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(b)    3.3073  1  5.3589

13.41   (a)    t  7.74  t 23  2.0687 with 23 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(b)    0.7805  1  1.3497

13.42   (a)    p-value = 7.26497E-06 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b)    1.2463  1  2.7918

13.43          (a) For the Ford Motor Company, the estimated value of its stock will increase by 0.92%
on average when the S & P 500 index increases by 1%.
For the Houston Industries, the estimated value of its stock will increase by 0.43% on
average when the S & P 500 index increases by 1%.
For IBM, the estimated value of its stock will increase by 1.09% on average when the S
& P 500 index increases by 1%.
For LSI Logic, the estimated value of its stock will increase by 1.80% on average when
the S & P 500 index increases by 1%.
(b)    A stock is riskier than the market if the estimated absolute value of the beta exceeds one.
This can be used to gauge the volatility of a stock in relative to how the market behaves
in general.

13.44   (a)      % dialy change in ULPIX   b0  2.00  % dialy change in S&P 500 Index 
(b)    If the S&P gains 30% in a year, the ULPIX is expected to gain an estimated 60%.
(c)    If the S&P loses 35% in a year, the ULPIX is expected to lose an estimated 70%.
(d)    Since the leverage funds have higher volatility and, hence, higher risk than the market,
risk averse investors should stay away from these funds. Risk takers, on the other hand,
will benefit from the higher potential gain from these funds.
312     Chapter 13: Simple Linear Regression

13.45    (a)    r = -0.1641.
(b)    t = -0.4706, p-value = 0.6505 > 0.05. Do not reject H0. There is not enough evidence
to conclude that there is a significant linear relationship between the retial price and
the energy cost per year of medium-size top-freezer refrigerators.

13.46    (a)     r = 0.9656
(b)    The p-value of the t test is essentially zero. At 0.05 level of significance, there
is significant linear relationship between calories and fat content.
(c)    Yes, one would expect the ice creams with higher fat content to have more
calories.

13.47    (a)                ˆ
When X = 2, Y  5  3 X  5  3(2)  11
1    ( X  X )2     1 (2  2) 2
h      n i                        0.05
(X i  X )
n               2   20   20
i 1
ˆ
95% confidence interval: Y  t18 sYX    h  11  2.1009 1 0.05
10.53  YX  11.47
(b)                             ˆ
95% prediction interval: Y  t18 sYX 1  h  11  2.1009 1  1.05
8.847  YI  13.153

13.48    (a)                ˆ
When X = 4, Y  5  3 X  5  3(4)  17
1    ( X  X )2     1 (4  2) 2
h      n i                        0.25
(X i  X )
n               2   20   20
i 1
ˆ
95% confidence interval: Y  t18 sYX    h  11  2.1009 1  0.25
15.95  YX  18.05
(b)                             ˆ
95% prediction interval: Y  t18 sYX 1  h  11  2.1009 1  1.25
14.651  YI  19.349
(c)    The intervals in this problem are wider because the value of X is farther from X .

13.49    (a)    1.7867  Y | X  2.2964
(b)    1.3100  YI  2.7740
(c)    Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.

13.50    (a)     7.3664  Y | X  7.9549
(b)     6.5667  YI  8.7546
(c)    Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.
Solutions to End-of-Section and Chapter Review Problems            313

13.51   (a)     14.7150  Y | X  15.3701
(b)     13.5059  YI  16.5793
(c)     Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.

13.52   (a)     100.96  Y | X  138.77
(b)     20.1  YI  219.72
(c)     Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.

13.53   (a)     1153.0  Y | X  1331.5
(b)     829.9  YI  1654.6
(c)     Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.

13.54   (a)     116.7082  Y | X  178.0564
(b)     111.5942  YI  183.1704
(c)     Part (b) provides an estimate for an individual response and Part (a) provides an
estimate for an average predicted value.

13.55   The slope of the line b1 represents the estimated expected change in Y per unit change in X. It
represents the estimated average amount that Y changes (either positively or negatively) for a
particular unit change in X. The Y intercept b0 represents the estimated average value of Y
when X equals 0.

13.56   The coefficient of determination measures the proportion of variation in Y that is explained
by the independent variable X in the regression model.

13.57   The unexplained variation or error sum of squares (SSE) will be equal to zero only when the
regression line fits the data perfectly and the coefficient of determination equals 1.

13.58   The explained variation or regression sum of squares (SSR) will be equal to zero only when
there is no relationship between the Y and X variables, and the coefficient of determination
equals 0.

13.59   Unless a residual analysis is undertaken, you will not know whether the model fit is
appropriate for the data. In addition, residual analysis can be used to check whether the
assumptions of regression have been seriously violated.

13.60   The assumptions of regression are normality of error, homoscedasticity, and independence of
errors. The normality of error assumption can be evaluated by obtaining a histogram, box-
and-whisker plot, and/or normal probability plot of the residuals. The homoscedasticity
assumption can be evaluated by plotting the residuals on the vertical axis and the X variable
on the horizontal axis. The independence of errors assumption can be evaluated by plotting
the residuals on the vertical axis and the time order variable on the horizontal axis. This
assumption can also be evaluated by computing the Durbin-Watson statistic.
314     Chapter 13: Simple Linear Regression

13.61    The Durbin-Watson statistic is a measure of the autocorrelation among the residuals. It
measures the correlation among consecutive residuals.

13.62    If the data in a regression analysis has been collected over time, then the assumption of
independence of errors needs to be evaluated using the Durbin-Watson statistic.

13.63    The confidence interval for the mean response estimates the average response for a given X
value. The prediction interval estimates the value for a single item or individual.

13.64    (a)
Scatter Diagram

80

70
Delivery Time (minutes)

60

50

40

30

20

10

0
0   50   100      150       200    250   300   350
Number of Cases

(b)     b0 = 24.84, b1 = 0.14
(c)     ˆ                                                        ˆ
Y  24.84  0.14 X , where X is the number of cases and Y is the estimated
delivery time.
(d)     For each additional case, the estimated delivery time increases by 0.14 minutes.
24.84 is the portion of the estimated delivery time that is not affected by the number
of cases.
(e)     ˆ
Y  24.84  0.14 X  24.84  0.14(150)  45.84
(f)     No, 500 cases is outside the relevant range of the data used to fit the regression
equation.
(g)     r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the
variation in the number of cases.
(h)     Since b1 is positive, r   r 2   0.972  0.986
(i)     sYX  1.987
(j)     Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the number of cases, there is no pattern. The model appears
(k)     t  24.88  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(l)     44.88  YX  46.80
(m)     41.56  YI  50.12
(n)     0.1282  1  0.1518
Solutions to End-of-Section and Chapter Review Problems    315

13.65   (a)
Scatter Diagram

500

450
400
350
300
250
200
150
100
50
0
0       500       1000      1500      2000     2500     3000
Number of Incoming Calls

(b)   b0 = –63.02, b1 = 0.189
(c)   ˆ                                                                   ˆ
Y  63.02  0.189 X , where X is the number of incoming calls and Y is the
(d)   For each additional incoming call, the estimated number of trade executions increases by
0.189 minutes. – 63.02 is the portion of the estimated delivery time that is not affected
by the number of incoming calls.
(e)   ˆ
Y  63.02  0.189 X  63.02  0.189(2000)  314.99
(f)   No, 5000 incoming calls is outside the relevant range of the data used to fit the
regression equation.
(g)   r2 = 0.630. So, 63.0% of the variation in trade executions can be explained by the
variation in the number of incoming calls.
(h)   Since b1 is positive, r   r 2   0.63  0.794
(i)   sYX  29.42
(j)   Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the number of cases, there is no pattern. The model appears
(k)   D = 1.96
(l)   D = 1.96>1.52. There is no evidence of positive autocorrelation. The model appears
(m)   t  7.50  t 33  2.0345 with 33 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(n)   302.07  YX  327.91
(o)   253.76  YI  376.22
(p)   0.1377  1  0.2403
316     Chapter 13: Simple Linear Regression

13.66    (a)
Scatter Diagram

140

120
Selling Price (\$000)   100

80

60

40

20

0
0   20      40             60    80           100
Assessed Value (\$000)

b0 = –44.172, b1 = 1.78171
(b)    For each additional dollar in assessed value, the estimated selling price increases by
\$1.78. –44.172 is the portion of the estimated selling price that is not affected by the
assessed value.
(c)     ˆ
Y  44.172  1.78171X  44.172  1.78171(70)  80.458 or \$80,458
(d)     sYX  3.475
(e)    r2 = 0.926. 92.6% of the variation in selling price can be explained by the variation in
the assessed value.
(f)    Since b1 is positive, r   r 2   0.926  0.962
(g)    Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the assessed value, there is no pattern. The model appears to
(h)    t  18.66  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(i)    78.707  YX  82.388
(j)    73.195  YI  87.900
(k)    1.5862  1  1.9773
Solutions to End-of-Section and Chapter Review Problems           317

13.67   (a)
Scatter Diagram

100
90
Assessed Value (\$000)
80
70
60
50
40
30
20
10
0
0           0.5               1             1.5             2    2.5
Heating Area (thousands of square feet)

b0 = 51.915, b1 = 16.633
(b)                             For each additional 1000 square feet in heating area, the estimated assessed value
increases by \$16,633. \$51,915 is the portion of the estimated assessed value that is
not affected by heating area.
(c)                              ˆ
Y  51.915  16.633 X  51.915  16.633(1.75)  81.024 or \$81,024
(d)                             sYX  2.919
(e)                             r2 = 0.659. 65.9% of the variation in assessed value can be explained by the variation
in heating area.
(f)                             Since b1 is positive, r   r 2   0.659  0.812
(g)                             Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the heating area, there is no pattern. The model appears to be
(h)                             t  5.02  t13  2.1604 with 13 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(i)                             79.279  YX  82.769
(j)                             74.479  YI  87.569
(k)                             9.469  1  23.797
(l)                             b0 = 52.805, b1 = 15.849
For each additional 1000 square feet in heating area, the estimated assessed value
increases by \$15,849. \$52,805 is the portion of the estimated assessed value that is
not affected by heating area.
ˆ
Y  52.805  15.849 X  52.805  15.849(1.75)  80.541 or \$80,541
sYX  2.598
r2 = 0.689. 68.9% of the variation in assessed value can be explained by the variation
in heating area.
Since b1 is positive, r   r 2   0.689  0.83
Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the heating area, there is no pattern. The model appears to be
318     Chapter 13: Simple Linear Regression

13.67    (l)           t  5.37  t13  2.1604 with 13 degrees of freedom for   0.05 . Reject H0. There
cont.                  is evidence that the fitted linear regression model is useful.
78.987  YX  82.096
74.716  YI  86.367
9.471  1  22.227

13.68    (a)
Scatter Diagram

4.5
4
3.5
3
2.5
GPI

2
1.5
1
0.5
0
0      100      200     300      400       500    600     700    800
GMAT Score

b0 = 0.30, b1 = 0.00487
(b)           For each additional point on the GMAT score, the estimated GPI increases by
0.00487. 0.30 is the portion of the GPI that is not affected by the GMAT score.
(c)            ˆ
Y  0.30  0.00487 X  0.30  0.00487(600)  3.222 5
(d)           sYX  0.158
(e)           r2 = 0.793. 79.3% of the variation in the GPI can be explained by the
variation in the GMAT score.
(f)           Since b1 is positive, r   r 2   0.793  0.891
(g)           Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the GMAT score, there is no pattern. The model appears to
(h)           t  8.31  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(i)           3.144  YX  3.301
(j)           2.886  YI  3.559
(k)           0.00366  1  0.00608
(l)           b0 = 0.258, b1 = 0.00494
For each additional point on the GMAT score, the estimated GPI increases by
0.00494. 0.258 is the portion of the GPI that is not affected by the GMAT score.
ˆ
Y  0.258  0.00494 X  0.258  0.00494(600)  3.221
sYX  0.147
r2 = 0.820. 82.0% of the variation in the GPI can be explained by the variation in the
GMAT score.
Solutions to End-of-Section and Chapter Review Problems    319

13.68   (l)   Since b1 is positive, r   r 2   0.82  0.906
cont.         Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the GMAT score, there is no pattern. The model appears to
t  9.06  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
3.147  YX  3.295
2.903  YI  3.539
0.00380  1  0.00609

13.69   (a)
Scatter Diagram

4.5
4
Completion Time (hours)

3.5
3
2.5
2
1.5
1
0.5
0
0      50      100       150      200      250   300     350
Invoice Processed

(b)   b0 = 0.4024, b1 = 0.012608
(c)   For each additional invoice processed, the estimated completion time increases by
0.012608 hours. 0.4024 is the portion of the estimated completion time that is not
affected by the number of invoices processed.
(d)    ˆ
Y  0.4024  0.012608 X  0.4024  0.012608(150)  2.2934
(e)   sYX  0.3342
(f)   r2 = 0.892. 89.2% of the variation in completion time can be explained by the
variation in the number of invoices processed.
(g)   Since b1 is positive, r   r 2   0.892  0.945
(i)   Based on a visual inspection of the graphs of the distribution of studentized residuals
and the residuals versus the number of invoices, there is no pattern. The model
(j)   D = 1.78
(k)   D = 1.78>1.49. There is no evidence of positive autocorrelation. The
(l)   t  15.24  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(m)   2.1638  YX  2.4230
(n)   1.5966  YI  2.9902
320     Chapter 13: Simple Linear Regression

13.70    (a)
O-ring damage index

12

O-ring Damage Index
10

8

6

4

2

0
0   10        20      30      40         50   60        70   80
Temperature (degrees F)

There is not any clear relationship between atmospheric temperature and O-ring
damage from the scatter plot.
(b),(f)

Chart Title

12
10
O-ring Damage Index

8
6
4
2
0
-2 0          20             40           60            80        100
-4
Temperature (degrees F)

(c)       In (b), there are 16 observations with an O-ring damage index of 0 for a variety of
temperature. If one concentrates on these observations with no O-ring damage, there
is obviously no relationship between O-ring damage index and temperature. If all
observations are used, the observations with no O-ring damage will bias the
estimated relationship. If the intention is to investigate the relationship between the
degrees of O-ring damage to atmospheric temperature, it makes sense to focus only
on the flight in which there was O-ring damage.
(d)       Prediction should not be made for an atmospheric temperature of 31 0F because it is
outside the range of the temperature variable in the data. Such prediction will
involve extrapolation, which assumes that any relationship between two variables
will continue to hold outside the domain of the temperature variable.
(e)        ˆ
Y  18.036  0.234X
(g)       A nonlinear model is more appropriate for these data.
Solutions to End-of-Section and Chapter Review Problems               321

13.70   (h)
cont.
Temperature Residual Plot

7
6
5
4
Residuals
3
2
1
0
-1
-2
-3
0         20              40            60           80           100
Temperature

The string of negative residuals and positive residuals that lie on a straight line
with a positive slope in the lower-right corner of the plot is a strong indication
that a nonlinear model should be used if all 23 observations are to be used in
the fit.

13.71   (a)
Scatter Diagram

80
1999 Gross Profits (in millions)

70

60

50

40

30

20

10

0
0    50         100      150      200        250    300     350     400
Page Views (monthly visitors in thousands)

If the outlier (Amazon.com) in the upper right corner of the scatter diagram is
removed, there is not an obvious linear relationship between page views and gross
profits.
(b)   ˆ
Y  1.354  0.154X
(c)   Since all the companies in the data are internet companies and rely their business on
online customers, it is not meaningful to interpret the estimated intercept when there
is no online visitor at all. The estimated slope coefficient b1 = 0.154 means that for
each increase in one thousand additional monthly visitors, the average gross product
of a company is estimated to increase by 0.154 million dollars.
322     Chapter 13: Simple Linear Regression

13.71    (d)    0.0553  1  0.2524 . In the long-run, 95% of all the confidence intervals that are
cont.           constructed for the slope parameter will contain the true value of the slope parameter.
Since the interval does not contain 0, we are 95% confidence that there is a
significant linear relationship between page views and gross profits.
(e)    r2 = 0.6183. 61.83% of the total variation in gross profits can be explained by the
variation in the number of monthly visitors.
(f)    SYX = 13.4245. The standard error of the estimate measures the average squared
distance between the values of the dependent variable and its fit on the least squares
regression line.
(g)    The outliers are Amazon.com, Cheap Tickets, and About.com.
(h)    (a)
Scatter Diagram

25
1999 Gross Profits (in millions)

20

15

10

5

0
0      20       40       60       80      100      120      140     160
Page Views (monthly visitors in thousands)

With Amazon.com removed from the data, there is no obvious relationship
between page views and gross profits.
(b)                                       ˆ
Y  12.598  0.0637X
(c)                                       Since all the companies in the data are internet companies and rely their
business on online customers, it is not meaningful to interpret the estimated
intercept when there is no online visitor at all. The estimated slope
coefficient b1 = -0.0637 means that for each increase in one thousand
additional monthly visitors, the average gross product of a company is
estimated to decrease by 0.0637 million dollars.
(d)                                        0.1691  1  0.0417 . In the long-run, 95% of all the confidence
intervals that are constructed for the slope parameter will contain the true
value of the slope parameter. Since the interval contains 0, we cannot
conclude that there is significant linear relationship between page views and
gross profits with 95% level of confidence.
(e)                                       r2 = 0.2259. 22.59% of the total variation in gross profits can be explained
by the variation in the number of monthly visitors.
(f)                                       SYX = 6.262. The standard error of the estimate measures the average
squared distance between the values of the dependent variable and its fit on
the least squares regression line.
Solutions to End-of-Section and Chapter Review Problems              323

13.71   (h)   (g)                       The outliers are Cheap Tickets, and eToys.com.
The exclusion of Amazon.com changes the estimated slope coefficient from
positive to negative. It is extremely influential on the least squares
regression estimates. It is called influential point in regression analysis.

13.72   (a)
Scatter Diagram

5000
4500
4000
Weight (grams)

3500
3000
2500
2000
1500
1000
500
0
0           20            40             60         80             100
Circumference (cms.)

(b)   ˆ
Y  2629.222+82.4717X
(c)   For each increase in one additional cm in circumference, the estimated average
weight of a pumpkin will increase by 82.4717 grams.
(d)   Y  2629.222+82.4717  60  2319.080 grams.
ˆ
(e)   There appears to be a positive relationship between weight and circumference of a
pumpkin. It is a good idea for the farmer to sell pumpkin by circumference instead of
weight for circumference is a good predictor of weight and it is much easier to
measure the circumference of a pumpkin than its weight.
(f)   r2 = 0.9373. 93.73% of the variation in pumpkin weight can be explained by the
variation in circumference.
(g)   SYX = 277.7495.
(h)
Circumference Residual Plot

600

400

200
Residuals

0

-200

-400

-600

-800
0     10    20     30      40     50      60   70        80   90
Circumference

There appears to be a nonlinear relationship between circumference and weight.
324     Chapter 13: Simple Linear Regression

13.72    (i)            p-value is virtually 0. Reject H0. There is sufficient evidence to conclude that there
cont.                   is a linear relationship between the circumference and the weight of a pumpkin.
(j)            72.7875  1  92.1559
(k)            2186.9589  Y | X  2451.2020
(l)            1726.5508  YI  2911.6101

13.73    (a)
Scatter Diagram

120

100

80
Wins

60

40

20

0
0           1           2           3          4           5           6
E.R.A.

(b)            ˆ
Y  152.8097 15.0927X
(c)            b0 = 152.8097. For a team that has an E.R.A. of 0, the estimated average number of
wins is 152.81. For each additional unit increase in team E.R.A., the estimated
average number of wins decreases by 15.09.
(d)            Y  152.8097 15.0927  4.5  84.89
ˆ
(e)            SYX = 7.6363.
(f)            r2 = 0.4354. So, 43.54% of the variation in number of wins can be explained by the
variation in the team E.R.A..
(g)            Since b1 is negative, r   r 2   0.4354  0.6599

```
To top