Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

chap13_part1

VIEWS: 8 PAGES: 28

									                                              Solutions to End-of-Section and Chapter Review Problems   297




                                                          CHAPTER 13


13.1   (a)     When X = 0, the estimated expected value of Y is 2.
       (b)     For increase in the value X by 1 unit, we can expect an increase by an estimated 5
               units in the value of Y.
       (c)       ˆ
                Y  2  5 X  2  5(3)  17
       (d) yes, (e) no, (f) no, (g) yes, (h) no

13.2   (a)     When X = 0, the estimated expected value of Y is 16.
       (b)     For increase in the value X by 1 unit, we can expect a decrease in an estimated 0.5
               units in the value of Y.
       (c)      ˆ
                Y  16  0.5 X  16  0.5(6)  13

13.3   (a)
                    Weekly Sales, Y




                                      4
                                      3
                                      2
                                      1
                                      0
                                          0           5          10         15         20
                                                           Shelf Space, X



       (b),(c) For each increase in shelf space of an additional foot, there is an expected increase in
               weekly sales of an estimated 0.074 hundreds of dollars, or $7.40.
       (d)       ˆ
                Y  1.45  0.074 X  1.45  0.074(8)  2.042 , or $204.20
       (e)      b0  1.5333 , b1  0.064
               For each increase in shelf space of an additional foot, there is an expected increase in
               weekly sales of an estimated 0.064 hundreds of dollars, or $6.40.
                ˆ
                Y  1.5333  0.064 X  1.5333  0.064(8)  2.0453 , or $204.53
       (f)     The best allocation to pet food depends on the profit made per foot of shelf space.
               The expected weekly sales (and profits) per foot of shelf space actually declines at
               the amount of allocated shelf space increases from 5 to 20 feet, however, if the
               profitability is still high enough, it will be worthwhile assigning a higher amount to
               pet food.
298    Chapter 13: Simple Linear Regression


13.4    (a)

                                                         Scatter Diagram



                        Weekly Sales, Y
                                           15
                                           10
                                               5
                                               0
                                                   0   200     400     600       800     1000   1200
                                                                  Customers, X

        (b),(c) For each increase of one additional customer, there is an expected increase in weekly
                sales of an estimated 0.00873 thousands of dollars, or $8.73.
        (d)      ˆ
                Y  2.423  0.00873 X  2.423  0.00873(600)  7.661 , or $7661
        (e)     b0  1.578 , b1  0.01009
                For each increase in shelf space of an additional foot, there is an expected increase in
                weekly sales of an estimated 0.01009 thousands of dollars, or $10.09.
                ˆ
                Y  1.578  0.01009 X  1.578  0.01009(600)  7.632 , or $7632

13.5    (a)
                                                             Scatter Diagram

                                          25

                                          20
                   # of Ord ers




                                          15

                                          10

                                          5

                                          0
                                               0        200          400           600          800
                                                                 Weight (lbs.)


        (b)     ˆ
                Y  0.1912  0.0297X
        (c)     For each increase of one additional pound, the estimated average number of order
                will increase by 0.0297.
        (d)     Y  0.1912  0.0297 500  15.043
                ˆ
                                                 Solutions to End-of-Section and Chapter Review Problems       299


13.6   (a)

                                                                       Scatter Plot

                                       400
                                       350

                   Gross ($millions)
                                       300
                                       250
                                       200
                                       150
                                       100
                                       50
                                        0
                                             0         10         20       30         40     50          60    70
                                                                         Video Units Sold

                ˆ
       (b),(c) Y  76.54  4.3331X
       (d)     For each increase of 1 million dollars in box office gross, expected home video units
               sold are estimated to increase by 4.3331 thousand, or 4333.1 units. 76.54 represents
               the portion of thousands of home video units that are not affected by box office gross.
       (e)     ˆ
               Y  76.54  4.3331X  76.54  4.3331(20)  163.202 or 163,202 units.
       (f)     Some other factors that might be useful in predicting video unit sales are (i) the
               number of days the movie was screened, (ii) the rating of the movie by critics, (iii)
               the amount of advertisement spent on the video release, etc.

13.7   (a)

                                                                       Scatter Plot

                                       2500

                                       2000
                   Monthly Rent ($)




                                       1500

                                       1000

                                       500

                                         0
                                              0             500          1000         1500        2000        2500
                                                                        Size (square feet)

               ˆ
       (b),(c) Y  177.1  1.065 X
300     Chapter 13: Simple Linear Regression


13.7     (d)    For each increase of 1 square foot in space, the expected monthly rental is estimated
cont.           to increase by $1.065. 177.1 represents the portion of apartment monthly rental that is
                not affected by square footage.
         (e)     ˆ
                 Y  177.1  1.065 X  177.1  1.065(1000)  $1242.10
         (f)    An apartment with 500 square feet is outside the relevant range for the
                independent variable.
         (g)    The apartment with 1200 square feet has the more favorable rent relative to size.
                Based on the regression equation, a 1200 square foot apartment would have an
                expected monthly rent of $1455.10, while a 1000 square foot apartment would have
                an expected monthly rent of $1242.10.

13.8     (a)

                                                                  Scatter Plot

                                                   120
                    Tensile Strength (lbs/squrae




                                                   100

                                                   80
                                 in.)




                                                   60

                                                   40

                                                   20

                                                    0
                                                         0   10            20               30   40
                                                              Hardness (Rockwell E units)


         (b)     ˆ
                 Y  6.0483  2.0191X
         (c)    For each increase of one additional Rockwell E unit in hardness, the estimated
                average tensile strength will increase by 2.0191 pounds per square inch.
         (d)     Y  6.0483  2.0191 70  147.382 pounds per square inch.
                 ˆ

13.9            80% of the variation in the dependent variable can be explained by the variation in
                the independent variable.

13.10           SST = 40 and r2 = 0.90. So, 90% of the variation in the dependent variable can be
                explained by the variation in the independent variable.

13.11           r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
                variation in the independent variable.

13.12           r2 = 0.667. So, 66.7% of the variation in the dependent variable can be explained by
                the variation in the independent variable.

13.13           Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at
                least as large as SSR.
                              Solutions to End-of-Section and Chapter Review Problems             301


13.14   (a)     r2 = 0.684. So, 68.4% of the variation in the dependent variable can be explained by
                the variation in the independent variable.
        (b)     sYX  0.308
        (c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.15   (a)     r2 = 0.912. So, 91.2% of the variation in the dependent variable can be explained by
                the variation in the independent variable.
        (b)     sYX  0.5015
        (c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.16   (a)     r2 = 0.9731. So, 97.31% of the variation in the dependent variable can be explained
                by the variation in the independent variable.
        (b)     sYX  0.7258
        (c)     Based on (a) and (b), the model should be very useful for predicting the number of
                order.

13.17   (a)     r2 = 0.728. So, 72.8% of the variation in the dependent variable can be explained by
                the variation in the independent variable.
        (b)     sYX  47.87
        (c)     Based on (a) and (b), the model should be very useful for predicting sales.

13.18   (a)     r2 = 0.723. So, 72.3% of the variation in the dependent variable can be explained by
                the variation in the independent variable.
        (b)     sYX  194.6
        (c)     Based on (a) and (b), the model should be very useful for predicting monthly rent.

13.19   (a)     r2 = 0.4613. So, 46.13% of the variation in the dependent variable can be explained
                by the variation in the independent variable.
        (b)     sYX  9.0616
        (c)     Based on (a) and (b), the model is only marginally useful for predicting tensile
                strength.

13.20           A residual analysis of the data indicates no apparent pattern. The assumptions of
                regression appear to be met.

13.21           A residual analysis of the data indicates a pattern, with sizeable clusters of
                consecutive residuals that are either all positive or all negative. This appears to
                violate the assumption of independence of errors.

13.22   (a)-(b) Based on a residual analysis, the model appears to be adequate.

13.23   (a)-(b) Based on a residual analysis of the studentized residuals versus customers, the model
                appears to be adequate.
302     Chapter 13: Simple Linear Regression


13.24    (a)
                                                                         Weight Residual Plot

                               1.5

                                 1

                   Residuals   0.5

                                 0

                               -0.5

                                -1

                               -1.5

                                -2
                                      0                 100       200       300        400      500     600   700    800
                                                                                     Weight

                                                                                  Normal Probability Plot

                                                      1.5

                                                        1

                                                      0.5
                                          Residuals




                                                        0
                                                             -2   -1.5       -1        -0.5      0      0.5   1     1.5    2
                                                      -0.5

                                                       -1

                                                      -1.5

                                                       -2
                                                                                              Z Value

                 The residual plot does not reveal any obvious pattern. So a linear fit appears to be
                 adequate.
         (b)     The residual plot does not reveal any possible violation of the homoscedasticity
                 assumption. This is not a time series data, so we do not need to evaluate the
                 independence assumption. The normal probability plot shows that the distribution
                 has a thicker left tail than a normal distribution but there is no sign of severe
                 skewness.

13.25    (a)-(b) Based on a residual analysis of the studentized residuals versus box office gross, the
                 model appears to be adequate.

13.26    (a)-(b) Based on a residual analysis of the studentized residuals versus size, the model
                 appears to be adequate.
                                              Solutions to End-of-Section and Chapter Review Problems   303


13.27   (a)
                                                    Hardness Residual Plot

                                    20

                                    10

                       Residuals     0
                                         20            25             30               35   40
                                   -10

                                   -20
                                                                 Hardness

              The residual plot does not reveal any obvious pattern. So a linear fit appears to be
              adequate.
        (b)
                                                             Normal Probability Plot

                                   20

                                   15

                                   10

                                    5
               Residuals




                                    0
                           -3                  -2           -1             0            1   2    3
                                    -5

                                   -10

                                   -15

                                   -20
                                                                      Z Value

              The residual plot does not reveal any possible violation of the homoscedasticity
              assumption. This is not a time series data, so we do not need to evaluate the
              independence assumption. The normal probability plot shows that the distribution
              has a slightly thinner right tail than a normal distribution but there is no sign of
              severe skewness.

13.28   (a)   An increasing linear relationship exists.
        (b)   D = 0.109
        (c)   There is strong positive autocorrelation among the residuals.

13.29   (a)   There is no apparent pattern in the residuals over time.
        (b)   D = 1.661>1.36. There is no evidence of positive autocorrelation among the
              residuals.
        (c)   The data are not positively autocorrelated.

13.30   (a)   No, since the data have been collected for a single period for a set of stores.
        (b)   If a single store was studied over a period of time and the amount of shelf space
              varied over time, computation of the Durbin-Watson statistic would be necessary.
304     Chapter 13: Simple Linear Regression


13.31    (a)

                                                                      Scatter Plot

                                160
                                140
                                120
               Kilowatt Usage




                                100
                                80
                                60
                                40
                                20
                                 0
                                         0               20              40           60               80         100
                                                             Atmospheric Temperature (degree F)

         (b)                    b0 = 169.455, b1 = –1.8579
         (c)                    For each increase of one degree in Fahrenheit temperature, the expected average
                                kilowatt usage is estimated to decrease by 1.8579.
         (d)                     ˆ
                                 Y  169.455  1.8579 X  169.455  1.8579(50)  76.56
         (e)                    r2 = 0.894. So, 89.4% of the variation in average kilowatt usage can be explained by
                                the variation in the average temperature.
         (f)                     sYX  11.63
         (g)
                                                                      Temperature Residual Plot

                                              25
                                              20
                                              15
                                              10
                                               5
                                  Residuals




                                               0
                                               -5
                                              -10
                                              -15
                                              -20
                                              -25
                                              -30
                                                    0   10       20      30     40      50        60        70   80     90
                                                                               Temperature
                                                 Solutions to End-of-Section and Chapter Review Problems                                    305


13.31   (h)
cont.
                                                            Residuals vs Time Period

                                          25
                                          20
                                          15
                                          10
                             Residuals     5
                                           0
                                          -5 0          5            10             15                  20          25          30
                                         -10
                                         -15
                                         -20
                                         -25
                                         -30
                                                                            Time Period


        (i)   D = 1.18<1.27. There is evidence of positive autocorrelation among the residuals.
        (j)   The plot of the residuals versus temperature indicates that positive residuals tend to
              occur for the lowest and highest temperatures in the data set. A nonlinear model
              might be more appropriate. The evidence of positive autocorrelation is another reason
              to question the validity of the model.

13.32   (a)
                                                                          Scatter Diagram

                             100
                                 90
                                 80
                                 70
                                 60
               Cost ($000)




                                 50
                                 40
                                 30
                                 20
                                 10
                                         0
                                             0   1000         2000           3000                4000        5000        6000        7000
                                                                                    # of order



        (b)   b0 = 0.458, b1 = 0.0161
        (c)   For each increase of one order, the expected distribution cost is estimated to increase
              by 0.0161 thousand dollars, or $16.10.
        (d)   ˆ
              Y  0.458  0.0161X  0.458  0.0161(4500)  72.908 or $72,908
        (e)   r2 = 0.844. So, 84.4% of the variation in distribution cost can be explained by the
              variation in the number of orders.
        (f)   sYX  5.218
306     Chapter 13: Simple Linear Regression


13.32    (g)
cont.
                                                                 Orders Residual Plot

                                 15

                                 10

                                     5
                  Residuals



                                     0

                                     -5

                               -10

                               -15
                                          0        1000       2000        3000        4000    5000   6000   7000
                                                                              Orders

         (h)
                                                                          Residuals

                                      15

                                      10

                                          5
                         Residuals




                                          0
                                               0          5          10          15          20      25     30
                                          -5

                                     -10

                                     -15
                                                                           Time Period

         (i)    D = 2.08>1.45. There is no evidence of positive autocorrelation among the residuals.
         (j)    Based on a residual analysis, the model appears to be adequate.
                                                        Solutions to End-of-Section and Chapter Review Problems     307


13.33   (a)
                                                                         Scatter Diagram

                                              140




                Gasoline Price (cents/gal.)
                                              120

                                              100

                                               80

                                               60

                                               40

                                               20

                                                0
                                                    0       5      10      15         20       25       30    35    40
                                                                             Crude Oil ($/bbl.)

        (b)   ˆ
              Y  42.8798+2.6573X
        (c)   For each increase of 1 $/bbl. of crude oil price, the estimated average gasoline price
              will increase by 2.6573 cents/gallon.
        (d)   Y  42.8798+2.6573  20  96.03 cents/gallon.
              ˆ
        (e)   r2 = 0.7117. So 71.17% of the variation in gasoline price can be explained by the
              variation in crude oil price.
        (f)   SYX = 12.32.
        (g)
                                                                    Crude Oil Price Residual Plot

                                              40

                                              30

                                              20
                Residuals




                                              10

                                               0

                                              -10

                                              -20
                                                    0       5      10       15         20          25    30    35     40
                                                                                 Crude Oil Price
308     Chapter 13: Simple Linear Regression


13.33    (h)
cont.
                                                                                    Residuals

                                                     40

                                                     30

                                                     20


                                         Residuals
                                                     10

                                                      0
                                                           0        5          10          15         20       25    30
                                                     -10

                                                     -20
                                                                                     Time Period


         (i)    D = 0.5915 < 1.27, there is evidence of positive autocorrelation.
         (j)
                                                                    Normal Probability Plot

                              40


                              30


                              20
                  Residuals




                              10


                               0
                                    -2               -1.5      -1       -0.5         0          0.5        1   1.5   2
                              -10


                              -20
                                                                                Z Value

                According to the residual plot of crude oil price, a nonlinear model is more
                appropriate. The plot of residual versus time series along with the Durbin-Watson
                statistic suggest that there is strong evidence of positive autocorrelation. The normal
                probability plot indicates that the distribution has thinner tails than a normal
                distribution but there is no sign of severe skewness.
                                                    Solutions to End-of-Section and Chapter Review Problems   309


13.34   (a)
                                                                   Scatter Diagram

                                           4

                                         3.5




                Sales Per Store ($000)
                                           3

                                         2.5

                                           2

                                         1.5

                                           1

                                         0.5

                                           0
                                                0       20       40         60            80   100    120
                                                                  Temperature (Degree F)

        (b)   b0 = –2.535, b1 = 0.060728
        (c)   For each increase of one degree Fahrenheit in the high temperature, expected sales
              are estimated to increase by 0.060728 thousand dollars, or $60.73.
        (d)    ˆ
              Y  2.535  0.060728 X  2.535  0.060728(83)  2.5054 or $2505.40
        (e)   r2 = 0.94. So, 94% of the variation in sales per store can be explained by the variation
              in the daily high temperature.
        (f)   sYX  0.1461
        (g)
                                                              Temperature Residual Plot

                                         0.3

                                         0.2

                                         0.1
                Residuals




                                           0

                                         -0.1

                                         -0.2

                                         -0.3

                                         -0.4
                                                0       20       40         60            80   100   120
                                                                       Temperature
310     Chapter 13: Simple Linear Regression


13.34    (h)
cont.
                                                   Residuals

                                0.3

                                0.2

                                0.1
                    Residuals


                                  0
                                       0   5       10          15        20         25
                                -0.1

                                -0.2

                                -0.3

                                -0.4
                                                     Time Period


         (i)     D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals.
         (j)     The plot of the residuals versus time period shows some clustering of positive and
                 negative residuals for intervals in the domain, suggesting a nonlinear model might be
                 better. Otherwise, the model appears to be adequate.
         (k)     b0 = –2.6281, b1 = 0.061713
                 For each increase of one degree Fahrenheit in the high temperature, expected sales
                 are estimated to increase by 0.061713 thousand dollars, or $61.71.
                  ˆ
                 Y  2.6281  0.061713 X  2.6281  0.061713(83)  2.4941 or $2494.10
                  2
                 r = 0.929. 92.9% of the variation in sales per store can be explained by the variation
                 in the daily high temperature.
                 sYX  0.1623
                 D = 1.24. The test of the Durbin-Watson statistic is inconclusive as to whether there
                 is positive autocorrelation among the residuals.
                 The plot of the residuals versus time period shows some clustering of positive and
                 negative residuals for intervals in the domain, suggesting a nonlinear model might be
                 better. Otherwise, the model appears to be adequate.
                 The results are similar to those in (a)-(j).

13.35    (a)     t  b1 / sb1  4.5 / 1.5  3.00
(b)      With n = 18, df = 18 – 2 =16. t16  2.1199
         (c)    Reject H0. There is evidence that the fitted linear regression model is useful.
         (d)     b0  t16 sb1  1  b0  t16 sb1 , 4.5  2.1199(1.5)  1  4.5  2.1199(1.5) ,
                 1.32  1  7.68

13.36    (a)     MSR  SSR / p  60 / 1  60
                 MSE  SSE /(n  p  1)  40 / 18  2.222
                 F  MSR / MSE  60 / 2.222  27
         (b)     F1,18  4.414
         (c)     Reject H0. There is evidence that the fitted linear regression model is useful.
                            Solutions to End-of-Section and Chapter Review Problems             311


13.37   (a)    t  4.65  t10  2.2281 with 10 degrees of freedom for   0.05 . Reject H0. There
               is evidence that the fitted linear regression model is useful.
        (b)    0.0386  1  0.1094

13.38   (a)    t  13.65  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0.
               There is evidence that the fitted linear regression model is useful.
        (b)    0.0074  1  0.0101

13.39   (a)    p-value is virtually 0 < 0.05. Reject H0. There is evidence that the fitted linear
               regression model is useful.
        (b)    0.0276  1  0.0318

13.40   (a)    t  8.65  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0. There
               is evidence that the fitted linear regression model is useful.
        (b)    3.3073  1  5.3589

13.41   (a)    t  7.74  t 23  2.0687 with 23 degrees of freedom for   0.05 . Reject H0. There
               is evidence that the fitted linear regression model is useful.
        (b)    0.7805  1  1.3497

13.42   (a)    p-value = 7.26497E-06 < 0.05. Reject H0. There is evidence that the fitted linear
               regression model is useful.
        (b)    1.2463  1  2.7918

13.43          (a) For the Ford Motor Company, the estimated value of its stock will increase by 0.92%
               on average when the S & P 500 index increases by 1%.
              For the Houston Industries, the estimated value of its stock will increase by 0.43% on
               average when the S & P 500 index increases by 1%.
               For IBM, the estimated value of its stock will increase by 1.09% on average when the S
               & P 500 index increases by 1%.
               For LSI Logic, the estimated value of its stock will increase by 1.80% on average when
               the S & P 500 index increases by 1%.
        (b)    A stock is riskier than the market if the estimated absolute value of the beta exceeds one.
               This can be used to gauge the volatility of a stock in relative to how the market behaves
               in general.

13.44   (a)      % dialy change in ULPIX   b0  2.00  % dialy change in S&P 500 Index 
        (b)    If the S&P gains 30% in a year, the ULPIX is expected to gain an estimated 60%.
        (c)    If the S&P loses 35% in a year, the ULPIX is expected to lose an estimated 70%.
        (d)    Since the leverage funds have higher volatility and, hence, higher risk than the market,
               risk averse investors should stay away from these funds. Risk takers, on the other hand,
               will benefit from the higher potential gain from these funds.
312     Chapter 13: Simple Linear Regression


13.45    (a)    r = -0.1641.
         (b)    t = -0.4706, p-value = 0.6505 > 0.05. Do not reject H0. There is not enough evidence
                to conclude that there is a significant linear relationship between the retial price and
                the energy cost per year of medium-size top-freezer refrigerators.

13.46    (a)     r = 0.9656
         (b)    The p-value of the t test is essentially zero. At 0.05 level of significance, there
                is significant linear relationship between calories and fat content.
         (c)    Yes, one would expect the ice creams with higher fat content to have more
                calories.

13.47    (a)                ˆ
                When X = 2, Y  5  3 X  5  3(2)  11
                      1    ( X  X )2     1 (2  2) 2
                 h      n i                        0.05
                         (X i  X )
                      n               2   20   20
                          i 1
                                         ˆ
                95% confidence interval: Y  t18 sYX    h  11  2.1009 1 0.05
                10.53  YX  11.47
         (b)                             ˆ
                95% prediction interval: Y  t18 sYX 1  h  11  2.1009 1  1.05
                 8.847  YI  13.153

13.48    (a)                ˆ
                When X = 4, Y  5  3 X  5  3(4)  17
                      1    ( X  X )2     1 (4  2) 2
                 h      n i                        0.25
                         (X i  X )
                      n               2   20   20
                          i 1
                                         ˆ
                95% confidence interval: Y  t18 sYX    h  11  2.1009 1  0.25
                15.95  YX  18.05
         (b)                             ˆ
                95% prediction interval: Y  t18 sYX 1  h  11  2.1009 1  1.25
                14.651  YI  19.349
         (c)    The intervals in this problem are wider because the value of X is farther from X .

13.49    (a)    1.7867  Y | X  2.2964
         (b)    1.3100  YI  2.7740
         (c)    Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.

13.50    (a)     7.3664  Y | X  7.9549
         (b)     6.5667  YI  8.7546
         (c)    Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.
                             Solutions to End-of-Section and Chapter Review Problems            313


13.51   (a)     14.7150  Y | X  15.3701
        (b)     13.5059  YI  16.5793
        (c)     Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.

13.52   (a)     100.96  Y | X  138.77
        (b)     20.1  YI  219.72
        (c)     Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.

13.53   (a)     1153.0  Y | X  1331.5
        (b)     829.9  YI  1654.6
        (c)     Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.

13.54   (a)     116.7082  Y | X  178.0564
        (b)     111.5942  YI  183.1704
        (c)     Part (b) provides an estimate for an individual response and Part (a) provides an
                estimate for an average predicted value.

13.55   The slope of the line b1 represents the estimated expected change in Y per unit change in X. It
        represents the estimated average amount that Y changes (either positively or negatively) for a
        particular unit change in X. The Y intercept b0 represents the estimated average value of Y
        when X equals 0.

13.56   The coefficient of determination measures the proportion of variation in Y that is explained
        by the independent variable X in the regression model.

13.57   The unexplained variation or error sum of squares (SSE) will be equal to zero only when the
        regression line fits the data perfectly and the coefficient of determination equals 1.

13.58   The explained variation or regression sum of squares (SSR) will be equal to zero only when
        there is no relationship between the Y and X variables, and the coefficient of determination
        equals 0.

13.59   Unless a residual analysis is undertaken, you will not know whether the model fit is
        appropriate for the data. In addition, residual analysis can be used to check whether the
        assumptions of regression have been seriously violated.

13.60   The assumptions of regression are normality of error, homoscedasticity, and independence of
        errors. The normality of error assumption can be evaluated by obtaining a histogram, box-
        and-whisker plot, and/or normal probability plot of the residuals. The homoscedasticity
        assumption can be evaluated by plotting the residuals on the vertical axis and the X variable
        on the horizontal axis. The independence of errors assumption can be evaluated by plotting
        the residuals on the vertical axis and the time order variable on the horizontal axis. This
        assumption can also be evaluated by computing the Durbin-Watson statistic.
314     Chapter 13: Simple Linear Regression


13.61    The Durbin-Watson statistic is a measure of the autocorrelation among the residuals. It
         measures the correlation among consecutive residuals.

13.62    If the data in a regression analysis has been collected over time, then the assumption of
         independence of errors needs to be evaluated using the Durbin-Watson statistic.

13.63    The confidence interval for the mean response estimates the average response for a given X
         value. The prediction interval estimates the value for a single item or individual.

13.64    (a)
                                                                 Scatter Diagram

                                             80

                                             70
                   Delivery Time (minutes)




                                             60

                                             50

                                             40

                                             30

                                             20

                                             10

                                             0
                                                  0   50   100      150       200    250   300   350
                                                                   Number of Cases

         (b)     b0 = 24.84, b1 = 0.14
         (c)     ˆ                                                        ˆ
                 Y  24.84  0.14 X , where X is the number of cases and Y is the estimated
                 delivery time.
         (d)     For each additional case, the estimated delivery time increases by 0.14 minutes.
                 24.84 is the portion of the estimated delivery time that is not affected by the number
                 of cases.
         (e)     ˆ
                 Y  24.84  0.14 X  24.84  0.14(150)  45.84
         (f)     No, 500 cases is outside the relevant range of the data used to fit the regression
                 equation.
         (g)     r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the
                 variation in the number of cases.
         (h)     Since b1 is positive, r   r 2   0.972  0.986
         (i)     sYX  1.987
         (j)     Based on a visual inspection of the graphs of the distribution of studentized residuals
                 and the residuals versus the number of cases, there is no pattern. The model appears
                 to be adequate.
         (k)     t  24.88  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0.
                 There is evidence that the fitted linear regression model is useful.
         (l)     44.88  YX  46.80
         (m)     41.56  YI  50.12
         (n)     0.1282  1  0.1518
                                                       Solutions to End-of-Section and Chapter Review Problems    315


13.65   (a)
                                                                       Scatter Diagram

                                             500




                Number of Trade Executions
                                             450
                                             400
                                             350
                                             300
                                             250
                                             200
                                             150
                                             100
                                             50
                                              0
                                                   0       500       1000      1500      2000     2500     3000
                                                                     Number of Incoming Calls

        (b)   b0 = –63.02, b1 = 0.189
        (c)   ˆ                                                                   ˆ
              Y  63.02  0.189 X , where X is the number of incoming calls and Y is the
              estimated number of trade executions.
        (d)   For each additional incoming call, the estimated number of trade executions increases by
              0.189 minutes. – 63.02 is the portion of the estimated delivery time that is not affected
              by the number of incoming calls.
        (e)   ˆ
              Y  63.02  0.189 X  63.02  0.189(2000)  314.99
        (f)   No, 5000 incoming calls is outside the relevant range of the data used to fit the
              regression equation.
        (g)   r2 = 0.630. So, 63.0% of the variation in trade executions can be explained by the
              variation in the number of incoming calls.
        (h)   Since b1 is positive, r   r 2   0.63  0.794
        (i)   sYX  29.42
        (j)   Based on a visual inspection of the graphs of the distribution of studentized residuals
              and the residuals versus the number of cases, there is no pattern. The model appears
              to be adequate.
        (k)   D = 1.96
        (l)   D = 1.96>1.52. There is no evidence of positive autocorrelation. The model appears
              to be adequate.
        (m)   t  7.50  t 33  2.0345 with 33 degrees of freedom for   0.05 . Reject H0. There
              is evidence that the fitted linear regression model is useful.
        (n)   302.07  YX  327.91
        (o)   253.76  YI  376.22
        (p)   0.1377  1  0.2403
316     Chapter 13: Simple Linear Regression


13.66    (a)
                                                        Scatter Diagram

                                         140

                                         120
                  Selling Price ($000)   100

                                         80

                                         60

                                         40

                                         20

                                          0
                                               0   20      40             60    80           100
                                                        Assessed Value ($000)

                b0 = –44.172, b1 = 1.78171
         (b)    For each additional dollar in assessed value, the estimated selling price increases by
                $1.78. –44.172 is the portion of the estimated selling price that is not affected by the
                assessed value.
         (c)     ˆ
                Y  44.172  1.78171X  44.172  1.78171(70)  80.458 or $80,458
         (d)     sYX  3.475
         (e)    r2 = 0.926. 92.6% of the variation in selling price can be explained by the variation in
                the assessed value.
         (f)    Since b1 is positive, r   r 2   0.926  0.962
         (g)    Based on a visual inspection of the graphs of the distribution of studentized residuals
                and the residuals versus the assessed value, there is no pattern. The model appears to
                be adequate.
         (h)    t  18.66  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0.
                There is evidence that the fitted linear regression model is useful.
         (i)    78.707  YX  82.388
         (j)    73.195  YI  87.900
         (k)    1.5862  1  1.9773
                                                     Solutions to End-of-Section and Chapter Review Problems           317


13.67   (a)
                                                                  Scatter Diagram

                                  100
                                  90
          Assessed Value ($000)
                                  80
                                  70
                                  60
                                  50
                                  40
                                  30
                                  20
                                  10
                                   0
                                        0           0.5               1             1.5             2    2.5
                                                          Heating Area (thousands of square feet)

                                        b0 = 51.915, b1 = 16.633
        (b)                             For each additional 1000 square feet in heating area, the estimated assessed value
                                        increases by $16,633. $51,915 is the portion of the estimated assessed value that is
                                        not affected by heating area.
        (c)                              ˆ
                                        Y  51.915  16.633 X  51.915  16.633(1.75)  81.024 or $81,024
        (d)                             sYX  2.919
        (e)                             r2 = 0.659. 65.9% of the variation in assessed value can be explained by the variation
                                        in heating area.
        (f)                             Since b1 is positive, r   r 2   0.659  0.812
        (g)                             Based on a visual inspection of the graphs of the distribution of studentized residuals
                                        and the residuals versus the heating area, there is no pattern. The model appears to be
                                        adequate.
        (h)                             t  5.02  t13  2.1604 with 13 degrees of freedom for   0.05 . Reject H0. There
                                        is evidence that the fitted linear regression model is useful.
        (i)                             79.279  YX  82.769
        (j)                             74.479  YI  87.569
        (k)                             9.469  1  23.797
        (l)                             b0 = 52.805, b1 = 15.849
                                        For each additional 1000 square feet in heating area, the estimated assessed value
                                        increases by $15,849. $52,805 is the portion of the estimated assessed value that is
                                        not affected by heating area.
                                         ˆ
                                        Y  52.805  15.849 X  52.805  15.849(1.75)  80.541 or $80,541
                                        sYX  2.598
                                        r2 = 0.689. 68.9% of the variation in assessed value can be explained by the variation
                                        in heating area.
                                        Since b1 is positive, r   r 2   0.689  0.83
                                        Based on a visual inspection of the graphs of the distribution of studentized residuals
                                        and the residuals versus the heating area, there is no pattern. The model appears to be
                                        adequate.
318     Chapter 13: Simple Linear Regression


13.67    (l)           t  5.37  t13  2.1604 with 13 degrees of freedom for   0.05 . Reject H0. There
cont.                  is evidence that the fitted linear regression model is useful.
                       78.987  YX  82.096
                       74.716  YI  86.367
                       9.471  1  22.227

13.68    (a)
                                              Scatter Diagram

                 4.5
                  4
                 3.5
                  3
                 2.5
           GPI




                  2
                 1.5
                  1
                 0.5
                  0
                       0      100      200     300      400       500    600     700    800
                                                     GMAT Score

                       b0 = 0.30, b1 = 0.00487
         (b)           For each additional point on the GMAT score, the estimated GPI increases by
                       0.00487. 0.30 is the portion of the GPI that is not affected by the GMAT score.
         (c)            ˆ
                       Y  0.30  0.00487 X  0.30  0.00487(600)  3.222 5
         (d)           sYX  0.158
         (e)           r2 = 0.793. 79.3% of the variation in the GPI can be explained by the
                       variation in the GMAT score.
         (f)           Since b1 is positive, r   r 2   0.793  0.891
         (g)           Based on a visual inspection of the graphs of the distribution of studentized residuals
                       and the residuals versus the GMAT score, there is no pattern. The model appears to
                       be adequate.
         (h)           t  8.31  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0. There
                       is evidence that the fitted linear regression model is useful.
         (i)           3.144  YX  3.301
         (j)           2.886  YI  3.559
         (k)           0.00366  1  0.00608
         (l)           b0 = 0.258, b1 = 0.00494
                       For each additional point on the GMAT score, the estimated GPI increases by
                       0.00494. 0.258 is the portion of the GPI that is not affected by the GMAT score.
                        ˆ
                       Y  0.258  0.00494 X  0.258  0.00494(600)  3.221
                       sYX  0.147
                       r2 = 0.820. 82.0% of the variation in the GPI can be explained by the variation in the
                       GMAT score.
                                                    Solutions to End-of-Section and Chapter Review Problems    319



13.68   (l)   Since b1 is positive, r   r 2   0.82  0.906
cont.         Based on a visual inspection of the graphs of the distribution of studentized residuals
              and the residuals versus the GMAT score, there is no pattern. The model appears to
              be adequate.
              t  9.06  t18  2.1009 with 18 degrees of freedom for   0.05 . Reject H0. There
              is evidence that the fitted linear regression model is useful.
              3.147  YX  3.295
              2.903  YI  3.539
              0.00380  1  0.00609

13.69   (a)
                                                                     Scatter Diagram

                                          4.5
                                           4
                Completion Time (hours)




                                          3.5
                                           3
                                          2.5
                                           2
                                          1.5
                                           1
                                          0.5
                                           0
                                                0      50      100       150      200      250   300     350
                                                                       Invoice Processed

        (b)   b0 = 0.4024, b1 = 0.012608
        (c)   For each additional invoice processed, the estimated completion time increases by
              0.012608 hours. 0.4024 is the portion of the estimated completion time that is not
              affected by the number of invoices processed.
        (d)    ˆ
              Y  0.4024  0.012608 X  0.4024  0.012608(150)  2.2934
        (e)   sYX  0.3342
        (f)   r2 = 0.892. 89.2% of the variation in completion time can be explained by the
              variation in the number of invoices processed.
        (g)   Since b1 is positive, r   r 2   0.892  0.945
        (i)   Based on a visual inspection of the graphs of the distribution of studentized residuals
              and the residuals versus the number of invoices, there is no pattern. The model
              appears to be adequate.
        (j)   D = 1.78
        (k)   D = 1.78>1.49. There is no evidence of positive autocorrelation. The
              model appears to be adequate.
        (l)   t  15.24  t 28  2.0484 with 28 degrees of freedom for   0.05 . Reject H0.
              There is evidence that the fitted linear regression model is useful.
        (m)   2.1638  YX  2.4230
        (n)   1.5966  YI  2.9902
320     Chapter 13: Simple Linear Regression


13.70    (a)
                                                                    O-ring damage index

                                            12


                      O-ring Damage Index
                                            10

                                            8

                                            6

                                            4

                                            2

                                            0
                                                 0   10        20      30      40         50   60        70   80
                                                                    Temperature (degrees F)


                   There is not any clear relationship between atmospheric temperature and O-ring
                   damage from the scatter plot.
         (b),(f)

                                                                      Chart Title

                                            12
                                            10
                      O-ring Damage Index




                                             8
                                             6
                                             4
                                             2
                                             0
                                            -2 0          20             40           60            80        100
                                            -4
                                                                    Temperature (degrees F)


         (c)       In (b), there are 16 observations with an O-ring damage index of 0 for a variety of
                   temperature. If one concentrates on these observations with no O-ring damage, there
                   is obviously no relationship between O-ring damage index and temperature. If all
                   observations are used, the observations with no O-ring damage will bias the
                   estimated relationship. If the intention is to investigate the relationship between the
                   degrees of O-ring damage to atmospheric temperature, it makes sense to focus only
                   on the flight in which there was O-ring damage.
         (d)       Prediction should not be made for an atmospheric temperature of 31 0F because it is
                   outside the range of the temperature variable in the data. Such prediction will
                   involve extrapolation, which assumes that any relationship between two variables
                   will continue to hold outside the domain of the temperature variable.
         (e)        ˆ
                   Y  18.036  0.234X
         (g)       A nonlinear model is more appropriate for these data.
                                                            Solutions to End-of-Section and Chapter Review Problems               321


13.70   (h)
cont.
                                                                         Temperature Residual Plot

                                                   7
                                                   6
                                                   5
                                                   4
                Residuals
                                                   3
                                                   2
                                                   1
                                                   0
                                                   -1
                                                   -2
                                                   -3
                                                        0         20              40            60           80           100
                                                                                    Temperature

              The string of negative residuals and positive residuals that lie on a straight line
              with a positive slope in the lower-right corner of the plot is a strong indication
              that a nonlinear model should be used if all 23 observations are to be used in
              the fit.

13.71   (a)
                                                                                Scatter Diagram

                                                   80
                1999 Gross Profits (in millions)




                                                   70

                                                   60

                                                   50

                                                   40

                                                   30

                                                   20

                                                   10

                                                    0
                                                        0    50         100      150      200        250    300     350     400
                                                                       Page Views (monthly visitors in thousands)

              If the outlier (Amazon.com) in the upper right corner of the scatter diagram is
              removed, there is not an obvious linear relationship between page views and gross
              profits.
        (b)   ˆ
              Y  1.354  0.154X
        (c)   Since all the companies in the data are internet companies and rely their business on
              online customers, it is not meaningful to interpret the estimated intercept when there
              is no online visitor at all. The estimated slope coefficient b1 = 0.154 means that for
              each increase in one thousand additional monthly visitors, the average gross product
              of a company is estimated to increase by 0.154 million dollars.
322     Chapter 13: Simple Linear Regression


13.71    (d)    0.0553  1  0.2524 . In the long-run, 95% of all the confidence intervals that are
cont.           constructed for the slope parameter will contain the true value of the slope parameter.
                Since the interval does not contain 0, we are 95% confidence that there is a
                significant linear relationship between page views and gross profits.
         (e)    r2 = 0.6183. 61.83% of the total variation in gross profits can be explained by the
                variation in the number of monthly visitors.
         (f)    SYX = 13.4245. The standard error of the estimate measures the average squared
                distance between the values of the dependent variable and its fit on the least squares
                regression line.
         (g)    The outliers are Amazon.com, Cheap Tickets, and About.com.
         (h)    (a)
                                                                                  Scatter Diagram

                                                     25
                  1999 Gross Profits (in millions)




                                                     20


                                                     15


                                                     10


                                                     5


                                                     0
                                                          0      20       40       60       80      100      120      140     160
                                                                        Page Views (monthly visitors in thousands)

                                                          With Amazon.com removed from the data, there is no obvious relationship
                                                          between page views and gross profits.
                (b)                                       ˆ
                                                          Y  12.598  0.0637X
                (c)                                       Since all the companies in the data are internet companies and rely their
                                                          business on online customers, it is not meaningful to interpret the estimated
                                                          intercept when there is no online visitor at all. The estimated slope
                                                          coefficient b1 = -0.0637 means that for each increase in one thousand
                                                          additional monthly visitors, the average gross product of a company is
                                                          estimated to decrease by 0.0637 million dollars.
                (d)                                        0.1691  1  0.0417 . In the long-run, 95% of all the confidence
                                                          intervals that are constructed for the slope parameter will contain the true
                                                          value of the slope parameter. Since the interval contains 0, we cannot
                                                          conclude that there is significant linear relationship between page views and
                                                          gross profits with 95% level of confidence.
                (e)                                       r2 = 0.2259. 22.59% of the total variation in gross profits can be explained
                                                          by the variation in the number of monthly visitors.
                (f)                                       SYX = 6.262. The standard error of the estimate measures the average
                                                          squared distance between the values of the dependent variable and its fit on
                                                          the least squares regression line.
                                            Solutions to End-of-Section and Chapter Review Problems              323


13.71   (h)   (g)                       The outliers are Cheap Tickets, and eToys.com.
                                        The exclusion of Amazon.com changes the estimated slope coefficient from
                                        positive to negative. It is extremely influential on the least squares
                                        regression estimates. It is called influential point in regression analysis.

13.72   (a)
                                                            Scatter Diagram

                                 5000
                                 4500
                                 4000
                Weight (grams)




                                 3500
                                 3000
                                 2500
                                 2000
                                 1500
                                 1000
                                  500
                                    0
                                        0           20            40             60         80             100
                                                                Circumference (cms.)

        (b)   ˆ
              Y  2629.222+82.4717X
        (c)   For each increase in one additional cm in circumference, the estimated average
              weight of a pumpkin will increase by 82.4717 grams.
        (d)   Y  2629.222+82.4717  60  2319.080 grams.
              ˆ
        (e)   There appears to be a positive relationship between weight and circumference of a
              pumpkin. It is a good idea for the farmer to sell pumpkin by circumference instead of
              weight for circumference is a good predictor of weight and it is much easier to
              measure the circumference of a pumpkin than its weight.
        (f)   r2 = 0.9373. 93.73% of the variation in pumpkin weight can be explained by the
              variation in circumference.
        (g)   SYX = 277.7495.
        (h)
                                                     Circumference Residual Plot

                                 600

                                 400

                                 200
               Residuals




                                   0

                                 -200

                                 -400

                                 -600

                                 -800
                                        0     10    20     30      40     50      60   70        80   90
                                                                 Circumference

              There appears to be a nonlinear relationship between circumference and weight.
324     Chapter 13: Simple Linear Regression


13.72    (i)            p-value is virtually 0. Reject H0. There is sufficient evidence to conclude that there
cont.                   is a linear relationship between the circumference and the weight of a pumpkin.
         (j)            72.7875  1  92.1559
         (k)            2186.9589  Y | X  2451.2020
         (l)            1726.5508  YI  2911.6101


13.73    (a)
                                                Scatter Diagram

                  120


                  100


                   80
           Wins




                   60


                   40


                   20


                    0
                        0           1           2           3          4           5           6
                                                         E.R.A.



         (b)            ˆ
                        Y  152.8097 15.0927X
         (c)            b0 = 152.8097. For a team that has an E.R.A. of 0, the estimated average number of
                        wins is 152.81. For each additional unit increase in team E.R.A., the estimated
                        average number of wins decreases by 15.09.
         (d)            Y  152.8097 15.0927  4.5  84.89
                        ˆ
         (e)            SYX = 7.6363.
         (f)            r2 = 0.4354. So, 43.54% of the variation in number of wins can be explained by the
                        variation in the team E.R.A..
         (g)            Since b1 is negative, r   r 2   0.4354  0.6599

								
To top