Docstoc

Chapter 4 Random Variables and Probability Distributions

Document Sample
Chapter 4 Random Variables and Probability Distributions Powered By Docstoc
					Chapter 12: Multiple
Regression and Model Building
Where We’ve Been

   Introduced the straight-line model
    relating a dependent variable y to an
    independent variable x
   Estimated the parameters of the
    straight-line model using least squares
   Assesses the model estimates
   Used the model to estimate a value of
    y given x
             McClave: Statistics, 11th ed. Chapter 12: Multiple   2
                    Regression and Model Building
Where We’re Going
   Introduce a multiple-regression model to relate a
    variable y to two or more x variables
   Present multiple regression models with both
    quantitative and qualitative independent variables
   Assess how well the multiple regression model fits
    the sample data
   Show how analyzing the model residuals can help
    detect problems with the model and the necessary
    modifications


                  McClave: Statistics, 11th ed. Chapter 12: Multiple   3
                         Regression and Model Building
12.1: Multiple Regression
Models
      The General Multiple Regression Model
       y   0  1 x1   2 x2                          k xk  
 where
  y is the dependent variable,
  x1 , x2 , ... , xk are the independent variables,
  E ( y )   0  1 x1   2 x2                          k xk is the
     deterministic portion of the model and
  i determines the contribution of the independent
     variable xi , which may be a quantitative variable
     of order one or higher or a qualitative variable
                    McClave: Statistics, 11th ed. Chapter 12: Multiple      4
                           Regression and Model Building
12.1: Multiple Regression
Models
 Analyzing a Multiple-Regression Model
Step 1: Hypothesize the deterministic
  portion of the model by choosing the
  independent variables x1, x2, … , xk.
Step 2: Estimate the unknown
  parameters 0, 1, 2, … , k .
Step 3: Specify the probability distribution
  of and estimate the standard
  deviation of this distribution.
             McClave: Statistics, 11th ed. Chapter 12: Multiple   5
                    Regression and Model Building
12.1: Multiple Regression
Models
 Analyzing a Multiple-Regression Model
Step 4: Check that the assumptions
  about are satisfied; if not make the
  required modifications to the model.
Step 5: Statistically evaluate the
  usefulness of the model.
Step 6: If the model is useful, use it for
  prediction, estimation and other
  purposes.
            McClave: Statistics, 11th ed. Chapter 12: Multiple   6
                   Regression and Model Building
     12.1: Multiple Regression
     Models

    Assumptions about the Random Error
1.   The mean is equal to 0.
2.   The variance is equal to 2.
3.   The probability distribution is a normal
     distribution.
4.   Random errors are independent of one
     another.
                McClave: Statistics, 11th ed. Chapter 12: Multiple   7
                       Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters

         A First-Order Model in Five
  Quantitative Independent Variables
  E ( y )  0  1 x1   2 x2  3 x3   4 x4  5 x5
  where x1, x2, … , xk are all
  quantitative variables that are not
  functions of other independent
  variables.

                  McClave: Statistics, 11th ed. Chapter 12: Multiple   8
                         Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters

          A First-Order Model in Five
   Quantitative Independent Variables
   E ( y )  0  1 x1   2 x2  3 x3   4 x4  5 x5
The parameters are estimated by
finding the values for the ‘s that
minimize
                SSE   ( y  y) .  ˆ  2



                  McClave: Statistics, 11th ed. Chapter 12: Multiple   9
                         Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters

          A First-Order Model in Five
   Quantitative Independent Variables
                      Only a truly talented
   E ( y )  0  1 x1   2 x(or 3 x3   4 x4  5 x5
                 mathematician
                                 2
                                     geek) would
                 choose to solve the necessary
                 system ofare estimated
The parameters by hand. In practice, by
                equations
                           simultaneous linear

finding the values are leftthe the‘s that
                  computers for to do
                complicated calculation required
minimize by multiple regression models.
                SSE   ( y  y)         ˆ  2



                  McClave: Statistics, 11th ed. Chapter 12: Multiple   10
                         Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters
   A collector of antique clocks
    hypothesizes that the auction price can
    be modeled as
           y  0  1 x1   2 x2  
          where
          y  auction price in dollars
          x1  age of clock in years
          x2  number of bidders.
              McClave: Statistics, 11th ed. Chapter 12: Multiple   11
                     Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters
   Based on the data in Table 12.1, the least
    squares prediction equation, the equation
    that minimizes SSE, is
         y  1,339  12.74 x1  85.95 x2
         ˆ
         SSE  516, 727
                SSE      516, 727
         s 
          2
                                   17,818
              n  k 1      29
         s  133.5 (the estimate for  )
                McClave: Statistics, 11th ed. Chapter 12: Multiple   12
                       Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters
   Based on the data in Table 12.1, the least
    squares prediction equation, the equation
    that minimizes SSE, is
         y  1,339  12.74 x1  85.95 x2
         ˆ
           SSE  516, 727
The estimate for 1 is
interpreted as the SSE       516, 727
           s  in y
             2
expected change                       17,818 asfor 2 is
                                          The estimate
given a one-unit n  k  1
                                          interpreted   the
                               29         expected change in y
x2 constants  133.5 (the estimate for  )
change in x1 holding                      given a one-unit
                                                                       change in x2 holding
                                                                       x1 constant
                  McClave: Statistics, 11th ed. Chapter 12: Multiple                          13
                         Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences
about the Parameters
   Based on the data in Table 12.1, the least
    squares prediction equation, the equation
    that minimizes SSE, is
         y  1,339  12.74 x1  85.95 x2
         ˆ
         SSE  516, 727 Since it makes no sense to sell a clock
                            of age 0 at an auction with no bidders,
                SSE      516, 727 term has no meaningful
                            the intercept
         s 
          2
                                        in this example.
                        interpretation17,818
              n  k 1        29
         s  133.5 (the estimate for  )
                  McClave: Statistics, 11th ed. Chapter 12: Multiple   14
                         Regression and Model Building
       12.2: The First-Order Model:
       Estimating and Making Inferences about
       the Parameters
Test of an Individual Parameter Coefficient in the Multiple Regression Model
  One-Tailed Test                                   Two-Tailed Test
   H 0 : i  0                                       H 0 : i  0
   H a : i  ()0                                    H a : i  0
  Rejection Region: t  t ( t )                   Rejection Region: t  t /2
                                                           ˆ
                                                           i
                        Test statistic: t 
                                                           sˆ
                                                              i


   where t and t /2 are based on n  (k  1) degrees of freedom and
   n = number of observations
   k  1 = number of  parameters in the model
                            McClave: Statistics, 11th ed. Chapter 12: Multiple      15
                                   Regression and Model Building
    12.2: The First-Order Model:
    Estimating and Making Inferences about
    the Parameters
Test of the Parameter Coefficient on the Number of Bidders
            H 0 : 2  0
            H a : 2  0
            Rejection Region: t  t  t.05  1.699

                                         ˆ
                                         2
                                     85.953
           Test statistic: t  *
                                            9.85
                               sˆ   8.729
                                             2




                       McClave: Statistics, 11th ed. Chapter 12: Multiple   16
                              Regression and Model Building
    12.2: The First-Order Model:
    Estimating and Making Inferences about
    the Parameters
Test of the Parameter Coefficient on the Number of Bidders
              H 0 : 2  0
              H a : 2  0
              Rejection Region: t  t  t.05  1.699
       Since t* > t,
       reject the null
       hypothesis.                        ˆ
                                          2
                                       85.953
             Test statistic: t *
                                              9.85
                                 sˆ   8.729
                                              2




                         McClave: Statistics, 11th ed. Chapter 12: Multiple   17
                                Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences about
the Parameters
 A 100(1- )% Confidence Interval for a                                Parameter
                    ˆ
                    i  (t /2 ) sˆ
                                               i


where t is based on n  (k  1) degrees of freedom and
n = number of observations
k  1 = number of  parameters in the model
Valid inferences about i also require that the four
assumptions about  are satisfied.


                 McClave: Statistics, 11th ed. Chapter 12: Multiple               18
                        Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences about
the Parameters
    A 100(1- )% Confidence Interval for                           1

            ˆ
           1  t /2 sˆ 
                                 1

            ˆ
           1  t.05sˆ 
                             1


           12.74  1.699(.905) 
           12.74  1.54




             McClave: Statistics, 11th ed. Chapter 12: Multiple       19
                    Regression and Model Building
12.2: The First-Order Model:
Estimating and Making Inferences about
the Parameters
      A 100(1- )% Confidence Interval for                               1

                  ˆ
                 1  t /2 sˆ 
                                       1

                  ˆ
                 1  t.05sˆ 
                                   1


                 12.74  1.699(.905) 
                 12.74  1.54

 Holding the number of bidders constant, the result above tells
 us that we can be 90% sure that the auction price will rise
 between $11.20 and $14.28 for each 1-year increase in age.


                   McClave: Statistics, 11th ed. Chapter 12: Multiple       20
                          Regression and Model Building
    12.3: Evaluating Overall Model
    Utility
Reject H 0 for i                              Do Not Reject H 0 for i
 Evidence of a linear                         There may be no
   relationship between y                        relationship between y
   and xi                                        and xi
                                               Type II error occurred

                                               The relationship between
                                                 y and xi is more complex
                                                 than a straight-line
                                                 relationship


                   McClave: Statistics, 11th ed. Chapter 12: Multiple   21
                          Regression and Model Building
12.3: Evaluating Overall Model
Utility

       The multiple coefficient of
        determination, R2, measures how
        much of the overall variation in y is
        explained by the least squares
        prediction equation.
       SSE SS yy  SSE Explained variability
R  1
    2
                     
       SS yy   SS yy     Total variability

                  McClave: Statistics, 11th ed. Chapter 12: Multiple   22
                         Regression and Model Building
12.3: Evaluating Overall Model
Utility

   High values of R2 suggest a good
    model, but the usefulness of R2 falls as
    the number of observations becomes
    close to the number of parameters
    estimated.




              McClave: Statistics, 11th ed. Chapter 12: Multiple   23
                     Regression and Model Building
 12.3: Evaluating Overall Model
 Utility

 The Adjusted Multiple Coefficient of Determination
        n  1   SSE             n 1 
R  1 
 2
                       SS 
                        yy 
                               1               1  R2                  
        n  (k  1)              n  (k  1) 
 a
                            

          Ra2 adjusts for the number of observations
          and the number of parameter estimates. It
          will always have a value no greater than R2.




                   McClave: Statistics, 11th ed. Chapter 12: Multiple           24
                          Regression and Model Building
     12.3: Evaluating Overall Model
     Utility
                   The Analysis-of-Variance F -Test
H 0 : 1   2          k  0
H a : At least one i  0
                       ( SS yy  SSE ) / k          R2 / k
Test Statistic: F                       
                    SSE /  n  (k  1)  (1  R 2 ) /  n  (k  1) 
                   Mean square (Model)
                 
                   Mean square (Error)
where n is the sample size and k is the number of terms in the model.
Rejection region: F  F , with k numerator and n  (k  1) denominator
degrees of freedom.
                            McClave: Statistics, 11th ed. Chapter 12: Multiple   25
                                   Regression and Model Building
     12.3: Evaluating Overall Model
     Utility
                   The Analysis-of-Variance F -Test
H 0 : 1   2         k  0
H a : At least one i  0
                     ( SS yy  the null
                   RejectingSSE ) / k hypothesis 2
                                                  R /k
Test Statistic: F                      
                   means that something in your
                    SSE  n  ( explain (1  R 2 ) / n
                   model/ helps k  1)  variations in  (k  1) 
                    Mean may be that another
                   y, but it square (Model)
                  model provides more reliable
                     Mean square (Error)
                   estimates and predictions.
where n is the sample size and k is the number of terms in the model.
Rejection region: F  F , with k numerator and n  (k  1) denominator
degrees of freedom.
                            McClave: Statistics, 11th ed. Chapter 12: Multiple   26
                                   Regression and Model Building
  12.3: Evaluating Overall Model
  Utility
  A collector of antique                  H 0 : 1   2  0
  clocks hypothesizes                     H a : At least one of the
  that the auction price
  can be modeled as                       two coefficients is nonzero
y  0  1 x1   2 x2                 Test Statistic:
where                                          MS(Model) 2,141,531
                                          F                           120.19
y  auction price in dollars                      MSE           17,818
x1  age of clock in years                p  value: less than .00001
x2  number of bidders



                          McClave: Statistics, 11th ed. Chapter 12: Multiple   27
                                 Regression and Model Building
  12.3: Evaluating Overall Model
  Utility
  A collector of antique                  H 0 : 1   2  0
  clocks hypothesizes                     H a : At least one of the
  that the auction price
  can be modeled as                       two coefficients is nonzero
y  0  1 x1   2 x2                 Test Statistic:
where                                          MS(Model) 2,141,531
                                          F                           120.19
y  auction price in dollars                      MSE           17,818
x1  age of clock in years                p  value: less than .00001
x2  number of bidders                              Something in the model is useful,
                                                    but the F-test can’t tell us which x-
                                                    variables are individually useful.


                          McClave: Statistics, 11th ed. Chapter 12: Multiple                28
                                 Regression and Model Building
     12.3: Evaluating Overall Model
     Utility

   Checking the Utility of a Multiple-Regression
    Model
    1.   Use the F-test to conduct a test of the adequacy
         of the overall model.
    2.   Conduct t-tests on the “most important”
         parameters.
    3.   Examine Ra2 and 2s to evaluate how well the
         model fits the data.
                     McClave: Statistics, 11th ed. Chapter 12: Multiple   29
                            Regression and Model Building
12.4: Using the Model for
Estimation and Prediction

   The model of antique clock prices can be
    used to predict sale prices for clocks of a
    certain age with a particular number of
    bidders.
   What is the mean sale price for all 150-year-
    old clocks with 10 bidders?




               McClave: Statistics, 11th ed. Chapter 12: Multiple   30
                      Regression and Model Building
    12.4: Using the Model for
    Estimation and Prediction
   What is the mean auction                      The average value of all clocks
    sale price for a single 150-                  with these characteristics can be
    year-old clock with 10                        found by using the statistical
    bidders?                                      software to generate a confidence
                                                  interval. (See Figure 12.7)

                                                  In this case, the confidence interval
                                                  indicates that we can be 95% sure
                                                  that the average price of a single
                                                  150-year-old clock sold at auction
                                                  with 10 bidders will be between
                                                  $1,154.10 and $1,709.30.



                         McClave: Statistics, 11th ed. Chapter 12: Multiple         31
                                Regression and Model Building
12.4: Using the Model for
Estimation and Prediction




         McClave: Statistics, 11th ed. Chapter 12: Multiple   32
                Regression and Model Building
12.4: Using the Model for
Estimation and Prediction

   What is the mean sale price for a single 50-
    year-old clock with 2 bidders?




               McClave: Statistics, 11th ed. Chapter 12: Multiple   33
                      Regression and Model Building
12.4: Using the Model for
Estimation and Prediction

   What is the mean sale price for a single 50-
    year-old clock with 2 bidders?

              Since 50 years-of-age
              and 2 bidders are both
              outside of the range of
              values in our data set,
              any prediction using
              these values would be
              unreliable.


               McClave: Statistics, 11th ed. Chapter 12: Multiple   34
                      Regression and Model Building
12.5: Model Building:
Interaction Models

   In some cases, the impact of an
    independent variable xi on y will
    depend on the value of some other
    independent variable xk.
   Interaction models include the cross-
    products of independent variables as
    well as the first-order values.

            McClave: Statistics, 11th ed. Chapter 12: Multiple   35
                   Regression and Model Building
12.5: Model Building:
Interaction Models
    An Interaction Model Relating E ( y ) to
    Two Quantitative Independent Variables
E ( y )   0  1 x1   2 x2   3 x1 x2
where 1   3 x2 represents the change in E ( y )
for every one-unit change in x1 holding x2 fixed
and  2  3 x1 represents the change in E ( y ) for
every one-unit change in x2 holding x1 fixed.

                   McClave: Statistics, 11th ed. Chapter 12: Multiple   36
                          Regression and Model Building
12.5: Model Building:
Interaction Models

   In the antique clock auction example,
    assume the collector has reason to
    believe that the impact of age (x1) on
    price (y) varies with the number of
    bidders (x2) .
   The model is now
      y=   0   +      1x 1     +        2x 2 +            3x 1x 2 +     .

                   McClave: Statistics, 11th ed. Chapter 12: Multiple       37
                          Regression and Model Building
12.5: Model Building:
Interaction Models




         McClave: Statistics, 11th ed. Chapter 12: Multiple   38
                Regression and Model Building
      12.5: Model Building:
      Interaction Models
   In the antique clock auction
    example, assume the
    collector has reason to
    believe that the impact of
    age (x1) on price (y) varies                                    The Global F -Test
    with the number of bidders                                       H 0 : 1   2   3
    (x2) .                                                          The test statistic is F = 193.04
   The model is now                                                 p-value = 0
    y=   0   +   1x1   +   2x2 +   3x1x2 +       .                  Reject the null hypothesis




                                     McClave: Statistics, 11th ed. Chapter 12: Multiple                39
                                            Regression and Model Building
      12.5: Model Building:
      Interaction Models
   In the antique clock auction
                                                                     The MINITAB results are reported
    example, assume the                                              in Figure 12.11 in the text.
    collector has reason to
    believe that the impact of
    age (x1) on price (y) varies                         The t -Test on the Interaction Parameter
    with the number of bidders                            H 0 : 3  0
    (x2) .                                               The test statistic is t = 6.11 (two-tailed)
   The model is now                                      p -value = 0 (= 0/2 = 0 for a one-tailed test)
    y=   0   +   1x1   +   2x2 +   3x1x2 +      .        Reject the null hypothesis




                                    McClave: Statistics, 11th ed. Chapter 12: Multiple                  40
                                           Regression and Model Building
      12.5: Model Building:
      Interaction Models
   In the antique clock auction example, assume the collector has reason
    to believe that the impact of age (x1) on price (y) varies with the
    number of bidders (x2) .
   The model is now
    y=    0   +   1x1   +   2x2 +   3x1x2 +       .


         The Estimated Model is
         y  320.5  0.878 x1  (93.26) x2  1.2978 x1 x2
         ˆ
         To estimate the change in the price of 150-year-old clock
         given a one-unit change in x2 , we must include the interction term.
                              ˆ     ˆ
         Estimated x slope     x  93.26  1.30(150)  101.74
                            2              2          3 1



                                      McClave: Statistics, 11th ed. Chapter 12: Multiple   41
                                             Regression and Model Building
12.5: Model Building:
Interaction Models


          Once the interaction
        term has passed the t-
         test, it is unnecessary
          to test the individual
        independent variables.




          McClave: Statistics, 11th ed. Chapter 12: Multiple   42
                 Regression and Model Building
    12.6: Model Building: Quadratic
    and Other Higher Order Models
   A quadratic (second-order) model includes the
    square of an independent variable:
              y = 0 + 1x + 2x2 + .
    This allows more complex relationships to be
    modeled.




                  McClave: Statistics, 11th ed. Chapter 12: Multiple   43
                         Regression and Model Building
    12.6: Model Building: Quadratic
    and Other Higher Order Models

   A quadratic (second-order) model
    includes the square of an independent
    variable:
            y = 0 + 1x + 2 x 2 + .

     1 is the shift parameter and
     2 is the rate of curvature.


                 McClave: Statistics, 11th ed. Chapter 12: Multiple   44
                        Regression and Model Building
12.6: Model Building: Quadratic
and Other Higher Order Models

   Example 12.7 considers whether home
    size (x) impacts electrical usage (y) in
    a positive but decreasing way.
   The MINITAB results are shown in
    Figure 12.13.



              McClave: Statistics, 11th ed. Chapter 12: Multiple   45
                     Regression and Model Building
12.6: Model Building: Quadratic
and Other Higher Order Models




          McClave: Statistics, 11th ed. Chapter 12: Multiple   46
                 Regression and Model Building
12.6: Model Building: Quadratic
and Other Higher Order Models

   According to the results, the equation
    that minimizes SSE for the 10
    observations is
      ˆ  1, 216.14  2.3989 x  .00045 x 2
      y
      Ra  .9767
       2




               McClave: Statistics, 11th ed. Chapter 12: Multiple   47
                      Regression and Model Building
12.6: Model Building: Quadratic
and Other Higher Order Models




          McClave: Statistics, 11th ed. Chapter 12: Multiple   48
                 Regression and Model Building
    12.6: Model Building: Quadratic
    and Other Higher Order Models
   Since 0 is not in the range of the independent
    variable (a house of 0 ft2?), the estimated
    intercept is not meaningful.
   The positive estimate on 1 indicates a
    positive relationship, although the slope is not
    constant (we’ve estimated a curve, not a
    straight line).
   The negative value on 2 indicates the rate of
    increase in power usage declines for larger
    homes.
                 McClave: Statistics, 11th ed. Chapter 12: Multiple   49
                        Regression and Model Building
    12.6: Model Building: Quadratic
    and Other Higher Order Models

   The Global F-Test
       H0: 1= 2= 0
       Ha: At least one of the coefficients ≠ 0
           The test statistic is F = 189.71, p-value near 0.
           Reject H0.




                     McClave: Statistics, 11th ed. Chapter 12: Multiple   50
                            Regression and Model Building
    12.6: Model Building: Quadratic
    and Other Higher Order Models

   t-Test of      2
       H0:   2= 0
        Ha:   2< 0
           The test statistic is t = -7.62, p-value = .0001
            (two-tailed).
           The one-tailed test statistic is .0001/2 = .00005
           Reject the null hypothesis.



                       McClave: Statistics, 11th ed. Chapter 12: Multiple   51
                              Regression and Model Building
     12.6: Model Building: Quadratic
     and Other Higher Order Models

     Complete Second-Order Model with
    Two Quantitative Independent Variables
E(y) = 0 + 1x1 + 2x2 + 3x1x2 + 4x12 + 5x22
y-intercept


              Changing 1                                                             Signs and values of
                                               Controls
              and 2 causes                                                           these parameters
                                               the rotation
              the surface to                                                         control the type of
                                               of the
              shift along the                                                        surface and the
                                               surface
              x1 and x2 axes                                                         rates of curvature

                                McClave: Statistics, 11th ed. Chapter 12: Multiple
                                       Regression and Model Building                                   52
12.6: Model Building: Quadratic
and Other Higher Order Models




          McClave: Statistics, 11th ed. Chapter 12: Multiple
                 Regression and Model Building                 53
12.7: Model Building: Qualitative
(Dummy) Variable Models

   Qualitative variables can be included
    in regression models through the use
    of dummy variables.
   Assign a value of 0 (the base level) to
    one category and 1, 2, 3 … to the
    other categories.


             McClave: Statistics, 11th ed. Chapter 12: Multiple   54
                    Regression and Model Building
12.7: Model Building: Qualitative
(Dummy) Variable Models
 A Qualitative Independent Variable with k Levels
  y   0  1 x1   2 x2     k 1 xk 1  
 where xi is the dummy variable for level i + 1 and
        1 if y is observed at level i  1
   xi  
                     0 otherwise
    A  0                     1   B   A
    B   0  1                2  C   A
   C   0   2              3   D   A
    j   0   j 1            j   j  A
               McClave: Statistics, 11th ed. Chapter 12: Multiple   55
                      Regression and Model Building
 12.7: Model Building: Qualitative
 (Dummy) Variable Models

    For the golf ball example from Chapter 10, there were
     four levels (the brands).Testing differences in brands
     can be done with the model
            E ( y )   0  1 x1   2 x2  3 x3
where
     1 if Brand B        1 if Brand C          1 if Brand D
x1               , x2               and x3  
      0 otherwise         0 otherwise           0 otherwise



                    McClave: Statistics, 11th ed. Chapter 12: Multiple   56
                           Regression and Model Building
12.7: Model Building: Qualitative
(Dummy) Variable Models

   Brand A is the base level, so 0
    represents the mean distance ( A) for
    Brand A, and
                  1 = B - A

                  2 = C - A

                  3 = D - A



             McClave: Statistics, 11th ed. Chapter 12: Multiple   57
                    Regression and Model Building
12.7: Model Building: Qualitative
(Dummy) Variable Models

   Testing that the four means are equal
    is equivalent to testing the significance
    of the s:
       H 0: 1 = 2 = 3 = 0
       Ha: At least of one the s ≠ 0



              McClave: Statistics, 11th ed. Chapter 12: Multiple   58
                     Regression and Model Building
    12.7: Model Building: Qualitative
    (Dummy) Variable Models
   Testing that the four                       The test statistic is the F-statistic.
    means are equal is                          Here F = 43.99, p-value  .000.
    equivalent to testing                       Hence we reject the null hypothesis
                                                that the golf balls all have the same
    the significance of                         mean driving distance.
    the s:
    H0: 1 = 2 = 3 = 0
    Ha: At least of one
    the s ≠ 0



                     McClave: Statistics, 11th ed. Chapter 12: Multiple           59
                            Regression and Model Building
    12.7: Model Building: Qualitative
    (Dummy) Variable Models
   Testing that the four                        The test statistic if the F-statistic.
    means are equal is                           Here F = 43.99, p-value  .000.
    equivalent to testing                        Hence we reject the null hypothesis
                                                 that the golf balls all have the same
    the significance of                          mean driving distance.
    the s:
    H0: 1 = 2 = 3 = 0
                                                 Remember that the maximum
    Ha: At least of one                          number of dummy variables is
    the s ≠ 0                                    one less than the number of
                                                 levels for the qualitative variable.




                     McClave: Statistics, 11th ed. Chapter 12: Multiple                 60
                            Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables

   Suppose a first-order model is used to
    evaluate the impact on mean monthly
    sales of expenditures in three
    advertising media: television, radio and
    newspaper.
       Expenditure, x1, is a quantitative variable
       Types of media, x2 and x3, are qualitative
        variables (limited to k levels -1)
                McClave: Statistics, 11th ed. Chapter 12: Multiple   61
                       Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables
E ( y )   0  1 x1   2 x2   3 x3   4 x1 x2   4 x1 x3
where
x1  advertising expenditure
      1 if radio
x2  
      0 otherwise
     1 if television
x3  
      0 otherwise
Newspaper is the base level.
                    McClave: Statistics, 11th ed. Chapter 12: Multiple   62
                           Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables

E ( y )   0  1 x1   2 x2  3 x3   4 x1 x2   4 x1 x3
              Main effects,         Main effects,                          Interaction
              advertising           type of medium
              expenditure

Newspaper medium line: E ( y )   0  1 x1


Radio medium line: E ( y )  (  0   2 )  ( 1   4 ) x1
                                                    Intercept                 Slope

Television medium line: E ( y )  (  0  3 )  ( 1  5 ) x1
                                                               Intercept              Slope

                      McClave: Statistics, 11th ed. Chapter 12: Multiple                      63
                             Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables

   Suppose now a second-order model is
    used to evaluate the impact of
    expenditures in the three advertising
    media on sales.
   The relationship between
    expenditures, x1, and sales, y, is
    assumed to be curvilinear.

             McClave: Statistics, 11th ed. Chapter 12: Multiple   64
                    Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables

   E ( y )  0  1 x1   x        2
                                   2 1

   where x1  advertising expenditure

 In this model, each medium is assumed
 to have the save impact on sales.




                  McClave: Statistics, 11th ed. Chapter 12:   65
                  Multiple Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables
E ( y )   0  1 x1   2 x12  3 x2   4 x3
where
                                                                  In this model, the
x1  advertising expenditure                                      intercepts differ
      1 if radio                                                 but the shapes
x2  
      0 otherwise                                                of the curves
                                                                  are the same.
     1 if television
x3  
      0 otherwise
Newspaper is the base level.
                    McClave: Statistics, 11th ed. Chapter 12: Multiple             66
                           Regression and Model Building
12.8: Model Building: Models with
Both Quantitative and Qualitative
Variables
   E ( y )  0  1 x1   2 x12  3 x2   4 x3
           5 x1 x2  6 x1 x3  7 x12 x2  8 x12 x3

In this model, the response curve for each
media type is different – that is, advertising
expenditure and media type interact, at
varying rates.



                 McClave: Statistics, 11th ed. Chapter 12: Multiple   67
                        Regression and Model Building
12.9: Model Building:
Comparing Nested Models

   Two models are nested if one model
    contains all the terms of the second
    model and at least one additional term.
    The more complex of the two models
    is called the complete model and the
    simpler of the two is called the reduced
    model.

              McClave: Statistics, 11th ed. Chapter 12: Multiple   68
                     Regression and Model Building
12.9: Model Building:
Comparing Nested Models

   Recall the interaction model relating
    the auction price (y) of antique clocks
    to age (x1) and bidders (x2) :
      E ( y )   0  1 x1   2 x2  3 x1 x2 .




                 McClave: Statistics, 11th ed. Chapter 12: Multiple   69
                        Regression and Model Building
  12.9: Model Building:
  Comparing Nested Models

     If the relationship is not constant, a
      second-order model should be
      considered:
E ( y )  0  1 x1   2 x2  3 x1 x2   x   x .                   2
                                                                       4 1
                                                                               2
                                                                             5 2
                 Reduced model
                                    Complete model




                  McClave: Statistics, 11th ed. Chapter 12: Multiple               70
                         Regression and Model Building
12.9: Model Building:
Comparing Nested Models

   If the complete model produces a
    better fit, then the s on the
    quadratic terms should be significant.
E ( y)  0  1 x1   2 x2  3 x1 x2   x   x .                  2
                                                                     4 1
                                                                             2
                                                                           5 2
                  Reduced model
                                     Complete model

H0: 4 = 5 = 0
Ha: At least one of 4 and 5 is non-zero
                McClave: Statistics, 11th ed. Chapter 12: Multiple               71
                       Regression and Model Building
         12.9: Model Building:
         Comparing Nested Models
                   F-Test for Comparing Nested Models

Reduced model: E ( y )   0  1 x1                         g xg
Complete model: E ( y )   0  1 x1                          g xg   g 1 xg 1      k xk
H 0 :  g 1   g  2     k  0
H a : At least one of the  parameters in H 0 is nonzero
Test Statistic:
    ( SSER  SSEC ) / (k  g ) ( SSER  SSEC ) / #  s in H 0
F                            
       SSEC / [n  (k  1)]               MSEC

                            McClave: Statistics, 11th ed. Chapter 12: Multiple              72
                                   Regression and Model Building
     12.9: Model Building:
     Comparing Nested Models
         F-Test for Comparing Nested Models
where
SSER = sum of squared errors for the reduced model
SSEC = sum of squared errors for the complete model
MSEC = mean square error (s2) for the complete model
k – g = number of  parameters specified in H0
k + 1 = number of  parameters in the complete model
     n = sample size
Rejection region: F > F, with k – g numerator and n – (k + 1)
   denominator degrees of freedom.
                      McClave: Statistics, 11th ed. Chapter 12: Multiple   73
                             Regression and Model Building
    12.9: Model Building:
    Comparing Nested Models
   The growth of
    carnations (y) is
    assumed to be a
    function of the
    temperature (x1)
    and the amount of
    fertilizer (x2).
   The data are shown
    in Table 12.6 in the
    text.
                  McClave: Statistics, 11th ed. Chapter 12: Multiple   74
                         Regression and Model Building
    12.9: Model Building:
    Comparing Nested Models
   The growth of carnations (y) is assumed to be a function
   of the temperature (x1) and the amount of fertilizer (x2).


The complete second order model is
E ( y )  0  1 x1   2 x2  3 x1 x2   4 x12  5 x2
                                                         2



The least squares prediction equation from Table 12.6 is
rounded to
 y  5,127.90  31.10 x1  139.75 x2  .146 x1 x2  .133 x12 1.14 x2
 ˆ                                                                   2



                          McClave: Statistics, 11th ed. Chapter 12: Multiple   75
                                 Regression and Model Building
    12.9: Model Building:
    Comparing Nested Models
   The growth of carnations (y) is assumed to be a function
   of the temperature (x1) and the amount of fertilizer (x2).


To test the significance of the contribution of the interaction
   and second-order terms, use
   H0: 3 = 4 = 5 = 0
    Ha: At least one of 3, 4 or 5 ≠ 0
This requires estimating the complete model in reduced
form, dropping the parameters in the null hypothesis.
Results are given in Figure 12.31.
                      McClave: Statistics, 11th ed. Chapter 12: Multiple   76
                             Regression and Model Building
12.9: Model Building:
Comparing Nested Models
H 0 : 3   4  5  0
H a : At least one of the  parameters in H 0 is nonzero
Test Statistic:
    ( SSER  SSEC ) / (k  g ) ( SSER  SSEC ) / #  s in H 0
F                            
       SSEC / [n  (k  1)]               MSEC
    (6, 671.50852  59.17832) / 3
F                                 782.15
               2.81802
Rejection region: F.05  3.07



                    McClave: Statistics, 11th ed. Chapter 12: Multiple   77
                           Regression and Model Building
12.9: Model Building:
Comparing Nested Models
H 0 : 3   4  5  0
H a : At least one of the  parameters in H 0 is nonzero
Test Statistic:
    ( SSER  SSEC ) / (k  g ) ( SSER  SSEC ) / #  s in H 0
F                            
       SSEC / [n  (k  1)]               MSEC
    (6, 671.50852  59.17832) / 3
F                                 782.15
               2.81802
Rejection region: F.05  3.07       Reject the null hypothesis:
                                                        the complete model seems
                                                        to provide better predictions
                                                        than the reduced model.

                    McClave: Statistics, 11th ed. Chapter 12: Multiple                  78
                           Regression and Model Building
12.9: Model Building:
Comparing Nested Models

   A parsimonious model is a general
    linear model with a small number of
     parameters. In situations where
    two competing models have
    essentially the same predictive power
    (as determined by an F-test), choose
    the more parsimonious of the two.

               McClave: Statistics, 11th ed. Chapter 12:   79
               Multiple Regression and Model Building
12.9: Model Building:
Comparing Nested Models

   A parsimonious model is a general
    linear model with a small number of
     parameters. In situations where
              If the models are not
              nested, the choice is
    two competing models have
              more subjective, based
              on Ra2, s, and an
    essentially the samethe
              understanding of
                                 predictive power
              theory behind an F-test), choose
    (as determined by the model.
    the more parsimonious of the two.

                  McClave: Statistics, 11th ed. Chapter 12:   80
                  Multiple Regression and Model Building
    12.10: Model Building:
    Stepwise Regression

   It is often unclear which independent
    variables have a significant impact on y.
   Screening variables in an attempt to
    identify the most important ones is
    known as stepwise regression.



               McClave: Statistics, 11th ed. Chapter 12: Multiple   81
                      Regression and Model Building
       12.10: Model Building:
       Stepwise Regression

Step 1: For each xi, estimate E(y) = 0 + 1 xi
For each xi, test i. The xi with the largest absolute t-score
(x*) is the best one-variable predictor of y.


          Step 2: Estimate E(y) = 0 + 1 x* + 2 xj with
          the remaining k – 1 x-variables. The x-variable with
          highest absolute value of t is retained (x’). (Some
          software packages may drop x* upon re-testing.)


                 Step 3: Estimate E(y) = 0 + 1 x* + 2 x’ + 3
                 xg with the remaining k – 2 x-variables as in Step 2.
                 Continue until no remaining x-variables yield
                 significant t-scores when included in the model.

                             McClave: Statistics, 11th ed. Chapter 12: Multiple   82
                                    Regression and Model Building
12.10: Model Building:
Stepwise Regression

   Stepwise regression must be used
    with caution
       Many t-tests are conducted, leading to
        high probabilities of Type I or Type II
        errors.
       Usually, no interaction or higher-order
        terms are considered – and reality may
        not be that simple.

                McClave: Statistics, 11th ed. Chapter 12: Multiple   83
                       Regression and Model Building
     12.11: Residual Analysis: Checking
     the Regression Assumptions
    Regression analysis is based on the four
     assumptions about the random error 
     considered earlier.
1.   The mean is equal to 0.
2.   The variance is equal to  2.
3.   The probability distribution is a normal
     distribution.
4.   Random errors are independent of one another.

                  McClave: Statistics, 11th ed. Chapter 12: Multiple   84
                         Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   If these assumptions are not valid, the
    results of the regression estimation are
    called into question.
   Checking the validity of the
    assumptions involves analyzing the
    residuals of the regression.


              McClave: Statistics, 11th ed. Chapter 12: Multiple   85
                     Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   A regression residual ˆ is defined as
    the difference between an observed y-
    value and its corresponding predicted
    value:
    ˆ                   ˆ     ˆ     ˆ
      ( y  y)  y  (0  1x1  2 x2 
              ˆ                                                        ˆ
                                                                       k xk )



                  McClave: Statistics, 11th ed. Chapter 12: Multiple             86
                         Regression and Model Building
     12.11: Residual Analysis: Checking
     the Regression Assumptions
          Properties of the Regression Residuals
1.    The mean of the residuals is equal to 0.
                 (Residuals)  ( y  y)  0
                                       ˆ
2.    The standard deviation of the residuals is equal to
      the standard deviations of the fitted regression
      model.
                 (Residuals)   ( y  y ) 2  0
                             2
                                        ˆ

                 (Residuals)
                                       2
                                                SSE
           s                                            MSE
                  n  (k  1)                n  (k  1)

                      McClave: Statistics, 11th ed. Chapter 12: Multiple   87
                             Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   If the model is misspecified, the mean
    of  will not equal 0.
       Residual analysis may reveal this
        problem.
       The home-size electricity usage example
        illustrates this.



                McClave: Statistics, 11th ed. Chapter 12: Multiple   88
                       Regression and Model Building
    12.11: Residual Analysis: Checking
    the Regression Assumptions

   The plot of the first-                         while the quadratic
    order model shows                               model shows a
    a curvilinear                                   more random
    residual pattern …                              pattern.




                     McClave: Statistics, 11th ed. Chapter 12:            89
                     Multiple Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

A pattern in the
residual plot
may indicate a
problem with
the model.




               McClave: Statistics, 11th ed. Chapter 12: Multiple   90
                      Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   A residual larger than 3s (in absolute
    value) is considered an outlier.
       Outliers will have an undue influence on
        the estimates.
        1. Mistakenly recorded data
        2. An observation that is for some reason truly
           different from the others
        3. Random chance

                  McClave: Statistics, 11th ed. Chapter 12: Multiple   91
                         Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   A residual larger than 3s (in absolute
    value) is considered an outlier.
       Leaving an outlier that should be
        removed in the data set will produce
        misleading estimates and predictions (#1
        & #2 above).
       So will removing an outlier that actually
        belongs in the data set (#3 above).

                McClave: Statistics, 11th ed. Chapter 12: Multiple   92
                       Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   Residual plots should be
    centered on 0 and within
    ±3s of 0.
   Residual histograms
    should be relatively bell-
    shaped.
   Residual normal
    probability plots should
    display straight lines.


                 McClave: Statistics, 11th ed. Chapter 12: Multiple   93
                        Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

              REGRESSION ANALYSIS
                 IS ROBUST WITH
               RESPECT TO (SMALL)
              NONNORMAL ERRORS.




   Slight departures from normality will
    not seriously harm the validity of the
    estimates, but as the departure from
    normality grows, the validity falls.
              McClave: Statistics, 11th ed. Chapter 12: Multiple   94
                     Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   If the variance of  changes as y changes, the
    constant variance assumption is violated.




                McClave: Statistics, 11th ed. Chapter 12: Multiple   95
                       Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions
                                                                A first-order
                                                                 model is used
                                                                 to relate the
                                                                 salaries (y) of
                                                                 social workers
                                                                 to years of
                                                                 experience (x).




            McClave: Statistics, 11th ed. Chapter 12: Multiple                     96
                   Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions
               E ( y )  0  1 x
         y  11,368.72  2141.38 x
         ˆ
         R  .787
          2


         t1  13.31; p  value  0




              McClave: Statistics, 11th ed. Chapter 12: Multiple   97
                     Regression and Model Building
    12.11: Residual Analysis: Checking
    the Regression Assumptions
   The model seems to
    provide good
    predictions, but the
    residual plot reveals a
    non-random pattern:
   The residual
    increases as the
    estimated mean
    salary increases,
    violating the constant
    variance assumption

                     McClave: Statistics, 11th ed. Chapter 12: Multiple   98
                            Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions

   Transforming the dependent variable
    often stabilizes the residual
       Possible transformations of y
       Natural logarithm
       Square root
       sin-1y1/2



                McClave: Statistics, 11th ed. Chapter 12: Multiple   99
                       Regression and Model Building
12.11: Residual Analysis: Checking
the Regression Assumptions




            McClave: Statistics, 11th ed. Chapter 12: Multiple   100
                   Regression and Model Building
  12.11: Residual Analysis: Checking
  the Regression Assumptions


Steps in a Residual Analysis
   Plot the                                        Plot the
  residuals                                   residuals with a      Plot the
against each                                   stem-and-leaf,      residuals
 quantitative    Examine the                    histogram or        against
independent     residual plots                     normal         predicted y-
variable and      for outliers                 probability plot values to check
look for non-                                  and check for for nonconstant
   random                                        nonnormal         variances
   patterns                                         errors

                    McClave: Statistics, 11th ed. Chapter 12: Multiple     101
                           Regression and Model Building
   12.11: Residual Analysis: Checking
   the Regression Assumptions


Steps in a Residual Analysis
Plot the residuals
   against each
    quantitative
   independent
variable and look
 for non-random
     patterns

                     McClave: Statistics, 11th ed. Chapter 12: Multiple   102
                            Regression and Model Building
  12.11: Residual Analysis: Checking
  the Regression Assumptions


Steps in a Residual Analysis

  Examine the
residual plots for
     outliers



                     McClave: Statistics, 11th ed. Chapter 12: Multiple   103
                            Regression and Model Building
   12.11: Residual Analysis: Checking
   the Regression Assumptions


Steps in a Residual Analysis
      Plot the
 residuals with a
  stem-and-leaf,
   histogram or
      normal
  probability plot
  and check for
nonnormal errors

                     McClave: Statistics, 11th ed. Chapter 12: Multiple   104
                            Regression and Model Building
   12.11: Residual Analysis: Checking
   the Regression Assumptions


Steps in a Residual Analysis

 Plot the residuals
against predicted y-
values to check for
   nonconstant
     variances


                       McClave: Statistics, 11th ed. Chapter 12: Multiple   105
                              Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

   Problem 1: Parameter Estimability


   Problem 2: Multicollinearity


   Problem 3: Extrapolation


   Problem 4: Correlated Errors

                McClave: Statistics, 11th ed. Chapter 12: Multiple   106
                       Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation




            McClave: Statistics, 11th ed. Chapter 12: Multiple   107
                   Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

     Problem 1: Parameter Estimability



                                                                   If x does not take
                                                                   on a sufficient
                                                                   number of
                                                                   different values,
                                                                   no single unique
                                                                   line can be
                                                                   estimated.



              McClave: Statistics, 11th ed. Chapter 12: Multiple                        108
                     Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

           Problem 2: Multicollinearity

Multicollinearity exists when two or more of the
independent variables in a regression are correlated.

    If xi and xj move together in some way,
    finding the impact on y of a one-unit
    change in either of them holding the other
    constant will be difficult or impossible.


                 McClave: Statistics, 11th ed. Chapter 12: Multiple   109
                        Regression and Model Building
    12.12: Some Pitfalls: Estimability,
    Multicollinearity and Extrapolation

                Problem 2: Multicollinearity

Multicollinearity can be detected in various ways.
A simple check is to calculate the correlation coefficients (rij)
for each pair of independent variables in the model.
Any significant rij may indicate a multicollinearity problem.

    If severe multicollinearity exists, the result may be
    1. Significant F-values but insignificant t-values
    2. Signs on s opposite to those expected
    3. Errors in  estimates, standard errors, etc.

                      McClave: Statistics, 11th ed. Chapter 12: Multiple   110
                             Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

   The Federal Trade Commission (FTC)
    ranks cigarettes according to their tar
    (x1), nicotine (x2), weight in grams (x3)
    and carbon monoxide (y) content .
   25 data points (see Table 12.11) are
    used to estimate the model
       E ( y )   0  1 x1   2 x2  3 x3 .
                   McClave: Statistics, 11th ed. Chapter 12:   111
                   Multiple Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation




               McClave: Statistics, 11th ed. Chapter 12:   112
               Multiple Regression and Model Building
 12.12: Some Pitfalls: Estimability,
 Multicollinearity and Extrapolation

E ( y )  0  1 x1   2 x2  3 x3
E ( y )  3.202  .963x1  (2.63) x2  (0.13) x3
(See Figure 12.49)
    F = 78.98, p-value < .0001
        t1= 3.97, p-value = .0007
        t2= -0.67, p-value = .5072
        t3= -0.3, p-value = .9735
                     McClave: Statistics, 11th ed. Chapter 12:   113
                     Multiple Regression and Model Building
 12.12: Some Pitfalls: Estimability,
 Multicollinearity and Extrapolation

E ( y )  0  1 x1   2 x2  3 x3
E ( y )  3.202  .963x1  (2.63) x2  (0.13) x3
(See Figure 12.49)
                                                                 The negative
    F = 78.98, p-value < .0001                                  signs on two
                                                                 variables and the
        t1= 3.97, p-value = .0007                              insignificant t-
                                                                 values are
        t2= -0.67, p-value = .5072                             suggestive of
                                                                 multicollinearity .
        t3= -0.3, p-value = .9735
                     McClave: Statistics, 11th ed. Chapter 12:                    114
                     Multiple Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

   The coefficients of correlation, rij,
    provide further evidence:
       rtar, nicotine = .9766
       rtar, weight      = .4908
       rweight, nicotine = .5002
   Each rij is significantly different from 0
    at the  = .05 level.

                      McClave: Statistics, 11th ed. Chapter 12:   115
                      Multiple Regression and Model Building
    12.12: Some Pitfalls: Estimability,
    Multicollinearity and Extrapolation
   Possible Responses to Problems Created
    by Multicollinearity in Regression
       Drop one or more correlated independent
        variables from the model.
       If all the xs are retained,
           Avoid making inferences about the individual 
            parameters from the t-tests.
           Restrict inferences about E(y) and future y values
            to values of the xs that fall within the range of the
            sample data.
                          McClave: Statistics, 11th ed. Chapter 12:   116
                          Multiple Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

        Problem 3: Extrapolation


    The data used to estimate the model
    provide information only on the range
    of values in the data set. There is no
    reason to assume that the dependent
    variable’s response will be the same
    over a different range of values.



              McClave: Statistics, 11th ed. Chapter 12: Multiple   117
                     Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

        Problem 3: Extrapolation




             McClave: Statistics, 11th ed. Chapter 12: Multiple   118
                    Regression and Model Building
12.12: Some Pitfalls: Estimability,
Multicollinearity and Extrapolation

        Problem 4: Correlated Errors


    If the error terms are not independent
    (a frequent problem in time series), the
    model tests and prediction intervals
    are invalid. Special techniques are
    used to deal with time series models.




              McClave: Statistics, 11th ed. Chapter 12: Multiple   119
                     Regression and Model Building

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:2/20/2012
language:
pages:119