# Regression Analysis

Document Sample

```					Regression Analysis

OLS Linear Regression Continued
Last Lecture:
We covered the basic layout for a Linear
Regression: Y=α+βX+ε
We introduced the concepts behind
regression via least squares
We looked at the history behind the method
We derived the equations for α and β
We then went to the lab and spent some time
calculating the equations in excel before we
finished with Residual plots
Today’s Lecture

As promised, we will cover the concepts
behind the coefficient of determination
and other measures of model fit
We will also introduce introduce
regression with multiple independent
variables
Coefficient of
Determination or R2
Recall this scatterplot with regression
equation and R2 – also recall that in this case
R2 is really r^2, r(X,Y) = -0.84 and 0.84^2=0.71

Y    X  
Ice Cream Demand
Ice Cream Cone's Sold

130
120
y = -16.094x + 137.81
110
R2 = 0.7078
100
90
80
\$0

\$1

\$1

\$1

\$1

\$2

\$2

\$2

\$2

\$3

\$3

\$3
.7

.0

.2

.5

.7

.0

.2

.5

.7

.0

.2

.5
5

0

5

0

5

0

5

0

5

0

5

0

Ice Cream Cone Cost
Conceptualizing R2

Let’s look at the Dry Erase Board for a
conceptual look at the various
components of variation in a data set

SSB                         SSE

SST

SSW
SSR
Regression and Variation
n
Total          (Y         Y)     2
n             

i
i 1                     Error          (Yi  Yi ) 2
n                         i 1
regression              (Y
i 1
i   Y)   2

n       
 (Yi  Yi ) 2
r2  1   i 1
n

 (Yi  Y ) 2
i 1
Since the R2 is the amount explained by the
regression over the total variation, the measure is
scaled between 0 and 1, with 1 denoting that all of
the variation is explained by the model
Other Measures of
Model Fit
Standard Error – This is a measure of the
standard deviation of the residuals about the
regression line                      n      

It is also called the Root                 (Yi  Yi ) 2
Mean Squared Error           SY  X       i 1
n2
Coefficient of Variation (CV)- since the RMSE
is scaled to the units and data in the data set,
it is difficult to compare regressions across
scales, the CV is a standardized measure of
fit that is derived from the classical CV
SY  X
CV         100
Y
Residual Analysis
15

10

and Multiple
5

0
0.00   0.50   1.00   1.50   2.00   2.50   3.00   3.50

Regression
-5

-10

-15

Recall the above residual plot from Tuesday’s lab
It is clear that much of the variation in the data set
remains unexplained by the regression model
Often it is impossible to improve upon a model because
of a lack of data or the general messiness of real world
phenomenon
However this is not always the case, sometimes a better
information to the model
Labwork Today
Today we are going to continue our look at
the ice cream cone sales data set
There is no math calculations today, all you
have to do is look from sheet to sheet as I
explain the process of deconstructing the
variability within the data set
Next Week
I will be away on field research all of next
week
During that time, I’d would like you to read
Chapter 14 of the text (pages 467-500)
Also I have contacted Dr. Qiu and set up an
account on the Geography Server so that you
can access the computers and software that
we will be using during the semester,
unfortunately we will not be able to access it
until next week
From SPSS - Model 1
b                                                Model Summary
Variables Entered/Remov ed

Variables        Variables                                                            Adjusted          Std. Error of
Model     Entered         Removed             Method       Model            R       R Square   R Square         the Estimate
1                 .838 a     .703       .666              8.41148
1        VAR00002 a                .         Enter
a. Predictors: (Constant), VAR00002
a. All requested variables entered.
b. Dependent Variable: VAR00001

ANOVAb

Sum of
Model                  Squares          df        Mean Square       F           Sig.
1       Regression    1340.076                1     1340.076       18.940         .002 a
Residual       566.024                8        70.753
Total         1906.100                9
a. Predictors: (Constant), VAR00002
b. Dependent Variable: VAR00001

Coefficientsa

Unstandardized        Standardized
Coefficients          Coefficients
Model                       B        Std. Error       Beta             t            Sig.
1           (Constant)   137.958         8.309                       16.604           .000
VAR00002      -16.121        3.704            -.838       -4.352          .002
a. Dependent Variable: VAR00001
More SPSS – Using
Variable X2
b
Variables Entered/Remov ed                                                Model Summary

Variables      Variables                                                              Adjusted         Std. Error of
Model    Entered       Removed        Method              Model        R       R Square       R Square        the Estimate
1       VAR00003 a              .    Enter                                   a
1             .210       .044           -.075          15.09139
a. All requested variables entered.                       a. Predictors: (Constant), VAR00003
b. Dependent Variable: VAR00001

ANOVAb

Sum of
Model                    Squares            df       Mean Square          F            Sig.
1        Regression       84.100                 1        84.100           .369          .560 a
Residual       1822.000                 8      227.750
Total          1906.100                 9
a. Predictors: (Constant), VAR00003
b. Dependent Variable: VAR00001
Coefficientsa

Unstandardized        Standardized
Coefficients          Coefficients
Model                   B        Std. Error       Beta           t              Sig.
1        (Constant)    85.558      30.234                       2.830             .022
VAR00003        .674        1.110             .210        .608           .560
a. Dependent Variable: VAR00001
More SPSS – Using
Variables X1 and X2
b
Variables Entered/Remov ed                                                     Model Summary

Variables      Variables                                                                   Adjusted    Std. Error of
Model      Entered       Removed          Method              Model            R       R Square      R Square   the Estimate
1                 .913 a     .834          .786        6.72827
1
VAR00003,
a                    .   Enter                    a. Predictors: (Constant), VAR00003, VAR00002
VAR00002

a. All requested variables entered.
ANOVAb
b. Dependent Variable: VAR00001
Sum of
Model                    Squares       df       Mean Square    F        Sig.
1         Regression    1589.213            2      794.606    17.553      .002 a
Residual       316.888            7        45.270
Total         1906.100            9
a. Predictors: (Constant), VAR00003, VAR00002
b. Dependent Variable: VAR00001

Coefficientsa

Unstandardized        Standardized
Coefficients          Coefficients
Model                    B        Std. Error       Beta                 t         Sig.
1       (Constant)    108.860       14.072                             7.736        .000
VAR00002       -17.350        3.009               -.902       -5.766        .001
VAR00003         1.179         .502                .367        2.346        .051
a. Dependent Variable: VAR00001
More SPSS – Model 2
b
Variables Entered/Remov ed                                                   Model Summary
Variables     Variables                                                                Adjusted      Std. Error of
Model     Entered      Removed          Method           Model             R       R Square     R Square     the Estimate
1                                                        1                  .999 a
.998         .998           .71543
VAR00004,                                              a. Predictors: (Constant), VAR00004, VAR00002,
VAR00002,               .   Enter
a                                                  VAR00003
VAR00003

ANOVAb
a. All requested variables entered.
b. Dependent Variable: VAR00001                                Sum of
Model                    Squares        df       Mean Square      F         Sig.
1        Regression     1903.029             3      634.343    1239.341       .000 a
Residual          3.071             6          .512
Total          1906.100             9
a. Predictors: (Constant), VAR00004, VAR00002, VAR00003
b. Dependent Variable: VAR00001

Coefficientsa

Unstandardized              Standardized
Coefficients                Coefficients
Model                     B        Std. Error             Beta             t            Sig.
1        (Constant)    117.174         1.534                              76.409          .000
VAR00002       -17.119         .320                  -.890      -53.482          .000
VAR00003          .957         .054                   .298       17.665          .000
VAR00004       -14.200         .573                  -.411      -24.761          .000
a. Dependent Variable: VAR00001

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 5 posted: 8/8/2012 language: English pages: 14