# Conclusion to Bivariate Linear Regression

Document Sample

```					Conclusion to Bivariate Linear
Regression

Economics 224 – Notes for November 19, 2008
Reporting regression results
• Equation format OR table format. For each of these:
– Make sure you define x and y, with the units for each also
provided. In your report, make this accessible.
– Report the sample size and units of observation.
– Report the standard errors or t-statistics associated with
each of the regression coefficients.
– Report the coefficient of determination, along with its
statistical significance. ANOVA table could be provided for
a fuller report.
– Each step involves reorganizing the results from Excel or
other statistical programs to one of these conventions.
– Don’t report too many or too few decimals!
Equation format
• Income and alcohol example. x is mean family income per
capita in dollars, 1986 and y is alcohol consumption in litres of
alcohol per capita for those aged 15 or over, 1985-86. n = 10
observations from the ten provinces of Canada.
• The regression equation, with standard errors reported in
brackets, is as follows. R2 for this equation is 0.625 with a P-
value of 0.0065.
y  0.835  0.276 x
ˆ
(2.332) (0.076)
• Alternatively, the t statistic could be reported in the brackets –
make sure you indicate whether it is the standard errors or t
statistics that are reported in the brackets.
Table format
Dependent variable is wages and salaries

Estimated                           Probability
Variable                                Standard Error
Coefficient                           Value
Constant              -13,493              23,211          0.568
Yrs schooling           4,181               1,606          0.017

R2 = 0.253, P = 0.017
(+ Other equation test-statistics)

Could be t-statistics
4
Presenting Multiple Results
Dependent variable is wages and salaries

Variable                Equation I          Equation II         Equation III

-13,493
Constant                                         ….                   ….
(23.211)
4,181
Yrs schooling                                    ….                   ….
(1,606) †
R2                         0.253
Significance               0.017                 …                    …
(+ Other equation test-statistics)
Note: Standard errors in brackets. * – significant at the 1% level,
† – significant at the 5% level, ‡ – significant at the 10% level              5
Residual analysis
• The t-test and F-test theoretically only work if the
assumptions about the error term are met.
E(ε) = 0;
Variance (ε) = σ2 is constant for each x;
Values of ε are independent of each other.
ε is normally distributed.
• If these assumptions are not met:
– Must correct how our model is constructed.
– Or, must come up with a new estimator other than OLS
and work on correcting the problems –> Econ 324 and up.
• Can’t see true ε’s –> must look at our estimates:
– Estimated residuals are ei = yi – ŷi.
– Best way: plot them versus xi or ŷi using Excel or another
program.                                                  6
Residuals (e) for years of schooling (x) and wages                     e y y
ˆ
and salaries regression
y  13,493  4,181x
ˆ                                             x
17
e
4914.407
12     -21180.1
12     30819.88
40000                                                     11      -22999
e                                                                15     -11223.4
30000                                                     15     -13223.4
19     4052.216
20000                                                     15      -2223.4
e y             20     9871.121
10000                                                     16     -25404.5
18     3233.312
0
0       5    10    15      20          25
x    11     15500.98
14     27457.69
-10000                                                    12     -3680.12
14.5    -41132.9
-20000                                                   13.5    19548.24
15      28276.6
-30000
13     1138.789
10     7682.075
-40000
12.5    -17770.7
15      -8223.4
-50000
12.3    14565.56
Last slide
• Example of regression of wages and salaries
on years of schooling. Appears to satisfy
assumptions, although it may have
heteroskedasticity. That is, variance of
residuals may not be equal for all values of x.
Regression of alcohol consumption on income,
1.5

Residuals.
ei = predicted –     1

actual alcohol     0.5

consumption
0
25   27   29   31   33   35       37       39

-0.5

-1

-1.5

x (income)

Appears to have a reasonable scatter of residuals,
with no obvious violation of assumptions.
Consumption function. Example of serial or auto
correlation.

Plot or Residuals of Consumption with GDP on horizontal
15000

10000
Residuals (y - predicted y)

5000

0
800000     850000     900000    950000         1000000   1050000   1100000   1150000

-5000

-10000

-15000
GDP
Solutions
• Tests and results are suspect.
• Violations of assumptions may affect some
estimates more than others.
• Solutions
– May mean we have missing explanatory variables
or wrong equation format.
– May mean that ε does not meet assumptions.
– Use different estimators.
• Examine in detail in courses in Econometrics.
11
Transformations
• Relationship may not be linear. There are
many different possibilities here. Two
examples are provided in other documents:
– Population growth – exponential growth.
– Earnings and age – parabolic relationship.
Some Cautions
• If we conclude β1 ≠ 0
–> doesn’t imply x causes y.
• Could still be random relationships.
• Need some theoretical argument too.

• If we conclude β1 is statistically different than 0
–> doesn’t mean a linear relationship exists for
sure.
– Must watch out for non-linear relationships.
13
Next day
• Begin multiple regression (ASW, Ch. 13).
• Assignment 6 has now been posted.
Remember that this assignment is optional.

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 16 posted: 8/29/2010 language: English pages: 14
How are you planning on using Docstoc?