# Econometrics Lecture 1 by ild18893

VIEWS: 20 PAGES: 14

• pg 1
```									  Econometrics
Lecture 1

A Review of OLS, Model
Specification and Mis-
specification Testing

1
OUR OBJECTIVES

• To establish whether the hypothesised
economic relationship is supported by
empirical evidence

• To obtain 'good' estimates of the
unknown parameters 1 and 2

• To be able to test hypotheses about the
unknown parameters or to construct
confidence intervals surrounding our
estimates.

• To use our estimated regression model
for other purposes, including particularly
– forecasting
– policy analysis/simulation modelling.

2
THE REQUIREMENTS OF A GOOD
ECONOMETRIC MODEL

•    THE MODEL MUST BE DATA ADMISSIBLE.

–   It must be logically possible for the data to have been generated by the model
that is being assumed to generate the data. This implies that the model
should impose any constraints that the data are required to satisfy (e.g. they
must lie within the 0-1 interval in a model explaining proportions).

•    THE MODEL MUST BE CONSISTENT WITH SOME ECONOMIC
THEORY.

•    REGRESSORS SHOULD BE (AT LEAST) WEAKLY EXOGENOUS
WITH RESPECT TO THE PARAMETERS OF INTEREST

–   This is a requirement for valid conditioning in a single equation modelling
context, to permit efficient inference on a set of parameters of interest.

•    THE MODEL SHOULD EXHIBIT PARAMETER CONSTANCY.
•
–   This is essential if parameter estimates are to be meaningful, and if
forecasting or policy analysis is to be valid.

•    THE MODEL SHOULD BE DATA COHERENT.

–   The residuals should be unpredictable from their past history. This implies the
need for exhaustive misspecification testing.

•    THE MODEL SHOULD BE ABLE TO ENCOMPASS A RANGE OF
ALTERNATIVE MODELS.

3
Model specification
•   Step 1: Specify a statistical model that is consistent with the relevant
prior theory:

–   (i)The choice of the set of variables to include in the model.
–   (ii)The choice of functional form of the relationship (is it linear in the
variables, linear in the logarithms of the variables, etc.?)

•   Step 2: Select an estimator (OLS, GLS, GMM, IV, etc etc)

•   Step 3: Estimate the regression model using the chosen estimator

•   Step 4: Test whether the assumptions made are valid (in which case the
regression model is statistically well-specified) and the estimator will
have the desired properties.

•   Step 5a:
•   If these tests show no evidence of misspecification in any relevant form,
go on to conduct statistical inference about the parameters.

•   Step 5b:
•   If these tests show evidence of misspecification in one or more relevant
forms, then two possible courses of action seem to be implied:

–   If you are able to establish the precise form in which the model is
misspecified, then it may be possible to find an alternative
estimator which will is optimal or will have other desirable qualities
when the regression model is statistically misspecified in a
particular way.
–   Regard statistical misspecification as a symptom of a flawed
model. In this case, one should search for an alternative, well-

4
CLRM Assumption

• The assumptions of the CLRM are:

1. The dependent variable is a linear function of the
set of possibly stochastic, covariance stationary
regressor variables and a random disturbance
term. The model specification is correct.
2. The set of regressors is not perfectly collinear. This
means that no regressor variable can be obtained
as an exact linear combination of any subset of the
other regressor variables.
3. The error process has zero mean. That is, E(ut) = 0
for all t.
4. The errors terms, ut, t=1,..,T, are serially
uncorrelated. That is, Cov(ut,us) = 0 for all s not
equal to t.
5. The errors have a constant variance. That is,
Var(ut) = s2 for all t.

6. Each regressor is asymptotically correlated with
the equation disturbance, ut.

•   We sometimes wish to make the following
assumption:

1. The equation disturbances are normally
distributed, for all t.

5
SPECIFICATION OF THE REGRESSION
MODEL

TRUE         MODEL
Yt  X t  u t             Yt  X t  Z t  u t
ESTIMATED             Yt  X t  u t             A                          B
REGRESSION            Yt  X t  Z t  u t      C                          D

NOTES TO TABLE Z:

Consequences of estimations:
Case A: (TRUE MODEL ESTIMATED)
 and 2 estimated without bias and efficiently



SE  is correct standard error, and so use of t and F tests is valid
Case D: (TRUE MODEL ESTIMATED)
,  and 2 estimated without bias and efficiently
Standard errors are correct, so use of t and F tests valid

Case B: (WRONG MODEL ESTIMATED DUE TO VARIABLE OMISSION )
Model misspecification due to variable omission. The false restriction that  = 0 is being imposed.
ˆ                                                                               ˆ
 is biased. [In the special case where X and Z are uncorrelated in the sample,  is unbiased].


SE  is biased, as is the OLS estimator of 2
Use of t and F tests not valid.

Case C: (WRONG MODEL ESTIMATED DUE TO INCORRECT INCLUSION OF AVARIABLE)
Model misspecification due to incorrectly included variable. The true restriction that  = 0 is not being
imposed.
ˆ
 is unbiased but inefficient (relative to the OLS estimator that arises when the true restriction is

imposed (as in case A). SE  is biased, as is the OLS estimator of 2
Use of t and F tests not valid.

6
Testing Restrictions

• F Test

F =

=               .

NOTE – the sample must be the same for the two
regressions

7
Functional Form
We know y=f(x) but not the correct
functional form

Y = 1 + 2 X                                    LINEAR

ln(Y ) =  1 + 2 ln( X)                         LOGARITHMIC (LOG-LINEAR)

Y =  1 +  2 ln( X )                           LIN-LOG (SEMI-LOG)

ln(Y ) =  1 + 2 X                              LOG-LIN (SEMI-LOG)

1
= 1 + 2 X
Y                                                RECIPROCAL/RATIO FORM
1
Y = 1 + 2
1                                   X
Y
1   2 X

Y   1   2 X   3 X 2 ...      POLYNOMIAL

8
Ramsey’s RESET test

•   First Step – run regression by
OLS and save residuals and
fitted values
•   Second Step : run auxiliary
regression

ut =  1 +  2 X 2t + ...+  k X kt +  yt +  t
ˆ2
ˆ

H0  =0
Ha  0

9
Disturbance Term

• ABSENCE OF DISTURBANCE TERM SERIAL
CORRELATION

• CONSTANCY OF DISTURBANCE TERM
VARIANCE (HOMOSCEDASTICITY)

• NORMALITY OF DISTURBANCE TERM

10
ABSENCE OF DISTURBANCE
TERM SERIAL CORRELATION

• THE DURBIN-WATSON (DW) TEST\
– Only first order autocorrelation
– Only for models without lagged variables
t=T

( u - u
2
ˆ ˆ   t       t -1   )
t= 2
DW =          t=T

u
ˆ
t=1
2
t

• DURBIN'S h STATISTIC


h = (1 - DW / 2) T / [1 - T V() ]

• GODFREY'S LAGRANGE MULTIPLIER (LM)
TESTS

11
Responses to presence
of Autocorrelation
•   a Type 1 error has occurred (you
have incorrectly rejected a true
null)
•   your chosen regression model is
misspecified in some way,
perhaps because a variable has
been incorrectly omitted, there
are insufficient lags in the model,
or the functional form is incorrect.
•   the disturbance term is actually
serially correlated – GLS

12
CONSTANCY OF DISTURBANCE
TERM VARIANCE
(HOMOSCEDASTICITY)

• Types of heteroscedasticity
Var( u t )   2  Z t
t

Var( u t )   2  Z 2
t      t

Var( u t )   2   1   2 Z t
t

Var( u t )   2   1   2 Z 2
t               t

Multiplicative heteroscedasticity
Var( u t )   2  exp( 1   2 Z t )  exp( 1 )  exp( 2 Z t )
t

Discrete switching heteroscedasticity

Var( u t )   1
2
, t  1,..., T1
Var( u t )   2
2   , t  T1  1,..., T

13
Parameters of the Model
• PARAMETER CONSTANCY OVER THE
WHOLE SAMPLE PERIOD

Chow 1 – If enough observation in the two subsample

~ F(k,T - 2k) if H 0 is true

Chow 2 – predictive failure test
not enough observations in the subsample