Econometrics Lecture 1 by ild18893

VIEWS: 20 PAGES: 14

									  Econometrics
    Lecture 1


A Review of OLS, Model
 Specification and Mis-
  specification Testing


                          1
    OUR OBJECTIVES

• To establish whether the hypothesised
  economic relationship is supported by
  empirical evidence

• To obtain 'good' estimates of the
  unknown parameters 1 and 2

• To be able to test hypotheses about the
  unknown parameters or to construct
  confidence intervals surrounding our
  estimates.

• To use our estimated regression model
  for other purposes, including particularly
   – forecasting
   – policy analysis/simulation modelling.


                                               2
    THE REQUIREMENTS OF A GOOD
             ECONOMETRIC MODEL

•    THE MODEL MUST BE DATA ADMISSIBLE.

      –   It must be logically possible for the data to have been generated by the model
          that is being assumed to generate the data. This implies that the model
          should impose any constraints that the data are required to satisfy (e.g. they
          must lie within the 0-1 interval in a model explaining proportions).

•    THE MODEL MUST BE CONSISTENT WITH SOME ECONOMIC
     THEORY.

•    REGRESSORS SHOULD BE (AT LEAST) WEAKLY EXOGENOUS
     WITH RESPECT TO THE PARAMETERS OF INTEREST

      –   This is a requirement for valid conditioning in a single equation modelling
          context, to permit efficient inference on a set of parameters of interest.

•    THE MODEL SHOULD EXHIBIT PARAMETER CONSTANCY.
•
      –   This is essential if parameter estimates are to be meaningful, and if
          forecasting or policy analysis is to be valid.

•    THE MODEL SHOULD BE DATA COHERENT.

      –   The residuals should be unpredictable from their past history. This implies the
          need for exhaustive misspecification testing.

•    THE MODEL SHOULD BE ABLE TO ENCOMPASS A RANGE OF
     ALTERNATIVE MODELS.




                                                                                        3
              Model specification
•   Step 1: Specify a statistical model that is consistent with the relevant
    prior theory:

     –   (i)The choice of the set of variables to include in the model.
     –   (ii)The choice of functional form of the relationship (is it linear in the
         variables, linear in the logarithms of the variables, etc.?)

•   Step 2: Select an estimator (OLS, GLS, GMM, IV, etc etc)

•   Step 3: Estimate the regression model using the chosen estimator

•   Step 4: Test whether the assumptions made are valid (in which case the
    regression model is statistically well-specified) and the estimator will
    have the desired properties.

•   Step 5a:
•   If these tests show no evidence of misspecification in any relevant form,
    go on to conduct statistical inference about the parameters.

•   Step 5b:
•   If these tests show evidence of misspecification in one or more relevant
    forms, then two possible courses of action seem to be implied:

     –   If you are able to establish the precise form in which the model is
         misspecified, then it may be possible to find an alternative
         estimator which will is optimal or will have other desirable qualities
         when the regression model is statistically misspecified in a
         particular way.
     –   Regard statistical misspecification as a symptom of a flawed
         model. In this case, one should search for an alternative, well-
         specified regression model, and so return to Step 1.




                                                                                 4
          CLRM Assumption

• The assumptions of the CLRM are:

1. The dependent variable is a linear function of the
   set of possibly stochastic, covariance stationary
   regressor variables and a random disturbance
   term. The model specification is correct.
2. The set of regressors is not perfectly collinear. This
   means that no regressor variable can be obtained
   as an exact linear combination of any subset of the
   other regressor variables.
3. The error process has zero mean. That is, E(ut) = 0
   for all t.
4. The errors terms, ut, t=1,..,T, are serially
   uncorrelated. That is, Cov(ut,us) = 0 for all s not
   equal to t.
5. The errors have a constant variance. That is,
   Var(ut) = s2 for all t.

6. Each regressor is asymptotically correlated with
   the equation disturbance, ut.

•   We sometimes wish to make the following
    assumption:

1. The equation disturbances are normally
   distributed, for all t.

                                                       5
    ASSUMPTIONS ABOUT THE
SPECIFICATION OF THE REGRESSION
             MODEL


                                                                TRUE         MODEL
                                                  Yt  X t  u t             Yt  X t  Z t  u t
ESTIMATED             Yt  X t  u t             A                          B
REGRESSION            Yt  X t  Z t  u t      C                          D

NOTES TO TABLE Z:

Consequences of estimations:
Case A: (TRUE MODEL ESTIMATED)
 and 2 estimated without bias and efficiently

   
   
SE  is correct standard error, and so use of t and F tests is valid
Case D: (TRUE MODEL ESTIMATED)
,  and 2 estimated without bias and efficiently
Standard errors are correct, so use of t and F tests valid

Case B: (WRONG MODEL ESTIMATED DUE TO VARIABLE OMISSION )
Model misspecification due to variable omission. The false restriction that  = 0 is being imposed.
ˆ                                                                               ˆ
 is biased. [In the special case where X and Z are uncorrelated in the sample,  is unbiased].
    
     
SE  is biased, as is the OLS estimator of 2
Use of t and F tests not valid.

Case C: (WRONG MODEL ESTIMATED DUE TO INCORRECT INCLUSION OF AVARIABLE)
Model misspecification due to incorrectly included variable. The true restriction that  = 0 is not being
imposed.
 ˆ
  is unbiased but inefficient (relative to the OLS estimator that arises when the true restriction is
                             
imposed (as in case A). SE  is biased, as is the OLS estimator of 2
Use of t and F tests not valid.




                                                                                                            6
  Testing Restrictions

• F Test

          ( RSSR - RSSU) / q
      F =
              RSSU /(T - k)

         ( RSSR - RSSU) T - k
       =               .
              RSSU       q

  NOTE – the sample must be the same for the two
                  regressions




                                                   7
           Functional Form
We know y=f(x) but not the correct
functional form

Y = 1 + 2 X                                    LINEAR



ln(Y ) =  1 + 2 ln( X)                         LOGARITHMIC (LOG-LINEAR)



 Y =  1 +  2 ln( X )                           LIN-LOG (SEMI-LOG)




 ln(Y ) =  1 + 2 X                              LOG-LIN (SEMI-LOG)



1
  = 1 + 2 X
Y                                                RECIPROCAL/RATIO FORM
                                             1
                                 Y = 1 + 2
         1                                   X
Y
     1   2 X



Y   1   2 X   3 X 2 ...      POLYNOMIAL




                                                                       8
         Ramsey’s RESET test


•   First Step – run regression by
    OLS and save residuals and
    fitted values
•   Second Step : run auxiliary
    regression

    ut =  1 +  2 X 2t + ...+  k X kt +  yt +  t
                                            ˆ2
    ˆ


                      H0  =0
                      Ha  0




                                                       9
    Assumptions about Equation
         Disturbance Term


• ABSENCE OF DISTURBANCE TERM SERIAL
  CORRELATION

• CONSTANCY OF DISTURBANCE TERM
  VARIANCE (HOMOSCEDASTICITY)


• NORMALITY OF DISTURBANCE TERM




                                       10
   ABSENCE OF DISTURBANCE
   TERM SERIAL CORRELATION


• THE DURBIN-WATSON (DW) TEST\
   – Only first order autocorrelation
   – Only for models without lagged variables
                    t=T

                    ( u - u
                                            2
                       ˆ ˆ   t       t -1   )
                    t= 2
             DW =          t=T

                           u
                            ˆ
                           t=1
                                 2
                                 t




• DURBIN'S h STATISTIC

                                     
       h = (1 - DW / 2) T / [1 - T V() ]


• GODFREY'S LAGRANGE MULTIPLIER (LM)
  TESTS




                                                11
Responses to presence
  of Autocorrelation
•   a Type 1 error has occurred (you
    have incorrectly rejected a true
    null)
•   your chosen regression model is
    misspecified in some way,
    perhaps because a variable has
    been incorrectly omitted, there
    are insufficient lags in the model,
    or the functional form is incorrect.
•   the disturbance term is actually
    serially correlated – GLS



                                       12
 CONSTANCY OF DISTURBANCE
      TERM VARIANCE
    (HOMOSCEDASTICITY)


• Types of heteroscedasticity
   Additive heteroscedasticity
   Var( u t )   2  Z t
                  t

   Var( u t )   2  Z 2
                  t      t

   Var( u t )   2   1   2 Z t
                  t

   Var( u t )   2   1   2 Z 2
                  t               t



   Multiplicative heteroscedasticity
   Var( u t )   2  exp( 1   2 Z t )  exp( 1 )  exp( 2 Z t )
                  t



   Discrete switching heteroscedasticity


   Var( u t )   1
                  2
                      , t  1,..., T1
   Var( u t )   2
                  2   , t  T1  1,..., T




                                                                        13
      Assumptions about the
     Parameters of the Model
• PARAMETER CONSTANCY OVER THE
  WHOLE SAMPLE PERIOD


 Chow 1 – If enough observation in the two subsample


( RSS 0 - ( RSS 1 + RSS 2 ))/k
                               ~ F(k,T - 2k) if H 0 is true
  ( RSS 1 + RSS 2 )/(T - 2k)


            Chow 2 – predictive failure test
       not enough observations in the subsample


  ( RSS 0 - RSS 1 )/m
                      ~ F(m, T - m - k) if H0 is true
   RSS 1 /(T - m - k)



                                                        14

								
To top