VIEWS: 20 PAGES: 14 CATEGORY: Templates POSTED ON: 7/30/2010 Public Domain
Econometrics Lecture 1 A Review of OLS, Model Specification and Mis- specification Testing 1 OUR OBJECTIVES • To establish whether the hypothesised economic relationship is supported by empirical evidence • To obtain 'good' estimates of the unknown parameters 1 and 2 • To be able to test hypotheses about the unknown parameters or to construct confidence intervals surrounding our estimates. • To use our estimated regression model for other purposes, including particularly – forecasting – policy analysis/simulation modelling. 2 THE REQUIREMENTS OF A GOOD ECONOMETRIC MODEL • THE MODEL MUST BE DATA ADMISSIBLE. – It must be logically possible for the data to have been generated by the model that is being assumed to generate the data. This implies that the model should impose any constraints that the data are required to satisfy (e.g. they must lie within the 0-1 interval in a model explaining proportions). • THE MODEL MUST BE CONSISTENT WITH SOME ECONOMIC THEORY. • REGRESSORS SHOULD BE (AT LEAST) WEAKLY EXOGENOUS WITH RESPECT TO THE PARAMETERS OF INTEREST – This is a requirement for valid conditioning in a single equation modelling context, to permit efficient inference on a set of parameters of interest. • THE MODEL SHOULD EXHIBIT PARAMETER CONSTANCY. • – This is essential if parameter estimates are to be meaningful, and if forecasting or policy analysis is to be valid. • THE MODEL SHOULD BE DATA COHERENT. – The residuals should be unpredictable from their past history. This implies the need for exhaustive misspecification testing. • THE MODEL SHOULD BE ABLE TO ENCOMPASS A RANGE OF ALTERNATIVE MODELS. 3 Model specification • Step 1: Specify a statistical model that is consistent with the relevant prior theory: – (i)The choice of the set of variables to include in the model. – (ii)The choice of functional form of the relationship (is it linear in the variables, linear in the logarithms of the variables, etc.?) • Step 2: Select an estimator (OLS, GLS, GMM, IV, etc etc) • Step 3: Estimate the regression model using the chosen estimator • Step 4: Test whether the assumptions made are valid (in which case the regression model is statistically well-specified) and the estimator will have the desired properties. • Step 5a: • If these tests show no evidence of misspecification in any relevant form, go on to conduct statistical inference about the parameters. • Step 5b: • If these tests show evidence of misspecification in one or more relevant forms, then two possible courses of action seem to be implied: – If you are able to establish the precise form in which the model is misspecified, then it may be possible to find an alternative estimator which will is optimal or will have other desirable qualities when the regression model is statistically misspecified in a particular way. – Regard statistical misspecification as a symptom of a flawed model. In this case, one should search for an alternative, well- specified regression model, and so return to Step 1. 4 CLRM Assumption • The assumptions of the CLRM are: 1. The dependent variable is a linear function of the set of possibly stochastic, covariance stationary regressor variables and a random disturbance term. The model specification is correct. 2. The set of regressors is not perfectly collinear. This means that no regressor variable can be obtained as an exact linear combination of any subset of the other regressor variables. 3. The error process has zero mean. That is, E(ut) = 0 for all t. 4. The errors terms, ut, t=1,..,T, are serially uncorrelated. That is, Cov(ut,us) = 0 for all s not equal to t. 5. The errors have a constant variance. That is, Var(ut) = s2 for all t. 6. Each regressor is asymptotically correlated with the equation disturbance, ut. • We sometimes wish to make the following assumption: 1. The equation disturbances are normally distributed, for all t. 5 ASSUMPTIONS ABOUT THE SPECIFICATION OF THE REGRESSION MODEL TRUE MODEL Yt X t u t Yt X t Z t u t ESTIMATED Yt X t u t A B REGRESSION Yt X t Z t u t C D NOTES TO TABLE Z: Consequences of estimations: Case A: (TRUE MODEL ESTIMATED) and 2 estimated without bias and efficiently SE is correct standard error, and so use of t and F tests is valid Case D: (TRUE MODEL ESTIMATED) , and 2 estimated without bias and efficiently Standard errors are correct, so use of t and F tests valid Case B: (WRONG MODEL ESTIMATED DUE TO VARIABLE OMISSION ) Model misspecification due to variable omission. The false restriction that = 0 is being imposed. ˆ ˆ is biased. [In the special case where X and Z are uncorrelated in the sample, is unbiased]. SE is biased, as is the OLS estimator of 2 Use of t and F tests not valid. Case C: (WRONG MODEL ESTIMATED DUE TO INCORRECT INCLUSION OF AVARIABLE) Model misspecification due to incorrectly included variable. The true restriction that = 0 is not being imposed. ˆ is unbiased but inefficient (relative to the OLS estimator that arises when the true restriction is imposed (as in case A). SE is biased, as is the OLS estimator of 2 Use of t and F tests not valid. 6 Testing Restrictions • F Test ( RSSR - RSSU) / q F = RSSU /(T - k) ( RSSR - RSSU) T - k = . RSSU q NOTE – the sample must be the same for the two regressions 7 Functional Form We know y=f(x) but not the correct functional form Y = 1 + 2 X LINEAR ln(Y ) = 1 + 2 ln( X) LOGARITHMIC (LOG-LINEAR) Y = 1 + 2 ln( X ) LIN-LOG (SEMI-LOG) ln(Y ) = 1 + 2 X LOG-LIN (SEMI-LOG) 1 = 1 + 2 X Y RECIPROCAL/RATIO FORM 1 Y = 1 + 2 1 X Y 1 2 X Y 1 2 X 3 X 2 ... POLYNOMIAL 8 Ramsey’s RESET test • First Step – run regression by OLS and save residuals and fitted values • Second Step : run auxiliary regression ut = 1 + 2 X 2t + ...+ k X kt + yt + t ˆ2 ˆ H0 =0 Ha 0 9 Assumptions about Equation Disturbance Term • ABSENCE OF DISTURBANCE TERM SERIAL CORRELATION • CONSTANCY OF DISTURBANCE TERM VARIANCE (HOMOSCEDASTICITY) • NORMALITY OF DISTURBANCE TERM 10 ABSENCE OF DISTURBANCE TERM SERIAL CORRELATION • THE DURBIN-WATSON (DW) TEST\ – Only first order autocorrelation – Only for models without lagged variables t=T ( u - u 2 ˆ ˆ t t -1 ) t= 2 DW = t=T u ˆ t=1 2 t • DURBIN'S h STATISTIC h = (1 - DW / 2) T / [1 - T V() ] • GODFREY'S LAGRANGE MULTIPLIER (LM) TESTS 11 Responses to presence of Autocorrelation • a Type 1 error has occurred (you have incorrectly rejected a true null) • your chosen regression model is misspecified in some way, perhaps because a variable has been incorrectly omitted, there are insufficient lags in the model, or the functional form is incorrect. • the disturbance term is actually serially correlated – GLS 12 CONSTANCY OF DISTURBANCE TERM VARIANCE (HOMOSCEDASTICITY) • Types of heteroscedasticity Additive heteroscedasticity Var( u t ) 2 Z t t Var( u t ) 2 Z 2 t t Var( u t ) 2 1 2 Z t t Var( u t ) 2 1 2 Z 2 t t Multiplicative heteroscedasticity Var( u t ) 2 exp( 1 2 Z t ) exp( 1 ) exp( 2 Z t ) t Discrete switching heteroscedasticity Var( u t ) 1 2 , t 1,..., T1 Var( u t ) 2 2 , t T1 1,..., T 13 Assumptions about the Parameters of the Model • PARAMETER CONSTANCY OVER THE WHOLE SAMPLE PERIOD Chow 1 – If enough observation in the two subsample ( RSS 0 - ( RSS 1 + RSS 2 ))/k ~ F(k,T - 2k) if H 0 is true ( RSS 1 + RSS 2 )/(T - 2k) Chow 2 – predictive failure test not enough observations in the subsample ( RSS 0 - RSS 1 )/m ~ F(m, T - m - k) if H0 is true RSS 1 /(T - m - k) 14