notes1 by blue123

VIEWS: 69 PAGES: 36

									Econometrics II. Lecture Notes 1

ESTIMATING SYSTEMS OF
EQUATIONS BY OLS, GLS and
GMM


 1. Introduction: SUR and Linear Panel Data models.

 2. System OLS estimation of Multivariate Linear Systems.

 3. GLS and FGLS estimation of Multivariate Linear Systems.

 4. Examples

    (a) The SUR model.
    (b) Panel Data Model

 5. A General Linear System of Equations with Endogenous Regressors

 6. Generalized Method of Moments Estimation

    (a) The System 2SLS Estimator
    (b) Optimal Estimates
    (c) The 3SLS (FIV) Estimator

 7. Testing using GMM

 8. Optimal instruments


                                       1
Econometrics II-1. Systems of Equations. 2009/10                   UC3M. Master in Economic Analysis


1.1      Introduction

We consider different methods of estimation of systems of linear equations: system OLS,
GLS and FGLS. GLS is more efficient at the cost of more stringent assumptions, while
system OLS may have interpretations in terms of single equation OLS.

EX. 1 (SUR: Seemingly Unrelated Regressions) Population model of G linear equa-
tions
                      y1 = x1 β 1 + u1
                      y2 = x2 β 2 + u2                     xg β g          g = 1, . . . , G      (1.1)
                                                          Kg ×1 Kg ×1
                                ···
                     yG = xG β G + uG
xg might be the same for each equation, but could have different dimensions. The regres-
sions are seemingly unrelated because the parameter vectors β g are different. However
there could be correlation across the errors ug .


  Random draws:
                                   yig = xig β g + uig , i = 1, . . . , n.
Inferences are done as n tends to infinity.

For the study of the properties of different estimates of β g we need assumptions on the
relationship of the explanatory variables (x1 , x2 , . . . , xG ) and the unobservables ug . If the
system is structural (without omitted variables, errors-in-variables or simultaneity), then
we can assume that
                          E [ug |x1 , x2 , . . . , xG ] = 0,      g = 1, . . . , G.              (1.2)

(1.2) implies that ug is uncorrelated with the explanatory variables in all equations.
• If the regressors are the same for all equations then the assumption is only
                                       E [ug |x] = E [ug |xg ] = 0.
•If the xg are not the same, then the variables excluded for equation g, have no effect on
yg once xg has been taken into account:
                  E [yg |x1 , x2 , . . . , xG ] = E [yg |xg ] = xg β g ,     g = 1, . . . , G.




                                                      2
Econometrics II-1. Systems of Equations. 2009/10            UC3M. Master in Economic Analysis


EX. 2 (Panel Data Models) For each cross section unit we observed data on all the
set of variables for T periods:

                     yt = xt β + ut          t = 1, . . . , T     xt       β            (1.3)
                                                                 K×1       K×1


The model is static if all the variables in xt are contemporaneous (no lagged variables).

• In (1.1) each equation explains a different dependent variable for the same cross sec-
tion unit.
• In (1.3) there is only a single dependent variable, but observed -together with the ex-
planatory variables- in several periods. However the statistical properties of estimates
can be studied under the same set-up.


  Different types of exogeneity:


   • Contemporaneous exogeneity:

                                E [ut |xt ] = 0,       t = 1, . . . , T.                (1.4)

   • Strict exogeneity:
                                   E [ut |x1 , x2 , . . . , xT ] = 0,
     which is stronger than contemporaneous exogeneity, and together with the model
     (1.3) implies E [yt |x1 , x2 , . . . , xT ] = E [yt |xt ] = xt β.


Strict exogeneity may fail if the regressors contain lagged endogenous variables, xt =
(1, yt−1 ) , or in the presence of a finite distributed lag model.

Which condition is assumed, determines the consistency of the estimation procedures of
β, and the validity of inference rules.




                                               3
Econometrics II-1. Systems of Equations. 2009/10              UC3M. Master in Economic Analysis


1.2      System OLS Estimation of Multivariate Linear
         Systems

We have IID cross section observations

                                   Xi , yi      : i = 1, 2, . . . , n
                                   K×G G×1

of the model

                                y = X β + u, with            β                            (1.5)
                                                            K×1

The idea is to use the form of the covariance matrix of u to obtain more efficient estimates
than single equation methods.


EX. 1 (SUR) y = (y1 , y2 , . . . , yG ) , u = (u1 , u2 , . . . , u) ,
                                                       
                          x1 0 0 · · · 0                                      
                                                                        β1
                      0 x2 0                       0 
                                                                          β2
                                                                              
                                        ...         . 
                                                       
                                                    . , β = 
                                                                              
              X = 0 0                              .
                     
             K×G
                                                                           .
                                                                           .
                                                                               
                                                         K×1             .
                      .                                                       
                      .                                                      
                      .                            0                   βG
                           0 0 0 · · · xG

where K = K1 + K2 + · · · + KG .


EX. 2 (Panel Data Models) Here

                                    X = (x1 , x2 , . . . , xT ) ,
                                   K×T

so all equations have the restriction of having the same parameter vector.




                                                 4
Econometrics II-1. Systems of Equations. 2009/10                  UC3M. Master in Economic Analysis


ASS. 1 (Orthogonality)
                                          E [Xu] = 0.


This Assumption is similar to the orthogonality condition for OLS estimation of single
equations, though it has different meanings for each application in terms of the compo-
sition of X. In most applications some elements of X are equal to 1, so ASS. 1 implies
that E [u] = 0.


EX. 1 (SUR) Here Xu = (x1 u1 , x2 u2 , . . . , xG uG ) , so ASS. 1 holds iff

                             E [xg ug ] = 0,        g = 1, 2, . . . , G,

but does not require xg and uh to be orthogonal for h = g.


EX. 2 (Panel Data Models) Here
                                     T
                             Xu =         xt ut ,       t = 1, 2, . . . , T,
                                    t=1

so a sufficient natural condition for ASS. 1 to hold is

                              E [xt ut ] = 0,       t = 1, 2, . . . , T.

Like (1.4), this allows xt and us to be correlated when s = t, but is weaker than strict
exogeneity.


ASS. 1 is the weakest for get consistent consistent estimates of β in a regression frame-
work. Much stronger is the assumption that

                                      E [u|X] = 0                                             (1.6)

which implies that every element of u and every element of X are uncorrelated.




                                                    5
Econometrics II-1. Systems of Equations. 2009/10          UC3M. Master in Economic Analysis


Under ASS. 1 the vector β satisfies

                                  E [X (y − X β)] = 0,                                (1.7)

or E [XX ] β = E [Xy] . Since XX is (random) symmetric, positive semidefinite (psd),
then E [XX ] is also K × K symmetric, psd matrix. To be able to estimate β we need
that it is the only vector that satisfies (1.7).


ASS. 2 (Rank Condition)

                        E [XX ]     is nonsingular (has rank K).


Then, under ASS. 1 and ASS. 2 we can write
                                                −1
                                  β = E [XX ]        E [Xy] ,

so these assumptions identify the vector β. By the analog principle we can estimate β
by the sample analogue
                               ˆ             −1
                              β n = En [XX ] En [Xy] ,
which is the System Ordinary Least Squares (SOLS) Estimator, given En [XX ] is
positive definite. Consistency of the SOLSE follows by taking probability limits and the
WLLN:


THM. 1 (Consistency of SOLS) Under ASS. 1 and 2

                                        ˆ
                                        β n →p β

as n → ∞.


Depending on the structure of β and X the solved OLS problem will be different.




                                            6
Econometrics II-1. Systems of Equations. 2009/10                       UC3M. Master in Economic Analysis


EX. 1 (SUR) Here
                                                                             
             En [x1 x1 ]     0        0 ···       0                                                                 
                                                                                                     En [x1 y1 ]
                0       En [xG xG ] 0            0                           
                                                                                                       En [x2 y2 ]
                                                                                                                    
                                     ..            .
                                                                             
                                                   .
                                                                                                                    
En [XX ] =      0           0          .          .                          ,       En [Xy] =                    .
                                                                             
                                                                                                           .
                                                                                                           .
                                                                                                           .
                                                                                                                    
                .
                 .
                                                                                                                   
                .                                0                           
                                                                                                     En [xG yG ]
                 0           0        0 · · · En [xG xG ]

                                    ˆ       ˆ ˆ              ˆ               ˆ
Therefore SOLS can be written as β n = β n1 , β n2 , . . . , β nG where each β ng is the
single-equation OLS estimator of the g-equation: System OLS of a SUR model is
equivalent to OLS equation by equation (without extra restrictions on the parameter
vector).


EX. 2 (Panel Data Models) Here
                                    T                                              T
                En [XX ] = En               xt xt ,         En [Xy] = En                 xt yt ,
                                    t=1                                           t=1

so
                                n       T                  −1      n    T
                      ˆ
                      βn =                   xit xit                         xit yit ,
                               i=1 t=1                             i=1 t=1

which is called the Pooled Ordinary Least Squares (POLS) Estimator because is
equivalent to run OLS for all observations running in both indexes, i and t (pooling or
staking all observations of, e.g., yit in a single vector of dimension T n × 1).

The POLSE is consistent under the orthogonality condition (1.4) and that
                                                  T
                              rank E                       xt xt       = K.
                                                 t=1



  In the general system (1.5), System OLS may have not an interpretation in terms of
equation by equation or pooled OLSE, e.g. when we impose in the SUR model cross-
equation restrictions.

  Unbiasedness (conditional on X) follows under the additional assumption that rank(En [XX ]) =
K, together with E[u| X] = 0 (which implies ASS. 1.)




                                                       7
Econometrics II-1. Systems of Equations. 2009/10         UC3M. Master in Economic Analysis


THM. 2 (CAN of SOLS) Under ASS. 1 and 2
                             √
                                   ˆ
                                 n β n −β →d N(0, A−1 BA−1 ),

where

                                     A := E [XX ]
                                     B := E [Xuu X ]

if the elements of Xuu X have finite expected absolute value.


  AVar Estimation. Consistent estimation of A is simple by means of

                                      ˆ
                                      An := En [XX ] ,

while a consistent estimate of B can developed by the analogy principle, since En [Xuu X ] →p
B := E [Xuu X ] . Therefore, because the u are not observed, we use instead the SOLS
residuals:
                                        ˆ              ˆ
                           un = y − X β n = u − X β n −β ,
                           ˆ
and given that the expectation of Xuu X is finite, then

                                  ˆ
                                  Bn := En [Xˆ u X ] →p B.
                                             uˆ
                  √     ˆ
Therefore AVar        n βn − β     is consistently estimated by

                                    ˆ     ˆn ˆ ˆn
                                    Vn := A−1 Bn A−1 ,                                (1.8)

which is a robust variance matrix estimator, because does not require particular
assumptions on the second moments of u :


   • The unconditional variance matrix Ω := V [u] = E [uu ] is unrestricted, allowing:
        - in a SUR system: for cross equation restrictions and different variances in each
        equation.
        - in Panel data models: for arbitrary serial correlation and time-varying variances
        in the disturbances.

   • The conditional variance matrix V [u|X] can depend on X in any way.




                                              8
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


   In some cases it can be desirable to impose more structure on the conditional and
unconditional covariance matrix of u to simplify its estimation, such as Ω := E [uu ] =
V [u|X] .


  For testing
                                     H0 : R β = r
                                          q×K

we can use the Wald statistic
                                                       −1
                              ˆ
                     W n = n Rβ n − r        ˆ
                                            RV n R           ˆ
                                                            Rβ n − r

which under H0 , converges in distribution to a χ2 .
                                                 q


In the SUR model this is the most general form of testing cross equation restrictions
among the parameters in different equations.




                                            9
Econometrics II-1. Systems of Equations. 2009/10      UC3M. Master in Economic Analysis


1.2.1     GLS and FGLS Estimation of Linear Systems

If we strengthen ASS. 1 and add assumptions on the conditional variance matrix of u,
V [u|X] , we can do better than OLS by means of GLS. As in single equation GLS, the
idea is to transform the model into a system of equations where the error has a scalar
variance-covariance matrix, multiplying (1.5) by Ω−1/2 :

                  Ω−1/2 y = Ω−1/2 X β + Ω−1/2 u, or y∗ = X∗ β + u∗ ,

where E [u∗ u∗ ] = IG .

Then the Generalized Least Squares (GLS) Estimator of β is En [X∗ X∗ ]−1 En [X∗ y∗ ] ,
i.e.,
                      ˆ GLS
                     β n := En XΩ−1 X
                                        −1
                                           En XΩ−1 y .

For consistent GLS estimation we need that each element of u is uncorrelated with each
element of X :


ASS. 3 (GLS Orthogonality)

                                     E [X ⊗ u] = 0.


In practice, one element of X is 1, so ASS. 3 implies that E [u] = 0. This is stronger
than ASS. 1 and a sufficient condition is the zero mean conditional expectation (1.6).

For GLS the key element is the second-moment matrix of u, Ω := E [uu ] , so in place
of ASS. 2 we impose a not too restrictive weighted version:


ASS. 4 (GLS Rank Condition) Ω is positive definite and E XΩ−1 X is nonsingu-
lar (has rank K).




                                           10
Econometrics II-1. Systems of Equations. 2009/10         UC3M. Master in Economic Analysis


  Consistency of GLS under ASS. 3 and ASS. 4. We can write that

                       ˆ GLS
                       β n − β := En XΩ−1 X
                                                    −1
                                                         En XΩ−1 u .

By the WLLN, En XΩ−1 X →p E XΩ−1 X . By ASS. 4 and Slutsky’s Theorem
                                             −1
                               En XΩ−1 X           →p A−1 ,

where
                                   A := E XΩ−1 X .
To show that En XΩ−1 u →p 0, we can use the WLLN and that E XΩ−1 u = 0,
because since each element of u is uncorrelated with each element of X, so is any linear
combination of X, such as XΩ−1 :

                   vec En XΩ−1 u         = En [u ⊗ X] vec Ω−1 = 0.

However consistency may fail under ASS. 1, because E [Xu] = 0 does not imply E XΩ−1 u =
0, except for particular Ω, because the transformation by Ω−1/2 induces some correlation
between X∗ and u∗ .


  Asymptotic Normality of GLS: we need ASS. 3 and 4 and some extra moment
conditions:
             √                          −1 √
                 ˆ GLS
               n β n − β := En XΩ−1 X        nEn XΩ−1 u .
By the CLT,
                             √
                                 nEn XΩ−1 u →d N (0, B)
where
                              B := E XΩ−1 uu Ω−1 X ,
(given the expectation is OK) so it is immediate to obtain that
                  √                           √
                        ˆ GLS
                      n βn − β       =     A−1 nEn XΩ−1 u + op (1)
                                     →d N 0, A−1 BA−1 .

In general A = B, so we do not obtain the usual GLS AVar, A−1 .




                                            11
Econometrics II-1. Systems of Equations. 2009/10           UC3M. Master in Economic Analysis


   Feasible GLS. The GLS estimate requires knowing Ω up to scale, Ω = σ 2 C, where
C is a known G × G pd matrix and σ 2 can be an unknown constant. Since generally C is
                                                                                 ˆ
unknown we need some feasible procedure, replacing Ω by some consistent estimate Ωn .
First-order asymptotic properties of FGLS would be equivalent to those of GLS under
ASS. 3 and 4.

Given that
                                         ˆ
                                         Ωn →p Ω
as n → ∞, the Feasible GLS (FGLS) Estimate is
                                                   −1
                         ˆ F GLS := En XΩ−1 X
                         βn             ˆn                  ˆ   −1
                                                        En XΩn y .

This estimate is generally known as the SUR Estimate, because it exploits the possible
correlation among the componentes of u.

      ˆ
  For Ωn we can use the residuals of a first estimation by SOLS,
                                      ˆ         u ˆ
                                      Ωn := En [ˆ n un ]                                (1.9)
                  ˆ      ˆ
where un := y −X β n and β n is the OLSE, consistent under ASS. 1 and 2 (and therefore
      ˆ
under ASS. 3 and 4).

                                                ˆ
Sometimes the elements of Ω are restricted, so Ωn can exploit these restrictions, but if
false the estimates would be inconsistent in general.


THM. 3 (CAN of FGLS) Under ASS. 3 and 4
                 √
                     ˆ F GLS −β →d N(0, A−1 BA−1 ),
                   n βn

where
                    A := E XΩ−1 X ,          B := E XΩ−1 uu Ω−1 X
if the elements of Xuu X have finite expected absolute value.

             ˆ F GLS and β GLS are equivalent for asymptotic inference: it does not matter
  Therefore β n            ˆn
that Ω has to be estimated, though, undoubtedly, it will affect finite sample performance.
The estimate of the AVar of β n ˆ F GLS is n−1 A−1 Bn A−1 , which is valid under ASS. 3 and 4,
                                               ˆn ˆ ˆn
                        F GLS
where, un := y − X β n
       ˜              ˆ       ,
                            ˆ         ˆ −1
                            An := En XΩn X
                             ˆ         ˆ −1 ˜ ˜ ˆ n
                             Bn := En XΩn un un Ω−1 X ,


                                             12
Econometrics II-1. Systems of Equations. 2009/10     UC3M. Master in Economic Analysis


   Asymptotic Variance of FGLS. We have not showed yet that FGLS is better
in any sense compared to SOLS, and it is less robust, since needs more assumptions
for consistency and asymptotic normality. However, under some additional system ho-
moskedasticity assumptions it is more efficient than SOLS :

ASS. 5 (System Homoskedasticity)
                         E XΩ−1 uu Ω−1 X = E XΩ−1 X ,
where Ω := E [uu ].

When G = 1, ASS. 5 is equivalent to the usual conditional homoscedasticity assumption
for single equation OLS. If Ω is diagonal and X has the structure of SUR or panel data,
ASS. 5 implies a kind of conditional homoskedasticity in each equation. A sufficient, but
not necessary, condition is that E [uu |X] = E [uu ] = Ω.

THM. 4 (Efficiency of FGLS) Under ASS. 3-5 the asymptotic variance of the FGLS
estimator is A−1 .

• This is the usual formula for the asymptotic variance of FGLS, and in this case we can
                           ˆ
use the previous estimate An (but this is non robust to cond. heteroskedasticity in u.)

• Also, under the previous assumptions, the FGLS estimator is more efficient than the
SOLS estimator: and in fact, FGLS is more efficient than any other estimator that uses
the orthogonality conditions E [X ⊗ u] = 0.

   Testing. We can use robust or nonrobust versions of the asymptotic variance to
construct t statistics, confidence intervals or more general Wald statistics, with chi-
square limit distributions to test
                                    H0 : R β = r.
                                         q×K


• Pseudo-LR: If ASS. 5 holds under H0 , then we can define a test statistic based on
the weighted sum of squared residuals, estimating the model with and without
restrictions imposed on β, where the same estimate of Ω are used in both cases (so that
                                                                                   ˆ
is consistent under H0 and H1 , e.g. that based on the unrestricted SOLS residuals un ).
                            r
                          ˜
Thus under ASS. 3-5, if un are the residuals from constrained FGLS (with q restrictions
    ˜                   ˆ
on β n ) using variance Ωn ,
                             ˜n ˆ n ˜n      ˜ ˆn ˆ
                  LRn = n En ur Ω−1 ur − En un Ω−1 un          →d χ2 .
                                                                   q




                                           13
Econometrics II-1. Systems of Equations. 2009/10           UC3M. Master in Economic Analysis


1.3      The SUR model

OLS equation by equation is simple and leads to standard inference under the OLS
homoskedasticity assumption E u2 |xg = σ 2 . By contrast, a sufficient condition for
                                   g      g
consistency of FGLS for β g requires that
                             E [xg uh ] = 0, g, h = 1, 2, . . . , G,
which is ASS. 3 for the SUR model.


  However we may be interested in running FGLS because:

   • Efficiency. If we can maintain that E [uu |X] = E [uu ] , in addition to ASS. 3-4,
     the FGLS is asymptotically at least as efficient as SOLS.

   • Testing. SOLS does not provide an easy way to test cross-equation hypothesis
     (unless we use AVar estimates such as (1.8)).

  OLS vs FGLS. There are two cases were FGLS is equivalent to OLS:

     ˆ                                  ˆ
   • Ωn is diagonal. In applications Ωn should not be diagonal unless we impose such
                                                        ˆ                  √
     restriction. If Ω is diagonal, then consistency of Ωn will lead to the n-asymptotic
     equivalence of FGLS and OLS -though they are not algebraically equivalent.

   • If x1 = x2 = · · · = xG -same regressors in all equations. This means that FGLS
     improves efficiency by using exclusion restrictions in some equations (when e.g.
     x1 = x2 ). Without these restrictions there are no efficiency gains

Even in this case, it is interesting to use SUR subroutines to estimate such models, in
                                                        ˆ                          ˆ
order to obtain estimates of joint covariance matrix of β n , no only that of each β ng via
equation by equation OLS.

   Cross equation restrictions in SUR: In some models there are cross equation
restrictions among the β g . These models can still be written in the general form, and
therefore, are amenable for OLS and FGLS.

These models are often termed ”SUR”, though now the equations are effectively related.
The methods rely on appropriately defining β and x accordingly. Then it is quite simple
to estimate and to test the restrictions, using e.g. sum of squared residuals test statistics.


                                               14
Econometrics II-1. Systems of Equations. 2009/10                UC3M. Master in Economic Analysis


1.4     Panel Data Model

We study the dynamic relationships in the model, for which we need the data ordered
over time for each cross section unit in

                            yt = xt β + ut ,          t = 1, 2, . . . , T


• β is the same for all periods, but with particular choices of xt we can allow for
parameters changing over time (e.g. period dummies).

• It can be that xt is not changing over time, describing characteristic not changing over
time (e.g. gender dummies). Thus it can be interesting to allow for different intercepts
for each time period if T is small and n large.

  Sufficient assumptions for Pooled OLS to estimate consistently β :


ASS. 6 (Orthogonality POLS)

                              E [xt ut ] = 0,      t = 1, 2, . . . , T


ASS. 7 (Rank POLS)
                                         T
                                rank            E [xt xt ]    = K.
                                        t=1



• ASS. 6 says nothing about the relationship between xt and us for t = s.

• ASS. 7 rules out perfect linear dependencies among explanatory variables (for all
periods).




                                                 15
Econometrics II-1. Systems of Equations. 2009/10                  UC3M. Master in Economic Analysis


  Use of OLS statistics from the POLS regression across i and t, requires additional
homoskedasticity and no serial correlation assumptions:


ASS. 8 (Cond. Homosked. No autocorrelation. POLS)

            a) E u2 xt xt
                  t              = σ 2 E [xt xt ] ,    t = 1, 2, . . . , T, σ 2 = E u2 , ∀t.
                                                                                     t

           b) E [ut us xt xs ] = 0,        t = s, t, s = 1, 2, . . . , T


ASS. 8.a) holds if E [u2 |xt ] = σ 2 for all t : so the conditional variance does not depend
                       t
on xt and is constant for each period.

ASS. 8.b) sets the conditional covariances equal to 0. A sufficient condition is that
E [ut us |xt , xs ] = 0, t = s, t, s = 1, 2, . . . , T and a necessary condition is E [ut us ] = 0,
t = s, if xt includes a constant.

Therefore, ASS. 8 imposes a particular unconditional covariance matrix for u,
E [uu ] = σ 2 IT , but also restricts the conditional covariances.


Theorem 1 (Asymptotic Normality of POLS) Under ASS. 6-7 the POLS estimate
is CAN. If ASS. 8 also holds then
                                                                            T                 −1
                    ˆ                                  −1
               AVar β n = n−1 σ 2 (E [XX ])                 = n−1 σ 2            E [xt xt ]        ,
                                                                           t=1


which is equal to (nT )−1 σ 2 (E [xt xt ])−1 if E [xt xt ] is constant, and an estimate of AVar β n
                                                                                                ˆ
is
                                                              n  T          −1
                      1 2                 −1    1 2 1
                        ˆ
                       σ (En [XX ]) = σ n         ˆ                 xit xit    ,
                      n n                       n        n i=1 t=1
where σ 2 is the usual OLS residual-variance estimator of σ 2 from the pooled regression
      ˆn
with nT observations.


Therefore the usual t and F statistics are valid asymptotically. Note that under ASS. 8,

                                                 B = σ 2 A,

where now
                             T                                T   T
                      A=          E [xt xt ] ,        B=                E [ut us xt xs ] ,
                            t=1                              t=1 s=1
are the matrices appearing in the AVar of THM. 2 for the Panel data case.

                                                      16
Econometrics II-1. Systems of Equations. 2009/10                   UC3M. Master in Economic Analysis


   Dynamic Completeness. Not always ASS. 8 can be maintained, but it must hold
if we wish to use results from standard OLS asymptotics. A sufficient condition is

                         E [yt |xt , yt−1 , xt−1 , . . . , y1 , x1 ] = E [yt |xt ] .          (1.10)

This establishes the dynamic completeness of the conditional mean: xt contains sufficient
lags of all variables such that additional lagged variables have no partial effect on yt . If
additionally, the homoskedasticity assumption V [yt |xt ] = σ 2 holds, then ASSs. 6 and 8
both hold and standard OLS inference is valid.

If (1.10) does not hold, then care must be taken when estimating the asymptotic
variance of the POLS Estimator to be robust to serial correlation (and also to
heteroskedasticity). However POLS is consistent in any case.


  Robust Asymptotic Variance Matrix.

We need a consistent estimate of AVar(β) which is valid in absence of the restrictive
ASS. 8. The general form of the estimator is

                                       ˆ    1ˆ ˆ ˆ
                                       Vn := A−1 Bn A−1 ,
                                            n n      n

       ˆ                        ˆ
where An := En [XX ] and Bn := En [Xˆ n un X ] , using the T × 1 POLS residuals
                                              u ˆ
               ˆ
un,i = yi − Xi β n for cross section observation i. In any case the data have to be stored
ˆ
in such a way that (yi , Xi ) are stacked on the top of one another for i = 1, . . . , n.


  Testing for Heteroskedasticity.

The basic issue is the checking of ASS. 8 (apart from the serial correlation problem).
Suppose that E [ut |xt ] = 0, t = 1, 2, . . . , T, which is slightly stronger that ASS. 6 but is
weaker than strict exogeneity. Then, the null of homoskedasticity can be stated as

                           H0 : E u2 |xt = σ 2 ,
                                   t                         t = 1, 2, . . . , T

Under H0 , u2 is uncorrelated with any function of xt : let ht be a Q × 1 vector of
            t
nonconstant functions of xt . It can include dummy variables for different time periods.

The usual procedure is to use a LM test regressing the squares of the POLS residuals,
u2 , on hi,t ,
ˆn,t
                      u2 |1, hi,t , t = 1, . . . , T ; i = 1, 2, . . . , n,
                      ˆn,i,t
so the test statistic nT R2 is asymptotically χ2 under H0 .
                                               Q


                                                    17
Econometrics II-1. Systems of Equations. 2009/10             UC3M. Master in Economic Analysis


1.5      A General Linear System of Equations with En-
         dogenous Regressors

We consider systems of equations where the explanatory variables may not satisfy the
exogeneity assumptions necessary for the consistency of SOLS and GLS procedure: In-
strumental Variables methods are needed. The current approach to System IV is by
means of the Generalized Method of Moments (GMM). The asymptotic properties of
such methods can be deduced in a similar way to the single equation framework.

The most well-known application of SIV estimation is to Simultaneous Equation Models
(SEM), but the methods go beyond, including the analysis of panel data models.

EX. 3 (SEM: Labor Supply and Wage Offer Functions) Consider a labor supply
function for the hours of labor supply, hs , at any wage, ω, for a given individual:
                                 hs (ω) = γ 1 ω + z1 δ 1 + u1
where z1 is a vector of observed labor supply shifters (education, age, experience, children,
etc.). Though this equation describes the utility-maximizing behaviour of an individual,
we can only observe equilibrium values. A wage offer function gives the hourly wage that
the market offers as a function of hours worked
                                  ω o (h) = γ 2 h + z2 δ 2 + u2
where z2 are productivity measures (education, experience, etc.). If we assume that the
observed hours and wage are such that both equations are satisfied, then the equilibrium
values (h, ω) satisfy
                                  h = γ 1 ω + z1 δ 1 + u1
                                  ω = γ 2 h + z2 δ 2 + u 2 .
Under restrictions on the parameters, the equations can be solved uniquely for (h, ω) as
functions of z1 , z2 , u1 , u2 and the parameters. If further z1 , z2 are exogenous, such that
                              E [u1 |z1 , z2 ] = E [u2 |z1 , z2 ] = 0,
then we can estimate consistently the parameters (with the usual identification assump-
tions).

Note that in general ω will be correlated with u1 in the first equation, while h will be
correlated with u2 : ω is endogenous in the first equation, and h is endogenous in the
second.


                                                18
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


EX. 4 (Omitted Variables: student performance) Consider a model to test the
effect of Head Start participation (measured as the binary variable HeadStart) on sub-
sequent student performance

                           score = γ 1 HeadStart + z1 δ 1 + u1 .

z1 contains other observed factors (income, education, etc.). u1 contains unobserved
factors that affect score, such as child’s ability, that might be correlated with HeadStart.
To capture the possible endogeneity of Headstart we may set

                                 HeadStart = z δ 2 + u2 ,

where z should contain at least one factor affecting HeadStart participation, but which
does not have a direct effect on score (e.g. distance): we only want to assume that
E [zu2 ] = 0. Correlation between u1 and u2 indicate that Head Start is endogenous in
the first equation.



The previous examples can be written as

                                    y1 = x1 β 1 + u1
                                    y2 = x2 β 2 + u2

which is like a SUR system, but where x1 and x2 can contain also endogenous regres-
sors. Since x1 and x2 are generally correlated with u1 and u2 , System OLS and GLS
estimation of these equations will be inconsistent.

We could apply single equation methods to each of them like 2SLS, but we could exploit
the joint information of all the system variables to improve the efficiency.




                                            19
Econometrics II-1. Systems of Equations. 2009/10           UC3M. Master in Economic Analysis


We study the following model

                         y = X β + u,      with      y , X , β .
                                                     G×1   K×G       K×1

The rows can represent different time periods for the same unit or different variables, so
we cover also panel data models.


ASS. 9 (Orthogonality)
                                        E[Zu] = 0
                                          L×1

where Z is a L × G matrix of observable instrumental variables.


We may assume that E [u] = 0, which would be true in most cases. ASS. 9 is not
enough to identify β. A sufficient condition is a rank condition that generalizes the
rank condition for single equations:


ASS. 10 (Rank Condition)

                                    rank E[ZX ] = K.
                                            L×K



A necessary condition for ASS. 10 to hold is the order condition L ≥ K.


EX. 5 Consider a G equation system

                                   y1 = x1 β 1 + u1                                   (1.11)
                                    .
                                    .
                                    .
                                   yG = xG β G + uG

where for each equation g, xg is Kg × 1 vector that may contain both exogenous and
endogenous variables. This looks like the SUR system, except from the different prop-
erties of some elements of xg , which might be endogenous. Then, for each equation we
have a set of Lg × 1 instrumental variables zg which are exogeneous

                              E [zg ug ] = 0,    g = 1, . . . , G.

In most cases one element of zg is the intercept, so E [ug ] = 0.



                                                20
Econometrics II-1. Systems of Equations. 2009/10                UC3M. Master in Economic Analysis


   If xg contains some elements correlated with ug , the zg must contain more than just
the exogeneous variables appearing in each equation. In many cases the set instruments
consistent of all exogenous variables in the system are valid for each equation, zg = z,
g = 1, . . . , G. In some cases this is not possible.

Then we have y = (y1 , y2 , . . . , yg ) ,   u = (u1 , u2 , . . . , ug ) ,
                                                              
                              x1 0            0 ··· 0                               
                                                                              β1
                         0 x2                0          0 
                                                                                β2
                                                                                    
                                             ..           . 
                                                              
                                                          . , β = 
                                                                                    
                X = 0 0                        .         .
                        
               K×G
                                                                                 .
                                                                                 .
                                                                                     
                                                                K×1            .
                         .                                                          
                         .                                                         
                         .                              0                    βG
                               0 0            0 · · · xG

where X is K × G, with K = K1 + K2 + · · · + KG , and
                                                                       
                                 z1 0 0 · · · 0
                                                                       
                               0 z2 0               0                  
                                           ..         .
                                                                       
                         Z = 0 0
                              
                                              .       .
                                                      .
                                                                        
                                                                        
                        L×G    .                                       
                               .
                               .                    0                  
                                                                        
                                 0 0 0 · · · zG

where Z is L × G, with L = L1 + L2 + · · · + LG .          Then Zu = z1 u1 , z2 u2 , . . . , zg ug   and
                                                                              
                            E [z1 x1 ]     0                0 ···       0
                                                                              
                              0       E [z2 x2 ]           0           0      
                                                           ..           .
                                                                              
                E [ZX ] = 
                          
                               0           0                  .         .
                                                                        .
                                                                               
                                                                               
                               .
                                .
                                                                               
                          
                               .                                       0      
                                                                               
                               0           0                0 · · · E [zG xG ]

where E zg xg is Lg × Kg . ASS. 10 requires that E [ZX ] is full column rank, where the
number of columns is K.

   Since a block diagonal matrix has full column rank iff each block of the matrix is full
column rank: ASS. 10 holds iff

                               rank E zg xg = Kg ,          g = 1, . . . , G,

which is exactly the rank condition needed to estimate each equation by 2SLS. There-
fore identification of the SUR system is equivalent to identification of equation by
equation.

                                                    21
Econometrics II-1. Systems of Equations. 2009/10           UC3M. Master in Economic Analysis


1.6     Generalized Method of Moments Estimation

The orthogonality conditions in ASS. 9 suggest the estimation method: under ASS. 9
and ASS. 10 the vector β is the unique K × 1 vector that

                                     E [Z (y − X β)] = 0,

in other words, β is identified. Therefore, by the analog principle we can estimate β by
the sample analogue choosing an estimate that satisfies

                                          ˆ
                               En Z y − X β n            = 0,                           (1.12)

                                                          ˆ
which is a set of L linear equations in the K unknowns in β n .

   When K = L, so we have exactly IV for the explanatory variables in the system: then
if En [ZX ] is not singular,

                              ˆ IV
                              βn      = En [ZX ]
                                                   −1
                                                        En [Zy] ,                       (1.13)

which is the System Instrumental Variables (SIV) Estimator.

Consistency of SIV follows by the WLLN under ASS. 9 and 10.

   When L > K -so there are more columns in the IV matrix Z than we need for
                         ˆ
identification- choosing β n is not straightforward. Except in special cases (1.12) will not
                                                     ˆ
have a solution. Instead we can take the solution β n to make the vector in equation
(1.12) as small as possible: one solution is to minimize its norm,

                                   ˆ
                        En Z y − X β n                   ˆ
                                              En Z y − X β n           ,

or in general we can use a weighting matrix to produce the best estimator in some sense.

      ˆ
  If Wn is an L × L positive semidefinite matrix, possibly depending on data, a Gen-
                                                                   ˆ
eralized Method of Moments (GMM) Estimate of β is a vector β n which solves

                                            ˆ
                       min En [Z (y − X b)] Wn En [Z (y − X b)] .
                        b

Since this is a quadratic form in b, the solution has a closed form:
                                                             −1
          ˆ     ˆ   ˆ             ˆ
          β n = β n Wn = En [XZ ] Wn En [ZX ]                              ˆ
                                                                  En [XZ ] Wn En [Zy]   (1.14)

                  ˆ
assuming En [XZ ] Wn En [ZX ] is nonsingular.

                                             22
Econometrics II-1. Systems of Equations. 2009/10          UC3M. Master in Economic Analysis

                                                                   ˆ
To show that the GMM Estimate is consistent we need to assume that Wn has a non-
singular probability limit:


ASS. 11 (GMM weighting matrix)

                                ˆ
                                Wn →p W as n → ∞,

where W is a nonrandom, symmetric, L × L positive definite matrix.


In applications, the convergence in ASS. 11 follows by the law of large numbers, because
 ˆ
Wn will be a function of samples averages, which will be positive definite with probability
approaching one.


THM. 5 (Consistency of GMM) Under ASS. 9-11

                                    ˆ   ˆ
                                    β n Wn → p β

as n → ∞.


PROOF. We can write
                                                    −1
               ˆ                  ˆ
               β n − β = En [XZ ] Wn En [ZX ]                     ˆ
                                                         En [XZ ] Wn En [Zu] .

Now under ASS. 10, C := E [ZX ] has rank K, and with ASS. 11, C WC has rank K
and therefore is nonsingular. Therefore from the LLN

               ˆ                       −1
               βn − β =       (C WC)        + op (1) (C W+op (1)) En [Zu]
                         = Op (1)op (1) = op (1),

because En [Zu] →p 0 by ASS. 9.


• When K = L, the GMM estimate in (1.14) is equal to the IV estimate in (1.13), no
                     ˆ
matter the choice of W, because En [XZ ] is a K × K nonsingular matrix.




                                             23
Econometrics II-1. Systems of Equations. 2009/10          UC3M. Master in Economic Analysis


The GMM estimate is asymptotically normally distributed under the same assumptions:


THM. 6 (CAN of GMM) Under ASS. 9-11
             √                            −1            −1
                    ˆ
                  n β n − β →d N 0, (C WC) C WΛWC (C WC)   ,

where
                                 Λ := E [Zuu Z ] = V [Zu]
if the elements of Zuu Z have finite expected absolute value.


PROOF. We only need to show that
             √                     −1                    √
                   ˆ
                 n β n − β = (C WC) + op (1) (C W+op (1)) nEn [Zu]

and that
                                  √
                                      nEn [Zu] →d N (0, Λ) .
by ASS. 9.


 AVar Estimation. Consistent estimation of the above asymptotic sandwich variance
                                          ˆ
matrix is simple if a consistent estimate Λ is available by means of
1                          −1                                                          −1
           ˆ                             ˆ ˆ ˆ            −1          ˆ
  En [XZ ] Wn En [ZX ]          En [XZ ] Wn Λn Wn En [ZX ]   En [XZ ] Wn En [ZX ]           .
n

                            ˆ
This formula simplifies when Wn is chosen optimally.

A consistent estimate of Λ can be

                                      ˆ         u ˆ
                                      Λn = En [Zˆ n un Z ] ,

                 ˆ                                                      ˆ
where un = y − X β n are residuals computed using a consistent estimate β n .
      ˆ




                                               24
Econometrics II-1. Systems of Equations. 2009/10        UC3M. Master in Economic Analysis


1.6.1    The System 2SLS Estimator

              ˆ
The choice of Wn
                                     ˆ             −1
                                     Wn = En [ZZ ]
leads to a familiar estimator. ASS. 11 simply requires that E [ZZ ] exists and is nonsin-
                                     ˆ
gular. When we plug this choice of W in (1.14), we obtain
                                                −1
          ˆ                      −1                                      −1
          β n = En [XZ ] En [ZZ ] En [ZX ]           En [XZ ] En [ZZ ]        En [Zy] ,

which looks like the single-equation 2SLS, so can be termed as System 2SLS Estimate.

• It can be showed that for system (1.11) in EX. 5, S2SLS produces 2SLS equation
by equation.

• For a particular choice of Z in a panel data set up, S2SLS produces a Pooled 2SLS
estimate.

However S2SLS is not necessarily the asymptotically efficient estimator.



1.6.2    Optimal Estimates

There is a choice of W that produces the GMM estimator of minimum variance. As in
a single equation set up, if we set W = Λ−1 , the AVar of the GMM estimate simplifies
             −1
to (C Λ−1 C) , and it can be shown that
                            −1                        −1                 −1
                   (C WC)        C WΛWC (C WC)             − C Λ−1 C

is positive semidefinite for any L × L positive definite matrix W.


ASS. 12 (Optimal Weighting)
                                                           −1
                                 W = Λ−1 := E [Zuu Z ]          .


THM. 7 (Efficient GMM) Under ASS. 9-12, the resulting GMM estimate is efficient
among all GMM estimators of the form (1.14).




                                           25
Econometrics II-1. Systems of Equations. 2009/10      UC3M. Master in Economic Analysis


   If we can estimate consistently Λ, we can obtain an estimate which has the same
first order asymptotic properties of the efficient GMM estimate, so is asymptotically
efficient:


         ˆ
  1. Let β n be an initial consistent estimate of β (e.g. S2SLS).

  2. Obtain the G × 1 residual vectors

                                ˆ           ˆ
                     un = y − X β n = u − X β n −β ,
                     ˆ                                        i = 1, . . . , n.

  3. Compute a consistent estimate of Λ such that

                                     ˆ         u ˆ
                                     Λn = En [Zˆ n un Z ] .

            ˆ    ˆn
  4. Choose Wn = Λ−1 to obtain the asymptotically optimal GMM estimate.


               ˆ
The estimate Λn is consistent for E [Zuu Z ] , even in the presence of conditional het-
eroskedasticity or serial correlation (because n → ∞, with T fixed).

                                                      ˆ ˆn
  The asymptotic variance of the efficient GMM estimate β n Λ−1 is estimated as

                                1                             −1
                          ˆ
                          Vn :=            ˆn
                                  En [XZ ] Λ−1 En [ZX ]                           (1.15)
                                n
       ˆ                                                           ˆ
where Λn can be obtained using the first step residuals, un = y − X β n , or the second
                                                        ˆ
                             ˆ ˆn
step residuals uEF F = y − X β n Λ−1 .
               ˆn

This estimate is labelled as a Minimum Chi-Square Estimator.

• When Z = X, and the un are the System OLS residuals, then the estimate (1.15)
                          ˆ
becomes the robust variance estimate of SOLS.

• The estimate reduces to the robust variance estimate for FGLS when Z = XΩ−1 and
    ˆ
the un are the FGLS residuals.

• If it is known that E [Zuu Z ] = E [ZΩZ ] , where Ω := E [uu ] , this can be used to
                     ˆ −1
estimate Λ by En ZΩn Z (3SLS or FIV estimate).




                                           26
Econometrics II-1. Systems of Equations. 2009/10        UC3M. Master in Economic Analysis


1.7     Testing using the GMM

1.7.1    Testing Classical Hypothesis

                                      H0 : R β = r.
                                          q×K


   Wald tests for linear restrictions: it can be used an optimal GMM Estimate or
a 3SLS Estimate if ASS. ?? is assumed then
                                                   −1
                            ˆ
                   W n = n Rβ n − r       ˆ
                                         RV n R          ˆ
                                                        Rβ n − r →d χ2
                                                                     q


under H0 .

In general 2SLS should not be used to test System hypothesis because its AVar is much
more complicated that in those cases.


  Pseudo-LR Tests. Other method consists on using the GMM objective function
with and without the restrictions imposed. It is necessary that we use an optimal
GMM estimate so Wn estimates consistently V [Zu]−1 = Λ−1 . Then
                  ˆ

                                        ˆ
                               nEn [Zu] Wn En [Zu] →d χ2
                                                       L


since Zu is an L × 1 vector with zero mean and variance Λ.

                         ˆ                                               ˜
Denote by un := y−X β n the unrestricted residuals and by ur := y−X β n the residuals
           ˆ                                                ˜n
obtained from the restricted model (imposing the q restrictions in H0 ). Then the GMM
distance statistic has a limiting chi-square distribution:

                          un ˆ                       u ˆ
             LRn = n En [Z˜ r ] Wn En [Z˜ r ] − En [Zˆ n ] Wn En [Zˆ n ] →d χ2
                                        un                         u         q


when H0 is true. This is only the difference of the GMM criterion multiplied by n (why
is non negative?).




                                            27
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


1.7.2     Testing Overidentifying Restrictions

There are overidentifying restrictions when L > K. In this case and under

                                     H0 : E[Zu] = 0,
                                           L×1

that all the restrictions are true

                                  u ˆ
                            nEn [Zˆ n ] Wn En [Zˆ n ] →d χ2
                                                u         L−K

   ˆ     ˆn
if Wn = Λ−1 is an asymptotically optimal weighting matrix. Replacing u by un reduces
                                                                          ˆ
the degrees of freedom from L to L − K (note that when L = K, the lhs is 0), because
we have estimated K parameters.

   ˆ
If Wn is not optimal then the result does not hold.


   If the null hypothesis is rejected, but not in case of single equation analysis, 2SLS
should be preferred.


  Hausman’s test compares directly the 2SLS and 3SLS estimates directly, assuming
that under the null, the 3SLS is more efficient.




                                           28
Econometrics II-1. Systems of Equations. 2009/10            UC3M. Master in Economic Analysis


1.8     Optimal Instruments

   How many instruments in Z? In principle we should use all instruments available,
given that the initial set satisfies the identification assumptions and that we use in any
optimal weighting matrices. If we have Z := (Z1 , Z2 ) , then
               √                       √                              −1               −1
        AVar        ˆ
                   nβ n (Z1 ) − AVar        ˆ
                                           nβ n (Z) = C1 Λ−1 C1
                                                          1                − C Λ−1 C

where C1 := E [Z1 X ] . Then it is easy to check that C Λ−1 C − C1 Λ−1 C1 is psd (White
                                                                    1
1984, Prop 4.49).

Then we cannot do worse asymptotically by adding instruments to Z1 .


  However we might not improve when

                                C2 = E [Z2 uu Z1 ] Λ−1 C1 ,
                                                    1                                       (1.16)

where C2 := E [Z2 X ] (White, 1984).

• Under conditional homoskedasticity so that we assume E [Zuu Z ] = σ 2 E [ZZ ] (so
2SLS is optimal), this condition is
                                                       −1
                   E [Z2 X ] = E [Z2 Z1 ] (E [Z1 Z1 ])      E [Z1 X ] = 0
                                                                   −1
                     ⇔ 0 = E           Z2 − E [Z2 Z1 ] (E [Z1 Z1 ])     Z1 X
                     ⇔ 0 = E [(Z2 − L [Z2 |Z1 ]) X ] .

This means that X is orthogonal to the part of the (linear) information in Z2 that was
not already in Z1 : in this case Z2 gives no additional information (on X) and estimation
can not be improved.

• In the general case, where the errors u can be correlated or there is conditional het-
eroskedasticity, it is quite unlikely that the original condition (1.16) is satisfied.




                                               29
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


  If the errors satisfy a zero conditional expectation assumption,

                                     E [u|Z] = 0,

then there are unlimited IV available.

• In the general regression case

                                   E [y − x β|x] = 0

and the OLSE is the IVE with IV z = x.


⇒ If V [u] = V [u|x] , (Cond. heteroscedasticity) there are infinite IVE that can
improve on OLS because any h(z) is a valid instrument since

                          E [uh(x)] = E [h(x)E [u|h(x)]] = 0.

Then the minimum chi-square estimate with IV z = x , h(x)       is generally more efficient
than OLS (Chamberlain, 1982).

⇒ If V [u|x] is constant (Cond. homoscedasticity), adding functions to the IV list
results in no asymptotic improvement because the linear projection of x onto x and h(x)
does not depend on h(x).

Therefore under homoskedasticity, adding moment conditions does not improve neither
reduce the asymptotic efficiency of the OLSE. However finite sample performance with
many overidentifying restrictions can be quite poor.




                                           30
Econometrics II-1. Systems of Equations. 2009/10          UC3M. Master in Economic Analysis


  Is possible instead to obtain a small set of optimal IV?

If we replace ASS. 9 by
                              E [ug |z] = 0,    g = 1, . . . , G
for some vector z (which is a valid set of IV for any equation) then it can be showed
that the optimal choice of instruments is

                                  Z∗ = E [X|z] Ω (z)−1

if rank{E [Z∗ X ]} = K, where Ω (z) := E [uu |z] , and we can forget about any other
function of z.


• If E [ug |z] = 0, E [uu |z] = E [uu ] , (conditional homoscedasticity) and E [X1 |z] =
L [X1 |z] = Π Z (linearity of the conditional expectation) then the 3SLS estimator is
the efficient among the SIV estimators based on the orthogonality condition E [ug |z] = 0.


• If E [u|X] = 0 and E [uu |X] = Ω, (exogeneous regressors) then the optimal IV are
XΩ−1 , which gives the GLS estimator.


• Without further assumptions, E [X|z] and Ω (z) can be arbitrary functions of z, in
which case the optimal IV estimate is not easy to obtain. However these functions could
be estimated (non)parametrically (Robinson, 1991; Newey, 1990).


RECOMMENDED READINGS: Wooldridge (2002, Ch. 7-8). Hayashi (2000, Ch. 4).
Ruud (2000, Ch 26.2). Mittelhammer et al. (2000, Ch. 15.2.1)




                                               31
Econometrics II-1. Systems of Equations. 2009/10             UC3M. Master in Economic Analysis




                                       Problem Set 1

  1. Consider the SUR model under (1.2) and ASS. 1, 2 and 5 with Ω = diag (σ 2 , . . . , σ 2 ) .
                                                                             1             G


      (a) Show that ASS. 3 and 4 hold.
      (b) Show that GLS and OLS estimation equation by equation are the same.
      (c) Show that single-equation OLS estimators for any two equations, say, β ng ˆ
               ˆ                                                                     ˆ
          and β nh are asymptotically uncorrelated, i.e., the asymptotic variance of β n
          is block diagonal.
      (d) Under the same assumptions, explain how you would test H0 : β ng = β nh
          against H1 : β ng = β nh if they have the same dimension.
      (e) Now drop ASS. 5, but maintain everything else. Suppose that Ω is estimated
          in an unrestricted way. Are FGLS and OLS algebraically equivalent? Show
               √ ˆ        ˆ F GLS = op (1).
          that n β n − β n
                 √                                       ˆ             ˆ
  2. Using the       n-consistency of the SOLS estimator β n,OLS , for Ωn in (1.9) show that
                                                            n
                         √
                 vec           ˆ
                             n Ωn − Ω       = vec n−1/2           ( ui ui − Ω) + op (1)
                                                            i=1

     under ASS 1 and 4. State the moment conditions you need.

  3. Show the equivalency OLS=FGLS for SUR models when:

          ˆ
      (a) Ωn is diagonal.
      (b) All equations have the same regressors, X = IG ⊗ x, so

                     En XΩ−1 X = Ω−1 ⊗ En [xx ] = (IG ⊗ En [xx ]) Ω−1 ⊗ IK .

  4. Consider the panel data model

                                              yt = xt β + ut          t = 1, . . . , T
                     E [ut |xt , ut−1 , xt−1 , . . .] = 0
                                     E u2 |xt
                                        t         = E u2 = σ 2 ,
                                                       t     t          t = 1, . . . , T

     Note that σ 2 may vary in each time period.
                 t


                                                   1
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


      (a) Show that Ω := E [uu ] is diagonal.
      (b) Write down the GLS estimator when Ω is known.
      (c) Show that ASS 3 does not necessarily hold under the assumptions made,
          taking xt = yt−1 .
      (d) Show that the GLS estimator of b) is consistent for β by showing that
          E XΩ−1 u = 0. Is ASS 1 necessary or sufficient for consistency of GLS?
      (e) Explain how to estimate consistently each σ 2 (n → ∞).
                                                      t

       (f) Justify that, under the previous assumptions, valid inference can be obtained
           by weighting each observation (yt , xt ) by 1/σ t and then running Pooled OLS.
      (g) What happens if we assume that σ 2 = σ 2 for all t = 1, . . . , T .
                                           t


  5. Consider the FRINGE.RAW dataset from Wooldridge (2002, p. 165) on wages and
     fringe benefits for 616 workers. Estimate a two-equation system for hourly wage
     (hrearn) and hourly benefits (hrbens) . Include in the regressors (the same for each
     equation) variables such as education, experience, tenure and dummies on belong-
     ing to a union, south, nrtheast, nrthcen, married, white and male. [EVIEWS:
     objects/new objects/system/spec and write down the equations.]

      (a) Estimate the System by OLS. Check that the results are the same than when
          estimating by OLS each equation separately. Why is that?
      (b) Check the correlation between the residuals of the two equations. [view/
          residuals/ correlation]
      (c) Estimate both equations by FGLS using the SU R option in Eviews. What
          would have been the result of FGLS (Eviews SUR) if the previous residual
          correlation were zero?
      (d) Investigate the sign of the regression coefficients.
      (e) Test the joint significance on both equations of married and white (four
          restrictions) by means of a Wald test. [view/Wald coefficient tests].
       (f) Disaggregate the benefits categories into value of vacation days, value of sick
           leave, value of employer-provided insurance and value of pension. Use hourly
           measures of these variables along with hrearn and estimate a 5-equation SUR
           model.
          Does marital status appear to affect any form of compensation?
          Test whether another year of education increases expected pension value and
          expected insurance by the same amount.

                                             2
Econometrics II-1. Systems of Equations. 2009/10          UC3M. Master in Economic Analysis


  6. Consider the panel data model to explain annual family saving over a five year
     span:
                savt = β 0 + β 1 inct + β 2 aget + β 3 educt + ut , t = 1, . . . , 5
     where inct is annual income, educt is years of education of the household head,
     and aget is age of household head.
     If we add ”wealth at the beginning of year t” to the saving equation, is the strict
     exogeneity assumption likely to hold?

  7. Use the GPA3.RAW data set from Wooldridge (2002, p. 173) to investigate the
     effect of being in season on grade point average. The data are on 366 students-
     athletes at a large university. There are two semesters of data for each students
     (T = 2). Of primary interest is the in-season effect on athletes’ GPA:

          trmgpait = β 0 + β 1 springt + β 2 cumgpait + β 3 crsgpait + β 4 frstsemit
                         +β 5 seasonit + β 6 SATi + β 7 verbmathi + β 8 hspersci
                         +β 9 hssizei + β 10 blacki + β 11 femalei + uit .

     The variable cumgpait is cumulative GPA at the beginning of the term, and this
     clearly depends on past-term GPA, which introduces something similar to a lagged
     dependent variable. There are variables that change over time (season) and other
     other that do not (SAT). Assume that uit is uncorrelated with all variables on
     the right hand side. [Note that data are stacked, so the two observations for each
     period of a given student are consecutive: stacked by cross section.]

      (a) Estimate the equation by (pooled) OLS on the stacked data file. Is the in
          season effect significative?
      (b) Under which conditions are the standard errors provided valid? Will the
          absence of these conditions affect the consistency of OLS estimates?
      (c) Describe how would you test such assumptions given the structure of the data
          set (and the residuals).

  8. Show that S2SLS produces 2SLS equation by equation in EX. 5.

  9. Consider the standard panel data model

                                       yt =       xt β   + ut                        (1.17)
                                                 1×K

     and xt might have some elements correlated with ut . Let zt be a L × 1 vector of
     instruments, L ≥ K, such that E [zt ut ] = 0, t = 1, 2, . . . , T.

                                             3
Econometrics II-1. Systems of Equations. 2009/10              UC3M. Master in Economic Analysis


      (a) Give an expression for the S2SLS estimator if the instrument matrix is

                                            Z = (z1 , z2 , . . . , zT ) .
                                           L×T

          Show that this is the Pooled 2SLS estimator obtained by 2SLS estimation of
          (1.17) using instruments zt , pooled across all t.
      (b) What is the rank condition for the pooled 2SLS estimator?
      (c) Propose a consistent estimate of the asymptotic variance of the Pooled 2SLS
          estimate without using further assumptions.
      (d) Show that under

                       E [ut |zt , ut−1 , zt−1 , . . . , u1 , z1 ] = 0, t = 1, 2, . . . , T   (1.18)
                                                 E u2 |zt
                                                    t          = σ 2 , t = 1, 2, . . . , T    (1.19)

          the usual standard errors and test statistics from the Pooled 2SLS estimation
          are valid.
      (e) What estimator would you use under condition (1.18) and relaxing condition
          (1.19) to
                          E u2 |zt = E u2 = σ 2 , t = 1, 2, . . . , T.
                              t          t      t

          (You would probably need a first step using an initial Pooled 2SLS estima-
          tion).

 10. Let x be a K × 1 random vector and let z be a M × 1 random vector. Suppose
     that the following linear condition expectation condition holds,

                                           E [x|z] = L [x|z] .

     Let h(z) be any q × 1 matrix on linear functions of z, and define an expanded list
     of IV w := [z , h(z) ] .

      (a) Show that
                                        rankE [zx ] = rankE [wx ] .

      (b) Consider the system of equations

                                             y1 = x1 β 1 + u1
                                              .
                                              .
                                              .
                                             yg = xg β G + ug


                                                  4
Econometrics II-1. Systems of Equations. 2009/10       UC3M. Master in Economic Analysis


          and let z be a vector of exogenous variables in every equation so that

                                   E [ug |z] = 0,   g = 1, . . . , G,

          allowing for any nonlinear function of z to be a valid instrument in every
          equation. Suppose that E [xg |z] is linear in z for all g.
          Show that adding nonlinear functions of z to the instrument list cannot help
          in satisfying the rank condition.
      (c) What happens when E [xg |z] is a nonlinear function of z for some g?

 11. Describe situations where System GMM procedures are equivalent to GMM equa-
     tion by equation procedures in overidentified systems.

 12. Obtain the asymptotic distribution of the 3SLS Estimate when nor ASS. ?? (nei-
     ther ASS. 12) hold.

 13. Consider the system of equations

                                      y1 = x1 β 1 + u1
                                      y2 = x2 β 2 + u2

     with the following instrument matrix

                                                z1 0
                                      Z=
                                                0 z2

     and where the covariance matrix Ω of u := (u1 , u2 ) has inverse

                                                σ 11 σ 12
                                    Ω−1 =                    .
                                                σ 21 σ 22

      (a) Find E [ZΩ−1 u] and show that is not necessarily 0 under the orthogonality
          conditions E [z1 u1 ] = 0 and E [z2 u2 ] = 0.
      (b) What happens when Ω is diagonal?
      (c) What if z1 = z2 for a general Ω?




                                            5

								
To top