Multiple Regression Analysis
Earlier we covered finite sample (or exact) properties of
the OLS estimators. We had the following results:
Assumption MLR.1: the model of interest is linear in
parameters, y Xβ u
Assumption MLR.2: Our data are a random sample of n
Assumption MLR.3: Matrix of regressors X [i x1 x k ]
has full column rank
Assumption MLR.4: Zero conditional mean
E (u | x 1 x k ) E (u | X ) 0
Given assumptions MLR1-4, OLS estimator is unbiased:
E(b|X)=β. This is an exact property, because it holds for
any sample size (including small).
Assumption MLR.5: Spherical disturbances
V (ui | X ) E (ui2 | X ) 2 => homoskedasticity
Cov(ui , u j | X ) 0, i j => non-autocorrelation
With this additional assumption, OLS can be shown
to be the best linear unbiased estimator (Gauss-
This is also an exact property of the OLS.
Assumption MLR.6: conditional on X, errors are normally
distributed with mean 0 and constant variance σ2:
This assumption allows making a stronger case about
OLS estimator: it is the best among all (not just linear)
As above, this is an exact property of the OLS.
In addition to making OLS the best unbiased estimator
this last assumption also allows us constructing the exact
sampling distribution of b which leads to the t and F
distribution for t and F statistics.
What will happens to our results if Assumption MLR.6
does not hold?
Fortunately, given all the other assumptions we made, OLS
has good large sample properties (or asymptotic
In particular, t and F statistic have approximately t and F
distribution, at least in large samples.
Consistency of the OLS
Sometimes an estimator fails to be unbiased. In those
cases it has to be at least consistent:
P(| bn | ) 0 as n
As sample size growths, the probability of being far from
the true parameter is approaching zero.
Consistency is a minimal requirement for an estimator.
Theorem: Under assumptions MLR1-4 OLS estimator is
unbiased (as we know) and also consistent.
For the case of simple regression model:
i ( x i x )u i
(x x ) y i
n i 1 Cov( x, u )
1 1 1
Var ( x )
(x i x ) 2 (x i x ) 2
i 1 n i 1
In this derivation we used the fact that Cov( x, u) 0
which is a weaker assumption than SLR4 (x and u are
A little bit of matrix algebra to deal with the multiple linear
regression case: 1
X' X X ' u
b ( X' X) 1 X' y ( X' X) 1 X' ( Xβ u) β ( X' X) 1 X' u β
X' X 1 X' u X' X 1 X' u
p lim( b) p lim β β p lim
Let' s assume that lim n Q
X' X 1 X' u X' u
p lim( b) β p lim β Q 1 p lim
If p lim 0 OLS is consistent
X' u 1
1. E | X X' E (u | X) 0 lim E 0
X' u 1 X' X
2. V | X 2 X'V (u | X) X lim V 0 Q 0
n n n n
p lim 0
X' X 1 X' u
p lim( b) β p lim β Q 1 0 β
OLS is consistent
Previously we have seen that if E (u | X ) 0 , OLS
estimator is not unbiased.
Now we can add that if any of the independent variables
are correlated with the error term, the OLS will also be
This is very bad: there is bias in small samples and it does
not go away as we increase sample size.
Hypothesis testing without MLR 6
Target: Assumption MLR 6
Assumption MLR 6 plays crucial role in hypothesis testing:
bj j SN
~ t( n k 1)
1 se (b j ) ( n k 1) /(n k 1)
u ~ N(0, σ I) b ~ N(β, σ [ X' X] )
F J2 / J
~ F( J ,n k 1)
( n k 1) /(n k 1)
What happens to the distribution of t and F statistics when
the errors are not normal?
If errors are not normally distributed, b-OLS will not be
normally distributed, and b\SE(b) will not have t-distribution
and F-statistic will not have exact F distribution.
Theorem: Even thought b-OLS is not normal if assumption
MLR6 does not hold, it is asymptotically normal =>
approximately normal in large enough samples.
More matrix algebra:
X' X X' u
b β ( X ' X ) 1 X ' u b β ( X ' X ) 1 X ' u n ( b β)
We use the same assumption that Q
Using the central limit theorem it can be shown that
X' u d
N (0, 2 Q)
X' X X' u d
n (b β) Q 1 N (0, 2Q) N (0, 2Q 1 )
b β N (0, Q ) N (0, [nQ] ) b N (β, 2 [nQ]1 )
2 1 2 1
s [ X 'X ] ?
It can be shown that s2 is a consistent estimator of
p lim( s 2 ) 2
(in addition to being unbiased. HW problem).
Also, since X’X-> nQ, we can use s 2 ( X ' X ) 1 as an
as an appropriate estimator of the covariance matrix.
From these results it follows that
b j N ( j , var(b j )) (b j j ) / se(b j ) N (0,1)
Where as before: Var(b j ) , SSTj ( xij x j ) 2
SSTj (1 R 2 )
j i 1
And the R-squared is from regressing xj on all the other
independent variables. Now:
• Take the square root out of the asymptotic variance of bj
• Estimate σ with s
And we have the estimated asymptotic standard error of bj:
which can be used to do hypotheses testing. All our steps
will be as before, the difference is that now the results will
be asymptotic (valid for large samples).
From these results it follows that
( j j ) / se( j ) N (0,1)
(use t table for testing, since it shows critical values for
standard normal when n is large).
How about the F-statistic? It has an approximate F-
distribution in large samples. So, we use the same steps as
we did before to perform hypothesis testing.
To show the asymptotic normality of b-OLS we do need to
assume homoskedasticity (same error variance for all
observations).The heteroskedasticity case will be
discussed in our next topic.
Finally, it can be shown that under assumptions MLR 1-5
(Gauss-Markov assumptions) b-OLS is asymptotically
efficient among a certain class of estimators, B:
Asymptotic V (bOLS ) Asymptotic V ( B)
To claim the asymptotic normality and efficiency of b-
OLS we do need assumptions MLR1-5. In particular, we
NEED to assume homoskedasticity (same error variance
for all observations).
The case of heteroskedasticity is what we will discuss
MLR 1,2,3,4 => b-OLS is unbiased AND consistent
MLR 1,2,3,4 + MLR 5 => b-OLS is BLUE
MLR 1,2,3,4 + MLR 5 + MLR 6 => b-OLS is BUE
b-OLS Unb, Lin Unbiased MLR 1-5 MLR 1-6
B1 Unb, Lin Unb, Lin b-OLS b-OLS
B2 Unb, Lin Unb, Lin b-OLS b-OLS
B3 Bsd, Nlin
B4 Unb, Nlin Unb, Nlin b-OLS
B5 Unb, Lin Unb, Lin b-OLS b-OLS
B6 Bsd, Lin
B7 Unb, Nlin Unb, Nlin b-OLS