# regression

Shared by:
Categories
Tags
-
Stats
views:
66
posted:
11/29/2011
language:
English
pages:
78
Document Sample

```							               Regression Analysis
• Linear Regression Model
–   Method of OLS
–   Properties of OLS Estimators
–   Goodness-of-Fit
–   Inference
• Multiple Regression Model
– Estimation
– Goodness-of-Fit
– Inference

Sisir Sarma                18.318: Introduction to Econometrics
The Simple Regression Model
• Economic Model: y = b0 + b1x
• Examples: Consumption Function, Savings
Function, Demand Function, Supply Function, etc.
• The parameters we are interested in this model
are: b0 and b1, which we wish to estimate.
• A simple regression model can be written as
y = b0 + b1 x + 

Sisir Sarma          18.318: Introduction to Econometrics
Some Terminology
In the simple linear regression model, where y =
b0 + b1x + , we typically refer to y as the

•   Dependent Variable, or
•   Left-Hand Side Variable, or
•   Explained Variable, or
•   Regressand

Sisir Sarma           18.318: Introduction to Econometrics
Some Terminology (cont.)
In the simple linear regression of y on x, we
typically refer to x as the
• Independent Variable, or
• Right-Hand Side Variable, or
• Explanatory Variable, or
• Regressor, or
• Covariate, or
• Control Variable
Sisir Sarma             18.318: Introduction to Econometrics
A Simple Assumption
The average value of , the error term, in the
population is 0. That is,

E() = 0

This is not a restrictive assumption, since we can
always use b0 to normalize E() to 0.

Sisir Sarma          18.318: Introduction to Econometrics
Zero Conditional Mean
• We need to make a crucial assumption about
how  and x are related
• We want it to be the case that knowing
something about x does not give us any
information about , so that they are
completely unrelated. That is,
• E(|x) = E() = 0, which implies
• E(y|x) = b0 + b1x

Sisir Sarma           18.318: Introduction to Econometrics
Ordinary Least Squares
• Basic idea of regression is to estimate the
population parameters from a sample.
• Let {(xi,yi): i = 1, …, n} denote a random
sample of size n from the population.
• For each observation in this sample, it will be
the case that
yi = b0 + b1xi + i
This is the econometric model.
Sisir Sarma           18.318: Introduction to Econometrics
Population regression line, sample data points
and the associated error terms
y                                    E(y|x) = b0 + b1x
y4                                        .
4 {

y3                            .} 3
y2                  2   {
.

1
y1        .}

x1             x2   x3         x4          x
Deriving OLS Estimates
• To derive the OLS estimates we need to realize
that our main assumption of E(|x) = E( ) = 0
also implies that

• Cov(x, ) = E(x ) = 0

• Why? Remember from basic probability that
Cov(X,Y) = E(XY) – E(X)E(Y).
Sisir Sarma            18.318: Introduction to Econometrics
Deriving OLS (cont.)
• We can write our 2 restrictions just in terms of
x, y, b0 and b1 , since  = y – b0 – b1x

• E(y – b0 – b1x) = 0
• E[x(y – b0 – b1x)] = 0

• These are called moment restrictions

Sisir Sarma          18.318: Introduction to Econometrics
Deriving OLS using M.O.M.
• The method of moments approach to estimation
implies imposing the population moment
restrictions on the sample moments.

• What does this mean? Recall that for E(X), the
mean of a population distribution, a sample
estimator of E(X) is simply the arithmetic mean
of the sample.

Sisir Sarma         18.318: Introduction to Econometrics
More Derivation of OLS
• We want to choose values of the parameters
that will ensure that the sample versions of our
moment restrictions are true.
• The sample versions are as follows:

                                      
n
1             ˆ     ˆ

n i 1
y i  b 0  b1 xi  0

                                           
n
1                 ˆ    ˆ

n i 1
xi y i  b 0  b 1 xi  0
Sisir Sarma            18.318: Introduction to Econometrics
More Derivation of OLS
• Given the definition of a sample mean, and
properties of summation, we can rewrite the
first condition as follows

ˆ     ˆ
y  b 0  b1 x ,
or
ˆ
ˆ  yb x
b0      1
Sisir Sarma             18.318: Introduction to Econometrics
The OLS estimated slope is
n

 x  x  y
i                i    y
ˆ
b1    i 1
n

 x  x 
2
i
i 1
n
provided that   xi  x   0
2

i 1

Sisir Sarma                    18.318: Introduction to Econometrics
Summary of OLS slope estimate
• The slope estimate is the sample covariance between x
and y divided by the sample variance of x. [Note: if
you divide both the numerator and the denominator by
(n-1), we get the sample covariance and the sample
variance formulas, respectively].
• If x and y are positively correlated, the slope will be
positive.
• If x and y are negatively correlated, the slope will be
negative.
• Only need x to vary in our sample.

Sisir Sarma            18.318: Introduction to Econometrics
More OLS
• Intuitively, OLS is fitting a line through the
sample points such that the sum of squared
residuals is as small as possible, hence the term
least squares.

• The residual,  is an estimate of the error
term,  , and is the difference between the fitted
line (sample regression function) and the
sample point.

Sisir Sarma          18.318: Introduction to Econometrics
Sample regression line, sample data points
and the associated estimated error terms
y
y4                                       .
 4{
ˆ ˆ      ˆ
y  b0  b1x

y3                           .}  3
y2                      .
2{


y1        .}  1
x1           x2    x3           x4           x
Alternate approach to Derivation
(The Textbook)
• Given the intuitive idea of fitting a line, we can
set up a formal minimization problem.
• That is, we want to choose our parameters such
that we minimize the SSR:

ˆi   yi  b 0  b1 x 
n               n

2   ˆ     ˆ                                       2

i 1             i 1

Sisir Sarma              18.318: Introduction to Econometrics
Alternate approach (cont.)
• If one uses calculus to solve the minimization
problem for the two parameters you obtain the
following first order conditions, which are the
same as we obtained before, multiplied by n

 y                                                     0
n

i    b 0  b 1 xi
ˆ     ˆ
i 1

                                                0
n
 xi yi  b 0  b1 xi
ˆ     ˆ
i 1

Sisir Sarma              18.318: Introduction to Econometrics
Algebraic Properties of OLS
• The sum of the OLS residuals is zero.
• Thus, the sample average of the OLS residuals
is zero as well.
• The sample covariance between the regressors
and the OLS residuals is zero.
• The OLS regression line always goes through
the mean of the sample.

Sisir Sarma        18.318: Introduction to Econometrics
Algebraic Properties (precise)
n

n                                 ˆi
 ˆi  0 and thus,               i 1
n
0
i 1
n
 xiˆi  0
i 1

ˆ    ˆ
y  b 0  b1 x
Sisir Sarma                 18.318: Introduction to Econometrics
More terminology
We can think of each observatio n as being made
up of an explained part, and an unexplaine d part,
y i  y i   i We then define the following :
ˆ     ˆ
n
  y i  y 2 is the total sum of squares (TSS)
i 1
n
  yi  y 
2
ˆ               is the explained sum of squares (ESS)
i 1
n
 i 2 is the sum of squared residuals
 ˆ                                                                  (SSR)
i 1

Then TSS  ESS  SSR OR ESS  TSS - SSR
Sisir Sarma                 18.318: Introduction to Econometrics
Proof that TSS = ESS + SSR

  yi  y     yi  yi    yi  y 
2                                            2
ˆ      ˆ
   i   y i  y 
2
ˆ      ˆ
   i  2  i  y i  y     y i  y 
2
ˆ 2
ˆ ˆ              ˆ
 SSR  2  i  y i  y   ESS
ˆ ˆ
and we know that   i  y i  y   0
ˆ ˆ
Sisir Sarma         18.318: Introduction to Econometrics
Goodness-of-Fit
• How do we think about how well our sample
regression line fits our sample data?
• Can compute the fraction of the total sum of
squares (TSS) that is explained by the model,
call this the R-squared of regression
• R2 = ESS/TSS = 1 – SSR/TSS
• Since SSR lies between 0 and TSS, R2 will
always lie between 0 and 1.

Sisir Sarma         18.318: Introduction to Econometrics
Unbiasedness of OLS
• Assume the population model is linear in
parameters as y = b0 + b1x + 
• Assume we can use a random sample of size n,
{(xi, yi): i = 1, 2, …, n}, from the population
model. Thus we can write the sample model yi
= b0 + b1xi + i
• Assume E(|x) = 0 and thus E(i|xi) = 0
• Assume there is variation in the xi

Sisir Sarma          18.318: Introduction to Econometrics
Unbiasedness of OLS (cont.)
• In order to think about unbiasedness, we need
to rewrite our estimator in terms of the
population parameter.

ˆ
b1   
 x  x  yi                     i

 x  x 
2
i
Sisir Sarma           18.318: Introduction to Econometrics
Unbiasedness of OLS (cont.)

ˆ b   xi  x  i
b1
 xi  x 
1              2

 
ˆ b
So, E b1   1

Sisir Sarma       18.318: Introduction to Econometrics
Unbiasedness Summary
• The OLS estimates of b1 and b0 are unbiased
• Proof of unbiasedness depends on our 4
assumptions – if any assumption fails, then
OLS is not necessarily unbiased
• Remember unbiasedness is a description of the
estimator – in a given sample we may be “near”
or “far” from the true parameter

Sisir Sarma           18.318: Introduction to Econometrics
Variance of the OLS Estimators
• Now we know that the sampling distribution of
our estimate is centered around the true
parameter.
distribution is.
• Assume Var(|x) = s2 (Homoskedasticity)

Sisir Sarma         18.318: Introduction to Econometrics
Variance of OLS (cont.)
• Var(|x) = E(2|x)-[E(|x)]2
• E(|x) = 0, so s 2 = E(2|x) = E(2) = Var()
• Thus s2 is also the unconditional variance,
called the error variance.
• s, the square root of the error variance is called
the standard deviation of the error.
• Can say: E(y|x)=b0 + b1x and Var(y|x) = s2.

Sisir Sarma            18.318: Introduction to Econometrics
Homoskedastic Case
y
f(y|x)

. E(y|x) = b + b x
0    1
.

x1   x2
Heteroskedastic Case
f(y|x)

.
.       E(y|x) = b0 + b1x

.
x1   x2   x3             x
Variance of OLS

                     s      2
ˆ
Var b1 
 x  x 
2
i

s x
          N x  x 
2                    2
ˆ
Var b                                       i


0                                            2
i

Sisir Sarma                 18.318: Introduction to Econometrics
Variance of OLS Summary
• The larger the error variance, s2, the larger the
variance of the slope estimate, for a given  x  x  2
 i
•
• The larger the variability in the xi, the smaller
the variance of the slope estimate.
• As a result, a larger sample size should
decrease the variance of the slope estimate.
• Problem that the error variance is unknown.
Sisir Sarma           18.318: Introduction to Econometrics
Estimating the Error Variance
• We don’t know what the error variance, s2 is,
because we don’t observe the errors, i.

• What we observe are the residuals, i
ˆ
• We can use the residuals to form an estimate of
the error variance.

Sisir Sarma         18.318: Introduction to Econometrics
Error Variance Estimate (cont.)

An unbiased estimator of s is                        2

 ˆi  SSR /n  2
1
s 
ˆ 2              2

n  2

Sisir Sarma      18.318: Introduction to Econometrics
Error Variance Estimate (cont.)

sˆ  s 2  Standard error of the regression
ˆ

recall that sd b  s
ˆ

 xi  x 
1
2 2

if wesubstitutes for s then wehave
ˆ
ˆ
the standard error of b1 ,

   ˆ    
se b1  s /  xi  x 
ˆ                         2

1
2

Sisir Sarma            18.318: Introduction to Econometrics
Multiple Regression Analysis

y = b0 + b1x1 + b2x2 + . . . bkxk +

Estimation

Sisir Sarma                  18.318: Introduction to Econometrics
Parallels with Simple Regression
•    b0 is still the intercept
•    b1 to bk all called slope parameters
•     is still the error term
• Still need to make a zero conditional mean
assumption, so now assume that
• E(|x1,x2, …,xk) = 0
• Still minimizing the sum of squared
residuals, so have k+1 first order conditions
Sisir Sarma            18.318: Introduction to Econometrics
Interpreting Multiple Regression
ˆ ˆ       ˆ        ˆ            ˆ
y  b 0  b1 x1  b 2 x2  ... b k xk , so
ˆ     ˆ         ˆ
y  b x  b x  ... b x ,   ˆ
1 1    2 2                             k      k

so holding x2 ,...,xk fixed implies that
ˆ
y  b x , that is each b has
ˆ          1 1

a ceteris paribus interpretation
Sisir Sarma          18.318: Introduction to Econometrics
Simple vs Multiple Reg Estimate
~ ~
~b b x
Compare the simple regression y        0     1 1

ˆ ˆ       ˆ       ˆ
with the multiple regression y  b 0  b1 x1  b 2 x2
~    ˆ
Generally, b1  b1 unless :
ˆ
b  0 (i.e. no partial effect of x ) OR
2                                           2

x1 and x2 are uncorrelated in the sample

Sisir Sarma           18.318: Introduction to Econometrics
Goodness-of-Fit
We can think of each observatio n as being made
up of an explained part, and an unexplaine d part,
yi  yi   i We then define the following :
ˆ ˆ
  yi  y  is the total sum of squares (TSS)
2

  yi  y  is the explained sum of squares (ESS)
2
ˆ
 ˆi is the sum of squared residuals (SSR)
2

Then TSS  ESS  SSR
Sisir Sarma            18.318: Introduction to Econometrics
Goodness-of-Fit (cont.)
• How do we think about how well our
sample regression line fits our sample data?

• Can compute the fraction of the total sum
of squares (SST) that is explained by the
model, call this the R-squared of regression

• R2 = ESS/TSS = 1 – SSR/TSS
Sisir Sarma            18.318: Introduction to Econometrics
• R2 can never decrease when another
independent variable is added to a
regression, and usually will increase

• Because R2 will usually increase with the
number of independent variables, it is not a
good way to compare models

Sisir Sarma           18.318: Introduction to Econometrics
• Recall that the R2 will always increase as more
variables are added to the model
• The adjusted R2 takes into account the number of
variables in a model, and may decrease

R   2
 1
SSR n  k  1
SST n  1
sˆ   2
 1
SST n  1
Sisir Sarma                 18.318: Introduction to Econometrics
• Most packages will give you both R2 and
• You can compare the fit of 2 models (with
the same y) by comparing the adj-R2
• You cannot use the adj-R2 to compare
models with different y’s (e.g. y vs. ln(y))

Sisir Sarma         18.318: Introduction to Econometrics
Goodness of Fit
• Important not to fixate too much on adj-R2 and
lose sight of theory and common sense
• If economic theory clearly predicts a variable
belongs, generally leave it in
• Don’t want to include a variable that prohibits a
sensible interpretation of the variable of interest
• Remember ceteris paribus interpretation of
multiple regression

Sisir Sarma             18.318: Introduction to Econometrics
Classical Linear Model: Inference
• The 4 assumptions for unbiasedness, plus
homoskedasticity assumption are known as the
Gauss-Markov assumptions.
• If the Gauss-Markov assumptions hold, OLS is
BLUE.
• In order to do classical hypothesis testing, we
need to add another assumption (beyond the
Gauss-Markov assumptions).
• Assume that  is independent of x1, x2,…, xk and 
is normally distributed with zero mean and
variance s2:  ~ iid N(0,s2)

Sisir Sarma           18.318: Introduction to Econometrics
CLM Assumptions (cont.)
• Under CLM, OLS is not only BLUE, but is
the minimum variance unbiased estimator.
• We can summarize the population
assumptions of CLM as follows
• y|x ~ Normal(b0 + b1x1 +…+ bkxk, s2)
• While for now we just assume normality,
clear that sometimes not the case.
• Large samples will let us drop normality.
Sisir Sarma            18.318: Introduction to Econometrics
The homoskedastic normal distribution with
a single explanatory variable
y
f(y|x)

. E(y|x) = b + b x
0    1
.
Normal
distributions

x1     x2
Sisir Sarma              18.318: Introduction to Econometrics
Normal Sampling Distributions
Under the CLM assumptions, conditional on
the sample values of the independent variables
ˆ
j
ˆ  
b ~ Normal b ,Var b , so thatj        j

bˆ         bj             ~ Normal0,1
 
j
ˆ
sd b j
ˆ
b j is distributed normally becauseit
is a linear combination of the errors
Sisir Sarma                       18.318: Introduction to Econometrics
The t Test
Under the CLM assumptions
ˆ
bj  b j    
ˆ
se b    
~ t n  k 1
j

Note this is a t distribution (vs normal)
because we have to estimate s by sˆ2                  2

Note the degrees of freedom : n  k  1
Sisir Sarma             18.318: Introduction to Econometrics
The t Test (cont.)
• Knowing the sampling distribution for the
standardized estimator allows us to carry out
hypothesis tests
• For example, H0: bj = 0
• If accept null, then accept that xj has no effect
on y, controlling for other x’s.

Sisir Sarma           18.318: Introduction to Econometrics
The t Test (cont.)
e
To perform our test w first need to form
ˆ
bj
ˆ
" the" t statistic for b j : t bˆ 
j        ˆ
se b                 
j

We will then use our t statistic along with
o
a rejection rule to determine whether t
accept thenull hypothesis, H 0
Sisir Sarma         18.318: Introduction to Econometrics
t Test: One-Sided Alternatives
• Besides our null, H0, we need an alternative
hypothesis, HA, and a significance level
• HA may be one-sided, or two-sided
• HA: bj > 0 and HA: bj < 0 are one-sided
• HA: bj  0 is a two-sided alternative.
• If we want to have only a 5% probability of
rejecting H0 if it is really true, then we say
our significance level is 5%.
Sisir Sarma         18.318: Introduction to Econometrics
One-Sided Alternatives (cont.)
• Having picked a significance level, a, we
look up the (1 – a)th percentile in a t
distribution with n – k – 1 df and call this c,
the critical value.
• We can reject the null hypothesis if the t
statistic is greater than the critical value.
• If the t statistic is less than the critical value
then we fail to reject the null.

Sisir Sarma           18.318: Introduction to Econometrics
One-Sided Alternatives (cont.)
yi = b0 + b1xi1 + … + bkxik + i

H0: bj = 0                              HA: bj > 0

Fail to reject
reject
1  a                                     a

Sisir Sarma
0                           c
18.318: Introduction to Econometrics
One-sided vs Two-sided
• Because the t distribution is symmetric, testing
H1: bj < 0 is straightforward. The critical value is
just the negative of before
• We can reject the null if the t statistic < –c, and if
the t statistic > than –c then we fail to reject the
null
• For a two-sided test, we set the critical value
based on a/2 and reject H1: bj  0 if the absolute
value of the t statistic > c
Sisir Sarma             18.318: Introduction to Econometrics
Two-Sided Alternatives
yi = b0 + b1Xi1 + … + bkXik + i

H0: bj = 0                                    HA: bj  0
fail to reject

reject                                                           reject
a/2            1  a                                     a/2

Sisir Sarma
-c            0                            c
18.318: Introduction to Econometrics
Summary for H0: bj = 0
• Unless otherwise stated, the alternative is
assumed to be two-sided
• If we reject the null, we typically say “xj is
statistically significant at the a % level”
• If we fail to reject the null, we typically say
“xj is statistically insignificant at the a %
level”

Sisir Sarma           18.318: Introduction to Econometrics
Testing other hypotheses
• A more general form of the t statistic
recognizes that we may want to test
something like H0: bj = aj
• In this case, the appropriate t statistic is

bˆ        aj   
t
 
j
ˆ   , where
se b j
a j  0 for the standard test
Sisir Sarma                     18.318: Introduction to Econometrics
Confidence Intervals
•  Another way to use classical statistical testing is
to construct a confidence interval using the same
critical value as was used for a two-sided test
• A (1 - a) % confidence interval is defined as

 
ˆ  c  se b , wherec is the 1 - a  percentile
bj              ˆ
j                 
 2
in a tn k 1 distribution
Sisir Sarma             18.318: Introduction to Econometrics
Computing p-values for t tests
• An alternative to the classical approach is
to ask, “what is the smallest significance
level at which the null would be rejected?”
• So, compute the t statistic, and then look up
what percentile it is in the appropriate t
distribution – this is the p-value
• p-value is the probability we would observe
the t statistic we did, if the null were true

Sisir Sarma         18.318: Introduction to Econometrics
Testing a Linear Combination
• Suppose instead of testing whether b1 is equal to a
constant, you want to test if it is equal to another
parameter, that is H0 : b1 = b2
• Use same basic procedure for forming a t statistic

ˆ ˆ
b1  b 2
t
ˆ b
se b1   ˆ
2                      
Sisir Sarma            18.318: Introduction to Econometrics
Testing Linear Comb. (cont.)
Since

ˆ ˆ              
ˆ ˆ        
se b1  b 2  Var b1  b 2 , then
Varb  b   Varb   Varb   2Covb , b 
ˆ ˆ
1
ˆ
2
ˆ
1
ˆ ˆ
2                           1   2

se b  b   se b   se b   2 s 
 ˆ
1
ˆ ˆ                         ˆ   2                       2                      2
1              2       1                       2                  12

where s is an estimate of Covb , b 
12
ˆ ˆ
1         2

Sisir Sarma                        18.318: Introduction to Econometrics
Testing a Linear Comb. (cont.)
• So, to use formula, need s12, which standard
output does not have
• Many packages will have an option to get it, or
will just perform the test for you
• In Eviews, after ls y c x1 x2 … xk, in the window
with the regression results, select View then
Coefficient Tests, then Wald tests to do a Wald
test of the hypothesis that b1 = b2 (type c(2) = c(3)

Sisir Sarma             18.318: Introduction to Econometrics
Example:
• Suppose you are interested in the effect of
campaign expenditures on outcomes
• Model is voteA = b0 + b1log(expendA) +
b2log(expendB) + b3prtystrA + 
• H0: b1 = - b2, or H0: q1 = b1 + b2 = 0
• b1 = q1 – b2, so substitute in and rearrange
 voteA = b0 + q1log(expendA) +
b2log(expendB - expendA) + b3prtystrA + 
Sisir Sarma          18.318: Introduction to Econometrics
Example (cont.)
• This is the same model as originally, but
now you get a standard error for b1 – b2 = q1
directly from the basic regression
• Any linear combination of parameters
could be tested in a similar manner
• Other examples of hypotheses about a
single linear combination of parameters: b1
= 1 + b2 ; b1 = 5b2 ; b1 = -1/2b2 ; etc

Sisir Sarma         18.318: Introduction to Econometrics
Multiple Linear Restrictions
• Everything we’ve done so far has involved
testing a single linear restriction, (e.g. b1 = 0
or b1 = b2 )
• However, we may want to jointly test
• A typical example is testing “exclusion
restrictions” – we want to know if a group
of parameters are all equal to zero

Sisir Sarma           18.318: Introduction to Econometrics
Testing Exclusion Restrictions
• Now the null hypothesis might be
something like H0: bk-r+1 = 0, ... , bk = 0
• The alternative is just HA: H0 is not true
• Can’t just check each t statistic separately,
because we want to know if the r
parameters are jointly significant at a given
level – it is possible for none to be
individually significant at that level

Sisir Sarma         18.318: Introduction to Econometrics
Exclusion Restrictions (cont.)
• To do the test we need to estimate the “restricted
model” without xk-r+1,, …, xk included, as well as
the “unrestricted model” with all x’s included
• Intuitively, we want to know if the change in SSR
is big enough to warrant inclusion of xk-r+1,, …, xk
 SSRUR  r
SSRR
F                    , where
SSRUR n  k  1
the subscript R refers to restricted and
the subscript UR refers to unrestrict ed
Sisir Sarma            18.318: Introduction to Econometrics
The F statistic
• The F statistic is always positive, since the
SSR from the restricted model can’t be less
than the SSR from the unrestricted
• Essentially the F statistic is measuring the
relative increase in SSR when moving from
the unrestricted to restricted model
• r = number of restrictions, or dfR – dfUR
• n – k – 1 = dfUR
Sisir Sarma         18.318: Introduction to Econometrics
The F statistic (cont.)
• To decide if the increase in SSR when we
move to a restricted model is “big enough”
to reject the exclusions, we need to know
about the sampling distribution of our F stat
• Not surprisingly, F ~ Fr,n-k-1, where r is
referred to as the numerator degrees of
freedom and n – k – 1 as the denominator
degrees of freedom
Sisir Sarma            18.318: Introduction to Econometrics
The F statistic (cont.)
f(F)
Reject H0 at a
fail to reject           significance level
if F > c

reject
1  a                 a
0                       c                               F
Sisir Sarma                    18.318: Introduction to Econometrics
The    R2   form of the F statistic
• Because the SSR’s may be large and unwieldy, an
alternative form of the formula is useful
• We use the fact that SSR = TSS(1 – R2) for any
regression, so can substitute in for SSRR and
SSRUR

F
R  r
R
2        2

1  R  n  k  1
UR       R
2
, where again
UR

R is restricted and UR is unrestrict ed
Sisir Sarma             18.318: Introduction to Econometrics
Overall Significance
• A special case of exclusion restrictions is to test
H0: b1 = b2 =…= bk = 0
• Since the R2 from a model with only an intercept
will be zero, the F statistic is simply

2
R k
F
          
1  R n  k  1
2

Sisir Sarma            18.318: Introduction to Econometrics
General Linear Restrictions
• The basic form of the F statistic will work
for any set of linear restrictions
• First estimate the unrestricted model and
then estimate the restricted model
• In each case, make note of the SSR
• Imposing the restrictions can be tricky –
will likely have to redefine variables again

Sisir Sarma         18.318: Introduction to Econometrics
F Statistic Summary
• Just as with t statistics, p-values can be
calculated by looking up the percentile in
the appropriate F distribution
• If only one exclusion is being tested, then F
= t2, and the p-values will be the same
• You can use Wald test to test multivariate
hypotheses as well

Sisir Sarma          18.318: Introduction to Econometrics

```
Related docs
Other docs by yPsYn3x
mat4