Embed
Email

DEFINITION

Document Sample

Shared by: dfhdhdhdhjr
Categories
Tags
Stats
views:
0
posted:
1/29/2012
language:
pages:
24
3. Multiple Regression

Analysis: Estimation

-Although bivariate linear regressions are

sometimes useful, they are often unrealistic

-SLR.4, that all factors affecting y are

uncorrelated with x, is often violated



-MULTIPLE REGRESSION ANALYSIS

allows us to explicitly control factors to

obtain a Ceteris Paribus situation

-this allows us to infer causality better

than a bivariate regression

3. Multiple Regression

Analysis: Estimation

-multiple regression analysis includes more

variables, therefore explaining more of the

variation in y



-multiple regression analysis can also

“incorporate fairly general functional form

relationships

-it’s more flexible

3. Multiple Regression Analysis:

Estimation

3.1 Motivation for Multiple Regression

3.2 Mechanics and Interpretation of

Ordinary Least Squares

3.3 The Expected value of the OLS

Estimators

3.4 The Variance of the OLS Estimators

3.5 Efficiency of OLS: The Gauss-Markov

Theorem

3.1 Motivation for Multiple Regression

Take the bivariate regression:

Moviequality   0  1Plot  u (ie)

-where u takes into other factors affecting movie

quality, such as the characters

-for this regression to be valid, we have to

assume that characters are uncorrelated with

the plot – a poor assumption

-since u affects Plot, this estimate is biased and

we can’t isolate the Ceteris Paribus effect of

plot on movie quality

3.1 Motivation for Multiple Regression

Take the multiple variable regression:

Moviequality   0  1Plot   2Character  u (ie)



-we still need to be concerned of u’s effect on

character and plot BUT…

-by including Character in the regression we

ensure we can examine Plot’s effect with

Character held constant (B1)

-We can also analyze Character’s effect on movie

quality with Plot held constant (B2)

3.1 Motivation for Multiple Regression

-”Multiple regression analysis is also useful for

generalizing functional relationships between

variables”:

Exammark   0  1Study   2 Study2  u (ie)



-here study time can impact exam mark in a

direct and/or quadratic fashion

-this quadratic equation effects how the

parameters are interpreted

-you cannot examine study’s effect on

exammark by holding study2 constant

3.1 Motivation for Multiple Regression

-the change in exammark due to an extra hour of

studying therefore becomes:

Exammark

 1  2 2 Study (ie)

Study

-the impact is no longer a constant (B1).

-while including one variable twice in multiple

regression analysis allows it to have a more

dynamic impact, it requires a more in-depth

analysis of the coefficients estimated

3.1 Motivation for Multiple Regression

-A simple model with two independent variables

(x1 and x2) can be written as:

y   0  1x1   2 x2  u (3.3)

-where B1 examines x1’s impact on y and B2

examines x2’s impact on y

-a key assumption on how u is related to x1 and

x2 is: E (u | x1 , x 2 )  0 (3.5)

-that is, all unobserved impacts on y are

expected to be zero given any x1 and x2

-as in the bivariate case, B0 can be scaled to

make this hold true

3.1 Motivation for Multiple Regression

-in our movie example, this becomes:

E (u | plot, character) 0 (ie)

-in other words, other factors affecting movie

quality (such as filming skill) are not related to

plot or character

-in the quadratic case, this assumption is

simplified:



E (u | study,study2 )  0  E (u | study,)  0 (ie)

3.1 Model with k Independent Variables

-in a regression with k independent variables, the

MULTIPLE LINEAR REGRESSION MODEL or

MULTIPLE REGRESSION MODEL of the

population is:

y   0  1x1   2 x 2   3 x 3  ...   k x k  u (3.6)

-B0 is the intercept, B1 relates to x1, B2 relates to

x2, and so on

-k variables and an intercept give k+1 unknown

parameters

-parameters other than the intercept are

sometimes called SLOPE PARAMETERS

3.1 Model with k Independent Variables

-in the multiple regression model:

y   0  1x1   2 x 2   3 x 3  ...   k x k  u (3.6)

-u is the error term or disturbance that captures

all effects on y not included in the x’s

-some effects can’t be measured

-some effects aren’t expected

-y is the DEPENDENT, EXPLAINED, or PREDICTED

variable

-x are the INDEPENDENT, EXPLANATORY or

PREDICTOR variables

3.1 Model with k Independent Variables

-parameter interpretation is key in multiple

regressions:

log( mark )   0  1log(a bility)   2study   3study 2  u (ie)

-here B1 is the ceteris paribus elasticity of mark

with respect to ability

-if B3=0, then 100B2 is approximately the ceteris

paribus increase in mark when you study an

extra hour

-if B3≠0, this is more complicated

-note that this equation is linear in the

parameters even though mark and study have

a non-linear relationship

3.1 Model with k Independent Variables

-the k assumption with k independent variables

becomes:

E (u | x1 , x 2 ,..., x k )  0 (3.8)

-that is, ALL unobserved factors are uncorrelated

with ALL explanatory variables

-anything that causes correlation between u and

any explanatory variable causes (3.8) to fail

3.2 Mechanics and Interpretation of

Ordinary Least Squares

-in a simple model with two independent

variables, the OLS estimation is written as:

ˆ ˆ ˆ ˆ

y  0  1x1  2 x 2 (3.9)

-where B0hat estimates B0, B1hat estimates B1

and B2hat estimates B2

-we obtain these estimates through the method

of ORDINARY LEAST SQUARES which

minimizes the sum of squared residuals:

Minˆ  ( yi   0 ˆ1 i1 ˆ2 i2

ˆ ˆ

ˆ   x   x )2 (3.10)

 0 , 1 ,  2

3.2 Indexing Note

-when independent variables have two

subscripts, the i refers to the observation number

-likewise the number (1 or 2, etc.) distinguishes

between different variables

-for example, x54 indicates the 5th observations

data for variable 4

-in this course, variables will be generalized xij,

where i refers to observation number and j refers

to variable number

-this is not universal, other papers will use

different conventions

3.2 K Independent Variables

-in a model with k independent variables, the

OLS estimation is written as:

ˆ ˆ ˆ ˆ ˆ

y  0  1x1  2 x 2  .... k x k (3.11)

-where B0hat estimates B0, B1hat estimates B1

and B2hat estimates B2, etc.

-this is called the OLS REGRESSION LINE or

SAMPLE REGRESSION FUNCTION (SRF)

-we still obtain k+1 OLS estimates by minimizing

the sum of squared residuals:

n

Min  ( yi   0  1x i1  ...   k x ik ) 2

ˆ

ˆ ˆ ˆ (3.12)

j i 1

3.2 K Independent Variables

-using multivariable calculus (partial derivatives),

this leads to k+1 equations of k+1 unknowns:

 ˆ

   x   x  ....  x  0

0

ˆ

1

ˆ

i1 2

ˆ

i2 k ik





x i1

ˆ ˆ ˆ ˆ

(  0  1x i1   2 x i2  ....  k x ik )  0



x i2

ˆ ˆ ˆ ˆ

(  0  1x i1   2 x i2  ....  k x ik )  0 (3.13)

...

 ˆ ˆ ˆ ˆ

xik (  0  1x i1   2 x i2  ....  k x ik )  0

-these are also OLS’s FIRST ORDER CONDITIONS

(FOC’s)

3.2 K Independent Variables

-these equations are sample counterparts of

population moments from a method of

moments estimation (we’ve omitted dividing by

n) using the following assumptions:

E (u)  0 (3.8)

E ( x j u)  0

-(3.13) is tedious to solve by hand, and we use

statistics and econometric software -the one

requirement is that (3.13) can be solved

uniquely for Bjhat (this is an easy assumption)

-B0hat is called the OLS INTERCEPT ESTIMATE

and B1hat to BKhat the OLS SLOPE ESIMATES

3.2 Interpreting the OLS Equation

-given a model with 2 independent variables (x1

and x2):



ˆ ˆ ˆ ˆ

y  0  1x1  2 x 2 (3.14)

-B0hat is the predicted value of y when x1=0 and

x1=0

-this is sometimes and interesting situation and

other times impossible

-the intercept is still essential to the estimation,

even if it is theoretically meaningless

3.2 Interpreting the OLS Equation

-”B1hat and B2hat have PARTIAL EFFECT or

CETERIS PARIBUS interpretations:

ˆ ˆ

y  1x1   2 x 2

ˆ

-therefore given a change in x1 and x2, we can

predict a change in y

-in addition, when the other x variable is held

constant, we have:

ˆ

y  1x1 (when x 2 is held fixed)

ˆ

and

ˆ

y   2 x 2 (when x 1 is held fixed)

ˆ

3.2 Interpreting Example

-consider the theoretical model:

intell ˆgence  80  5HomeParent  0.5Held

i (ie)

-Where a person’s innate intelligence is a function

of how many years a parent was home during

their childhood and the average amount of

hours they are held as a child

-the intercept (80) estimates that a child with no

stay-at home parent that is never held with

have an innate intelligence of 80

3.2 Interpreting Example

-consider the theoretical model:

intell ˆgence  80  5HomeParent  0.5Held

i (ie)

-B1hat estimates that a parent staying home for

an extra year increases child intellect by 5

-B2hat estimates that a parent holding a child for

on average an extra hour increases child

intellect by 0.5

-if a parent stays home for an extra year, and as

a result holds a child an extra hour on average,

we would estimate their intellect to rise by 5.5

(5+0.5; 1(B1hat) + 1(B2hat))

3.2 Interpreting the OLS Equation

-A model with k independent variables is written

similar to the 2 independent variable case:

ˆ ˆ ˆ ˆ ˆ

y  0  1x1  2 x 2  ... k x k (3.16)

-Written in terms of changes:

ˆ ˆ ˆ ˆ

y  1x1  2x 2  ... k x k (3.17)

-If we hold all other variables (xj|j=1,2…k, i≠f)

fixed, or CONTROL FOR ALL other variables,

ˆ ˆ

y   f x f (3.18')

3.2 Holding Other Factors Fixed

-we’ve already seen that Bjhat examines the

effect of increasing xj by one, holding all

other x’s constant

-in simple regression analysis, this would

require two identical observations where

only xj differed

-multiple regression analysis estimates this

effect without having an explicit example

-multiple regression analysis mimics a

controlled experiment using

nonexperimental data



Related docs
Other docs by dfhdhdhdhjr
US History Sources
Views: 0  |  Downloads: 0
Endocrine System
Views: 0  |  Downloads: 0
1st and 2nd hour tests
Views: 0  |  Downloads: 0
queuing theory
Views: 1  |  Downloads: 0
Slide 1 - Suffolk University
Views: 0  |  Downloads: 0
VAT Abuses
Views: 0  |  Downloads: 0
Interest Parity
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!