VIEWS: 129 PAGES: 42

Limited Dependent Variable Models

• pg 1
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Limited Dependent Variable Models

Gabriel V. Montes-Rojas
City University London

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Binary Response Models

Assume that your dependent variable is an indicator/dummy
variable that takes values 0 and 1. We adopt the convention that a
value of 1 is called the “success” and 0 the “failure”
Labour Force Participation: Consider a model when you want to
estimate the eﬀect of human capital on labour force participation,
i.e. whether the individual actually works or not. Say you have the
variable inlf that takes the value 1 if the individual is working and
0 otherwise.
Bankruptcy : Consider a model when you want to estimate eﬀect
of some ﬁrm charactaristics on the probability that a ﬁrm declares
bankruptcy.

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Linear Probability Model

Let y = 0, 1 be the dependent variable. One option is to use a
linear probability model of the form:

y = β0 + β1 X + u
How do we interpret β1 ?

E[y|X] = β0 + β1 X = P [y = 1|X]
Then β1 = ∂P [y=1|X] . In other words: β1 gives you the marginal
∂X
eﬀect on the probability of obtaining a success (i.e. y=1).

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Linear Probability Model

There are some drawbacks of using a linear probability model:
1. Predicted value: the model does not guarantee that
0 ≤ y ≤ 1.
ˆ
2. Heteroskedasticity:
V ar(y|X) = P [y = 1|X] ∗ (1 − P [y = 1|X])

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Linear Probability Model

Consider Example 7.12: A Linear Probability Model of Arrests
http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge7.html

Database: http://fmwww.bc.edu/ec-p/data/wooldridge/CRIME1.des

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models
An alternative speciﬁcation uses the concept of a cumulative
distribution function. Let u be a random variable, then its
cumulative function is

P [u ≤ t] = F (t), 0 ≤ F (.) ≤ 1
Then consider the following latent variable model:

y ∗ = β0 + β1 X + e
But you don’t observe y ∗ , rather

y = 1[y ∗ > 0] = 1[e > −(β0 + β1 X)]
Here 1[.] is an indicator function that takes the value of 1 if the
argument in brackets is true, 0 otherwise.
Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

If we assume that e follows a normal distribution, i.e.
2
e ∼ N (0, σe ), then have the probit model. In this case:
∞
F (z) = P [e ≤ z] = z φ(v)dv = Φ(z) where φ is the normal
(or Gaussian) density function and Φ is the normal
distribution (or cumulative) function.
Then, P [y = 1|X] = P [e > −(β0 + β1 X)] =
1 − F (−(β0 + β1 X)) = F (β0 + β1 X) = Φ(β0 + β1 X)
Then, P [y = 0|X] = P [e ≤ −(β0 + β1 X)] =
F (−(β0 + β1 X)) = 1 − F (β0 + β1 X) = 1 − Φ(β0 + β1 X)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

If we assume that e follows a logistic distribution, then have
the logit model. In this case:
exp(z)
F (z) = P [e ≤ z] = 1+exp(z) = Λ(z), where Λ is the
cumulative distribution function of a logit model.
Then, P [y = 1|X] = P [e > −(β0 + β1 X)] =
1 − F (−(β0 + β1 X)) = F (β0 + β1 X) = Λ(β0 + β1 X)
Then, P [y = 0|X] = P [e ≤ −(β0 + β1 X)] =
F (−(β0 + β1 X)) = 1 − F (β0 + β1 X) = 1 − Λ(β0 + β1 X)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

How to interpret coeﬃcients? Note that

∂P [y = 1|X]   ∂F (β0 + β1 X)
=                = f (β0 + β1 X)β1
∂X             ∂X
As a result β1 = ∂P [y=1|X] ... then you cannot interpret the
∂X
coeﬃcients of a probit or logit model directly. For that you
need f (.), i.e. the density function of your assumed e.
You can though interpret the direction of the eﬀect through
the sign:

∂P [y = 1|X]
sign(β1 ) = sign
∂X

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

How to interpret coeﬃcients? Note that

∂P [y = 1|X]   ∂F (β0 + β1 X)
=                = f (β0 + β1 X)β1
∂X             ∂X
As a result β1 = ∂P [y=1|X] ... then you cannot interpret the
∂X
coeﬃcients of a probit or logit model directly. For that you
need f (.), i.e. the density function of your assumed e.
You can though interpret the direction of the eﬀect through
the sign:

∂P [y = 1|X]
sign(β1 ) = sign
∂X

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

For a probit model f (z) = φ(z) = (2π)−1/2 exp(−z 2 /2).
exp(z)
For a logit model f (z) =         (1+exp(z))2
But what value of X we have to include in f (β0 + β1 X)? In
¯
general X = X.

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
The dependent variable data consists on {yi }n , 0s and 1s
i=1
for each observation.
If you observe a 1, say yi = 1, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
yi = β0 + β1 Xi + ei > 0, and since e was assumed to be
probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi )
If you observe a 0, say yi = 0, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was
assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
The dependent variable data consists on {yi }n , 0s and 1s
i=1
for each observation.
If you observe a 1, say yi = 1, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
yi = β0 + β1 Xi + ei > 0, and since e was assumed to be
probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi )
If you observe a 0, say yi = 0, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was
assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
The dependent variable data consists on {yi }n , 0s and 1s
i=1
for each observation.
If you observe a 1, say yi = 1, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
yi = β0 + β1 Xi + ei > 0, and since e was assumed to be
probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi )
If you observe a 0, say yi = 0, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was
assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
The dependent variable data consists on {yi }n , 0s and 1s
i=1
for each observation.
If you observe a 1, say yi = 1, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
yi = β0 + β1 Xi + ei > 0, and since e was assumed to be
probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi )
If you observe a 0, say yi = 0, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was
assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
The dependent variable data consists on {yi }n , 0s and 1s
i=1
for each observation.
If you observe a 1, say yi = 1, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
yi = β0 + β1 Xi + ei > 0, and since e was assumed to be
probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi )
If you observe a 0, say yi = 0, what is the associated
probability that you would have got THIS PARTICULAR
VALUE?
∗
... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was
assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation

... more generally:
P [y|X] = [F (β0 + β1 X)]y [1 − F (β0 + β1 X)]1−y

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
What about the whole sample altogether, i.e. {yi }n instead of a
i=1
particular observation?
REMEMBER THE STATISTICAL PROPERTY OF
INDEPENDENCE. IF TWO EVENTS A AND B ARE
INDEPENDENT, THEN P [A&B] = P [A] × P [B]

n
P [y1 , y2 , ..., yn |X] =         P [yi |Xi ]
i=1
n
=         [F (β0 + β1 Xi )]yi [1 − F (β0 + β1 Xi )]1−yi
i=1

This is the likelihood function.

Gabriel Montes-Rojas   Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
What about the whole sample altogether, i.e. {yi }n instead of a
i=1
particular observation?
REMEMBER THE STATISTICAL PROPERTY OF
INDEPENDENCE. IF TWO EVENTS A AND B ARE
INDEPENDENT, THEN P [A&B] = P [A] × P [B]

n
P [y1 , y2 , ..., yn |X] =         P [yi |Xi ]
i=1
n
=         [F (β0 + β1 Xi )]yi [1 − F (β0 + β1 Xi )]1−yi
i=1

This is the likelihood function.

Gabriel Montes-Rojas   Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

An introduction to Maximum Likelihood estimation
In general, it is easier to work with the log-likelihood function
n
L(β) =              i (β)
i=1

where

i (β)   = log P [yi |Xi ]
= yi × log F (β0 + β1 Xi ) + (1 − yi ) log[1 − F (β0 + β1 Xi )]
ˆ
Then, the maximum likelihood estimator (MLE) is β that
ˆ
maximises L(β). In other words, for every possible β, L(β) ≥ L(β)

Gabriel Montes-Rojas      Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Probit vs. Logit
Density functions

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Probit vs. Logit
Cumulative distribution functions

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Logit and Probit Models

probit y x1 x2 (probit model)
logit y x1 x2 (logit model)
Remember that the coeﬃcients of these models cannot be
interpreted except for the sign... If you want the marginal eﬀect on
the probability of success:
dprobit y x1 x2 (probit model)
logit y x1 x2 (logit model)
mfx (this gives you the marginal eﬀects)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Multinomial logit model
What can be done if the dependent variable can take several values
y = 0, 1, 2..., J, but the y values do not represent a particular
ordering? This is a multinomial model.
Example: y could be marital status
y = 0 single
y = 1 married
y = 2 divorced
y = 3 widow
Example: discrete choice models. y could be place of holiday
y = 0 Europe
y = 1 Asia
y = 2 America
y = 3 Africa
y = 4 Oceania
Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Multinomial logit model

Select a base group. By convention this corresponds to
j = 0.
Each outcome contains a diﬀerent set of parameters
βj , j = 1, 2, ..., J
In a multinomial logit model each probability is of the form

exp(Xβj )
P [y = j|X] =                  J
1+     h=1 exp(Xβh )

Gabriel Montes-Rojas     Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Multinomial logit model

mlogit y x1 x2 x3 (multinomial logit model)
Remember that the coeﬃcients of these models cannot be
interpreted except for the sign (similar to probit and logit models)
mfx, predict(p outcome(1)) (computes the marginal eﬀects for
y = 1)
mfx, predict(p outcome(2)) (computes the marginal eﬀects for
y = 2)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Ordered probit model

What can be done if the dependent variable can take several values
y = 0, 1, 2..., J, and the values of y represent a particular ordering?
Example: y could be monthly income range
y = 0 no income
y = 1 £1 to £500
y = 2 £501 to £1000
y = 3 £1001 to £2000
y = 4 £2001 to £5000
y = 5 greater than £5000
Here it does not make much sense to run a OLS model with y as
the dependent variable...

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Ordered probit model

oprobit y x1 x2 x3 (ordered probit model)
Remember that the coeﬃcients of these models cannot be
interpreted except for the sign (similar to probit and logit models)
mfx, predict(p outcome(1)) (computes the marginal eﬀects for
y = 1)
mfx, predict(p outcome(2)) (computes the marginal eﬀects for
y = 2)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Consider the following latent variable model:

y ∗ = β0 + β1 X + u, u|X ∼ N (0, σu )
2

But you don’t observe y ∗ , rather

y = max{0, y ∗ }
Here the variable y ∗ is truncated at 0, i.e. it cannot take negative
values.

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Each observation log-likelihood for this model is

i (β, σ)   = 1[yi = 0] log[1 − Φ(xi β/σ)]
+1[yi > 0] log[(1/σ)φ ((yi − xi β)/σ)]
Note that it has two components...

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Then,

E(y|X) = P [y > 0|X] ∗ E[y|y > 0, X] + P [y = 0|X] ∗ 0
= P [y > 0|X] ∗ E[y|y > 0, X]

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Example: Hours worked. The number of hours you work cannot be
negative, then h ≥ 0. However, if you consider the model

h = βX + u
certainly, there is the restriction that h cannot be negative. Then,

E(h|X) = P [h > 0|X] ∗ E[h|h > 0, X] + P [h = 0|X] ∗ 0
= P [h > 0|X] ∗ E[h|h > 0, X]

Example: Annual amount spent in electronic goods (i.e. TV, DVD
players). Some years you may declare to have spent £0.

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Here we need some mathematical statistics tools...
If z ∼ N (0, 1), then E(z|z > c) = φ(c)/[1 − Φ(c)].
Then,
E(y|y > 0, X) = Xβ + E(u|u > −Xβ) = Xβ + σφ(Xβ)/Φ(Xβ).
Here we have used φ(−c) = φ(c) and 1 − Φ(−c) = Φ(c).

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Then,
E(y|y > 0, X) = Xβ + σλ(Xβ/σ)
where λ is the inverse Mills ratio, the ratio of a standard normal
pdf and cdf.
Moreover,

E(y|X) = Φ(Xβ)E(y|y > 0, X) = Φ(Xβ)[Xβ + σλ(Xβ/σ)]

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Marginal eﬀects in Tobit models

dλ
∂E(y|y > 0, X)/∂xj = βj + βj  (Xβ/σ)
dc
= βj {1 − λ(Xβ/σ) [Xβ + σλ(Xβ/σ)]}

∂E(y|X)/∂xj = βj Φ(Xβ/σ)

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

tobit y x1 x2 (tobit estimation)
mfx compute, predict(ystar(0,.))
¯
(∂E(y|y > 0, X = X)/∂xj )
¯
mfx compute, predict(e(0,.)) (∂E(y|X = X)/∂xj )

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Tobit Models

Consider Example 17.2: Married Women’s Annual Labor Supply
http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge17.html

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Sample selection models
Consider the following model.
The outcome equation is

y ∗ = Xβ + u, E(u|X) = 0

However, we only observe the dependent variable if something
happens.
The selection equation is

Zγ + v > 0
then

y = y ∗ × 1[Zγ + v > 0]

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Sample selection models
Consider the following model.
The outcome equation is

y ∗ = Xβ + u, E(u|X) = 0

However, we only observe the dependent variable if something
happens.
The selection equation is

Zγ + v > 0
then

y = y ∗ × 1[Zγ + v > 0]

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Sample selection models

Under certain conditions OLS is biased. Assume that u and v are
correlated, i.e. corr(u, v) = ρ. Then,

E(y|y > 0, X) = E(y|y > 0, X, Zγ+v > 0) = Xβ+E(u|Zγ+v > 0)

= Xβ + ρσu λ(Zγ)
Now,

∂E(y|y > 0, X)/∂xj = βj + ρσu ∂λ(Zγ)/∂xj

Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Sample selection models
There are two ways of estimating these models:
1. MLE
heckman y x1 x2, select(c= z1 z2)

2. Heckman’s two-step estimator (James Heckman won the Nobel
Prize for this...)
heckman y x1 x2, select(c= z1 z2) twostep

1. Here the idea is that you estimate a probit model ﬁrst, to get
ˆ
1[Zγ + e > 0], that is to estimate γ .
ˆ          γ
2. Then you construct the inverse Mills ratio λ(Zγ) = λ(Zˆ ).
3. Then you run a regression of

ˆ
y = Xβ + αλ(Zγ) + e
Gabriel Montes-Rojas    Limited Dependent Variable Models
Binary Response Models
Multivalued response models
Truncated Models
Sample Selection

Sample selection models
There are two ways of estimating these models:
1. MLE
heckman y x1 x2, select(c= z1 z2)

2. Heckman’s two-step estimator (James Heckman won the Nobel
Prize for this...)
heckman y x1 x2, select(c= z1 z2) twostep

1. Here the idea is that you estimate a probit model ﬁrst, to get
ˆ
1[Zγ + e > 0], that is to estimate γ .
ˆ          γ
2. Then you construct the inverse Mills ratio λ(Zγ) = λ(Zˆ ).
3. Then you run a regression of

ˆ
y = Xβ + αλ(Zγ) + e
Gabriel Montes-Rojas    Limited Dependent Variable Models

To top