VIEWS: 129 PAGES: 42 POSTED ON: 3/10/2010
Limited Dependent Variable Models
Binary Response Models Multivalued response models Truncated Models Sample Selection Limited Dependent Variable Models Gabriel V. Montes-Rojas City University London Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Binary Response Models Assume that your dependent variable is an indicator/dummy variable that takes values 0 and 1. We adopt the convention that a value of 1 is called the “success” and 0 the “failure” Labour Force Participation: Consider a model when you want to estimate the eﬀect of human capital on labour force participation, i.e. whether the individual actually works or not. Say you have the variable inlf that takes the value 1 if the individual is working and 0 otherwise. Bankruptcy : Consider a model when you want to estimate eﬀect of some ﬁrm charactaristics on the probability that a ﬁrm declares bankruptcy. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Linear Probability Model Let y = 0, 1 be the dependent variable. One option is to use a linear probability model of the form: y = β0 + β1 X + u How do we interpret β1 ? E[y|X] = β0 + β1 X = P [y = 1|X] Then β1 = ∂P [y=1|X] . In other words: β1 gives you the marginal ∂X eﬀect on the probability of obtaining a success (i.e. y=1). Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Linear Probability Model There are some drawbacks of using a linear probability model: 1. Predicted value: the model does not guarantee that 0 ≤ y ≤ 1. ˆ 2. Heteroskedasticity: V ar(y|X) = P [y = 1|X] ∗ (1 − P [y = 1|X]) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Linear Probability Model Consider Example 7.12: A Linear Probability Model of Arrests http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge7.html Database: http://fmwww.bc.edu/ec-p/data/wooldridge/CRIME1.des Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models An alternative speciﬁcation uses the concept of a cumulative distribution function. Let u be a random variable, then its cumulative function is P [u ≤ t] = F (t), 0 ≤ F (.) ≤ 1 Then consider the following latent variable model: y ∗ = β0 + β1 X + e But you don’t observe y ∗ , rather y = 1[y ∗ > 0] = 1[e > −(β0 + β1 X)] Here 1[.] is an indicator function that takes the value of 1 if the argument in brackets is true, 0 otherwise. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models If we assume that e follows a normal distribution, i.e. 2 e ∼ N (0, σe ), then have the probit model. In this case: ∞ F (z) = P [e ≤ z] = z φ(v)dv = Φ(z) where φ is the normal (or Gaussian) density function and Φ is the normal distribution (or cumulative) function. Then, P [y = 1|X] = P [e > −(β0 + β1 X)] = 1 − F (−(β0 + β1 X)) = F (β0 + β1 X) = Φ(β0 + β1 X) Then, P [y = 0|X] = P [e ≤ −(β0 + β1 X)] = F (−(β0 + β1 X)) = 1 − F (β0 + β1 X) = 1 − Φ(β0 + β1 X) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models If we assume that e follows a logistic distribution, then have the logit model. In this case: exp(z) F (z) = P [e ≤ z] = 1+exp(z) = Λ(z), where Λ is the cumulative distribution function of a logit model. Then, P [y = 1|X] = P [e > −(β0 + β1 X)] = 1 − F (−(β0 + β1 X)) = F (β0 + β1 X) = Λ(β0 + β1 X) Then, P [y = 0|X] = P [e ≤ −(β0 + β1 X)] = F (−(β0 + β1 X)) = 1 − F (β0 + β1 X) = 1 − Λ(β0 + β1 X) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models How to interpret coeﬃcients? Note that ∂P [y = 1|X] ∂F (β0 + β1 X) = = f (β0 + β1 X)β1 ∂X ∂X As a result β1 = ∂P [y=1|X] ... then you cannot interpret the ∂X coeﬃcients of a probit or logit model directly. For that you need f (.), i.e. the density function of your assumed e. You can though interpret the direction of the eﬀect through the sign: ∂P [y = 1|X] sign(β1 ) = sign ∂X Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models How to interpret coeﬃcients? Note that ∂P [y = 1|X] ∂F (β0 + β1 X) = = f (β0 + β1 X)β1 ∂X ∂X As a result β1 = ∂P [y=1|X] ... then you cannot interpret the ∂X coeﬃcients of a probit or logit model directly. For that you need f (.), i.e. the density function of your assumed e. You can though interpret the direction of the eﬀect through the sign: ∂P [y = 1|X] sign(β1 ) = sign ∂X Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models For a probit model f (z) = φ(z) = (2π)−1/2 exp(−z 2 /2). exp(z) For a logit model f (z) = (1+exp(z))2 But what value of X we have to include in f (β0 + β1 X)? In ¯ general X = X. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation The dependent variable data consists on {yi }n , 0s and 1s i=1 for each observation. If you observe a 1, say yi = 1, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ yi = β0 + β1 Xi + ei > 0, and since e was assumed to be probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi ) If you observe a 0, say yi = 0, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ ... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation The dependent variable data consists on {yi }n , 0s and 1s i=1 for each observation. If you observe a 1, say yi = 1, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ yi = β0 + β1 Xi + ei > 0, and since e was assumed to be probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi ) If you observe a 0, say yi = 0, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ ... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation The dependent variable data consists on {yi }n , 0s and 1s i=1 for each observation. If you observe a 1, say yi = 1, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ yi = β0 + β1 Xi + ei > 0, and since e was assumed to be probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi ) If you observe a 0, say yi = 0, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ ... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation The dependent variable data consists on {yi }n , 0s and 1s i=1 for each observation. If you observe a 1, say yi = 1, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ yi = β0 + β1 Xi + ei > 0, and since e was assumed to be probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi ) If you observe a 0, say yi = 0, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ ... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation The dependent variable data consists on {yi }n , 0s and 1s i=1 for each observation. If you observe a 1, say yi = 1, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ yi = β0 + β1 Xi + ei > 0, and since e was assumed to be probit/logit P [yi = 1|Xi ] = F (β0 + β1 Xi ) If you observe a 0, say yi = 0, what is the associated probability that you would have got THIS PARTICULAR VALUE? ∗ ... this implies that yi = β0 + β1 Xi + ei ≤ 0, and since e was assumed to be probit/logit P [yi = 0|Xi ] = 1 − F (β0 + β1 Xi ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation ... more generally: P [y|X] = [F (β0 + β1 X)]y [1 − F (β0 + β1 X)]1−y Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation What about the whole sample altogether, i.e. {yi }n instead of a i=1 particular observation? REMEMBER THE STATISTICAL PROPERTY OF INDEPENDENCE. IF TWO EVENTS A AND B ARE INDEPENDENT, THEN P [A&B] = P [A] × P [B] n P [y1 , y2 , ..., yn |X] = P [yi |Xi ] i=1 n = [F (β0 + β1 Xi )]yi [1 − F (β0 + β1 Xi )]1−yi i=1 This is the likelihood function. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation What about the whole sample altogether, i.e. {yi }n instead of a i=1 particular observation? REMEMBER THE STATISTICAL PROPERTY OF INDEPENDENCE. IF TWO EVENTS A AND B ARE INDEPENDENT, THEN P [A&B] = P [A] × P [B] n P [y1 , y2 , ..., yn |X] = P [yi |Xi ] i=1 n = [F (β0 + β1 Xi )]yi [1 − F (β0 + β1 Xi )]1−yi i=1 This is the likelihood function. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection An introduction to Maximum Likelihood estimation In general, it is easier to work with the log-likelihood function instead of the likelihood function. n L(β) = i (β) i=1 where i (β) = log P [yi |Xi ] = yi × log F (β0 + β1 Xi ) + (1 − yi ) log[1 − F (β0 + β1 Xi )] ˆ Then, the maximum likelihood estimator (MLE) is β that ˆ maximises L(β). In other words, for every possible β, L(β) ≥ L(β) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Probit vs. Logit Density functions Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Probit vs. Logit Cumulative distribution functions Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Logit and Probit Models probit y x1 x2 (probit model) logit y x1 x2 (logit model) Remember that the coeﬃcients of these models cannot be interpreted except for the sign... If you want the marginal eﬀect on the probability of success: dprobit y x1 x2 (probit model) logit y x1 x2 (logit model) mfx (this gives you the marginal eﬀects) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Multinomial logit model What can be done if the dependent variable can take several values y = 0, 1, 2..., J, but the y values do not represent a particular ordering? This is a multinomial model. Example: y could be marital status y = 0 single y = 1 married y = 2 divorced y = 3 widow Example: discrete choice models. y could be place of holiday y = 0 Europe y = 1 Asia y = 2 America y = 3 Africa y = 4 Oceania Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Multinomial logit model Select a base group. By convention this corresponds to j = 0. Each outcome contains a diﬀerent set of parameters βj , j = 1, 2, ..., J In a multinomial logit model each probability is of the form exp(Xβj ) P [y = j|X] = J 1+ h=1 exp(Xβh ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Multinomial logit model mlogit y x1 x2 x3 (multinomial logit model) Remember that the coeﬃcients of these models cannot be interpreted except for the sign (similar to probit and logit models) mfx, predict(p outcome(1)) (computes the marginal eﬀects for y = 1) mfx, predict(p outcome(2)) (computes the marginal eﬀects for y = 2) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Ordered probit model What can be done if the dependent variable can take several values y = 0, 1, 2..., J, and the values of y represent a particular ordering? Example: y could be monthly income range y = 0 no income y = 1 £1 to £500 y = 2 £501 to £1000 y = 3 £1001 to £2000 y = 4 £2001 to £5000 y = 5 greater than £5000 Here it does not make much sense to run a OLS model with y as the dependent variable... Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Ordered probit model oprobit y x1 x2 x3 (ordered probit model) Remember that the coeﬃcients of these models cannot be interpreted except for the sign (similar to probit and logit models) mfx, predict(p outcome(1)) (computes the marginal eﬀects for y = 1) mfx, predict(p outcome(2)) (computes the marginal eﬀects for y = 2) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Consider the following latent variable model: y ∗ = β0 + β1 X + u, u|X ∼ N (0, σu ) 2 But you don’t observe y ∗ , rather y = max{0, y ∗ } Here the variable y ∗ is truncated at 0, i.e. it cannot take negative values. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Each observation log-likelihood for this model is i (β, σ) = 1[yi = 0] log[1 − Φ(xi β/σ)] +1[yi > 0] log[(1/σ)φ ((yi − xi β)/σ)] Note that it has two components... Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Then, E(y|X) = P [y > 0|X] ∗ E[y|y > 0, X] + P [y = 0|X] ∗ 0 = P [y > 0|X] ∗ E[y|y > 0, X] Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Example: Hours worked. The number of hours you work cannot be negative, then h ≥ 0. However, if you consider the model h = βX + u certainly, there is the restriction that h cannot be negative. Then, E(h|X) = P [h > 0|X] ∗ E[h|h > 0, X] + P [h = 0|X] ∗ 0 = P [h > 0|X] ∗ E[h|h > 0, X] Example: Annual amount spent in electronic goods (i.e. TV, DVD players). Some years you may declare to have spent £0. Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Here we need some mathematical statistics tools... If z ∼ N (0, 1), then E(z|z > c) = φ(c)/[1 − Φ(c)]. Then, E(y|y > 0, X) = Xβ + E(u|u > −Xβ) = Xβ + σφ(Xβ)/Φ(Xβ). Here we have used φ(−c) = φ(c) and 1 − Φ(−c) = Φ(c). Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Then, E(y|y > 0, X) = Xβ + σλ(Xβ/σ) where λ is the inverse Mills ratio, the ratio of a standard normal pdf and cdf. Moreover, E(y|X) = Φ(Xβ)E(y|y > 0, X) = Φ(Xβ)[Xβ + σλ(Xβ/σ)] Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Marginal eﬀects in Tobit models dλ ∂E(y|y > 0, X)/∂xj = βj + βj (Xβ/σ) dc = βj {1 − λ(Xβ/σ) [Xβ + σλ(Xβ/σ)]} ∂E(y|X)/∂xj = βj Φ(Xβ/σ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models tobit y x1 x2 (tobit estimation) mfx compute, predict(ystar(0,.)) ¯ (∂E(y|y > 0, X = X)/∂xj ) ¯ mfx compute, predict(e(0,.)) (∂E(y|X = X)/∂xj ) Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Tobit Models Consider Example 17.2: Married Women’s Annual Labor Supply http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge17.html Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Sample selection models Consider the following model. The outcome equation is y ∗ = Xβ + u, E(u|X) = 0 However, we only observe the dependent variable if something happens. The selection equation is Zγ + v > 0 then y = y ∗ × 1[Zγ + v > 0] Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Sample selection models Consider the following model. The outcome equation is y ∗ = Xβ + u, E(u|X) = 0 However, we only observe the dependent variable if something happens. The selection equation is Zγ + v > 0 then y = y ∗ × 1[Zγ + v > 0] Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Sample selection models Under certain conditions OLS is biased. Assume that u and v are correlated, i.e. corr(u, v) = ρ. Then, E(y|y > 0, X) = E(y|y > 0, X, Zγ+v > 0) = Xβ+E(u|Zγ+v > 0) = Xβ + ρσu λ(Zγ) Now, ∂E(y|y > 0, X)/∂xj = βj + ρσu ∂λ(Zγ)/∂xj Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Sample selection models There are two ways of estimating these models: 1. MLE heckman y x1 x2, select(c= z1 z2) 2. Heckman’s two-step estimator (James Heckman won the Nobel Prize for this...) heckman y x1 x2, select(c= z1 z2) twostep 1. Here the idea is that you estimate a probit model ﬁrst, to get ˆ 1[Zγ + e > 0], that is to estimate γ . ˆ γ 2. Then you construct the inverse Mills ratio λ(Zγ) = λ(Zˆ ). 3. Then you run a regression of ˆ y = Xβ + αλ(Zγ) + e Gabriel Montes-Rojas Limited Dependent Variable Models Binary Response Models Multivalued response models Truncated Models Sample Selection Sample selection models There are two ways of estimating these models: 1. MLE heckman y x1 x2, select(c= z1 z2) 2. Heckman’s two-step estimator (James Heckman won the Nobel Prize for this...) heckman y x1 x2, select(c= z1 z2) twostep 1. Here the idea is that you estimate a probit model ﬁrst, to get ˆ 1[Zγ + e > 0], that is to estimate γ . ˆ γ 2. Then you construct the inverse Mills ratio λ(Zγ) = λ(Zˆ ). 3. Then you run a regression of ˆ y = Xβ + αλ(Zγ) + e Gabriel Montes-Rojas Limited Dependent Variable Models