Document Sample

Limited Dependent Variable Models and Sample Selection Correlations 1. Introduction Limited dependent variable models have been developed to analyze the behavior of individuals, families, or firms. Suppose we want to study the labor force participation of adult males as a function of the unemployment rate, average wage rate, family income, education etc. A person either is in the labor force or not. Hence, the dependent variable, labor force participation, can take only two values: 1 if the person is in the labor force and 0 if he or she is not. This type of 1/0 or yes/no response is called binary response and is a class of discrete choice. The discrete choice may include more than two choices. The discrete choice is a class of limited dependent variable (LDV). Consider the following simple model: yi 1 2 i ui , where X i = set of explanatory variables, y 1 if he of she is in the labor market, y 0 if he of she is not in the labor market. When discrete choice involves more than two choices, i.e. three choices, an example would be: y 1 if he or she goes to America, y2 if he or she goes to Australia, y 3 if he or she goes to England, X i = the price of air tickets. There are four most commonly used approaches to estimating such models: 1. The linear probability model (LPM) 2. The logit model 3. The probit model 4. The Tobit model The 1st-3rd models are typically used in the case of a binary dependent variable. The 2nd and 3rd models can be extended to incorporate more general cases of discrete choice – more than two choices (multinomial logit model), a count variable (ordered probit model). The model 4 is used when the value of the dependent variable is bounded. The first model - linear probability model (LPM) is simple to estimate. It can be generally formulated as: P( y 1 x ) P( y 1 x1 , x2 ,.........., xk ) (1) Despite its simplicity, it has some drawbacks. The two most important disadvantages are that the fitted probabilities can be less than zero or greater than one and the partial effect of any explanatory variable (appearing in level form) is constant. 2. Logit and probit models for binary response 2.1 Specifying logit and probit Models To avoid the LPM limitations, consider a class of binary response models of the form P ( y 1 x ) G ( 0 1 x1 ............ k xk ) G ( 0 xβ ) (2) In the logit model, G is the logistic function: G( z ) exp( z ) / [1 exp( z)] ( z) (3) which is between zero and one for all real numbers z. This is the cumulative distribution function for a standard logistic random variable. In the probit model, G is the standard normal cumulative distribution function (cdf), which is expressed as an integral z G ( z ) ( z ) (v)dv where ( z ) is the standard normal density ( z) (2 )1/2 exp( z 2 / 2) . (4) The logistic function is plotted in the following figure. Graph of logistic function G( z ) exp( z ) / [1 exp( z )] 1 .5 0 -3 -2 -1 0 1 2 3 z Logit and probit models can be derived from an underlying latent variable model. Let y* be an unobserved, or latent variable, determined by y* 0 x , y 1[ y* 0] . (5) We can derive the response probability for y: P(y 1 x ) P( y* 0 x ) P[ ( 0 x ) x ] = 1 G[(0 x )] G(0 x ) . According to (3), the logit cdf of this specification will be: G( 0 xβ )= exp[(0 xβ )] / [1 exp(( 0 xβ)] . For the probit model, 0 xβ G ( 0 xβ )= (v)dv ( 0 xβ ) To find the partial effect of roughly continuous variables on the response probability, we must rely on calculus. If xj is a roughly continuous variable, its partial effect on p ( x ) P(y 1 x ) is obtained from the partial derivative: p( x ) dG ( z ) g ( 0 xβ ) j , where g(z) (6) x j dz Because G is the cdf (cumulative density function) of a continuous random variable, g is the pdf (probability density function). This implies that p ( x ) sign( ) sign( j ) . (7) x j The sign of the marginal effect will not depend on x while its magnitude will be proportional to g (0 xβ ) , thereby affected by x . 2.2 Maximum Likelihood Estimation of logit and probit Models Because of nonlinear nature of E(y x ) , OLS and WLS (weighted least squares) are not applicable. We could instead use maximum likelihood estimation (MLE). Assume that we have a random sample of size n. To obtain the maximum likelihood estimator, conditional on the explanatory variables, we need the density of yi given xi . We can write this as f ( y x i ; ) [G( xi β )] y [1 G( xi β )]1 y , y 0,1 (9) The log-likelihood function for observation i is a function of the parameters and the data ( xi , yi ) and is obtained by taking the log of (9): i ( ) yi log[G( xi , )] (1 yi ) log[1 G( xi β)] (10) The log-likelihood for the sample size of n is obtained by summing (10) across all observations. n m n L( ) i ( ) yi log[G( xi , )] (1 y ) log[1 G( x β)] i i i 1 i 1 i m 1 where yi 1 for i 1,...., m and yi 0 for i m 1,..., n . ˆ The MLE of , denoted by , maximizes this log-likelihood. If G() is the standard logit cdf, then ˆ is the logit estimator, if G() is the standard normal cdf, ˆ then is the probit estimator. 2.3 Testing multiple hypotheses The likelihood ratio test is most commonly used to test multiple restrictions such as 1 2 ... k 0 , in the logit and probit models. This is equivalent to the F test for the model in the least squares estimation. The likelihood ratio (LR) test The likelihood ratio statistic is twice the difference in the log likelihoods: LR 2( Lur Lr ) (11) where Lur the log-likelihood value for the unrestricted model (that with no constraint) and Lr is the log-likelihood value for the restricted model ( 1 2 ... k 0 in the above example). Because Lur Lr , LR is nonnegative and usually strictly positive. Stata commands 1. Logit model logit y x1 x2 dlogit y x1 x2 (The marginal effects are returned) mlogit y x1 x2 (Multinomial logit model) 2. Probit model probit y x1 x2 dprobit y x1 x2 (The marginal effects are returned) mprobit y x1 x2 (Multinomial probit model) oprobit y x1 x2 (Ordered probit model – e.g. count data)

DOCUMENT INFO

Shared By:

Categories:

Tags:
dependent variable, probit model, selection model, panel data, tobit model, selection bias, selection models, dependent variable models, limited dependent variable, regression models, dependent variables, econometric theory, independent variables, ols estimates, journal of econometrics

Stats:

views: | 3 |

posted: | 1/19/2010 |

language: | English |

pages: | 4 |

OTHER DOCS BY ylx48163

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.