Docstoc

Limited Dependent Variable Models - PDF

Document Sample
Limited Dependent Variable Models - PDF Powered By Docstoc
					Limited Dependent Variable
Models

           EMET 8002
            Lecture 9
         August 27, 2009



                             1
Limited Dependent Variables
 A limited dependent variable is a dependent
 variable whose range is restricted

 For example:
   Any indicator variable such as whether or not a
   household is poor (i.e., 0 or 1)
   Test scores (generally bound by 0 and 100)
   The number of children born to a woman is a non-
   negative integer



                                                      2
Outline
 Logit and probit models for binary dependent
 variables

 Tobit model for corner solutions




                                                3
Why do we care?
 Let’s start with a review of the linear probability
 model to examine some of its shortcomings

 The model is given by:
          y = β 0 + β1 x1 + ... + β k xk + u


 where
    P ( y = 1| x ) = E ( y | x ) = β 0 + β1 x1 + ... + β k xk



                                                                4
Linear Probability Model
 There will be three undesirable features of this model:
 1. The error term will not be homoskedastic. This violates
    assumption LMR.4. Our OLS estimates will still be unbiased,
    but the standard errors are incorrect. Nonetheless, it is
    easy to adjust for heteroskedasticity of unknown form.

 2.   We can get predictions that are either greater than 1 or
      less than 0!

 3.   The independent variables cannot be linearly related to the
      dependent variable for all possible values.


                                                                 5
Linear Probability Model
Example
 Let’s look at how being in the labour force is
 influenced by various determinants:
    Husband’s earnings
    Years of education
    Previous labour market experience
    Age
    Number of children less than 6 years old
    Number of children between 6 and 18 years of age




                                                       6
Linear Probability Model
Example
                    Coefficient   Usual standard   Robust standard
                     estimate         errors           errors
Husband’s income     -0.0034         0.0014            0.0015

Years of              0.038           0.007            0.007
education
Experience            0.039           0.006            0.006

Experience2         -0.00060         0.00018          0.00019

Age                   -0.016          0.002            0.002

# kids <= 6 years     -0.262          0.034            0.032
old
# kids > 6 years      0.013           0.013            0.014
old
                                                                     7
Linear Probability Model
Example
 Using standard errors that are robust to unknown
 heteroskedasticity is simple and does not
 substantially change the reported standard errors

 Interpreting the coefficients:
    All else equal, an extra year of education increases the
    probability of participating in the labour force by 0.038
    (3.8%)
    All else equal, an additional child 6 years of age or less
    decreases the probability of working by 0.262

                                                             8
Linear Probability Model
Example
 Predicted probabilities:
    Sometimes we obtain predicted probabilities that are outside
    of the range [0,1]. In this sample, 33 of the 753
    observations produce predicted probabilities outside of [0,1].
    For example, consider the following observation:
        Husband’s earnings = 17.8
        Years of education = 17
        Previous labour market experience = 15
        Age = 32
        Number of children less than 6 years old = 0
        Number of children between 6 and 18 years of age = 1
        The predicted probability is 1.13!!

                                                                 9
Linear Probability Model
Example
 An additional problem is that probabilities cannot be
 linearly related to the independent variables for all
 possible values
    For example, consider the estimate of the marginal
    effect of increasing the number of children 6 years of
    age or younger. It is estimated to be -0.262. This
    means that if this independent variable increased from
    0 to 4, the probability of being in the labour market
    would fall by 1.048, which is impossible!




                                                         10
Linear Probability Model
 It is still a useful model to estimate, especially since
 the estimate coefficients are much easier to interpret
 than the nonlinear models that we are going to
 introduce shortly
 Plus, it usually works well for values of the
 independent variables that are close to the respective
 means (i.e., outlying values of x cause problems)




                                                       11
Limited Dependent Variables
Models
 In this lecture we’re going to cover estimation
 techniques that will better address the nature of the
 dependent variable
    Logit & Probit
    Tobit




                                                         12
Logit and Probit Models for
Binary Response
 We’re going to prevent predicted values from ever
 falling outside the range [0,1] by estimating a
 nonlinear regression:
            P ( y = 1| x ) = G ( β 0 + xβ )
 where 0<G(z)<1 for all real numbers z

 The two most commonly used functions for G(.) are
 the logit model and the probit model:
                     exp ( z )
            G(z) =               = Λ(z)
                   1 + exp ( z )
          G(z) = Φ(z)                                13
Logit and Probit Models for
Binary Response
 Logit and probit models can be derived from an
 underlying latent variable model
   i.e., an unobserved variable
            y* = β 0 + xβ + e,
                             y = 1 ⎡ y* > 0 ⎤
                                   ⎣        ⎦
 We assume that e is independent of x and that e
 either has the standard logistic distribution or the
 standard normal distribution
 Under either assumption e is symmetrically
 distributed about 0, which implies that 1-G(-z)=G(z)
 for all real numbers z

                                                        14
Logit and Probit Models for
Binary Response
 We can now derive the response probability for y:
        P ( y = 1| x ) = P ( y* > 0 | x )
                   = P ( β 0 + xβ + e > 0 | x )
                   = P ( e > − ( β 0 + xβ ) | x )
                   = 1 − G ⎡ − ( β 0 + xβ ) ⎤
                           ⎣                ⎦
                   = G ( β 0 + xβ )




                                                     15
Logit and Probit Models for
Binary Response
 In most applications of binary response models our main
 interest is to explain the effects of the x’s on the response
 probability P(y=1|x)
 The latent variable interpretation tends to give the impression
 that we are interested in the effects of the x’s on y*
 For probit and logit models, the direction of the effect of the x’s
 on E(y*|x) and E(y|x)=P(y=1|x) are the same
 In most applications however, the latent variable does not have
 a well-defined unit of measurement which limits its
 interpretation. Nonetheless, in some examples this is a very
 useful tool for thinking about the problem.



                                                                   16
Logit and Probit Models for
Binary Response
 The sign of the coefficients will tell us the direction of
 the partial effect of xj on P(y=1|x)

 However, unlike the linear probability model, the
 magnitudes of the coefficients are not especially
 useful

 If xj is a roughly continuous variable, its partial effect
 is given by:      ∂p ( x ) dG ( z )
                           =         βj
                     ∂x j    dz
                                                          17
Logit and Probit Models for
Binary Response
 In the linear probability model the derivative of G was simply 1,
 since G(z)=z in the linear probability model.
     In other words, we can move from this nonlinear function
     back to the linear model by simply assuming G(z)=z.

 For both the logit and the probit models g(z)=dG(z)/dz is
 always positive (since G is the cumulative distribution function,
 g is the probability density function). Thus, the sign of βj is the
 same as the sign of the partial effect.

 The magnitude of the partial effect is influenced by the entire
 vector of x’s

                                                                   18
Logit and Probit Models for
Binary Response
 Nonetheless, the relative effect of any two
 continuous explanatory variables do not depend on x

 The ratio of the partial effects for xj and xh is βj/βh,
 which does not depend on x




                                                            19
Logit and Probit Models for
Binary Response
 Suppose x1 is a discrete variable, its partial effect of going from
 c to c+1 is given by:
           G ( β 0 + β1 ( c + 1) + β 2 x2 + ... + β k xk ) −
           G ( β 0 + β1c + β 2 x2 + ... + β k xk )
 Again, this effect depends on x

 Note, however, that the sign of β1 is enough to know whether
 the discrete variable has a positive or negative effect
    This is because G() is strictly increasing



                                                                   20
Logit and Probit Models for
Binary Response
 We use Maximum Likelihood Estimation, which
 already takes into consideration the
 heteroskedasticity inherent in the model

 Assume that we have a random sample of size n

 To obtain the maximum likelihood estimator,
 conditional on the explanatory variables, we need the
 density of yi given xi
                                                       1− y
      f ( y | xi ; β ) = ⎡G ( xi β ) ⎦ ⎣1 − G ( xi β ) ⎦
                                     y
                         ⎣           ⎤ ⎡               ⎤      , y = 0,1
                                                                          21
Logit and Probit Models for
Binary Response
 When y=1: f(y|xi:β)=G(xiβ)
 When y=0: f(y|xi:β)=1-G(xiβ)

 The log-likelihood function for observation i is
 given by:
     li ( β ) = yi log ⎡G ( xi β ) ⎤ + (1 − yi ) log ⎡1 − G ( xi β ) ⎤
                       ⎣           ⎦                 ⎣               ⎦
 The log-likelihood for a sample of size n is obtained
 by summing this expression over all observations
                                   n
                        L ( β ) = ∑ li ( β )
                                  i =1

                                                                         22
Logit and Probit Models for
Binary Response
 The MLE of β maximizes this log-likelihood
 If G is the standard logit cdf, then we get the logit
 estimator
 If G is the standard normal cdf, then we get the
 probit estimator

 Under general conditions, the MLE is:
    Consistent
    Asymptotically normal
    Asymptotically efficient


                                                         23
Inference in Probit and Logit
Models
 Standard regression software, such as Stata, will
 automatically report asymptotic standard errors for
 the coefficients

 This means we can construct (asymptotic) t-tests for
 statistical significance in the usual way:
                     ˆ     ( )
                            ˆ
               t j = β j se β j




                                                       24
Logit and Probit Models for Binary
Response: Testing Multiple Hypotheses

 We can also test for multiple exclusion restrictions
 (i.e., two or more regression parameters are equal to
 0)

 There are two options commonly used:
    A Wald test
    A likelihood ratio test




                                                    25
Logit and Probit Models for Binary
Response: Testing Multiple Hypotheses

 Wald test:
    In the linear model, the Wald statistic, can be
    transformed to be essentially the same as the F
    statistic
    The formula can be found in Wooldridge (2002,
    Chapter 15)
    It has an asymptotic chi-squared distribution, with
    degrees of freedom equal to the number of restrictions
    being tested
    In Stata we can use the “test” command following
    probit or logit estimation

                                                        26
Logit and Probit Models for Binary
Response: Testing Multiple Hypotheses
 Likelihood ratio (LR) test
     If both the restricted and unrestricted models are easy to
     compute (as is the case when testing exclusion restrictions),
     then the LR test is very attractive
     It is based on the difference in the log-likelihood functions
     for the restricted and unrestricted models
        Because the MLE maximizes the log-likelihood function,
        dropping variables generally leads to a smaller log-likelihood
        (much in the same way are dropping variables in a liner model
        leads to a smaller R2)
    The likelihood ratio statistic is given by:
                        LR = 2 ( Lur − Lr )
    It is asymptotically chi-squared with degrees of freedom
    equal to the number of restrictions
    can use lrtest in Stata

                                                                     27
Logit and Probit Models for Binary
Response: Interpreting Probit and Logit
Estimates

  Recall that unlike the linear probability model, the
  estimated coefficients from Probit or Logit estimation
  do not tell us the magnitude of the partial effect of a
  change in an independent variable on the predicted
  probability

  This depends not just on the coefficient estimates,
  but also on the values of all the independent
  variables and the coefficients


                                                        28
Logit and Probit Models for Binary
Response: Interpreting Probit and Logit
Estimates

  For roughly continuous variables the marginal effect
  is approximately by:
                               (          )
         ΔP ( y = 1| x ) ≈ ⎡ g β 0 + xβ β j ⎤ Δx j
           ˆ
                           ⎣
                               ˆ      ˆ ˆ
                                            ⎦

  For discrete variables the estimated change in the
  predicted probability is given by:
         (                                         )
         G β 0 + β1 ( c + 1) + β 2 x2 + ... + β k xk −
            ˆ     ˆ            ˆ              ˆ

        G(βˆ
             0
                  ˆ    ˆ              ˆ
               + β1c + β 2 x2 + ... + β k xk   )
                                                         29
Logit and Probit Models for Binary
Response: Interpreting Probit and Logit
Estimates
  Thus, we need to pick “interesting” value of x at
  which to evaluate the partial effects
     Often the sample averages are used. Thus, we obtain
     the partial effect at the average (PEA)

     We could also use lower or upper quartiles, for
     example, to see how the partial effects change as
     some elements of x get large or small

     If xk is a binary variable, then it often makes sense to
     use a value of 0 or 1 in the partial effect equation,
     rather than the average value of xk

                                                                30
Logit and Probit Models for Binary
Response: Interpreting Probit and Logit
Estimates

  An alternative approach is to calculate the average
  partial effect (APE)

  For a continuous explanatory variable, xj, the APE is:
              (         )               (        )
            n                             n
     n −1 ∑ ⎡ g β 0 + xi β β j ⎤ = n −1 ∑ ⎡ g β 0 + xi β ⎤β j
                 ˆ       ˆ ˆ                   ˆ       ˆ ˆ
          i =1
               ⎣               ⎦        i =1
                                             ⎣           ⎦

  The two scale factors (at the mean for PEA and
  averaged over the sample for the APE) differ since
  the first uses a nonlinear function of the average and
  the second uses the average of a nonlinear function
                                                            31
Example 17.1: Married Women’s
Labour Force Participation
 We are going to use the data in MROZ.RAW to
 estimate a labour force participation for women using
 logit and probit estimation.
   The explanatory variables are nwifeinc, educ, exper,
   exper2, age, kidslt6, kidsge6
   probit inlf nwifeinc educ exper expersq age kidslt6
   kidsge6




                                                          32
Example 17.1
Independent                           Coefficient Estimates
variable
                         OLS                 Probit            Logit
                    (robust stderr)
Husband’s income        -0.0034              -0.012            -0.021
                       (0.0015)             (0.005)           (0.008)
Years of                0.038                0.131             0.221
education              (0.007)              (0.025)           (0.043)
Age                     -0.016               -0.053            -0.088
                       (0.002)              (0.008)           (0.014)
# kids <= 6 years       -0.262               -0.868            -1.44
old                    (0.032)              (0.119)           (0.20)
# kids > 6 years        0.013                0.036             0.060
old                    (0.014)              (0.043)           (0.075)
                                                                        33
Example 17.1
 True or false:
    The Probit and Logit model estimates suggest that the
    linear probability model was underestimating the
    negative impact of having young children on the
    probability of women participating in the labour force.




                                                          34
Example 17.1
 How does the predicted probability change as the
 number of young children increases from 0 to 1?
 What about from 1 to 2?
   We’ll evaluate the effects at:
      Husband’s income=20.13
      Education=12.3
      Experience=10.6
      Age=42.5
      # older children=1
   These are all close to the sample averages


                                                    35
Example 17.1
 From the probit estimates:

    Going from 0 to 1 small child decreases the probability of
    labour force participation by 0.334

    Going from 1 to 2 small child decreases the probability of
    labour force participation by 0.256

 Notice that the impact of one extra child is now nonlinear (there
 is a diminishing impact). This differs from the linear probability
 model which says any increase of one young child has the same
 impact.

                                                                 36
Logit and Probit Models for Binary
Response
 Similar to linear models, we have to be concerned with
 endogenous explanatory variables. We don’t have time to cover
 this so see Wooldridge (2002, Chapter 15) for a discussion

 We need to be concerned with heteroskedasticity in probit and
 logit models. If var(e|x) depends on x then the response
 probability no longer has the form G(β0+βx) implying that more
 general estimation techniques are required

 The linear probability can be applied to panel data, typically
 estimated using fixed effects
     Logit and probit models with unobserved effects are difficult
     to estimate and interpret (see Wooldridge (2002, Chapter
     15))


                                                                 37
The Tobit Model for Corner
Solution Responses
 Often in economics we observes variables for which 0
 (or some other fixed number) is in an optimal
 outcome for some units of observations, but a range
 of positive outcomes prevail for other observations
   For example:
      Number of hours worked annually
      Trade flows
      Hours spent on the internet
      Grade on a test (may be grouped at both 0 and 100)




                                                           38
The Tobit Model for Corner
Solution Responses
 Let y be a variable that is roughly continuous over
 strictly positive values but that takes on zero with a
 positive probability

 Similar to the binary dependent variable context we
 can use a linear model and this might not be so bad
 for observations that are close to the mean, but we
 may obtain negative fitted values and therefore
 negative predictions for y


                                                          39
The Tobit Model for Corner
Solution Responses
 We often express the observed outcome, y, in terms
 of an unobserved latent variable, say y*
          y* = xβ + u , u | x ~ N ( 0, σ 2 )
                  y = max ( 0, y *)

 We now need to think about how to estimate this
 model. There are two cases to consider:
   When y=0
   When y>0


                                                   40
       The Tobit Model for Corner
       Solution Responses
          Let’s start with how we’d incorporate y=0. What is
          the probability that y=0 conditional on the
          explanatory variables?
P ( y = 0 | x ) = P ( y* < 0 | x )   Definition of y

= P ( xβ + u < 0 | x )               Definition of y*

= P ( u < − xβ | x )
= P ( u σ < − xβ σ | x )             Creating a standard normal variable

= Φ ( − xβ σ )                       The normal CDF

= 1 − Φ ( xβ σ )
                                                                           41
The Tobit Model for Corner
Solution Responses
 What is the probability that y>0 conditional on the
 explanatory variables?

 Since y is continuous for values greater than 0, the
 probability is simply the density of the normal
 variable u

 We can now put together these two pieces to form
 the log-likelihood function for the Tobit model (see
 equation 17.22 in Wooldridge)

                                                        42
Interpreting Tobit estimates
 Given standard regression packages, it is straight forward to
 estimate a Tobit model using maximum likelihood (the details of
 the formulation are available in Wooldridge (2002, Chapter 16))

 The underlying model tells us that βj measures the partial effect
 of xj on y*, the latent variable. However, we’re usually
 interested in the observed outcome y, not y*

 In the Tobit model two conditional expectations are generally of
 interest:
     E(y|y>0,x)
     E(y|x)

                                                                43
Interpreting Tobit estimates
 E ( y | y > 0, x ) = xβ + σλ ( xβ / σ )
 E ( y | x ) = Φ ( xβ / σ ) xβ + σφ ( xβ / σ )

 Take home message: Conditional expectations in the
 Tobit are much more complicated than in the linear
 model

 E(y|x) is a nonlinear of function of both x and β.
 Moreover, this conditional expectation can be shown
 to be positive for any values of x and β.
                                                   44
Interpreting Tobit estimates
 To examine partial effects, we should consider two cases:
    When xj is continuous
    When xj is discrete

 When xj is continuous we can use calculus to solve for the
 partial effects:

 ∂E ( y | y > 0, x )
           ∂x j
                              {                                  }
                        = β j 1 − λ ( xβ σ ) ⎡ xβ σ + λ ( xβ σ ) ⎤
                                             ⎣                   ⎦

 ∂E ( y | x )
                  = β j Φ ( xβ σ )
    ∂x j
 Like in probit or logit models, the partial effect will depend on
 all explanatory variables and parameters
                                                                     45
Interpreting Tobit estimates
 When xj is discrete we estimate the partial effect as
 the difference:
  E ( y | y > 0, x − j , x j = c + 1) − E ( y | y > 0, x − j , x j = c )
  E ( y | x − j , x j = c + 1) − E ( y | x − j , x j = c )




                                                                           46
Interpreting Tobit estimates
 Just like the probit and logit models, there are two
 common approaches for evaluating the partial
 effects:
    Partial Effect at the Average (PEA)
       Evaluate the expressions at the same average
    Average Partial Effect (APE)
       Calculate the mean over the values for the entire sample




                                                             47
Example 17.2: Women’s
annual labour supply
 We can use the same dataset, MROZ.RAW, that we
 used to estimate the probability of women
 participating in the labour force to estimate the
 impact of various explanatory variables on the total
 number of hours worked

 Of the 753 women in the sample:
    428 worked for a wage during the year
    325 worked zero hours in the labour market


                                                        48
Tobit example: Women’s
annual labour supply
 reg hours nwifeinc educ exper expersq age kidslt6
 kidsge6

 tobit hours nwifeinc educ exper expersq age kidslt6
 kidsge6, ll(0)




                                                       49
Tobit example: Women’s
annual labour supply
                          Coefficient Estimates

                         OLS               Tobit

Husband’s income         -3.45             -8.81
                        (2.54)            (4.46)
Years of education       28.76            80.65
                        (12.95)          (21.58)
Age                     -30.51            -54.41
                        (4.36)            (7.42)
# kids <= 6 years old   -442.09           -894.02
                        (58.85)          (111.88)
# kids > 6 years old     -32.78           -16.22
                        (23.18)          (38.64)
Sigma                                   1122.022
                                                    50
                                         (41.58)
Tobit example: Women’s
annual labour supply
 The Tobit coefficient estimates all have the same sign
 as the OLS coefficients

 The pattern of statistical significance is also very
 similar

 Remember though, we cannot directly compare the
 OLS and Tobit coefficients in terms of their effect on
 hours worked


                                                        51
Tobit example: Women’s
annual labour supply
 Let’s construct some marginal effects for some of the
 discrete variables

 First, the means of the explanatory variables:
    Husband’s income: 20.12896
    Education: 12.28685
    Experience: 10.63081
    Age: 42.53785
    # young children: 0.2377158
    # older children: 1.353254

                                                     52
Tobit example: Women’s
annual labour supply
 Recall the formula:
      E ( y | x ) = Φ ( xβ / σ ) xβ + σφ ( xβ / σ )

 We can use this to answer the following question: What is the
 impact of moving from 0 to 1 young children on the total
 number of hours worked?
    We’ll evaluate for a hypothetical person close to the mean
    values:
        Husband’s income: 20.12896
        Education: 12
        Experience: 11
        Age: 43
        # older children: 1


                                                                 53
Tobit example: Women’s
annual labour supply
 xβ(#young=0,means)=624.64
 xβ(#young=1,means)=-269.38

 xβ(#young=0,means) / σ=0.5567
 xβ(#young=1,means) / σ=-0.2401

 φ(#young=0,means)=0.3417
 φ(#young=1,means)=0.3876

 Φ(#young=0,means)=0.7111
 Φ(#young=1,means)=0.4051

                                  54
Tobit example: Women’s
annual labour supply
 E(y|#young=0,means)=827.6
 E(y|#young=1,means)=325.8

 E(y|#young=0,means)-E(y|#young=1,means)=502

 Thus, for a hypothetical “average” woman, going from 0 young
 children to 1 young child would decrease hours worked by 502
 hours. This is larger than the OLS estimate of a 442 hour
 decrease.

 We could do the same thing to look at the impact of adding a
 second young child.


                                                                55
Specification Issues
 The Tobit model relies on the assumptions of normality and
 homoskedasticity in the latent variable model

 Recall, using OLS we did not need to assume a distributional
 form for the error term in order to have unbiased (or consistent)
 estimates of the parameters.

 Thus, although using Tobit may provide us with a more realistic
 description of the data (for example, no negative predicted
 values) we have to make stronger assumptions than when using
 OLS.

 In a Tobit model, if any of the assumptions fail, it is hard to
 know what the estimated coefficients mean.


                                                                   56
Specification Issues
 One important limitation of Tobit models is that the expectation of y,
 conditional on a positive value, is closely linked to the probability that
 y>0

 The effect of xj on P(y>0|x) is proportional to βj, as is the effect on
 E(y|y>0,x). Moreover, for both expressions the factor multiplying βj is
 positive.

 Thus, if you want a model where an explanatory variable has opposite
 effects on P(y>0|x) and E(y|y>0,x), then Tobit is inappropriate.

 One way to informally evaluate a Tobit model is to estimate a probit
 model where:
    w=1 if y>0
    w=0 if y=0


                                                                              57
Specification Issues
 The coefficient on xj in the above probit model, say
 γj, is directly related to the coefficient on xj in the
 Tobit model, βj:
                      γ j = βj σ

 Thus, we can look to see if the estimated values
 differ.
    For example, if the estimates differ in sign, this may
    suggest that the Tobit model is in appropriate


                                                             58
Specification Issues: Annual
hours worked example
 From our previous examples, we estimated the probit coefficient on the
 variable # of young children to be -0.868

 In the Tobit model, we estimated βj/σ=-0.797 for the variable # of
 young children

 This is not a very large difference, but it suggests that having a young
 child impacts the initial labour force participation decision more than
 how many hours a woman works, once she is in the labour force

 The Tobit model effectively averages this two effects:
    The impact on the probability of working
    The impact on the number of hours worked, conditional on working



                                                                        59
Specification Issues
 If we find evidence that the Tobit model is
 inappropriate, we can use hurdle or two-part models

 These models have the feature that P(y>0|x) and
 E(y|y>0,x) depend on different parameters and thus
 xj can have dissimilar effects on the two functions
 (see Wooldridge (2002, Chapter 16))




                                                   60
Practice questions
 17.2, 17.3
 C17.1, C17.2, C17.3




                       61
Computer Exercise C17.2
 Use the data in LOANAPP.RAW for this exercise.

 Estimate a probit model of approve on white. Find
 the estimated probability of loan approval for both
 whites and nonwhites. How do these compare to the
 linear probability model estimates?

 probit approve white
 regress approve white


                                                   62
Computer Exercise C17.2
                       Probit             LPM
White                  0.784             0.201
                      (0.087)           (0.020)
Constant               0.547             0.708
                      (0.075)           (0.018)
•As there is only one explanatory variable and it takes only two
values, there are only two different predicted probabilities: the
estimated loan approval probabilities for white and nonwhite
applicants
•Hence, the predicted probabilities, whether we use a probit, logit, or
LPM model are simply the cell frequencies:
    •0.708 for nonwhite applicants
    •0.908 for white applicants                                           63
Computer Exercise C17.2
 We can do this in Stata using the following
 commands following the probit estimation:

 predict phat
 summarize phat if white==1
 summarize phat if white==0




                                               64
Computer Exercise C17.2
 Now add the variables hrat, obrat, loanprc, unem,
 male, married, dep, sch, cosign, chist, pubrec,
 mortlat1, mortlat2, and vr to the probit model. Is
 there statistically significant evidence of
 discrimination against nonwhites?




                                                      65
Computer Exercise C17.2
approve    Coef.       Std. Err.   z       P>z     [95% Conf.Interval]

white      .5202525    .0969588    5.37    0.000   .3302168    .7102883
hrat       .0078763    .0069616    1.13    0.258   -.0057682   .0215209
obrat      -.0276924   .0060493    -4.58   0.000   -.0395488   -.015836
loanprc    -1.011969   .2372396    -4.27   0.000   -1.47695    -.5469881
unem       -.0366849   .0174807    -2.10   0.036   -.0709464   -.0024234
male       -.0370014   .1099273    -0.34   0.736   -.2524549   .1784521
married    .2657469    .0942523    2.82    0.005   .0810159    .4504779
dep        -.0495756   .0390573    -1.27   0.204   -.1261266   .0269753
sch        .0146496    .0958421    0.15    0.879   -.1731974   .2024967
cosign     .0860713    .2457509    0.35    0.726   -.3955917   .5677343
chist      .5852812    .0959715    6.10    0.000   .3971805    .7733818
pubrec     -.7787405   .12632      -6.16   0.000   -1.026323   -.5311578
mortlat1   -.1876237   .2531127    -0.74   0.459   -.6837153   .308468
mortlat2   -.4943562   .3265563    -1.51   0.130   -1.134395   .1456823
vr         -.2010621   .0814934    -2.47   0.014   -.3607862   -.041338
_cons      2.062327    .3131763    6.59    0.000   1.448512    2.676141


                                                                           66
Computer Exercise C17.2
 Estimate the previous model by logit. Compare the
 coefficient on white to the probit estimate.




                                                     67
Computer Exercise C17.2
approve    Coef.       Std. Err.   z       P>z     [95% Conf.Interval]

white      .9377643    .1729041    5.42    0.000   .5988784    1.27665
hrat       .0132631    .0128802    1.03    0.303   -.0119816   .0385078
obrat      -.0530338   .0112803    -4.70   0.000   -.0751427   -.0309249
loanprc    -1.904951   .4604412    -4.14   0.000   -2.807399   -1.002503
unem       -.0665789   .0328086    -2.03   0.042   -.1308825   -.0022753
male       -.0663852   .2064288    -0.32   0.748   -.4709781   .3382078
married    .5032817    .177998     2.83    0.005   .1544121    .8521513
dep        -.0907336   .0733341    -1.24   0.216   -.2344657   .0529986
sch        .0412287    .1784035    0.23    0.817   -.3084356   .3908931
cosign     .132059     .4460933    0.30    0.767   -.7422677   1.006386
chist      1.066577    .1712117    6.23    0.000   .731008     1.402146
pubrec     -1.340665   .2173657    -6.17   0.000   -1.766694   -.9146363
mortlat1   -.3098821   .4635193    -0.67   0.504   -1.218363   .598599
mortlat2   -.8946755   .5685807    -1.57   0.116   -2.009073   .2197222
vr         -.3498279   .1537248    -2.28   0.023   -.6511231   -.0485328
_cons      3.80171     .5947054    6.39    0.000   2.636109    4.967311



                                                                           68
Computer Exercise C17.2
 Use the average partial effect (APE) to calculate the
 size of discrimination for the probit and logit
 estimates.




                                                         69
Computer Exercise C17.2
 This can be done in Stata using the user-written
 command margeff
    For dummy variables the APE is calculated as a
    discrete change in the dependent variable as the
    dummy variable changes from 0 to 1 (see Cameron
    and Trivedi, 2009, Chapter 14)

 probit ...
 margeff
 logit ...
 margeff

                                                       70
Computer Exercise C17.2
  Average Partial Effect of being White on Loan
  Approval
                 Probit       Logit      OLS

  White          0.104       0.101      0.129
                (0.023)     (0.022)    (0.020)
  Partial Effect at the Average

  White          0.106       0.097      0.129
                (0.024)     (0.022)    (0.020)



                                                  71

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:178
posted:4/9/2010
language:English
pages:71
Description: Limited Dependent Variable Models