Duration and Survival Analysis Introduction by hwh10252


									                             CARLETON UNIVERSITY
                               OTTAWA CANADA
                              ECONOMETRICS 5701F
                                  LECTURE 6

   Instructor: Marcel Voia

 Duration and Survival Analysis
   Questions that can be answered using duration models: What are the patterns of
unemployment duration? What are the factors which influence the length of unemployment
spell? Are you less likely to exit the state of unemployment the longer you have been
   Duration analysis have in fact been developed in other disciplines (eg. health, engineering).

 Some basic concepts
    Let T be the duration of a process, or the time to exit from a state. The associated
probability density function is ft  PrT  t. The duration distribution function Ft
represents the probability of exit from the state by time t, where
                                           Ft  PrT  t      s0 fsds,
                       which implies that ft 
We are more commonly interested in the probability of survival St in a state to at least time t
                                  St  PrT ≥ t  1 − Ft.
The basic building block in duration modelling is the exit rate or hazard function at some time
t,commonly denoted t, which represents the instantaneous exit rate from the state at time t.
In discrete terms, the probability than an individual who has occupied the state until time t
leaves that state in a short interval of length dt after t is
                                     Prt ≤ T ≤ t  dt|T ≥ t.
An average probability of exit per unit of time within the short interval dt is
                                     Prt ≤ T ≤ t  dt|T ≥ t
As we shorten the length of the interval over which this average probability is defined, we
converge to the hazard rate t. That is,
                                             Prt ≤ T ≤ t  dt|T ≥ t
                               t lim                              .
                                      dt→0             dt
Using the normal rules for conditional probabilities, we have that
                         Prt ≤ T ≤ t  dt|T ≥ t            Prt ≤ T ≤ t  dt, T ≥ t
         t  lim                               lim 1                            
                                   dt                dt→0 dt        PrT ≥ t
                          Prt ≤ T ≤ t  dt              PrT ≤ t  dt − PrT ≤ t
               lim 1                        lim 1                               
                dt→0 dt       PrT ≥ t         dt→0 dt           PrT ≥ t
                            PrT ≤ t  dt − PrT ≤ t                  dFt
               lim 1                                   1 lim             
                dt→0 St               dt                  St dt→0     dt
         t        .
So, hazard rate is the ratio of the duration density to the complement of the duration
distribution function at time t. Indeed, given the importance of the hazard rate in modelling
functions, it may be profitable to model the hazard directly rather than as a function of the
duration distribution.

 The concept of duration
     The characteristics of the hazard function have important implications for the pattern of the
probability of exit from some state over time. Suppose for example that T represents the
duration of unemployment. In the study of labour market transitions, one commonly used view
is that it becomes increasingly difficult to secure employment the longer one remains
unemployed. In terms of instantaneous exit rate from the state of unemployment, this
hypothesis would be supported by an empirical finding that the hazard rate t decreases with
t. This characteristic is called negative duration dependence, and represents a situation in
which, for some t  t ∗ ,
                                                     0.
Conversely, positive duration dependence would correspond to a circumstance in which
                                                     0.
Clearly, the potential patterns of duration dependence depend on the form of t. Perhaps the
simplest hazard rate is one in which the instantaneous exit rate is constant over time, such that
t   0 . However, t may neither be constant, nor indeed monotonic.

 Some relationships between functions
   We have that
                                         ft          dFt
                                   t        1
                                         St     St dt
                                                 dSt       d ln St
                                        1 −          −
                                         St      dt           dt
                                          d ln1 − Ft
So, integrating the hazard function t to t gives
                               t                 t         −
                     t     s0 sds   s0 − d ln1dt Ft . ds
                           − ln1 − Fs t0  − ln1 − Ft  ln1 − F0
                           − ln1 − Ft since F0  0
                           − ln St
That is to say, the integrated hazard t is precisely the negative of the log survival function.
By rearranging the above equation we have that
                                        St  exp−              sds

which leads to an expression for the density of t,
                                    ft    exp−          sds t.

So, we can completely describe either the survival function or the duration density in terms of
the hazard function t. To proceed further with empirical analysis, some functional
specification for the hazard rate (or the survival function directly) is required.

 Some specific parametric forms
 The exponential distribution
    In the discussion of duration dependence, we noted that the simplest case (of a constant
instantaneous exit rate corresponds) to a hazard rate specification of the form
                                                 t   0
for some parameter 0  0. The duration density ft and distribution function Ft, and the
survival rate St, may easily be derived from the constant hazard specification. Note first that
                                             d ln St
                                                        − 0 ,
which implies that
                                             ln St  k −  0 t.
for some k. Hence,
                      St  expk −  0 t  K exp− 0 t  S0 exp− 0 t
under the assumption that S0  1. It follows that
                                          t   0 t
                                            ft   0 exp− 0 t
                                     t − ft    −  0 . exp− 0 t
                            Ft                 0
                                        t                0
                                  1 − exp− 0 t.
The exponential distribution (so-called because of the exponential form for the duration
density and distribution functions) is clearly a restrictive specification, and appropriate only for
certain economic or statistical applications. The major drawback inherent in this specification
is that the conditional probability of exit is constant, which implies no positive or negative
duration dependence. This characteristic can also be expressed in terms of a process with no
     It can be argued that a model of labour market transition from unemployment founded on
the reservation wage model may be modelled using the exponential distribution. That is to say,
if wage offers from some (constant) wage offer distribution arrive at a constant rate over time,
then the probability of labour market entry is also constant.
 The Weibull distribution
   To introduce duration dependence (positive or negative) in a model of state transitions
requires a more general functional specification for the hazard rate. A popular one-parameter
generalisation of the exponential distribution is the Weibull distribution, for which
                             t   0  0 t −1 for  0 ,   0,   ∗
which corresponds to a survival function of the form
                                         St  exp− 0 t  
following the same methodology as applied in the exponential distribution. In the presentation
of empirical results, it is common to define the median duration M of the process, which is
given by the duration t  M at which SM  0. 5. For the Weibull,
                                  SM  exp− 0 M    0. 5
                                              log2 1/
The advantage that the Weibull has over the exponential can be demonstrated by
differentiating ∗. Specifically,
                                             2  − 1 0 t −2
which is positive (negative) for   1. Thus, the parameter  defines the sign and degree of
duration dependence for the Weibull model - positive duration dependence requires   1.
This is a useful generalision which makes the Weibull model more widely applicable than the
simpler exponential distribution. Two examples:
    1. Strike duration. There may be good reason to suppose that the elapsed duration of a
strike at some time t might influence the probability of the strike ending in the next period.
One possible scenario might be that the longer the strike lasts, the more likely it is to end in the
next period, a situation of positive duration dependence.
    2. Unemployment duration. Depreciation of skills and human capital might mean that the
longer is the term of unemployment, the less likely is the unemployed person to secure a job
offer in the next period. This corresponds to a situation of negative duration dependence.
    The Weibull is still restrictive, to the extent that the hazard rate is monotonic. That is to
say, duration dependence is either entirely positive or entirely negative, and can never change
sign over the range of state duration.
 The log logistic distribution
   The log-logistic distribution is an alternative one-parameter generalisation of the
exponential distribution, and corresponds to a hazard rate of the form
                                      0  0 t −1
                            t                     for  0 ,   0 ∗ ∗.
                                      1   0 t 
The survival function corresponding to (**), although harder to derive, is

                                        St         1        .
                                                 1   0 t 
The log logistic exhibits a different pattern of duration dependence to the Weibull; the hazard
rate first increases with t before decreasing as t increases.
 Introducing covariates
    We introduce the fact that duration may depend on observed characteristics. It seems
perfectly reasonable, for example, that the duration of unemployment will depend on
socio-demographic factors and human capital and skill characteristics. A model which does not
admit this as a possibility is therefore likely to be misspecified to some degree. The first class
of covariates are termed time-invariant covariates, and are so-called because the values do not
depend on the period of duration in a state. So, for time-invariant characteristics x i , the value
of t does not influence the value of the covariate - one could class these covariates as
exogenous to the duration process. In a study of unemployment, one might include the gender
of the individual, or the level of school qualifications. Each are clearly fixed in relation to the
period of unemployment. On the other hand, time-varying covariates are much harder to deal
with. A time-varying covariate x it is one for which the level of the covariate depends on the
duration in the state on question. Examples might include socio-demographic characteristics,
or the human capital/skills base of the subject, all of which may vary over the period of
 The proportional hazard specification
     A common mechanism by which to introduce covariates into duration and survival analysis
is through the so-called proportional hazards specification, which adjusts the conventional
hazard specification according to the following rule:
                                   t  gx 0 t  0 (***)
for some parametric baseline hazard  0 t and covariate function gx, where x represents a
vector of covariates thought to influence the duration in a state and instataneouous exit rate.
Typically, gx  expx ′ .
    Based on (***), we can define the survival function in the normal fashion, as
                               St  exp−          expx ′  0 sds
                                     exp− expx ′              0 sds

                                     exp− expx ′  0 t
   with the standard formulation for the likelihood.

 Accelerated hazard function
   The general form for the accelerated hazard function introduces covariates through the  0
multiplier in the parametric hazard functions defined earlier. Typically,
                                          0  expx ′ .
Some special cases:
   No duration dependence
   When the proportional hazard rate does not depend on duration, we can write the hazard
function as
                                       t   0  expx ′ .
It can be shown that the expected duration for such a model is

                                        Et          1       ,
                                                    expx ′ 
which leads to a simple regression representation of the no duration dependence proportional
hazard (for completed durations) of
                                       − logt  x ′   v
where v is a stochastic term.
   Accelerated Weibull hazard
   The general form for the accelerated Weibull would be as above,
                                t   0  0 t −1
                                       expx ′ expx ′ t −1
Estimation of the accelerated Weibull follows as before, substituting  0  expx ′  into the
standard likelihood expression. For monotone hazard functions such as the Weibull, the signs
of the coefficients on  will therefore show the effects of covariates on the exit rate. For
non-monotonic functions, however, interpretation is less straightforward.
 Introduce the unobserved heterogeneity (the mixed proportinal
 hazard model)
    The following question maight arise in your research: What would be the effect on the
employment/unemployment duration of a randomly chosen individual that was actually given a
treatment? Since the use of the treatment is a matter of choice, selection issues will arise in
answering this question even though randomization was used in allocating individuals to the
possibility of using the employment services (treatment). For example, those individuals who
believe they would get the most benefit from the services may be disproportionately the ones
that choose to avail themselves of the services. Because of these selection problems this is a
difficult question to answer and the answer will be sensitive to the assumptions made to permit
estimation of this effect in the presence of this selection effect. We will utilize a mixed
proportional hazard MPH model that allows for unobserved heterogeneity,Consider the
following mixed proportional hazard MPH model
                                                           i t i |v i , x i   v i gx i  0 t i ,
where t i is the spell of eg. unemployment for individual i, gx i   e x i  and  0 t i  is the
baseline hazard for individual i. Here, an individual has a given value of v i , and his spell
durations are independent drawings from the univariate duration distribution Ft i |x i ; v i , where,
v i , is unobserved, so that the durations given just x i are not independent. Conditional on x i and
v i , the durations t i are independent. Given independence, the integrated hazard for the
unemployment spell of a given individual is defined as
                                                                i  v i gx i  0 t i , with
                                                       0 t i          0 udu.
                                     ft i |v i ,x i 
Given that t i |v i , x i       1−Ft i |v i ,x i 
                                                          , we can find

            ft i |v i , x i   t i |v i , x i 1 − Ft i |v i , x i , where 1 − Ft i |v i , x i   e −v i gx i  0 t i  ,
                                                           ft i |x i      0 ft i |v i , x i dGv i ,
in which we already implicitly assume that v i are independent of x i .
For convenience, we consider  0 t i , gx i , v i , and the distribution G of v i in the population to
satisfy some regularity assumptions (G is usually Gamma distributed or mass points

 Estimating hazard rate and survival functions
   To estimate the parameters of the various models of hazard requires a sample of observed
durations. So, for example, to model unemployment duration requires that we collect a sample
of observations of time spent in unemployment from the beginning of a period of
unemployment to its end.
    It is worth stating here that a number of complications can arise in the collection of data
which can complicate the analysis of the duration analysis.
    - One may not observe a completed duration for a subset of the sample, leading to a
problem of ”right censoring”.
    - One may not observe the start of events. So, individuals may enter the period of the
sample ina state of unemployment, before exiting to unemployment at some point within the
period of observation (so called ”left-censoring”).
    - The reported duration of unemployment may not be a continous event. One may, for
example, observe multiple spells of unemployment for a given individual over the period of the
    Some of these problems are particularly difficult to accomodate in a statistical model of
duration. We shall concentrate on the right-censoring of duration data.
 The case of no censoring
    Consider a sample of n observed durations t 1 , t 2 , . . . , t n within a given sample period.
Furthermore, suppose that all observations t i represent a completed duration. Given a
parametric hzard function  i ;  based on a set of parameters , the general density function
for the completed duration t i is
                                         ft i ;   t i ; St i ; 
and the likelihood function for completed durations is
                                          n                  n
                                L     ft i ;    t i ; St i ; 
                                         i1                i1

and a corresponding log-likelihood
                                 n                     n                     n
                       l    ∑ ln ft i ,   ∑ ln t i ;   ∑ ln St i ; 
                                 i1                  i1                    i1
Maximum likelihood estimates  for a particular parametric hazard rate specification then
follow straightforwardly, via the optimisation of the log-lokelihood function.
 The problem of right-censoring
    Suppose now that some among the sample t 1 , t 2 , . . . , t n of observed durations are
incomeplete (that is to say, right-censored). Then, we know that, for the censored observations,
T  t i with probability St i . In the same way as for the treatment of censored observations in
a Tobit analysis, we can include this piece of information in the likelihood for all observations
for which the durations is incomplete. So, define the following indicators:
                         i  1 if the observed duration is completed
                             0 if the observed duration is right-censored.
Then, the likelihood for a parameteric hazard function in the presence of right-censored data
                               L c           ft i ;   St i ; 
                                                 i 1          i 0

                                                t i ; St i ;   St i ; 
                                                 i 1                            i 0

with corresponding log-likelihood

                               l c         ∑ ln ft i ;   ∑ ln St i ; 
                                                i 1                    i 0
                                              ∑ ln t i ;   ∑ ln St i ; .
                                                i 1                     i1

 The problem of left-censoring
    The converse form of censoring (and one which is inherently more difficult to deal with)
relates to the concept of ”left-censoring”, whereby durations t, are only observed conditional
on t i  l i for some l i  0. If we knew the value of l i for each left-censored observation, then
the appropriate treatment in estimation would require that the density be conditioned on the
event t i  l i . The conditional density for left-censored data would be
                                                                   t i ; St i ; 
                                     ft i |t i  l i ;  
                                                                        Sl i ; 
leading to a log-likelihood (in the absence of right-censoring) of the form
                                      n                        n
                        l lc     ∑ ln ft i ;   ∑ln St i ;  − ln Sl i ; 
                                     i1                   i1

Of course, this likelihood can only be evaluated if l i is known for each observed duration, a
circumstance which is unlikely in most cases.

To top