Using Survival Analysis for Diffusion Studies

Pages to are hidden for

"Using Survival Analysis for Diffusion Studies"

```					   Using Survival Analysis for
Diffusion Studies

Wynne W. Chin
University of Houston

Do Not Mistake This Technique With
Survivor Analysis

1
Main problems dealing with
longitudinal data such as
diffusion
1.Censored Data
2.Time-varying explanatory
variables

2
Censoring is the more common
problem
n The predictor or explanatory variables
are usually measured at a specific time
point.
n But the dependent variable of interest
over time may not have yet occurred.

Example of problematic data
n Study of recidivism of say 500 inmates
released from prison.
– After 12 months we obtain data as to
whether and when any arrests occur.
– We also have potential explanatory
variables such as age at release,
education, race, prior work
experience.

3
Problem 1 - analyzing censored data
n Nothing special about one year period as an
end point – 6 months or 1.5 year may be just
as good.
n If we create a dichotomous 1 vs. 0 for arrested
vs. not – cannot run regression and can lose
much info on either side of the 1 year mark.
n If we measure length before an arrest – we run
into the censor problem for people who have
not been arrested after 1 year. Might be ok if
the number left over is small.

4
Problem 2 – time varying predictors
(let’s assume no censored cases)
n What if individuals were interviewed monthly
follow-
during a 1 year follow-up?
n We obtain measures of income, marital status,
employment status each month and can see changes
over time.
n Do we use 12 different income measures in a
multiple regression?
n Wouldn’t work for the person arrested during the
first month – income drops to nil? If in prison,
income is a consequence and not a cause of
recidivism.

What Do We Call This Phenomenon
& associated techniques for
evaluating?
n Event History Analysis – Allison, P.D.
(1982). Event History Analysis. Regression
for Longitudinal Event Data, Sage monograph
number 46.
n Survival analysis – Cox, D.R. & Oakes, d.
(1984). Survival Analysis. London: Chapman
& Hall.

5
Background
n Survival or event history techniques formed in
part from demographic and medical research
interested in analyzing survival data.
n If an animal is exposed to different doses of
toxic chemicals and observed as to how long
the “event” of death occurs. Censoring occurs
when the experiment ends before all the
animals die.
n In a different camp, we have engineers doing
“reliability” or “failure time” analysis.

Background
n   In the meantime – us social scientists were
unaware and inappropriately running
regressions, etc. until the mid 70s when Tuma
integrated Markov theory with explanatory
continuous-
variables into a continuous-time model.

6
n Distributional versus regression methods –
early work studies the distribution of time to
an event or time in between events (i.e., life
tables or Markov processes). Recent work
links occurrence of an event as a linear
function of explanatory variables.
non-          events.
n Repeated versus non-repeated events. Deaths
non-
represent single, non-repeatable events. Job
changes or marriage can occur many times.

n Single vs. multiple kinds of events – easy to
treat all events as the same. Yet job terminations
are not all the same. IT usage, likewise, can be
voluntary or involuntary. In evaluating cancer
treatment effectiveness – need to separate death
due to cancer versus deaths from other causes.
non-parametric. Bio-
n Parametric versus non-parametric. Bio-
non-
statisticians favor non-parametric. Engineers
and social scientrists assume that time until an
event come from specific distributional families
Gompertz).
(e.g., Weibull or Gompertz).

7
n                       non-
Parametric versus non-parametric (continued) – Cox
(1972) provide a bridge between these two approach
via the proportional hazards model – described as
semi or partially parametric. The regression model
follows a specific form, but the distributional form of
event times does not.
n                               time.
Discrete versus continuous time. Are time of event
assumed to be measured exactly – based on
continuous methods? But if event occurrence is
measured in larger time units like months or years –
consider it discrete. Continuous time methods
predominate among all disciplines (i.e., sociology,
biostats)
engineering, biostats)

Discrete Time Example
n Single – unrepeated event
n 200 newly minted male professors who
begin their careers as assistant professor in
n Observe them every year for 5 years.
n Event of interest – switching jobs.
n Although actually a repeatable event – will
treat leaving the first job as different from
that of later jobs.

8
Discrete Time Example
Year              # changing   # at risk   Estimated
jobs                     hazard rate
1                 11           200         .055
2                 25           189         .132
3                 10           164         .061
4                 13           154         .084
5                 12           141         .085
>5                129
Total             200          848
(From Allison, 1984)

Discrete Time Example
n Events are discrete time since we only know
the year a job switch occurred.
n Don’t know if voluntary or involuntary – thus
event.
treat as a single kind of event.
n We see 129 professors did not change jobs –
hence censored data.
n Objective: Estimate regression model to
determine probability of job change in a one
year period based on 5 independent variables.

9
Discrete Time Example
n 2 independent variables assumed constant
– Prestige of the department
– Fed funds allocated to the institution for
research
n 3 variables measured each year
– Cumulative # of published articles
– # of citations made by other researchers
– Academic rank (0 = assistant, 1 = associate)

Discrete Time Example
n Two key concepts – risk set and hazard rate.
n Risk set is the set of individuals at risk of event
occurrence at each point in time (i.e., 200
professors during the first year).
n Hazard rate (or simply hazard) – probability
that an event will occur at a particular time to a
particular individual (assuming at risk at the
time).
n Notion of hazard rate is latent – but viewed as
controlling both occurrence and timing of
events.

10
Discrete Time Example
n Assume hazard rate can vary over time. But is
the same for all individuals at each time period.
n So – at year 2, we have 25 jumpers out of 189.
The estimated hazard is 25/189 = 0.132
n If we look back at the previous table – we see
that the hazard doesn’t seem to change with
time. Of course, it is possible that the total
number of jumps can decline over time, yet the
hazard can increase since the risk set is also
decreasing.

Discrete Time Example
Year               # changing   # at risk   Estimated
jobs                     hazard rate
1                  11           200         .055
2                  25           189         .132
3                  10           164         .061
4                  13           154         .084
5                  12           141         .085
>5                 129
Total              200          848
(From Allison, 1984)

11
Discrete Time Example
n For simplicity, let’s assume 2 explanatory
time-
variables – one constant, one time-varying.
n Start off with P(t) = hazard or probability an
individual has an event.
n P(t) = a + b1X1 + b2X2(t) for t = 1, 2, …, 5
n Now do a logit transformation to eliminate
probabilities greater than one or less than zero.
n log [ P(t)/(1-P(t)] = a + b1X1 + b2X2(t)
P(t)/(1-
n Now the left side varies from minus to positive
infinity and b1 and b2 represent how one unit
log-
change in X1 or X2 affects the log-odds.

Discrete Time Example
n The formula still only has X2 the only time
varying variable.
n What about the possibility that the hazard rate
changes automatically over time?
n Maybe individuals beomce more invested in a
job over time – the associated costs of moving
increases (inertia argument).
n You can allow for this with a time varying
intercept.
P(t)/(1-
n log [ P(t)/(1-P(t)] = a(t) + b1X1 + b2X2(t)

12
Discrete Time Example
n                              person-
Estimating requires creating person-time data
units.
person-
n Each individual contributes potentially 5 person-
years.
n For an individual who changed jobs at year 3, s/he
person-
contributes 3 person-years worth of cases.
n For each person year, the person is coded 1 if
changed jobs, 0 otherwise.
n The explanatory variables are assigned the values
person-
they took on in each person-year.
person-
n In our example, we pool 848 person-years into a
single sample – run ML logit analysis.

Discrete Time Example
n Two problems are solved via this procedure.
n Individuals whose time to first job change is
censored contribute exactly what is known
about them (i.e., they didn’t change jobs in any
of the five years of observation).
n Time varying explanatory variables are easily
included because each year at risk is included as
a distinct observation.

13
Discrete Time Example
Model 1                Model 2

Explanatory       b             t     b                   t
variable
Prestige of dept.   0.045      -0.21     0.56               0.26
Funding        -0.077    -2.45*    -0.078             -2.47*
Pubs         -0.021     -0.75    -0.023             -0.79
Cites        0.0072     2.44*    0.0069             2.33*
Rank          -1.4     -2.86**     -1.6             -3.12**
Yr1 (D)                            -0.96              -2.11*
Yr2 (D)                           -0.025              -0.06
Yr3(D)                             -0.74              -1.60
Yr4(D)                             -0.18              -.42
constant         4.95              -226.25
Log-likelihood   -230.95             -226.25

Discrete Time Example
n Model 2 allows for the hazard rate to change
during each of the 5 years.
n Done by 4 dummy variables.
n Interpreted relative to the log odds of the 5th
year.
n No clear pattern found.
n Check by examining twice the difference in the
log-
log-likelihoods – which is 9.4 and 4 degrees of
chi-
freedom. Using chi-square table, just below the
0.05 level of significance.

14
Problems with Discrete Approach
n If large sample with more discrete units of time
– can become unwieldy.
n In our example, switching to person months
would yield a sample of almost 10,000 cases.
log-
n Work around by using log-linear if all
explanatory variables are categorical
(computation based on # of cells in the
contingency table).
logit.
n Use OLS instead of ML logit.

Results from Discrete Approach
n   The discrete time approach described here will
virtually always give results that are similar to
continuous-
continuous-time methods.
n   As the time units get smaller, the model and associated
equation converges to the proportional hazard model
discussed next.
n   Choice depends on computational costs and
convenience.
n   Choose continuous if no time varying explanatory
variables since it doesn’t require the observation period
for each person be divided into distinct units. –
Otherwise relative costs and convenience are
comparable.

15
Proportional Hazard Models
n Hazard h(t) = lim s→0 P(t,t+s)/s
n This is not really a probability since it has no
upper bound. It is the instantaneous rate.
n Expected length of time until an event occurs
1/h(t). So h(t) of 1.25 implies an event will
likely occur in 0.80 time units.
n Think of hazard in terms of two people. If the
first is 0.5 and the second is 1.5, the second
person’s risk of the event occurring is 3 times
more likely.

Proportional Hazard Models
n We almost always view the hazard rate as a
function of time (e.g., time of last event, or age
of the person).
n Hazard for arrests decrease after age 25. Hazard
for retirement increases with time.
n Hazard can also be U shaped. Death is high
right after birth – early years and begins to go
up during late middle age.
n Thus, the hazard rate function chosen is one of
the key differences for continuous time data.

16
Parametric Proportional Hazard Models
n Need to specify h(t) based on time and
explanatory variables.
n One approach is a linear function – but use log
to ensure h(t) cannot be less than zero.
n log h(t) = a + b1X1 + b2X2
n This is the exponential function – hazard is
constant over time.
n But we may want the hazard to increase or
decrease linearly with time to conform to events
like job switching (decreases due to job
investment) or death (increases due to aging).

Parametric Proportional Hazard Models
n log h(t) = a + b1X1 + b2X2 + ct
n Called the Gompertz regression model since it
results in a Gompertz distribution for the time
until event occurrence. Note c can be positive or
negative
n If we model the hazard as changing linearly
with the log of time – get the Weibull
distribution for time til next event.
n log h(t) = a + b1X1 + b2X2 + c log t
n with c constrained to be > -1.

17
Parametric Proportional Hazard Models
n All three differ in how time enters the model.
n Weibull and Gompertz require different
estimation procedures. Both do not allow for U
or inverted U shape hazards.
n Also, none allow for a random disturbance term.
n There is randomness in terms of the latent h(t)
and observed length of the time interval.
n Choice of models depends on substantive
knowledge, theory, mathematical convenience,
and empirical evidence.

Proportional Hazard Models
n Up to now – we needed to determine how the
hazard rate depends on time.
non-
n Difficulty if the hazard is non-monotonic.
n Previous models do not allow for explanatory
variables that change over time.
n David Cox solved the problem in 1972.
n Referred to as simply “proportional hazards
model”, Cox provided a simple generalization
of the previous parametric models.

18
Proportional Hazard Models
n                         time-
Let’s start off without time-varying explanatory
variables.
time-
n Use two time-constant variables
n log h(t) = a(t) + b1X1 + b2X2
with a(t) being any function of time.
n Because this function need not be specified, it is
semi-
considered partially parametric or semi-
parametric.
n It is considered a proportional hazards because
for any two individuals at any point in time, the
constant.
ratio of their hazards is a constant.

Proportional Hazard Models
n In other words, for person j and k at any time t
hj(t)/hk(t) = c
(t)/h
n Estimation done by a method called partial
likelihood.
n Separates model into two parts.
n The first contains info only about b1 and b2.
n The other contains info on b1, b2, and a(t)
n Discard the second part and treat the first as an
ordinary likelihood function. Now the first part
depends on the order in which events occur, not
on the exact times of occurrence.

19
Proportional Hazard Models
n Estimates are asymptotically unbiased and
normally distributed.
n A model with 2 explanatory variables (1
constant and 1 time varying) would look as
follows:
n log h(t) = a(t) + b1X1 + b2X2 (t)
with a(t) being any function of time.
n You can also lag it if you believe there is a
delay of say 2 months.
n log h(t) = a(t) + b1X1 + b2X2 (t - 2)

Proportional Hazard Models
n The estimation uses the same partial likelihood
approach, but the algorithm for maximizing the
likelihood function are more complex.
n CPU time increases by a factor of 10 for
time-
including one time-varying explanatory
variable.

20
Example - Recidivism
Financial aid (D)
n
n Age at release
n Black (D)
n Work experience (D)
n Married (D)
n Paroled (D)
n Prior arrests
n Age at earliest arrest
n Education
n Weeks worked
n   Worked (D) – number of weeks employed during first 3
months

Proportional Hazard Models
Exponential         Proportional      Time-
Time-dependent
Hazard           Proportional
Hazard
Explanatory variables      b            t      b         t        b         t
Financial aid (D)      -0.325     -1.69    -0.337    -1.76    -0.333    -1.74
Age at release        -0.067    -2.89**   -0.069   -2.94**   -0.064   -2.78**
Black (D)           0.280      0.90     0.286     0.92     0.354     1.13
Work experience (D)      -0.117     -0.53    -0.122    -0.55    -0.012    -0.06
Married (D)          -0.414     -1.08    -0.426    -1.11    -0.334    -0.87
Paroled (D)          -0.037     -0.19    -0.035    -0.18    -0.075    -0.38
Prior arrests        0.095     3.21**    0.101    3.36**    0.100    3.31**
Age at earliest arrest   0.080      2.3*     0.071    2.35*     0.077    2.48*
Education           -0.263    -1.96*    -0.264   -1.96*    -0.293   -2.12*
Weeks worked          -0.039     -1.76    -0.039    -1.78
Worked (D)                                                  -1.392   -5.65**
constant           -3.869

21
Interpretation of coefficients
n   -0.067 for age at release means that each additional year of
life reduces the log of the hazard by 0.067 controlling for
other variables.
n   Or raise exp (i.e., 2.718) to the b power.
n   Now for each unit increase in explanatory variable, the
hazard is multiplied by that exponentiated number.
n   Or computer 100 *[exp(b) –1] gives the percentage change
in the hazard with each one unit change in the explanatory
variable.
n   So 0.095 for prior arrest leads to exp(0.095) = 1.10

or 10 percent increase for each additional prior arrest.

Difference For Time-dependent
Proportional Hazards Model
n Note the big difference in dropping weeks
employed during the first three months after
release.
time-
n It is likely a surrogate for a time-varying
explanatory variable.
n Replaced with a dummy variable indicating
employment status at each time period. While
results are the same for the other variables, we
now see employment status as the most
Exponentiating–
important. Exponentiating–1.397 yields 0.25.

22
The End?

23

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 15 posted: 6/22/2010 language: English pages: 23