# Introduction to Survival Analysis

Document Sample

```					Introduction to Survival Analysis

October 13 & 20, 2009

Brian F. Gage, MD, MSc
with thanks to Bing Ho, MD, MPH
Dept. of Medicine
Washington University in St. Louis
Goal: Conceptual and Graphical
Understanding of Survival Analyses
   What is survival analysis
• When to use it?
• How it compares to alternative statistics
   Univariate method: Kaplan-Meier curves
   Multivariate methods: Cox-proportional
hazards model
   Assessment of adequacy of model
Sample Kidney Transplant (Tx) Data
PID   Years        Donor       Tx Fails
2   10.1604   Living-related       1
4    4.2967   Living-related       1
6    2.3644   Living-related       1
8    2.8048   Living-related       1
10    5.4670   Living-related       1
12    3.7554   Living-related       1
Univariate analysis of Tx survival in

30

25

Mean = 1.9 years;
P
e
r
20                                                Median=1.3 years
c
e
n
15
t

10

5

0

0   1   2   3   4   5   6   7    8     9    10      11   12   13   14   15   16   17   18   19   20

Year s
Univariate analysis of Tx survival in
recipients of living-related kidney

Mean = 3.0 years;
Median=2.15 years
How Would You Analyze Those Data?
   All 2000 simulated pts. were followed until
time of rejection or Tx failure.
   <write your data analysis plan here>
Univariate analysis of logarithm (Tx survival) in
recipients of living-related kidney
Univariate analysis of logarithm (Tx survival) in
Comparisons of Log (Tx Survival)
Variable   Method          Variances      DF   Pr > |t|
LnYears    Pooled          Equal        1998    <.0001
LnYears    Satterthwaite   Unequal      1988    <.0001

Variable   Method                     Two-Sided Pr > |Z|
LnYears    Wilcoxon/Mann-Whitney Two-Sample Test <.0001
Suppose You Only Have Time/Money to
Follow Participants for 4.5 Years or that
some Patients Enrolled Late
PID    Years         Donor      Tx Fails
2     4.5+      Living-related      0
4     4.2967    Living-related      1
6     2.3644    Living-related      1
8     2.8048    Living-related      1
10     4.5+      Living-related      0
12     3.7554    Living-related      1
Univariate analysis of Tx survival in

Data censored at 4.5 years
Univariate analysis of Tx survival in
recipients of living-related kidney

Data censored at 4.5 years
Now, Survival Times are Censored
 A t-test is no longer appropriate
 We don‟t know how long patients will survive
past the observation window
 We can‟t compute the mean (or SD) of survival
time between the 2 cohorts
• although may be able to observe medians
To Analyze Censored Data, We Need to
Use Time-to-Event Analysis, Such as St
   Survivor function, S(t) defines the probability of
surviving longer than time t
• Known as Kaplan-Meier curves or product-limit
• Does not account for other covariates
   Model time to failure or time to event
• Survival analysis has a dichotomous (binary) outcome
• Unlike logistic regression, survival analysis analyzes the
time to an event
   Able to account for censoring
• But not covariates
• When is this OK?
   Can compare survival between 2+ groups
Kaplan-Meier Plots of Kidney Tx

St
P < .0001

Median
survival

Living-Related Donor

How to Compare Kaplan-Meier Curves?
   Hypothesis test (test of significance)
• H0: the curves are statistically the same
• HA: the curves are statistically different
   Compares: observed to expected cell counts
• Test statistic is compared to 2
   Do you weigh each failure equally?
• Yes ==> Log-Rank (Mantel-Haenszel) Test
or do you penalize early failure more?
• Yes ==> Generalized Wilcoxon (Breslow) Test.
Time to Cardiovascular Adverse Event in VIGOR Trial

P < .001

1-S(t)
Censoring is Variable
   Subject does not
Death       experience event of
interest
Death                  Incomplete follow-up
• Lost to follow-up
• Withdraws from study
• Death (if not an endpoint)

Death
Importance of censored data
 Why are censored data important?
 In a Cox model, what is the key assumption of
censoring?
When to use Survival Analysis
   When one suspects that 1+ explanatory
variable(s) explains the differences in time to
an event
   Examples
• Time to death or clinical endpoint
• Time in remission after treatment of disease
• Recidivism rate after addiction treatment
   Especially when follow-up is incomplete or
variable
1.0

0.9

0.8

0.7
Warf
0.6                                                      ASA
0.5
No Rx
0.4

0.3        Age 76 Years and Older (N = 394)

0.2
P = .0001
0.1

0.0
0    100 200 300 400 500 600 700 800 900

Days Since Index Hospitalization
Gage B et al. Adverse outcomes and predictors of underuse of antithrombotic therapy
in Medicare beneficiaries with chronic atrial fibrillation. Stroke 2000;31:822-7.
Limitation of Kaplan-Meier curves
   What happens when you have several covariates that
you believe contribute to survival?
   Example
• Smoking, hyperlipidemia, diabetes, hypertension,
contribute to time to myocardial infarct or stroke.
   Can use stratified K-M curves, but only for 2 or
maybe 3 categorical covariates.
   Need another approach – Cox proportional hazards
model is most common for many covariates, esp.
continuous ones
Multivariate method: Cox proportional hazards

   Can assess the effect of multiple covariates on
survival
   Cox-proportional hazards is the most
commonly used multivariate survival method
• Easy to implement in SPSS, Stata, JMP, SAS, or R
• Parametric approaches are an alternative, but they
– They yield a closed eqn. for S(t) and H(t)
Cox model:
Proportional hazard assumption
   Hazard Ratio (HR) = exp(B) is a multiplicative
risk—this is the proportional hazard
assumption
   Can handle both continuous and categorical
predictor variables
   Can stratify results using a categorical variable
   Cox models distinguish individual
contributions of covariates on survival.
Hazard Rate h(t)
# of pts. dying per unit time in the interval
ht =
# of pts. alive at t

h(t) is called the “hazard rate,” “hazard function,”
“conditional failure rate”, or „instantaneous
failure rate.”
The Hazard Rate h(t)

ht =lim0 [▲(1-St)/ ▲t] / St
t 

▲(1-St)
▲t
Cox proportional hazard model

   Separates baseline hazard function (ho(t),
which can be any shape) from covariates
• Baseline hazard function over time:
h(t) = ho(t)exp(B1X+Bo)
• Covariates are not usually time independent
– But they can be
• B1 is used to calculate the hazard ratio, which is
similar to the relative risk
   semiparametric
Time to Cardiovascular Adverse Event in VIGOR Trial
Should be Summarized w/ a Single HR, Instead of:

RR= 2.6
RR= 2.4
RR= 1.9    RR= 1.9

RR= 1.2
Use These 2 Eqns. to Show How the Hazard
Ratio Changes when Binary Factor B1 Is
Present (X=1) Rather than Absent (X=0)
   ht = hot exp(B1X+Bo)
   Hazard ratio (HR) = ht|X=1 / ht|X=0
• Hint: exp (a) / exp (b) = exp (a-b)

   Relative risk reduction, RRR, = 1-HR
Cox proportional hazards models
   Hazard Ratio (HR) = exp(B) is a multiplicative
risk—this is the proportional hazard
assumption
• Sometimes can be compensated for by using an
interaction term
   Can handle both continuous and categorical
predictor variables
   can stratify results using a categorical variable
• no distribution assumption is required in that case
Output of Cox Proportional Hazard Model
From Simulated Kidney Tx Data

Analysis of Maximum Likelihood Estimates
Parameter     Standard
Parameter     DF        Estimate       Error  Chi-Square    Pr > ChiSq
Donor       1         0.474         0.0493        92.3      <.0001

Hazard      95% Hazard Ratio
Parameter         Ratio      Confidence Limits
Donor             1.61       1.46       1.77

Thus, cadaveric Tx were 61% more likely to fail.
Limitations of Cox PH model
   Normally, does not include variables that
change over time
• Luckily most variables (e.g. gender, ethnicity, and
congenital condition, birth year) are constant
Example: Tumor Extent
   3000 patients derived from SEER cancer
registry and Medicare billing information
   Explore the relationship between tumor extent
and survival
   Hypothesis is that more extensive tumor
involvement is related to poorer survival
Log-Rank   2 = 269 p <.0001
Example: Tumor Extent
   Tumor stage may not be the only covariate that
affects survival
• Medical comorbidities & poor functional status may
be associated with poorer outcome
• Ethnicity and gender may contribute
• Tumor grade and genotype may contribute
• Etc.
   Cox proportional hazards model could quantify
these relationships
Summary of Kaplan-Meier Curves
   Model time to failure or time to event
• Survival analysis has a dichotomous (binary) outcome
• Unlike logistic regression, survival analysis analyzes the
time to an event
 Able to account for censoring
 Can compare survival between 2+ groups
Summary of time-to-event analyses
   Quantifies time to a single, dichotomous event
   Handles censored data well
   Cox models distinguish individual contributions of
covariates on survival, provided certain assumptions are
met
   Cox models are used commonly in outcomes research.
• E.g. Math 434 - Survival Analysis or
http://k30.im.wustl.edu/program/interm%20biostats%20syllabus.doc
• BST.520 Survival Data Analysis at SLU

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 129 posted: 11/10/2009 language: English pages: 37
How are you planning on using Docstoc?