# Survival Analysis

Document Sample

```					                        Survival Analysis Introduction
(AKA Event History Analysis)

Situation

Analysis of data in which the dependent variable consists of two aspects –
whether or not the outcome occurred and the amount of time that elapsed
before occurrence of the outcome.

The typical situations with this type of dependent variable

Medical literature

The survival time and whether or not death occurs in response to treatment
of disease.

Group A given Drug A.
Group B given Drug B.

Are survival rates/times different in the two groups?

Turnover literature

The time to quit and whether or not employees quit in response to
organizational conditions.

High Job Satisfaction
Low Job Satisfaction

Is turnover rate/times different across levels of job satisfaction?

Survival Analysis – 1                 Printed on 10/9/2008
Two aspects of survival: One Dichotomous and one Continuous

Dichotomous: Dying / Leaving the company

Continuous: Length of survival / Time at the company

These two aspects – proportion dying/turning over and “average” time
before death/turnover - are correlated but not identical aspects of the
response to treatment. That is, the analysis of only one aspect (e.g.,
number of deaths) won’t necessarily yield same result as the analysis of
the other (e.g., time to die).

They’re not identical because of the fact that our data are gathered only
during a window of observation.

At some time, we begin gathering data. At some later time we stop.

But this necessarily results in incomplete information on some participants
– those who are still alive/working when the window of observation is
closed and those were sick/working when the window opened.

If the window of observation were infinitely long, then everyone would
have died/turned over, so the point would be moot. We would compare
survival times and base our conclusions on those comparisons.

Or, if those who didn’t die, lived forever, we could simply compare death
rates and base our conclusions on those comparisons.

However, the exigencies of research require that our windows of
observation be finite. This means that survival time and death rates are
both imperfect measures. This requires that we take the correlation
between outcome and duration into account in our analyses.

Survival Analysis – 2               Printed on 10/9/2008
Incomplete Analysis 1: Analysis of only the outcomes – deaths or quits.

Use logistic regression.
(Use linear regression in a pinch praying that the God of statistics won’t
strike you down).

Problem – it’s possible to create situations in which distribution of
durations is different even though proportions of outcomes are identical.

Consider the following . . . Assume we’re dealing with employment.

In the figures, each arrow represents duration of employment for a person.
The -> of the arrow represents termination.

Group A

Group B

Clearly, Group A has longer average employment times, but both have the
exact same proportion of turnovers – 100% in this example. So
comparison of death/quit rates gives an incomplete picture.
Survival Analysis – 3                Printed on 10/9/2008
Incomplete Analysis 2 – Analyze only the durations. Ignore the
deaths/turnovers.

Use U-tests since durations will be positively skewed.

No one has done this that I know of. Most people will be interested in
differences in death/quit proportions. Since this analysis ignores those,
most people won’t be interested in these.

Group A

Group B

In the example above, the two groups have equal durations, but different
turnover rates. In this case, analysis of only survival times will give an
incorrect picture of the differences in survival between the groups.

Each type of incomplete analysis ignores one aspect of the complete
dependent variable. Need a method of analysis that takes into account
both aspects.

Survival analysis is an analytic technique that combines both aspects.
Survival Analysis – 4               Printed on 10/9/2008
Types of cases in survival analysis
Monitoring of cases                                                           Monitoring of cases
begins, i.e., Window                                                          ends, i.e., Window
opens                                                                        closes

-------|------------------------------------------------------|--------

Ideal Cases - starting time and ending time is known

Cases whose ending time (time of termination/death) is unknown
These are called Right Censored Cases - the most common

??????????????
?
??????????????
The above cases are still employed/surviving at the time monitoring ends. ?

??????????????????????????????
The above case is lost to follow-up (quit answering phone, left state, etc.)

Cases whose starting times (times at which disease develops)
are unknown.
I believe these are called left-censored, although Tabachnick & Fidell are ambiguous on p. 5?? and p. 5??
regarding this. They use "failed before study begin" and "disease process began before study began" - two
different conditions. I believe their discussion on p 5?? is the correct definition.

???????

Cases whose starting times and ending times are unknown

Survival Analysis – 5                                Printed on 10/9/2008
Survival Analysis (also called Event History Analysis)

An analytic technique that models both average duration to outcome and
proportion of outcomes.

3 separate techniques – Life Table, Kaplan-Meier, Cox Regression

Key concepts common to all

1. Survival function

A plot of proportion surviving from time 0 up to a given time vs. time
A cumulative plot.

Proportion
Surviving

0              t
Time

Generally decreasing curve, since proportion surviving decreases across
time.

Separate curves for separate groups.

Survival Analysis – 6              Printed on 10/9/2008
Note that the survival function represents both proportion turning over (the
height of the curve at a point) and duration of stay/life (how far the curve
has progressed to the right from t0). It is a two-dimensional representation
of the two aspects of survival – length of life/employment and turnover.

The vertical axis represents proportion of survivals or turnovers.

Within a vertical slice at any point, turnover rates can be compared.
In the following, we see that Group B had lower survival/higher turnover
at the indicated time period.

A

B

Time

The horizontal axis represents duration of life/stay.

Within a horizontal slice at any point, average durations can be compared.
In the following, we see that at the given rate of turnover, Group A had
longer average duration of stay.
A

B

Survival Analysis – 7                   Printed on 10/9/2008
2. Hazard function

A plot of proportion dying/leaving at time intervals given survival to that
time period.
Not a cumulative plot.

Proportion
Dying

Age

Hazard function for human mortality – Highest at young age and at high
age.

3. Cumulative Hazard.

A plot of proportion dieing/turning over up to a particular time.
A cumulative plot – the inverse of the survival plot.

Survival Analysis – 8               Printed on 10/9/2008
Three general types of Survival Analysis

1. Life Tables analysis.

The window of observation is cut up into n equal-length intervals.

Proportions of persons surviving/dying within each interval are computed.

This is the original method.
Useful for analysis of one group or for comparison of a few groups
defined by levels of a single factor.
Can’t incorporate quantitative predictors.
Can’t incorporate more than 2 qualitative predictors in SPSS.
Cannot analyze interactions of 2 predictors.

2. Kaplan-Meier analysis.

Event-based. Subintervals within the total window of observation are
defined by when outcomes (deaths, turnovers) occur.

Can’t incorporate quantitative predictors.
Can’t incorporate more than 2 qualitative predictors in SPSS.
Cannot analyze interactions of 2 predictors.

Survival Analysis – 9               Printed on 10/9/2008
3. Cox Proportional Hazards Regression (Cox Regression)

A very general, procedure.

Estimates hazard probabilities for whole sample.

Then estimates ratios of hazards to this overall hazard function for
groups/persons with different values of IV’s

As implemented in SPSS output and analyses are quite reminiscent of
logistic regression.

Can incorporate quantitative predictors.
Can incorporate multiple qualitative and quantitative factors.
Can incorporate both types in the same analysis.
Can incorporate interactions.

Requires:

Proportional hazard functions.
Survival plots for different groups must diverge “nicely” and can’t cross
back.
OK
Not OK

Survival Analysis – 10                Printed on 10/9/2008
Based on Tabachnick Table 11.1, p. 511
Analyzed using SPSS Life Tables
Suppose the efficacy of Drug 0 is being compared with that of Drug 1. Each was formulated to prolong life of
patients with a usually terminal form of cancer. Seven patients were given Drug 0 and five were given Drug 1.
Patients were observed for up to 12 months. After 12 months, the window of observation closed and the results
were entered into SPSS.
So this problem is analogous to a turnover problem in organizational research with two groups of employees
treated differently.

The SPSS syntax to invoke the analysis.

SAVE OUTFILE='G:\MdbT\P595\P595AL07-Survival analysis\TAndFDancingData.sav'
/COMPRESSED.
SURVIVAL TABLE=months BY drug(0 1)
/INTERVAL=THRU 12 BY 1
/STATUS=outcome(1)
/PRINT=TABLE
/PLOTS (SURVIVAL)=months BY drug.

Survival Analysis – 11                               Printed on 10/9/2008
Survival Analysis
[DataSet0] G:\MdbT\P595\P595AL07-Survival analysis\TAndFDancingData.sav
Survival Variable: months
Life Table
Std.
Error of
Cumulati     Cumulati
ve           ve
Number                                                        Proportio    Proportio                  Std.
Withdra                 Number       Proportio                    n            n                    Error of               Std.
Number        wing      Number         of            n        Proportio   Surviving    Surviving    Probabili   Probabili             Error of
First-order       Interval     Entering     during     Exposed     Terminal     Terminat        n        at End of    at End of       ty          ty       Hazard    Hazard
Controls          Start Time   Interval    Interval     to Risk     Events         ing       Surviving    Interval     Interval    Density     Density      Rate      Rate
drug          0   0                    7           0       7.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
1                    7           0       7.000           1           .14         .86          .86          .13       .143        .132        .15        .15
2                    6           0       6.000           2           .33         .67          .57          .19       .286        .171        .40        .28
3                    4           0       4.000           1           .25         .75          .43          .19       .143        .132        .29        .28
4                    3           0       3.000           1           .33         .67          .29          .17       .143        .132        .40        .39
5                    2           0       2.000           1           .50         .50          .14          .13       .143        .132        .67        .63
6                    1           0       1.000           0           .00        1.00          .14          .13       .000        .000        .00        .00
7                    1           0       1.000           0           .00        1.00          .14          .13       .000        .000        .00        .00
8                    1           0       1.000           0           .00        1.00          .14          .13       .000        .000        .00        .00
9                    1           0       1.000           0           .00        1.00          .14          .13       .000        .000        .00        .00
10                   1           0       1.000           0           .00        1.00          .14          .13       .000        .000        .00        .00
11                   1           0       1.000           1         1.00          .00          .00          .00       .143        .132      2.00         .00
1   0                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
1                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
2                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
3                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
4                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
5                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
6                    5           0       5.000           0           .00        1.00        1.00           .00       .000        .000        .00        .00
7                    5           0       5.000           1           .20         .80          .80          .18       .200        .179        .22        .22
8                    4           0       4.000           1           .25         .75          .60          .22       .200        .179        .29        .28
9                    3           0       3.000           0           .00        1.00          .60          .22       .000        .000        .00        .00
10                   3           0       3.000           2           .67         .33          .20          .18       .400        .219      1.00         .61
11                   1           0       1.000           0           .00        1.00          .20          .18       .000        .000        .00        .00
12                   1           1        .500           0           .00        1.00          .20          .18       .000        .000        .00        .00

The results suggest that survival is significantly longer with Drug 1.

Survival Analysis – 12                                                                    Printed on 10/9/2008
Tabachnick Table 11.1, p. 511
Analyzed using SPSS Kaplan-Meier

KM months BY drug /STATUS=outcome(1) /PRINT TABLE MEAN    /PLOT SURVIVAL
/TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED.

Survival Analysis – 13               Printed on 10/9/2008
Kaplan-Meier
[DataSet2] G:\MdbT\InClassDatasets\Survival(T&Bp511).sav
Case Processing Summary
Censored
drug                  Total N             N of Events                 N                Percent
0                                7                        7                    0                   .0%
1                                5                        4                    1                 20.0%
Overall                         12                       11                    1                  8.3%

Survival Table
Cumulative Proportion Surviving at the Time
drug                                 Time             Status                 Estimate                   Std. Error           N of Cumulative Events   N of Remaining Cases
0             1                          1.000                    1                      .857                       .132                          1                       6
2                          2.000                    1                          .                          .                         2                       5
3                          2.000                    1                      .571                       .187                          3                       4
4                          3.000                    1                      .429                       .187                          4                       3
5                          4.000                    1                      .286                       .171                          5                       2
6                          5.000                    1                      .143                       .132                          6                       1
7                         11.000                    1                      .000                       .000                          7                       0
1             1                          7.000                    1                      .800                       .179                          1                       4
2                          8.000                    1                      .600                       .219                          2                       3
3                         10.000                    1                          .                          .                         3                       2
4                         10.000                    1                      .200                       .179                          4                       1
5                         12.000                    0                          .                          .                         4                       0

Means and Medians for Survival Time
Meana                                                                          Median
95% Confidence Interval                                                    95% Confidence Interval
drug             Estimate         Std. Error        Lower Bound           Upper Bound                Estimate     Std. Error       Lower Bound       Upper Bound
0                      4.000             1.272                    1.506              6.494                3.000         1.309                .434               5.566
1                      9.400              .780                    7.872             10.928               10.000           .894              8.247              11.753
Overall                6.250             1.081                    4.131              8.369                5.000         2.598                .000              10.092
a. Estimation is limited to the largest survival time if it is censored.

Overall Comparisons

Chi-Square              df             Sig.
Log Rank (Mantel-Cox)                                3.747                 1              .053
Breslow (Generalized Wilcoxon)                       4.926                 1              .026
Tarone-Ware                                          4.522                 1              .033
Test of equality of survival distributions for the different levels of drug.

As was the case with the analysis using the LIFE TABLES procedure, the results support the conclusion that
survival is significantly longer with Drug 1.

Survival Analysis – 14                                                               Printed on 10/9/2008
Tabachnick Table 11.1, p. 511
Analyzed using SPSS Cox Regression

Drug was treated as a categorical variable so that survival curves for each value of drug could be obtained.

Since it’s a dichotomy, the analysis could be done without labeling it categorical.

Survival Analysis – 15                                Printed on 10/9/2008
If you want separate
predicted survival
functions for each
value of a categorical
variable, put the name
of that categorical
variable here.

COXREG months /STATUS=outcome(1) /PATTERN BY drug
/CONTRAST (drug)=Indicator(1) /METHOD=ENTER drug                                         /PLOT SURVIVAL
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20).

Cox Regression
[DataSet2] G:\MdbT\InClassDatasets\Survival(T&Bp511).sav

Case Processing Summary

N          Percent
Cases available in analysis                 Eventa                                          11         91.7%
Censored                                         1          8.3%
Total                                           12        100.0%
Cases dropped                               Cases with missing values                        0            .0%
Cases with negative time                         0            .0%
Censored cases before the earliest event         0            .0%
in a stratum
Total                                            0          .0%
Total                                                                                       12       100.0%
a. Dependent Variable: months

Categorical Variable Codingsb

Frequency             (1)
druga          0                            7                 0
1                            5                 1
a. Indicator Parameter Coding
b. Category variable: drug

Survival Analysis – 16                                         Printed on 10/9/2008
Block 0: Beginning Block

Omnibus Tests of Model
Coefficients
-2 Log Likelihood
40.740

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa
Overall (score)                           Change From Previous Step                      Change From Previous Block
-2 Log Likelihood    Chi-square             df              Sig.       Chi-square          df          Sig.           Chi-square         df            Sig.
37.394            3.469               1              .063         3.346              1          .067             3.346             1            .067
a. Beginning Block Number 1. Method = Enter

Variables in the Equation

B               SE                  Wald              df             Sig.           Exp(B)
drug                    -1.176            .658              3.192                1              .074             .309

Covariate Means and Pattern Values                           Cox regression coefficient signs are relative to death, not
Pattern                          survival. So a positive sign means that larger values of the
Mean              1                     2
drug                   .417            .000                  1.000
independent variable have higher death rates. And negative
signs mean that larger values of the independent variable
have lower death rates.

Death

0                            1
Drug

I strongly recommend that you create a plot such as the one immediately above by hand to make sure you
understand the Cox Regression results.

Survival Analysis – 17                                                        Printed on 10/9/2008
COXREG plots are plots of predicted survival, not actual survival. In this sense, they’re like the tables and
plots of estimated marginal means from GLM.

Survival Analysis – 18                                Printed on 10/9/2008
Turnover at a local Manufacturing Plant
1. Effect of Friends and/or family at the plant

In this study, turnover at a local manufacturing plant was studied. On the application blank, applicants were

Some did not respond to this question. They’re included in the analysis.

Kaplan-Meier output is shown

KM
dos BY wsfr2 /STATUS=status(1)
/PRINT TABLE MEAN
/PLOT SURVIVAL
/TEST LOGRANK BRESLOW TARONE
/COMPARE OVERALL POOLED .

Kaplan-Meier
[DataSet3] G:\MdbR\1TurnoverArticle\TurnoverArticleDataset061005.sav
Huge table not reproduced here.

Case Processing Summary

wsfr2 Whether F/F at company                                                                                  Censored                          -.50 = No friends
for whole sample analyses                               Total N            N of Events                   N               Percent                .15 = No info
-.50                                                           423                  174                       249            58.9%
.15 Whole sample missing
100                     40                     60             60.0%
value
.50                                                             778                   220                     558             71.7%
Overall                                                        1301                   434                     867             66.6%

The casewise table was deleted.

Means and Medians for Survival Time

a
Mean                                                                Median
wsfr2 Whether F/F at company for whole                                                                   95% Confidence Interval                                           95% Confidence Interval
sample analyses                                               Estimate            Std. Error     Lower Bound           Upper Bound     Estimate      Std. Error        Lower Bound       Upper Bound
-.50                                                              610.597               25.559        560.500               660.693        667.000                 .                 .                 .
.15 Whole sample missing value                                    579.795              49.233              483.299          676.291       528.000        151.013             232.014          823.986
.50                                                               769.900              18.559              733.524          806.277              .                 .                 .                 .
Overall                                                           706.965              15.009              677.548          736.383              .                 .                 .                 .
a. Estimation is limited to the largest survival time if it is censored.

Survival Analysis – 19                                                                        Printed on 10/9/2008
Ov erall Comparisons

Chi-Square              df                Sig.
Log Rank (Mantel-Cox)                            25.344                     2                .000
Breslow (Generalized
25.325                    2                .000
Wilcoxon)
Tarone-Ware                                       25.004                    2                .000

iv                                     els
Test of equality of surv al distributions for the different lev of wsfr2
Whether F/F at company for whole sample analy      ses.

Survival Functions

Whether F/F at
1.0
company for whole
Had friends or family                     sample analyses
-.50
0.8
.50
Whole sample
missing value
.50-censored
Cum Survival

0.6                                                                                               -.50-censored
Missing response          Whole sample
missing value-
censored
0.4
Mike – Fix this. The
color of the “censored”
0.2       No friends or family                                                                          symbols is opposite the
colore of the lines!!

0.0

0        200             400             600             800                 1000   1200
Days of service: termdate-effdate or 3/1/1-effdate or
12/31/4-effdate

The data strongly suggest that applicants who had friends or family at the company had higher survival rates at
all times, up to 1100 days (about 3 years).

For example, at the end of 1 year survival (leftmost arrow in the above figure) rate of those with friends and
family was about 70% while that for those who said they did not have friends or family at the organization was

By two years (rightmost arrow), the rate of retention of those with was about 68% while the rate of those

The fact that the curve for those for whom no information was available was between the other two curves
suggests that those employees for whom no information was available were a mixture of some who did have
friends and family and those who did not.
Survival Analysis – 20                                                 Printed on 10/9/2008
Using Survival Analysis to validate selection test questions.
An I/O consulting firm gave a 30 question pre-employment questionnaire to 1000+ employees of a local
company. Each question had from one to five alternatives. In order to identify responses associated with long
tenure, a survival analysis was conducted for each question. A few of the analyses are presented below.

For each survival function, each curve is the survival function of persons who made a particular response to the
item.

Question 1
Overall Comparisons

Chi-Square         df           Sig.

Log Rank (Mantel-Cox)                         5.382             2             .068

Breslow (Generalized Wilcoxon)                4.307             2             .116

Tarone-Ware                                   4.756             2             .093

Test of equality of survival distributions for the different levels of GenQ6 Gen
Q6 L: Complete all assignments at the end of the day / S: Getting the job
you want is mostly a matter of luck.

The numbers represent
the 3 possible responses
to the question, coded as
+1, 0, -1.

For this question, I
+1        believe we treated +1 as
an indicator of long
0    tenure and both 0 and -1
as indicators of short
-1?          tenure.

Survival Analysis – 21                             Printed on 10/9/2008
Question 2
Overall Comparisons

Chi-Square           df           Sig.

Log Rank (Mantel-Cox)                        7.647              4             .105

Breslow (Generalized Wilcoxon)               6.950              4             .139

Tarone-Ware                                  7.298              4             .121

Test of equality of survival distributions for the different levels of GenQ5 Gen
Q5 L: How often you experience conflict with a co-worker? / S: If hired how
long do you plan to work here?.

As in the case of the
question on the
previous page, the
+1
response coded as +1
was treated as an
indicator of long
tenure and all other
responses were
treated as indicators
0           of short tenure.

-1?

Survival Analysis – 22                          Printed on 10/9/2008
Question 3
Overall Comparisons

Chi-Square           df           Sig.

Log Rank (Mantel-Cox)                        5.070              3             .167

Breslow (Generalized Wilcoxon)               5.525              3             .137

Tarone-Ware                                  5.493              3             .139

Test of equality of survival distributions for the different levels of GenQ4 Gen
Q4 L:I prefer a job that / S: How often you experience conflict with a co-
worker?.
There were very few
persons who responded
+1 or 0, but those who
did were treated as long
tenure and those who
responded 0 as short
tenure.

+1

0

Survival Analysis – 23                              Printed on 10/9/2008
Overall Comparisons

Chi-Square           df           Sig.

Log Rank (Mantel-Cox)                        7.753              4             .101

Breslow (Generalized Wilcoxon)               6.762              4             .149

Tarone-Ware                                  7.439              4             .114

Test of equality of survival distributions for the different levels of GenQ3 Gen
Q3 L: Recieved safety training? / S: You are asked to do more physically
demanding work than you were hired to do because someone out sick, how
do you react?.

+1: Long tenure
Else: Short tenure
+1

0

-1?

Survival Analysis – 24                       Printed on 10/9/2008
Overall Comparisons

Chi-Square           df           Sig.

Log Rank (Mantel-Cox)                       10.971              4             .027

Breslow (Generalized Wilcoxon)               9.931              4             .042

Tarone-Ware                                 10.597              4             .031

Test of equality of survival distributions for the different levels of GenQ2 Gen
Q2 L: Your team in disagreement over who will clean the floor. What
method is fair?/ S: Recent supervisor rate dependability?.

+1

0

-1?

Survival Analysis – 25                      Printed on 10/9/2008
Overall Comparisons

Chi-Square            df           Sig.

Log Rank (Mantel-Cox)                         8.052              3             .045

Breslow (Generalized Wilcoxon)               12.729              3             .005

Tarone-Ware                                  10.614              3             .014

Test of equality of survival distributions for the different levels of GenQ1
GenQ1 L: Which strategies inspire a team and help be more effective?/
S:Your team in disagreement over who will clean the floor. What method is
fair?.

+1

0

-1?

Survival Analysis – 26                      Printed on 10/9/2008
Thirty questions were evaluated in the above fashion.

After examination of the individual survival curves for the 30 questions, those for which significant differences
in survival between responses were identified.

Finally, an overall index of predicted survival was calculated, using syntax like the following . . .

In this particular case, the response associated with long survival added 1 to the index.

The response associated with short survival subtracted 1 from the index.

Tenure Scale Computation

Compute genshort=0.
if ((genq1=3 or genq1=4))                     genqshort=genqshort+1.
if ((genq1=1 or genq1=2))                     genqshort=genqshort-1.
if ((genq2=3 or genq2=4))                     genshort=genshort+1.
if ((genq2=1 or genq2=2 or genq2=5))          genshort=genshort-1.
if ((genq6=3))                                genshort=genshort+1.
if ((genq6=1 or genq6=2))                     genshort=genshort-1.
if ((genq12=1))                               genshort=genshort+1.
if ((genq12=3))                               genshort=genshort-1.
if ((genq13=1))                               genshort=genshort+1.
if ((genq13=2 or genq13=3 or genq13=4))       genshort=genshort-1.
if ((genq21=1 or genq21=3))                   genshort=genshort+1.
if ((genq21=2))                               genshort=genshort-1.

Survival Analysis – 27                                    Printed on 10/9/2008
Report on Tenure Scale

The following is not based on the scale above but on a similar scale.

The median score on the scale was determined to be -14.

Group 0 was all employees with an index value less than or equal to -14.

Group 1 was all employees with an index value greater than -14.

The graph indicates that those in Group 1, with large values of the index, had a nearly 70% retention rate after
50 months.

Those in Group 0 had a 40% retention rate after the same length of time.

Survival Analysis – 28                                 Printed on 10/9/2008
Survival Analysis of a phenomenon with a positive outcome
PEG vs. PEGJ Example
The data for this example compared two methods of feeding trauma patients, one using a percutaneous
esophagogastrojejunostomy (PEGJ) and the other using percutaneous esophagogastrostomy (PEG). It
was hoped that the data would show that the PEGJ technique would provide continuous uninterrupted nutrition
with greater consistency than with PEG. Time to reach a nutrition goal was the continuous dependent
variable. Patients were observed for 14 days. Whether or not a patient reached the goal was the status.
Reaching the goal was the +1 state. A patient who had not reached the goal in 14 days, was treated as a
censored case. Group=1 is the PEGJ group. Group=2 is the PEG group.
NUTRSD   NUTRGOAL DAYSGOAL GOALIN14     GROUP       ISS      AGE
02/15/98   02/16/98        1        1         1        29       43       DAYSGOAL is the
01/10/98   01/12/98        2        1         1         5       88       “length of the arrow”
02/14/98   02/18/98        4        1         1        29       37
02/02/98   02/06/98        4        1         1        27       36       variable in the first
01/10/98   01/13/98        3        1         1        13       92       handout.
01/09/98          .       15        0         2        19       73
01/02/98   01/04/98        2        1         2        26       42
01/20/98   01/22/98        2        1         2        36       55       GOALIN14 is a
03/18/98          .        5        1         1        27       23       variable which
02/04/98   02/06/98        2        1         2        13       72       represents whether the
01/23/98          .       15        0         2        10       45
02/01/98   02/02/98        1        1         1        22       59       goal was reached or
02/20/98   02/21/98        1        1         1        17       54       not.
02/03/98   02/04/98        1        1         2        14       78
03/31/98   04/02/98        2        1         2        18       30
04/13/98   04/15/98        2        1         2        27       49       GOALIN14=1 means
05/08/98   05/09/98        1        1         2         9       22       that the goal was
04/14/98   04/20/98        6        1         2         9       60
05/27/98   05/28/98        1        1         1        17       27
reached.
05/13/98          .       15        0         2        29       95
05/07/98   05/16/98        9        1         2        25       31       GOALIN14=0 means
04/16/98   04/17/98        1        1         2        32       31
03/23/98   03/25/98        2        1         2        20       41
that the case is right-
04/07/98   04/08/98        1        1         2        16       29       censored.
03/29/98   03/30/98        1        1         1        25       24
04/30/98   05/01/98        1        1         2        29       52
05/05/98   05/08/98        3        1         2        38       79       GROUP=1: PEGJ
05/28/98   05/30/98        2        1         1         4       76       GROUP=2: PEG
06/08/98   06/10/98        2        1         2        16       70
05/27/98   05/28/98        1        1         1         9       27
04/27/98   04/29/98        2        1         1        22       87       ISS: Injury Severity
04/10/98   04/11/98        1        1         1        27       36       Score, a measure of
02/26/98   03/04/98        6        1         1        25       54       amount of trauma
03/27/98   03/28/98        1        1         1        29       22
04/17/98   04/18/98        1        1         1        22       22       (taken at admission)
02/25/98   03/05/98        8        1         1        25       79
03/18/98   03/19/98        1        1         1        25       56
01/28/98   01/29/98        1        1         1        17       66
03/23/98   03/24/98        1        1         1        16       20
04/29/98   05/03/98        4        1         1        26       22
07/19/98   08/02/98       14        1         2        34       33
08/13/98   08/15/98        2        1         1        25       49
08/25/98          .       15        0         2        26       77
10/06/98   10/07/98        1        1         2        34       19
09/10/98   09/11/98        1        1         2        27       36
08/14/98   08/15/98        1        1         1        30       35
08/25/98   08/27/98        2        1         2        27       29
09/20/98   09/21/98        1        1         2        36       62
09/29/98   10/01/98        2        1         2        17       19
10/09/98          .       15        0         2        38       74

Survival Analysis – 29                              Printed on 10/9/2008
NUTRSD   NUTRGOAL DAYSGOAL GOALIN14      GROUP      ISS   AGE
10/02/98   10/03/98        1        1          1       10    40
08/26/98   09/04/98        9        1          2       18    48
08/19/98   08/21/98        2        1          1       18    31
08/03/98   08/04/98        1        1          1       41    46
08/25/98   08/28/98        3        1          2       24    37
09/17/98          .       15        0          2       26    75
07/02/98          .       15        0          1       19    28
08/03/98   08/05/98        2        1          2       13    52
07/15/98   07/17/98        2        1          2       38    71
07/27/98   08/01/98        5        1          2       34    33
04/30/98   05/02/98        2        1          2        4    61
05/29/98   05/30/98        1        1          1       29    58
05/16/98   05/18/98        2        1          2       19    42
06/20/98   06/23/98        3        1          1       25    19
08/30/98          .       15        0          1       25    70
04/30/98   05/02/98        2        1          2       43    33
07/01/98   07/02/98        1        1          1       43    79
09/29/98          .       15        0          2       17    18
05/28/98   06/08/98       11        1          2       36    57
07/15/98   07/16/98        1        1          2       27    59
08/11/98   08/12/98        1        1          1       19    43
10/12/98   10/13/98        1        1          1       36    18
08/24/98   08/25/98        1        1          1       20    84
10/22/98          .       15        0          1       25    17
10/08/98   10/09/98        1        1          2       25    20
10/06/98          .       15        0          2       17    31
07/30/98   08/02/98        3        1          1       22    26
04/16/98   04/17/98        1        1          1       38    18
10/08/98   10/09/98        1        1          1       25    34
08/19/98   08/21/98        2        1          1       34    22
03/20/98   03/21/98        1        1          1       25    48
06/20/98   06/21/98        1        1          1       11    45
07/30/98   07/31/98        1        1          1       25    33
09/07/98          .       15        0          2       36    28
07/17/98   07/18/98        1        1          1       22    62
09/15/98   09/17/98        2        1          2       20    47
07/07/98   07/08/98        1        1          1       33    27
10/01/98   10/02/98        1        1          2       25    33
09/11/98   09/12/98        1        1          1       41    31

Specifying the analysis using Life Tables . . .

Survival Analysis – 30           Printed on 10/9/2008
The output of LIFE TABLES
SURVIVAL
TABLE=DAYSGOAL BY GROUP(1 2)
/INTERVAL=THRU 15 BY 1
/STATUS=GOALIN14(1)
/PRINT=TABLE
/PLOTS ( SURVIVAL)=DAYSGOAL BY GROUP                                                         .

Survival Analysis
G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Survival Variable: DAYSGOAL

Life Table

Std. Error of
Cumulative       Cumulative
Number                                                             Proportion       Proportion                    Std. Error                Std.
Number         Withdrawin      Number      Number of   Proportion                 Surv iving at    Surv iving at                      of                  Error of
Interval Start      Entering        g during      Exposed to   Terminal    Terminatin   Proportion       End of           End of        Probability   Probability   Hazard    Hazard
First-order Controls          Time                 Interval        Interval        Risk       Events         g        Surviving       Interval         Interval        Density       Density       Rate      Rate
GROUP 1                       .000                          46              0       46.000           0          .00          1.00             1.00              .00        .000            .000       .00       .00
1.000                        46              0        46.000         28           .61           .39             .39              .07          .609          .072        .88       .15
2.000                        18              0        18.000          6           .33           .67             .26              .06          .130          .050        .40       .16
3.000                        12              0        12.000          3           .25           .75             .20              .06          .065          .036        .29       .16
4.000                         9              0         9.000          3           .33           .67             .13              .05          .065          .036        .40       .23
5.000                         6              0         6.000          1           .17           .83             .11              .05          .022          .022        .18       .18
6.000                         5              0         5.000          1           .20           .80             .09              .04          .022          .022        .22       .22
7.000                         4              0         4.000          0           .00         1.00              .09              .04          .000          .000        .00       .00
8.000                         4              0         4.000          1           .25           .75             .07              .04          .022          .022        .29       .28
9.000                         3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
10.000                        3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
11.000                        3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
12.000                        3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
13.000                        3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
14.000                        3              0         3.000          0           .00         1.00              .07              .04          .000          .000        .00       .00
2                  .000                         43              0        43.000          0           .00         1.00             1.00              .00          .000          .000        .00       .00
1.000                        43              0        43.000         11           .26           .74             .74              .07          .256          .067        .29       .09
2.000                        32              0        32.000         15           .47           .53             .40              .07          .349          .073        .61       .15
3.000                        17              0        17.000          2           .12           .88             .35              .07          .047          .032        .13       .09
4.000                        15              0        15.000          0           .00         1.00              .35              .07          .000          .000        .00       .00
5.000                        15              0        15.000          1           .07           .93             .33              .07          .023          .023        .07       .07
6.000                        14              0        14.000          1           .07           .93             .30              .07          .023          .023        .07       .07
7.000                        13              0        13.000          0           .00         1.00              .30              .07          .000          .000        .00       .00
8.000                        13              0        13.000          0           .00         1.00              .30              .07          .000          .000        .00       .00
9.000                        13              0        13.000          2           .15           .85             .26              .07          .047          .032        .17       .12
10.000                       11              0        11.000          0           .00         1.00              .26              .07          .000          .000        .00       .00
11.000                       11              0        11.000          1           .09           .91             .23              .06          .023          .023        .10       .10
12.000                       10              0        10.000          0           .00         1.00              .23              .06          .000          .000        .00       .00
13.000                       10              0        10.000          0           .00         1.00              .23              .06          .000          .000        .00       .00
14.000                       10              0        10.000          1           .10           .90             .21              .06          .023          .023        .11       .11

Median Survival Time

First-order Controls                           Med Time
GROUP 1                                               1.82
2                                      2.70

Survival Analysis – 31                                                                                   Printed on 10/9/2008
First-order Control: GROUP

Since the outcome is a
good event, the faster the
curve falls to zero, the
better.

So the group performing
best is the group with the
lowest curve.

These data are strange because the “event” is something that is sought after - reaching a feeding goal, rather
than something that is to be avoided - death or termination. So for these data, lower "survival" is preferred,
since the "event" is not death, but reaching a nutrition goal. The sooner a patient reached the nutrition goal the
better. Thus, the investigators hoped that patients in the PEJ condition would reach those goals faster, leading
to lower "survival" curves. In this case, survival should be called "Failure to reach feeding goal."

Survival Analysis – 32                                  Printed on 10/9/2008
Analysis of the same data using Kaplan-Meier

KM
DAYSGOAL BY GROUP /STATUS=GOALIN14(1)
/PRINT TABLE MEAN
/PLOT SURVIVAL HAZARD
/TEST LOGRANK BRESLOW TARONE
/COMPARE OVERALL POOLED .

Kaplan-Meier
G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Case Processing Summary

Censored
GROUP            Total N           N of Events               N                Percent
1                          46                43                     3              6.5%
2                          43                 34                    9            20.9%
Overall                    89                 77                   12            13.5%

Means and Medians for Surv ival Time

a
Mean                                                                 Median
95% Confidence Interval                                            95% Confidence Interval
GROUP            Estimate           Std. Error        Lower Bound        Upper Bound      Estimate      Std. Error        Lower Bound        Upper Bound
1                      2.717                .527             1.685              3.750           1.000                 .                 .                  .
2                      5.488                .857                 3.808            7.169        2.000            .214              1.581             2.419
Overall                4.056                .517                 3.043            5.069        2.000            .211              1.587             2.413
a. Estimation is limited to the largest survival time if it is censored.

Survival Analysis – 33                                                               Printed on 10/9/2008
Ov erall Comparisons

Chi-Square             df          Sig.
Log Rank (Mantel-Cox)                          8.479                   1           .004
Breslow (Generalized
9.588                 1           .002
Wilcoxon)
Tarone-Ware                                      9.306                 1           .002

iv                                     els
Test of equality of surv al distributions for the different lev of GROUP.

Survival Analysis – 34                  Printed on 10/9/2008
The same analysis using Cox Regression

One requirement of the Cox Regression analysis is
that the hazard functions be proportional. That
means that for any two values of a covariate, the
ratio of hazards for those two values across time be
constant.

This eliminates hazard functions which cross or
which are parallel.

Roughly speaking the hazard function should look
like the following . . .

That is, the hazard functions diverge over time.

Survival Analysis – 35                       Printed on 10/9/2008
COXREG
DAYSGOAL /STATUS=GOALIN14(1)
/PATTERN BY GROUP
/CONTRAST (GROUP)=Indicator(1)
/METHOD=ENTER GROUP
/PLOT SURVIVAL HAZARD
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) .

Cox Regression
G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Case Processing Summary

N                 Percent
Cases available          Event a                                            77              86.5%
in analysis              Censored                                           12              13.5%
Total                                              89            100.0%
Cases dropped            Cases with missing values                           0                .0%
Cases with negative time                            0                .0%
Censored cases before
the earliest event in a                             0                .0%
stratum
Total
0                .0%

Total                                                                       89            100.0%
a. Dependent Variable: DAYSGOAL

b
Categorical Variable Codings

Frequency              (1)
GROUP a 1                                46                       0
2                        43                        1
a. Indicator Parameter Coding
b. Category variable: GROUP

Block 0: Beginning Block

Omnibus Tests of Model Coefficients

-2 Log Likelihood
618.281

Block 1: Method = Enter

a,b
Omnibus Tests of Model Coefficients

Overall (score)                                       Change From Previous Step                         Change From Previous Block
-2 Log Likelihood          Chi-sq uare              df                   Sig.            Chi-sq uare          df               Sig.          Chi-sq uare         df               Sig.
612.895                 5.448                      1                  .020            5.385               1                 .020          5.385              1                 .020
a. Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 618.281
b. Beginning Block Number 1. Method = Enter

Survival Analysis – 36                                                                  Printed on 10/9/2008
Variables in the Equation

B              SE                 Wald                 df       Sig.          Exp(B)       Goal
GROUP             -.542           .235             5.332                  1          .021         .582

Covariate Means and Pattern Values

Pattern
Mean             1                   2
GROUP               .483          .000                1.000

No Goal

1               2

The above graph presents predicted proportions. They are analogous to plots of y-hats vs. predictors in a
regression analysis.

When you perform a Cox-regression analysis, you may also have to run a Kaplan-Meier analysis just for the
observed survival curves the K-M procedure produces.

Survival Analysis – 37                                             Printed on 10/9/2008
Survival Analysis – 38   Printed on 10/9/2008
Multivariate analyses using Cox-Regression
An advantage of the Cox-Regression procedure over the Life Tables and Kaplan-Meier procedures is the fact
that you can include multiple covariates in the analysis. Moreover, those covariates do not have to be
categorical. They can be any combination of categorical and continuous variables.

COXREG
DAYSGOAL /STATUS=GOALIN14(1)
/PATTERN BY GROUP
/CONTRAST (GROUP)=Indicator(1)
/METHOD=ENTER GROUP ISS AGE
/PLOT SURVIVAL
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) .

Cox Regression
[DataSet2] G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Case Processing Summary

N        Percent
Cases available      Event a                                 77       86.5%
in analysis          Censored                                12      13.5%
Total                                   89     100.0%
Cases dropped        Cases with missing values               0         .0%
Cases with negative time                0         .0%
Censored cases before
the earliest event in a                 0         .0%
stratum
Total
0         .0%

Total                                                        89     100.0%
a. Dependent Variable: DAYSGOAL

Survival Analysis – 39            Printed on 10/9/2008
b
Categorical Variable Codings

Frequency                   (1)
GROUP a 1                             46                            0
2                        43                            1
a. Indicator Parameter Coding
b. Category variable: GROUP

Block 0: Beginning Block

Omnibus Tests of Model Coefficients

-2 Log Likelihood
618.281

Block 1: Method = Enter

a,b
Omnibus Tests of Model Coefficients

Overall (score)                                           Change From Previous Step                              Change From Previous Block
-2 Log Likelihood        Chi-sq uare                df                        Sig.           Chi-sq uare             df                Sig.          Chi-sq uare         df               Sig.
612.033               6.338                         3                      .096           6.247                  3                  .100          6.247              3                 .100
a. Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 618.281
b. Beginning Block Number 1. Method = Enter

Variables in the Equation

B               SE                        Wald                      df             Sig.               Exp(B)
GROUP                     -.518               .237                 4.778                        1               .029              .596
ISS                        .001               .014                      .006                    1               .938             1.001
AGE                       -.005               .005                      .804                    1               .370              .995

Covariate Means and Pattern Values

Pattern
Mean                  1                         2
GROUP                      .483               .000                      1.000
ISS                    24.112             24.112                    24.112
AGE                    45.629             45.629                    45.629

The Cox Regression plots are not presented here, since they’re very similar to those presented above.

Survival Analysis – 40                                                                          Printed on 10/9/2008

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 19 posted: 12/15/2011 language: English pages: 40
How are you planning on using Docstoc?