# Introduction to Bayesian Survival Analysis

Document Sample

```					            Fundamental concepts
Bayesian approach
Semiparametric models
Examples

Introduction to Bayesian Survival Analysis

Tim Hanson

Division of Biostatistics
University of Minnesota, U.S.A.

IAP-Workshop 2009
Modeling Association and Dependence in Complex Data
November 19, 2009

1 / 72
Fundamental concepts
Bayesian approach
Semiparametric models
Examples

Outline

1   Fundamental concepts

2   Bayesian approach

3   Semiparametric models

4   Examples

2 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data

Can be time to any event of interest, e.g. death, leukemia
remission, bankruptcy, electrical component failure, etc.

3 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data

Can be time to any event of interest, e.g. death, leukemia
remission, bankruptcy, electrical component failure, etc.
Data T1 , T2 , . . . , Tn live in R+ .

3 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data

Can be time to any event of interest, e.g. death, leukemia
remission, bankruptcy, electrical component failure, etc.
Data T1 , T2 , . . . , Tn live in R+ .
Called: survival data, reliability data, time to event data.

3 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data

Can be time to any event of interest, e.g. death, leukemia
remission, bankruptcy, electrical component failure, etc.
Data T1 , T2 , . . . , Tn live in R+ .
Called: survival data, reliability data, time to event data.
T1 , . . . , Tn can be iid, independent, partially exchangeable,
dependent, etc.

3 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data

Can be time to any event of interest, e.g. death, leukemia
remission, bankruptcy, electrical component failure, etc.
Data T1 , T2 , . . . , Tn live in R+ .
Called: survival data, reliability data, time to event data.
T1 , . . . , Tn can be iid, independent, partially exchangeable,
dependent, etc.
Interest often focuses on relating aspects of the distribution
on Ti to covariates or risk factors xi , possibly
time-dependent xi (t). Can be external or internal.

3 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Survival data: covariates and censoring

Uncensored data: (x1 , t1 ), . . . , (xn , tn ). Observe Ti = ti .

4 / 72
Fundamental concepts
Bayesian approach     Time to event data
Semiparametric models     Functions deﬁning lifetime distribution
Examples

Survival data: covariates and censoring

Uncensored data: (x1 , t1 ), . . . , (xn , tn ). Observe Ti = ti .
Right censored data: (x1 , t1 , δ1 ), . . . , (xn , tn , δn ). Observe

T i = ti     δi = 1
.
T i > ti     δi = 0

4 / 72
Fundamental concepts
Bayesian approach     Time to event data
Semiparametric models     Functions deﬁning lifetime distribution
Examples

Survival data: covariates and censoring

Uncensored data: (x1 , t1 ), . . . , (xn , tn ). Observe Ti = ti .
Right censored data: (x1 , t1 , δ1 ), . . . , (xn , tn , δn ). Observe

T i = ti     δi = 1
.
T i > ti     δi = 0

Interval censored data: (x1 , a1 , b1 ), . . . , (xn , an , bn ).
Observe Ti ∈ [ai , bi ].

4 / 72
Fundamental concepts
Bayesian approach     Time to event data
Semiparametric models     Functions deﬁning lifetime distribution
Examples

Survival data: covariates and censoring

Uncensored data: (x1 , t1 ), . . . , (xn , tn ). Observe Ti = ti .
Right censored data: (x1 , t1 , δ1 ), . . . , (xn , tn , δn ). Observe

T i = ti     δi = 1
.
T i > ti     δi = 0

Interval censored data: (x1 , a1 , b1 ), . . . , (xn , an , bn ).
Observe Ti ∈ [ai , bi ].
Not considered here: truncated data.

4 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.

5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.
Discrete t has pmf. Discrete survival regression models
include continuation ratio (hazard regression), proportional
odds (survival odds regression), etc.

5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.
Discrete t has pmf. Discrete survival regression models
include continuation ratio (hazard regression), proportional
odds (survival odds regression), etc.
Survival function is
∞
S(t) = 1 − F (t) = P(T > t) =                         f (s)ds.
t

5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.
Discrete t has pmf. Discrete survival regression models
include continuation ratio (hazard regression), proportional
odds (survival odds regression), etc.
Survival function is
∞
S(t) = 1 − F (t) = P(T > t) =                         f (s)ds.
t

Regression model that focuses on survival: proportional
odds.

5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.
Discrete t has pmf. Discrete survival regression models
include continuation ratio (hazard regression), proportional
odds (survival odds regression), etc.
Survival function is
∞
S(t) = 1 − F (t) = P(T > t) =                         f (s)ds.
t

Regression model that focuses on survival: proportional
odds.
Question: “What is probability of making it past 40 years?”

5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Density and survival
Continuous T has density f (t); considered here.
Discrete t has pmf. Discrete survival regression models
include continuation ratio (hazard regression), proportional
odds (survival odds regression), etc.
Survival function is
∞
S(t) = 1 − F (t) = P(T > t) =                         f (s)ds.
t

Regression model that focuses on survival: proportional
odds.
Question: “What is probability of making it past 40 years?”
Question: “What are the odds of dying before 40?”
S(40)
5 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Quantiles

pth quantile qp for T is qp such that P(T ≤ qp ) = p.

6 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Quantiles

pth quantile qp for T is qp such that P(T ≤ qp ) = p.
qp = F −1 (p).

6 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Quantiles

pth quantile qp for T is qp such that P(T ≤ qp ) = p.
qp = F −1 (p).
Question: “What is the median lifetime in the population?”

6 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Quantiles

pth quantile qp for T is qp such that P(T ≤ qp ) = p.
qp = F −1 (p).
Question: “What is the median lifetime in the population?”
Regression model that focuses on quantiles: accelerated
failure time (proportional quantiles).

6 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Quantiles

pth quantile qp for T is qp such that P(T ≤ qp ) = p.
qp = F −1 (p).
Question: “What is the median lifetime in the population?”
Regression model that focuses on quantiles: accelerated
failure time (proportional quantiles).
Quantile regression active area of research from
frequentist and Bayesian perspective, e.g. Koenker’s
excellent quantreg package for R.

6 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Residual life

Mean residual life
∞
t S(s)ds
m(t) = E{T − t|T > t} =                                  .
S(t)

7 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Residual life

Mean residual life
∞
t S(s)ds
m(t) = E{T − t|T > t} =                                  .
S(t)

Question: “Given that I’ve made it up to 40 years, how
much longer can I expect to live?”

7 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Residual life

Mean residual life
∞
t S(s)ds
m(t) = E{T − t|T > t} =                                  .
S(t)

Question: “Given that I’ve made it up to 40 years, how
much longer can I expect to live?”
Regression model that focuses on MRL: proportional mean
residual life; there are others.

7 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Residual life

Mean residual life
∞
t S(s)ds
m(t) = E{T − t|T > t} =                                  .
S(t)

Question: “Given that I’ve made it up to 40 years, how
much longer can I expect to live?”
Regression model that focuses on MRL: proportional mean
residual life; there are others.
Also: median (or any quantile) residual life. Much harder to
work with in regression context.

7 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Hazard function

Hazard at t:
P(t ≤ T < t + dt|T ≥ t)   f (t)
h(t) = lim +                             =       .
dt→0              dt              S(t)

8 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Hazard function

Hazard at t:
P(t ≤ T < t + dt|T ≥ t)   f (t)
h(t) = lim +                             =       .
dt→0              dt              S(t)

Question: “Given that I’ve made it up to 40 years, what is
the probability I die in the next day?”
1

8 / 72
Fundamental concepts
Bayesian approach   Time to event data
Semiparametric models   Functions deﬁning lifetime distribution
Examples

Hazard function

Hazard at t:
P(t ≤ T < t + dt|T ≥ t)   f (t)
h(t) = lim +                             =       .
dt→0              dt              S(t)

Question: “Given that I’ve made it up to 40 years, what is
the probability I die in the next day?”
1
Regression models that focuses on hazard function:
proportional hazards (Cox) and additive hazards (Aalen)
models.

8 / 72
Fundamental concepts
Bayesian approach    Time to event data
Semiparametric models    Functions deﬁning lifetime distribution
Examples

Density, survival, hazard, and MRL
1
0.25
0.8
0.2
0.6
0.15

0.4
0.1

0.05                                    0.2

2   4       6    8    10    12           2     4     6         8    10     12

2                                     6
1.75
5
1.5
4
1.25
1                                     3
0.75
2
0.5
1
0.25

2   4       6    8    10   12           2     4     6      8       10    12

9 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Bayes modiﬁes a likelihood
Let θ index a probability density fθ .

10 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Bayes modiﬁes a likelihood
Let θ index a probability density fθ .
Data x = (x1 , . . . , xn ) are collected x ∼ fθ ; likelihood is
fθ (x) as function of θ.

10 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Bayes modiﬁes a likelihood
Let θ index a probability density fθ .
Data x = (x1 , . . . , xn ) are collected x ∼ fθ ; likelihood is
fθ (x) as function of θ.
Frequentist might estimate θ using MLE
θ = argmaxθ∈Θ fθ (x) and study sampling distribution of
θ(x), often asymptotic.

10 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Bayes modiﬁes a likelihood
Let θ index a probability density fθ .
Data x = (x1 , . . . , xn ) are collected x ∼ fθ ; likelihood is
fθ (x) as function of θ.
Frequentist might estimate θ using MLE
θ = argmaxθ∈Θ fθ (x) and study sampling distribution of
θ(x), often asymptotic.
Bayesian places prior distribution on θ ∼ p(θ), Bayes’ rule
gives posterior distribution:
fθ (x)p(θ)
p(θ|x) =                        .
Θ fθ (x)p(θ)dθ

10 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Bayes modiﬁes a likelihood
Let θ index a probability density fθ .
Data x = (x1 , . . . , xn ) are collected x ∼ fθ ; likelihood is
fθ (x) as function of θ.
Frequentist might estimate θ using MLE
θ = argmaxθ∈Θ fθ (x) and study sampling distribution of
θ(x), often asymptotic.
Bayesian places prior distribution on θ ∼ p(θ), Bayes’ rule
gives posterior distribution:
fθ (x)p(θ)
p(θ|x) =                        .
Θ fθ (x)p(θ)dθ

Bayes’ estimate typically posterior mean, median, or
mode; e.g. θ = argmaxθ∈Θ fθ (x)p(θ).
10 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric survival without covariates
Survival distribution completely deﬁned by any of f (t), S(t),
h(t), or m(t).

11 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric survival without covariates
Survival distribution completely deﬁned by any of f (t), S(t),
h(t), or m(t).
Each of these can be derived from one of the others.

11 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric survival without covariates
Survival distribution completely deﬁned by any of f (t), S(t),
h(t), or m(t).
Each of these can be derived from one of the others.
Simplest case: iid with (noninformative) right censoring
gives
n                               n
L(S) =                δi
f (ti ) S(ti )1−δi
=         S(ti )h(ti )δi .
i=1                            i=1

11 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric survival without covariates
Survival distribution completely deﬁned by any of f (t), S(t),
h(t), or m(t).
Each of these can be derived from one of the others.
Simplest case: iid with (noninformative) right censoring
gives
n                                n
L(S) =                 δi
f (ti ) S(ti )1−δi
=         S(ti )h(ti )δi .
i=1                            i=1

If S(t) is parametric, e.g. Sθ (t) = exp −(t/θ2 )θ1 , then
likelihood is ﬁnite-dimensional:
n
δi
L(θ) =         exp −(t/θ2 )θ1              (θ1 /θ2 )(ti /θ2 )θ1             .
i=1

11 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric survival without covariates
Survival distribution completely deﬁned by any of f (t), S(t),
h(t), or m(t).
Each of these can be derived from one of the others.
Simplest case: iid with (noninformative) right censoring
gives
n                                n
L(S) =                 δi
f (ti ) S(ti )1−δi
=         S(ti )h(ti )δi .
i=1                            i=1

If S(t) is parametric, e.g. Sθ (t) = exp −(t/θ2 )θ1 , then
likelihood is ﬁnite-dimensional:
n
δi
L(θ) =         exp −(t/θ2 )θ1              (θ1 /θ2 )(ti /θ2 )θ1             .
i=1
Bayesian further places prior on θ, e.g.
θ1 ∼ Γ(7.3, 2.4) ⊥ θ2 ∼ exp(0.74).                                                   11 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).
Priors on h(t) include extended gamma, piecewise
exponential, correlated processes, etc.

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).
Priors on h(t) include extended gamma, piecewise
exponential, correlated processes, etc.
t
Priors on H(t) =     0   h(s)ds include gamma, beta, etc.

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).
Priors on h(t) include extended gamma, piecewise
exponential, correlated processes, etc.
t
Priors on H(t) =     0   h(s)ds include gamma, beta, etc.
Priors on S(t) include Dirichlet process (DP).

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).
Priors on h(t) include extended gamma, piecewise
exponential, correlated processes, etc.
t
Priors on H(t) =     0   h(s)ds include gamma, beta, etc.
Priors on S(t) include Dirichlet process (DP).
Priors on f (t) include DP mixtures, more general
nonparametric mixtures, ﬁnite mixtures, Polya trees,
log-splines, etc.

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Nonparametric survival without covariates

Inﬁnite-dimensional process directly deﬁned on one of h(t),
H(t), f (t), or S(t).
Priors on h(t) include extended gamma, piecewise
exponential, correlated processes, etc.
t
Priors on H(t) =     0   h(s)ds include gamma, beta, etc.
Priors on S(t) include Dirichlet process (DP).
Priors on f (t) include DP mixtures, more general
nonparametric mixtures, ﬁnite mixtures, Polya trees,
log-splines, etc.
H(t) ∼ GP(c, Hθ ) and S(t) ∼ PT (c, ρ, Sθ ) described
below...

12 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Gamma process prior on H(t)

Let Hθ (t) be increasing on t > 0, left-continuous,
Hθ (0) = 0.

13 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Gamma process prior on H(t)

Let Hθ (t) be increasing on t > 0, left-continuous,
Hθ (0) = 0.
H(t) ∼ GP(c, Hθ ) if
H(0) = 0.
H(t) has independent increments in disjoint intervals.
t > s implies H(t) − H(s) ∼ Γ(c(Hθ (t) − Hθ (s)), c).

13 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Gamma process prior on H(t)

Let Hθ (t) be increasing on t > 0, left-continuous,
Hθ (0) = 0.
H(t) ∼ GP(c, Hθ ) if
H(0) = 0.
H(t) has independent increments in disjoint intervals.
t > s implies H(t) − H(s) ∼ Γ(c(Hθ (t) − Hθ (s)), c).
Note that E{H(t)} = Hθ (t) and var{H(t)} = Hθ (t)/c. Also
H(t) increasing.

13 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise exponential approximates gamma process

Let R+ = [0, a1 ) ∪ [a1 , a2 ) ∪ · · · ∪ [aJ−1 , ∞) be ﬁxed, known.

14 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise exponential approximates gamma process

Let R+ = [0, a1 ) ∪ [a1 , a2 ) ∪ · · · ∪ [aJ−1 , ∞) be ﬁxed, known.
If H(t) ∼ GP(c, Hθ ), then
ind.
λj = H(aj ) − H(aj−1 ) ∼ Γ(c(Hθ (aj ) − Hθ (aj−1 )), c).

14 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise exponential approximates gamma process

Let R+ = [0, a1 ) ∪ [a1 , a2 ) ∪ · · · ∪ [aJ−1 , ∞) be ﬁxed, known.
If H(t) ∼ GP(c, Hθ ), then
ind.
λj = H(aj ) − H(aj−1 ) ∼ Γ(c(Hθ (aj ) − Hθ (aj−1 )), c).
Take partition to be a ﬁne mesh and assume hazard is
constant with value λj over interval [aj−1 , aj ) ⇒
approximates the gamma process.

14 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise exponential approximates gamma process

Let R+ = [0, a1 ) ∪ [a1 , a2 ) ∪ · · · ∪ [aJ−1 , ∞) be ﬁxed, known.
If H(t) ∼ GP(c, Hθ ), then
ind.
λj = H(aj ) − H(aj−1 ) ∼ Γ(c(Hθ (aj ) − Hθ (aj−1 )), c).
Take partition to be a ﬁne mesh and assume hazard is
constant with value λj over interval [aj−1 , aj ) ⇒
approximates the gamma process.
Finite dimensional. Easy to ﬁt.

14 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise exponential approximates gamma process

Let R+ = [0, a1 ) ∪ [a1 , a2 ) ∪ · · · ∪ [aJ−1 , ∞) be ﬁxed, known.
If H(t) ∼ GP(c, Hθ ), then
ind.
λj = H(aj ) − H(aj−1 ) ∼ Γ(c(Hθ (aj ) − Hθ (aj−1 )), c).
Take partition to be a ﬁne mesh and assume hazard is
constant with value λj over interval [aj−1 , aj ) ⇒
approximates the gamma process.
Finite dimensional. Easy to ﬁt.
How to pick a1 < a2 < · · · < aJ−1 ?

14 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise constant hazard: h(t)

0.8

0.6

0.4

0.2

2            4          6              8              10

15 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples
t
Piecewise constant hazard: H(t) =                        0    h(s)ds

3

2.5

2

1.5

1

0.5

2            4          6              8              10

16 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise constant hazard: S(t) = exp{−H(t)}

1

0.8

0.6

0.4

0.2

2           4          6             8               10

17 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Piecewise constant hazard: f (t) = h(t)S(t)

0.5

0.4

0.3

0.2

0.1

2            4          6              8              10

18 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Polya tree = partition + beta conditional probabilities

Notation: S ∼ PT (c, ρ(·), Sθ ). S is random probability
measure centered at Sθ , parametric on R.

19 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Polya tree = partition + beta conditional probabilities

Notation: S ∼ PT (c, ρ(·), Sθ ). S is random probability
measure centered at Sθ , parametric on R.
Polya tree prior on S deﬁned through nested partitions of
R, say Πθ , and associated conditional probabilities Yj at
j
level j.

19 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Polya tree = partition + beta conditional probabilities

Notation: S ∼ PT (c, ρ(·), Sθ ). S is random probability
measure centered at Sθ , parametric on R.
Polya tree prior on S deﬁned through nested partitions of
R, say Πθ , and associated conditional probabilities Yj at
j
level j.
Partition Πθ at level j splits R into 2j pieces of equal
j
probability under Sθ . Sets denoted Bθ ( ) where is binary.

19 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Polya tree = partition + beta conditional probabilities

Notation: S ∼ PT (c, ρ(·), Sθ ). S is random probability
measure centered at Sθ , parametric on R.
Polya tree prior on S deﬁned through nested partitions of
R, say Πθ , and associated conditional probabilities Yj at
j
level j.
Partition Πθ at level j splits R into 2j pieces of equal
j
probability under Sθ . Sets denoted Bθ ( ) where is binary.
Next slide shows Π1 , Π2 , and Π3 for Sθ = N(0, 1).

19 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Polya tree sets for Sθ = N(0, 1)

-2      000        001 010 011 100 001 110                   111    2

00             01          10             11
0                        1

Figure: First 3 partitions of R generated by N(0, 1).

20 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric Sθ gives partition.

21 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric Sθ gives partition.
Add Y1 = {Y0 , Y1 }, Y2 = {Y00 , Y01 , Y10 , Y11 },
Y3 = {Y000 , Y001 , Y010 , Y011 , Y100 , Y101 , Y110 , Y111 }, etc. to
reﬁne density shape. Let Y = {Y1 , Y2 , . . . , YJ }.

21 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric Sθ gives partition.
Add Y1 = {Y0 , Y1 }, Y2 = {Y00 , Y01 , Y10 , Y11 },
Y3 = {Y000 , Y001 , Y010 , Y011 , Y100 , Y101 , Y110 , Y111 }, etc. to
reﬁne density shape. Let Y = {Y1 , Y2 , . . . , YJ }.
Y   0   = S{Bθ ( 0)|Bθ ( )}. Y         1   = 1 − Y 0.

21 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Parametric Sθ gives partition.
Add Y1 = {Y0 , Y1 }, Y2 = {Y00 , Y01 , Y10 , Y11 },
Y3 = {Y000 , Y001 , Y010 , Y011 , Y100 , Y101 , Y110 , Y111 }, etc. to
reﬁne density shape. Let Y = {Y1 , Y2 , . . . , YJ }.
Y   0   = S{Bθ ( 0)|Bθ ( )}. Y         1   = 1 − Y 0.
Next slides take Sθ to be N(0, 1) and ﬁx values of Y1 , Y2 ,
and Y3 .

21 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.5         0.5 0.5 0.5 0.5 0.5 0.5                   0.5    2

0.5            0.5         0.5            0.5
0.5                      0.5

Figure: All pairs (Y 0 , Y 1 ) are 0.5.

22 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.5         0.5 0.5 0.5 0.5 0.5 0.5                   0.5    2

0.5            0.5         0.5            0.5
0.45                   0.55

Figure: Pair of level j = 1 probabilities (Y0 , Y1 ).

23 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.5         0.5 0.5 0.5 0.5 0.5 0.5                   0.5    2

0.7            0.3         0.5            0.5
0.45                   0.55

Figure: Pair of level j = 2 probabilities (Y00 , Y01 ).

24 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.5         0.5 0.5 0.5 0.5 0.5 0.5                   0.5    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Pair of level j = 2 probabilities (Y10 , Y11 ).

25 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.8         0.2 0.5 0.5 0.5 0.5 0.5                   0.5    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Pair of level j = 3 probabilities (Y000 , Y001 ).

26 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.8         0.2 0.7 0.3 0.5 0.5 0.5                   0.5    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Pair of level j = 3 probabilities (Y010 , Y011 ).

27 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.8         0.2 0.7 0.3 0.4 0.6 0.5                   0.5    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Pair of level j = 3 probabilities (Y100 , Y101 ).

28 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.8         0.2 0.7 0.3 0.4 0.6 0.55                 0.45    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Pair of level j = 3 probabilities (Y110 , Y111 ).

29 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

-2      0.8         0.2 0.7 0.3 0.4 0.6 0.55                 0.45    2

0.7            0.3         0.6            0.4
0.45                   0.55

Figure: Mixture of Finite Polya trees.

30 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Prior on (Y 0 , Y 1 )

Want E(Y 0 ) = 0.5 to center S at Sθ . Take

Y   0   ∼ beta(cρ(j), cρ(j)).

31 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Prior on (Y 0 , Y 1 )

Want E(Y 0 ) = 0.5 to center S at Sθ . Take

Y   0   ∼ beta(cρ(j), cρ(j)).

Conjugate beta distribution gives ‘Polya tree’ – other
distributions give a tailfree prior.

31 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Prior on (Y 0 , Y 1 )

Want E(Y 0 ) = 0.5 to center S at Sθ . Take

Y   0   ∼ beta(cρ(j), cρ(j)).

Conjugate beta distribution gives ‘Polya tree’ – other
distributions give a tailfree prior.
c and ρ(j) affect how quickly data “take over” Sθ .

31 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

Prior on (Y 0 , Y 1 )

Want E(Y 0 ) = 0.5 to center S at Sθ . Take

Y   0   ∼ beta(cρ(j), cρ(j)).

Conjugate beta distribution gives ‘Polya tree’ – other
distributions give a tailfree prior.
c and ρ(j) affect how quickly data “take over” Sθ .
c is weight, ρ(j) affects “clumpiness.”

31 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

“Standard” parameterization ρ(j) = j 2

R
B0                                                  B1
(Y0 , Y1 ) ∼ Dir(c, c)
B00                      B01                      B10                      B11
(Y00 , Y01 ) ∼ Dir(4c, 4c)                        (Y10 , Y11 ) ∼ Dir(4c, 4c)
B000        B001        B010         B011        B100         B101         B110        B111
(Y000 , Y001 ) ∼         (Y010 , Y011 ) ∼         (Y100 , Y101 ) ∼         (Y110 , Y111 ) ∼
Dir(9c, 9c)              Dir(9c, 9c)              Dir(9c, 9c)              Dir(9c, 9c)

Π1 = {B0 , B1 }, Y1 = {Y0 , Y1 }.
Π2 = {B00 , B01 , B10 , B11 }, Y2 = {Y00 , Y01 , Y10 , Y11 }.
Π3 = {B000 , B001 , B010 , B011 , B100 , B101 , B110 , B111 }
Y3 = {Y000 , Y001 , Y010 , Y011 , Y100 , Y101 , Y110 , Y111 }
Y = {Y0 , Y00 , Y10 , Y000 , Y010 , Y100 , Y110 }.

32 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

What do random densities look like?

MPT prior S ∼ PT5 (1, ρ, exp(θ))
where θ ∼ Γ(10, 10) so E(θ) = 1. So overall centering
distribution is exp(1).

33 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

What do random densities look like?

MPT prior S ∼ PT5 (1, ρ, exp(θ))
where θ ∼ Γ(10, 10) so E(θ) = 1. So overall centering
distribution is exp(1).
Take J = 5, and c = 1.

33 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

What do random densities look like?

MPT prior S ∼ PT5 (1, ρ, exp(θ))
where θ ∼ Γ(10, 10) so E(θ) = 1. So overall centering
distribution is exp(1).
Take J = 5, and c = 1.
Look at 10 random f (t)’s from MPT prior. That is, 10
random Y. The densities are averaged over θ ∼ Γ(10, 10).

33 / 72
Fundamental concepts
Building likelihood & posterior
Bayesian approach
Gamma process
Semiparametric models
Mixture of Polya trees
Examples

MPT
3

2

1

1                        2                        3
iid
Figure: f1 , . . . , f10 ∼     PT5 (1, ρ, exp(θ))P(dθ).

34 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why semiparametric?

Splits inference into two pieces: β and S0 (t).

35 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why semiparametric?

Splits inference into two pieces: β and S0 (t).
Ideally, β succinctly summarizes effects of risk factors x on
aspects of survival. Make S0 (t) as ﬂexible as possible.

35 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why semiparametric?

Splits inference into two pieces: β and S0 (t).
Ideally, β succinctly summarizes effects of risk factors x on
aspects of survival. Make S0 (t) as ﬂexible as possible.
Can make easily digestible statements concerning the
population, e.g. “Median life on those receiving treatment A
is 1.7 times those receiving B, adjusting for other factors.”

35 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why semiparametric?

Splits inference into two pieces: β and S0 (t).
Ideally, β succinctly summarizes effects of risk factors x on
aspects of survival. Make S0 (t) as ﬂexible as possible.
Can make easily digestible statements concerning the
population, e.g. “Median life on those receiving treatment A
is 1.7 times those receiving B, adjusting for other factors.”
Good starting place for fully nonparametric models (e.g.
additive models, varying coefﬁcient models, dependent
process models, MARS).

35 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why semiparametric?

Splits inference into two pieces: β and S0 (t).
Ideally, β succinctly summarizes effects of risk factors x on
aspects of survival. Make S0 (t) as ﬂexible as possible.
Can make easily digestible statements concerning the
population, e.g. “Median life on those receiving treatment A
is 1.7 times those receiving B, adjusting for other factors.”
Good starting place for fully nonparametric models (e.g.
additive models, varying coefﬁcient models, dependent
process models, MARS).
I will use mixtures of Polya trees priors on S0 in examples.

35 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).
AH: hx (t) = h0 (t) + β x.

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).
AH: hx (t) = h0 (t) + β x.
AFT: Sx (t) = S0 {eβ x t}.

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).
AH: hx (t) = h0 (t) + β x.
AFT: Sx (t) = S0 {eβ x t}.
PO: Fx (t)/Sx (t) = eβ x F0 (t)/S0 (t).

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).
AH: hx (t) = h0 (t) + β x.
AFT: Sx (t) = S0 {eβ x t}.
PO: Fx (t)/Sx (t) = eβ x F0 (t)/S0 (t).
PMRL mx (t) = eβ x m0 (t).

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Some models

PH: hx (t) = exp(x β)h0 (t).
AH: hx (t) = h0 (t) + β x.
AFT: Sx (t) = S0 {eβ x t}.
PO: Fx (t)/Sx (t) = eβ x F0 (t)/S0 (t).
PMRL mx (t) = eβ x m0 (t).
Others, but this is a nice start...

36 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).
Stochastically orders Sx1 and Sx2 .

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).
Stochastically orders Sx1 and Sx2 .
eβj is how risk changes when xj is increased by unity.

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).
Stochastically orders Sx1 and Sx2 .
eβj is how risk changes when xj is increased by unity.
Priors placed on β and one of h0 (t), H0 (t), or S0 (t).

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).
Stochastically orders Sx1 and Sx2 .
eβj is how risk changes when xj is increased by unity.
Priors placed on β and one of h0 (t), H0 (t), or S0 (t).
Cox (1972) is second most cited paper in statistics. (First is
Kaplan and Meier, 1958).

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards (PH)

Model is:

hx (t) = exp(x β)h0 (t) or Sx (t) = S0 (t)exp(x β) .

Extended to time dependent covariates via
hx (t) = exp(x(t) β)h0 (t).
Stochastically orders Sx1 and Sx2 .
eβj is how risk changes when xj is increased by unity.
Priors placed on β and one of h0 (t), H0 (t), or S0 (t).
Cox (1972) is second most cited paper in statistics. (First is
Kaplan and Meier, 1958).
Why?

37 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards the “default”...

Therneau and Grambsch (2000) Modeling Survival Data:
Extending the Cox Model discuss the Cox model including
many generalizations. When proportional hazards fails they
recommend:
Stratiﬁcation within the Cox model.

38 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards the “default”...

Therneau and Grambsch (2000) Modeling Survival Data:
Extending the Cox Model discuss the Cox model including
many generalizations. When proportional hazards fails they
recommend:
Stratiﬁcation within the Cox model.
PH may hold over short time periods, so partition the time
axis within the Cox model.

38 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards the “default”...

Therneau and Grambsch (2000) Modeling Survival Data:
Extending the Cox Model discuss the Cox model including
many generalizations. When proportional hazards fails they
recommend:
Stratiﬁcation within the Cox model.
PH may hold over short time periods, so partition the time
axis within the Cox model.
Time varying effects β(t) within the Cox model.

38 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards the “default”...

Therneau and Grambsch (2000) Modeling Survival Data:
Extending the Cox Model discuss the Cox model including
many generalizations. When proportional hazards fails they
recommend:
Stratiﬁcation within the Cox model.
PH may hold over short time periods, so partition the time
axis within the Cox model.
Time varying effects β(t) within the Cox model.
Only as a last resort consider other models, e.g.
accelerated failure time or additive hazards.

38 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional hazards the “default”...

Therneau and Grambsch (2000) Modeling Survival Data:
Extending the Cox Model discuss the Cox model including
many generalizations. When proportional hazards fails they
recommend:
Stratiﬁcation within the Cox model.
PH may hold over short time periods, so partition the time
axis within the Cox model.
Time varying effects β(t) within the Cox model.
Only as a last resort consider other models, e.g.
accelerated failure time or additive hazards.
Why the reluctance to explore other semiparametric
models?

38 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

If you have a hammer, every problem looks like a nail.

39 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.
SAS PHREG, other software provided momentum.

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.
SAS PHREG, other software provided momentum.
Naturally generalized to time dependent covariates,
time-varying effects, frailties, etc.

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.
SAS PHREG, other software provided momentum.
Naturally generalized to time dependent covariates,
time-varying effects, frailties, etc.
Highly interpretable.

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.
SAS PHREG, other software provided momentum.
Naturally generalized to time dependent covariates,
time-varying effects, frailties, etc.
Highly interpretable.
But...with today’s computing power other semiparametric
models may provide vastly improved ﬁt over PH or
generalizations of PH.

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Why is proportional hazards the “default?”

Initially, partial likelihood made relatively ﬁtting easy.
SAS PHREG, other software provided momentum.
Naturally generalized to time dependent covariates,
time-varying effects, frailties, etc.
Highly interpretable.
But...with today’s computing power other semiparametric
models may provide vastly improved ﬁt over PH or
generalizations of PH.
Having said that, there are a number of excellent packages
available for ﬁtting Bayesian PH models...

40 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Fitting Bayesian PH in packages
SAS: BAYES command in PROC PHREG gives piecewise
exponential.

41 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Fitting Bayesian PH in packages
SAS: BAYES command in PROC PHREG gives piecewise
exponential.
SAS: PROC MCMC.

41 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Fitting Bayesian PH in packages
SAS: BAYES command in PROC PHREG gives piecewise
exponential.
SAS: PROC MCMC.
Belitz, Brezger, Kneib, and Lang’s BayesX assigns
penalized B-spline prior on log h0 (t) and allows for additive
predictors, structured frailties, time-varying coefﬁcients,
etc. Free:
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.

41 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Fitting Bayesian PH in packages
SAS: BAYES command in PROC PHREG gives piecewise
exponential.
SAS: PROC MCMC.
Belitz, Brezger, Kneib, and Lang’s BayesX assigns
penalized B-spline prior on log h0 (t) and allows for additive
predictors, structured frailties, time-varying coefﬁcients,
etc. Free:
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Spiegelhalter, Thomas, Best, and Lunn’s WinBUGS has
example of counting process likelihood that can be easily
modiﬁed to piecewise exponential. Also parametric
example with frailties.

41 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Fitting Bayesian PH in packages
SAS: BAYES command in PROC PHREG gives piecewise
exponential.
SAS: PROC MCMC.
Belitz, Brezger, Kneib, and Lang’s BayesX assigns
penalized B-spline prior on log h0 (t) and allows for additive
predictors, structured frailties, time-varying coefﬁcients,
etc. Free:
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Spiegelhalter, Thomas, Best, and Lunn’s WinBUGS has
example of counting process likelihood that can be easily
modiﬁed to piecewise exponential. Also parametric
example with frailties.
Alejandro Jara’s DPpackage for R can ﬁt PH with
piecewise constaint h0 (t) and nonparametric frailties.
41 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).
Stochastically orders Sx1 and Sx2 .

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).
Stochastically orders Sx1 and Sx2 .
eβj how any quantile – or mean – changes when
increasing xj by unity.

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).
Stochastically orders Sx1 and Sx2 .
eβj how any quantile – or mean – changes when
increasing xj by unity.
Priors can be placed on S0 (t) or equivalently e0 . Prior
elicitation in Bedrick, Christensen, and Johnson (2000).

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).
Stochastically orders Sx1 and Sx2 .
eβj how any quantile – or mean – changes when
increasing xj by unity.
Priors can be placed on S0 (t) or equivalently e0 . Prior
elicitation in Bedrick, Christensen, and Johnson (2000).
Komarek’s bayesSurv for AFT models, spline and
discrete normal mixture on error. Versions can be ﬁt in
WinBUGS.

42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Accelerated failure time (AFT)
Model is
Sx (t) = S0 e−x β t , or log Tx = x β + e0 .

Implies qp (x) = ex β qp (0).
Stochastically orders Sx1 and Sx2 .
eβj how any quantile – or mean – changes when
increasing xj by unity.
Priors can be placed on S0 (t) or equivalently e0 . Prior
elicitation in Bedrick, Christensen, and Johnson (2000).
Komarek’s bayesSurv for AFT models, spline and
discrete normal mixture on error. Versions can be ﬁt in
WinBUGS.
bj() in Harrell’s Design library ﬁts Buckley-James
version.
42 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional odds (PO)

Model is
1 − Sx (t)            1 − S0 (t)
= exp(x β)            .
Sx (t)                S0 (t)

43 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional odds (PO)

Model is
1 − Sx (t)            1 − S0 (t)
= exp(x β)            .
Sx (t)                S0 (t)
βj how odds of event occuring before t changes when xj
increased by unity (for any t).

43 / 72
Fundamental concepts     Proportional hazards
Bayesian approach     Accelerated failure time
Semiparametric models     Proportional odds
Examples      Other models

Proportional odds (PO)

Model is
1 − Sx (t)            1 − S0 (t)
= exp(x β)            .
Sx (t)                S0 (t)
βj how odds of event occuring before t changes when xj
increased by unity (for any t).
Attenuation of risk:
hx1 (t)
lim             = 1.
t→∞      hx2 (t)

Plausible in many situations.

43 / 72
Fundamental concepts     Proportional hazards
Bayesian approach     Accelerated failure time
Semiparametric models     Proportional odds
Examples      Other models

Proportional odds (PO)

Model is
1 − Sx (t)            1 − S0 (t)
= exp(x β)            .
Sx (t)                S0 (t)
βj how odds of event occuring before t changes when xj
increased by unity (for any t).
Attenuation of risk:
hx1 (t)
lim             = 1.
t→∞      hx2 (t)

Plausible in many situations.
No ready software for ﬁtting Bayes version. timereg has
frequentist version. (My code in FORTRAN.)

43 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Model is
hx (t) = h0 (t) + x(t) β.

44 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Model is
hx (t) = h0 (t) + x(t) β.
βj is how risk changes when increasing xj by unity.

44 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Model is
hx (t) = h0 (t) + x(t) β.
βj is how risk changes when increasing xj by unity.
Can be estimated in standard software using empirical
Bayes approach with gamma process prior on H0 (t) (Sinha
et al., 2009).

44 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Model is
hx (t) = h0 (t) + x(t) β.
βj is how risk changes when increasing xj by unity.
Can be estimated in standard software using empirical
Bayes approach with gamma process prior on H0 (t) (Sinha
et al., 2009).
Other approaches require elaborate model speciﬁcation to
incorporate awkward constraints (Yin and Ibrahim, 2005;
Dunson and Herring, 2005).

44 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Model is
hx (t) = h0 (t) + x(t) β.
βj is how risk changes when increasing xj by unity.
Can be estimated in standard software using empirical
Bayes approach with gamma process prior on H0 (t) (Sinha
et al., 2009).
Other approaches require elaborate model speciﬁcation to
incorporate awkward constraints (Yin and Ibrahim, 2005;
Dunson and Herring, 2005).
Non-Bayesian approach nicely implemented in
Martinussen and Scheike (2006) timereg package.

44 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional mean residual life (PMRL)

Model is
mx (t) = exp(x β)m0 (t).

45 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional mean residual life (PMRL)

Model is
mx (t) = exp(x β)m0 (t).
eβj how expected lifetime from current timepoint t
increases when xj increased by unity, for any t. Very nice
interpretation. Often what patients want to know.

45 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Proportional mean residual life (PMRL)

Model is
mx (t) = exp(x β)m0 (t).
eβj how expected lifetime from current timepoint t
increases when xj increased by unity, for any t. Very nice
interpretation. Often what patients want to know.
Very hard to ﬁt. Frequentist approaches but “real” Bayesian
approach not developed yet...

45 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Super models!!!

46 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Faster than a speeding bullet...
Generalized odds-rate model (Scharfstein et al., 1998):
qρ {Sx (t)} = −x β + qρ {S0 (t)}
where qρ (s) = log{ρsρ /(1 − sρ )}. ρ = 1 gives PO and
ρ → 0+ PH. Special case of transformation model.

47 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Faster than a speeding bullet...
Generalized odds-rate model (Scharfstein et al., 1998):
qρ {Sx (t)} = −x β + qρ {S0 (t)}
where qρ (s) = log{ρsρ /(1 − sρ )}. ρ = 1 gives PO and
ρ → 0+ PH. Special case of transformation model.
Chen and Jewell (2001): h(t) = h0 (tex β1 )ex β2 .
β 2 = 0 gives PH and β 1 = β 2 gives AFT.

47 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Faster than a speeding bullet...
Generalized odds-rate model (Scharfstein et al., 1998):
qρ {Sx (t)} = −x β + qρ {S0 (t)}
where qρ (s) = log{ρsρ /(1 − sρ )}. ρ = 1 gives PO and
ρ → 0+ PH. Special case of transformation model.
Chen and Jewell (2001): h(t) = h0 (tex β1 )ex β2 .
β 2 = 0 gives PH and β 1 = β 2 gives AFT.
Yin and Ibrahim (2005):
hx (t)ρ − 1   h0 (t)ρ − 1
=             + β x(t).
ρ             ρ
ρ = 1 gives AH model, ρ → 0 gives PH. Authors treat ρ as
known when ﬁtting.

47 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Faster than a speeding bullet...
Generalized odds-rate model (Scharfstein et al., 1998):
qρ {Sx (t)} = −x β + qρ {S0 (t)}
where qρ (s) = log{ρsρ /(1 − sρ )}. ρ = 1 gives PO and
ρ → 0+ PH. Special case of transformation model.
Chen and Jewell (2001): h(t) = h0 (tex β1 )ex β2 .
β 2 = 0 gives PH and β 1 = β 2 gives AFT.
Yin and Ibrahim (2005):
hx (t)ρ − 1   h0 (t)ρ − 1
=             + β x(t).
ρ             ρ
ρ = 1 gives AH model, ρ → 0 gives PH. Authors treat ρ as
known when ﬁtting.
β loses interpretability; estimation of ρ problematic.
47 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)
Joint longitudinal/survival models. yi (t) = xi (t) + ei (t).

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)
Joint longitudinal/survival models. yi (t) = xi (t) + ei (t).
Recurrent events.

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)
Joint longitudinal/survival models. yi (t) = xi (t) + ei (t).
Recurrent events.
Completely nonparametric approaches.

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)
Joint longitudinal/survival models. yi (t) = xi (t) + ei (t).
Recurrent events.
Completely nonparametric approaches.
Multistate models.

48 / 72
Fundamental concepts   Proportional hazards
Bayesian approach   Accelerated failure time
Semiparametric models   Proportional odds
Examples    Other models

Other generalizations

Frailties. hij (t) = exij β+γi h0 (t).
Cure rate. P(T = ∞) > 0.
Time dependent covariates. hx (t) = ex(t) β h0 (t)
Time varying effects. hx (t) = ex β(t) h0 (t)
Joint longitudinal/survival models. yi (t) = xi (t) + ei (t).
Recurrent events.
Completely nonparametric approaches.
Multistate models.
Competing risks.

48 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer data

Treatment of limited-stage small cell lung cancer in n = 121
patients, data presented in Maksymiuk et al. (1993).

49 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer data

Treatment of limited-stage small cell lung cancer in n = 121
patients, data presented in Maksymiuk et al. (1993).
Used in median-regression models (which have the AFT
property) by Ying et al. (1995), Walker and Mallick (1999),
Yang (1999), Kottas and Gelfand (2001), Hanson (2006).

49 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer data

Treatment of limited-stage small cell lung cancer in n = 121
patients, data presented in Maksymiuk et al. (1993).
Used in median-regression models (which have the AFT
property) by Ying et al. (1995), Walker and Mallick (1999),
Yang (1999), Kottas and Gelfand (2001), Hanson (2006).
Of interest: which sequence of cisplaten and etoposide
patient age.

49 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer data

Treatment of limited-stage small cell lung cancer in n = 121
patients, data presented in Maksymiuk et al. (1993).
Used in median-regression models (which have the AFT
property) by Ying et al. (1995), Walker and Mallick (1999),
Yang (1999), Kottas and Gelfand (2001), Hanson (2006).
Of interest: which sequence of cisplaten and etoposide
patient age.
Treatment A: cisplaten followed by etoposide, B is
vice-versa.

49 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer data

Treatment of limited-stage small cell lung cancer in n = 121
patients, data presented in Maksymiuk et al. (1993).
Used in median-regression models (which have the AFT
property) by Ying et al. (1995), Walker and Mallick (1999),
Yang (1999), Kottas and Gelfand (2001), Hanson (2006).
Of interest: which sequence of cisplaten and etoposide
patient age.
Treatment A: cisplaten followed by etoposide, B is
vice-versa.
Treatment A administered to 62 patients, treatment B
administered to 59 patients; 23 patients right-censored.

49 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.

50 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.
In three semiparametric models,
S0 ∼ PT5 (1, ρ, Sθ )dP(θ).

50 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.
In three semiparametric models,
S0 ∼ PT5 (1, ρ, Sθ )dP(θ).
For PH and AFT models S0 centered at the Weibull
α
{Sθ (t) = e−(t/λ) : α > 0, λ > 0}.

50 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.
In three semiparametric models,
S0 ∼ PT5 (1, ρ, Sθ )dP(θ).
For PH and AFT models S0 centered at the Weibull
α
{Sθ (t) = e−(t/λ) : α > 0, λ > 0}.
PO model centered at log-logistic
{Sθ (t) = (1 + λt α )−1 : α > 0, λ > 0}.

50 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.
In three semiparametric models,
S0 ∼ PT5 (1, ρ, Sθ )dP(θ).
For PH and AFT models S0 centered at the Weibull
α
{Sθ (t) = e−(t/λ) : α > 0, λ > 0}.
PO model centered at log-logistic
{Sθ (t) = (1 + λt α )−1 : α > 0, λ > 0}.
Parametric Weibull and log-logistic models also ﬁt.

50 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing AFT, PO, PH

Patient covariates are xi = (xi1 , xi2 ) age and treatment.
In three semiparametric models,
S0 ∼ PT5 (1, ρ, Sθ )dP(θ).
For PH and AFT models S0 centered at the Weibull
α
{Sθ (t) = e−(t/λ) : α > 0, λ > 0}.
PO model centered at log-logistic
{Sθ (t) = (1 + λt α )−1 : α > 0, λ > 0}.
Parametric Weibull and log-logistic models also ﬁt.
p(α, λ) ∝ 1. p(β) ﬂat, but calibrated to place models on
“equal ground” using Weibull baseline.

50 / 72
Fundamental concepts
Bayesian approach    Lung cancer I
Semiparametric models    Lung cancer II
Examples

Weibull     log-logistic       PO            PH    AFT
LPML:
−747           −735           −734          −737   −734

51 / 72
Fundamental concepts
Bayesian approach    Lung cancer I
Semiparametric models    Lung cancer II
Examples

Weibull      log-logistic       PO            PH    AFT
LPML:
−747            −735           −734          −737   −734
Little predictive difference among the AFT, PO, and
log-logistic models.

51 / 72
Fundamental concepts
Bayesian approach    Lung cancer I
Semiparametric models    Lung cancer II
Examples

Weibull       log-logistic       PO            PH    AFT
LPML:
−747             −735           −734          −737   −734
Little predictive difference among the AFT, PO, and
log-logistic models.
Weibull model clearly inferior.

51 / 72
Fundamental concepts
Bayesian approach    Lung cancer I
Semiparametric models    Lung cancer II
Examples

Weibull       log-logistic       PO            PH    AFT
LPML:
−747             −735           −734          −737   −734
Little predictive difference among the AFT, PO, and
log-logistic models.
Weibull model clearly inferior.
AFT and PO models have a pseudo Bayes factor of about
10 relative to the PH model.

51 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Integrated Cox-Snell residual plots

Weibull                                         AFT
3                                   3
2                                   2
1                                  1

1       2       3                            1         2   3

PH                                         PO
3                                   3
2                                   2
1                                  1

1          2       3                        1         2   3

52 / 72
Fundamental concepts
Bayesian approach       Lung cancer I
Semiparametric models       Lung cancer II
Examples

Par.            MPT AFT                      MPT PO                MPT PH
β1 (age)     0.007 (−0.004,0.036)         0.034 (−0.001,0.071)   0.028 (0.003,0.054)
β2 (A or B)    0.345 (0.157,0.533)          0.930 (0.292,1.568)   0.533 (0.130,0.926)

Posterior regression effects.

53 / 72
Fundamental concepts
Bayesian approach       Lung cancer I
Semiparametric models       Lung cancer II
Examples

Par.            MPT AFT                      MPT PO                MPT PH
β1 (age)     0.007 (−0.004,0.036)         0.034 (−0.001,0.071)   0.028 (0.003,0.054)
β2 (A or B)    0.345 (0.157,0.533)          0.930 (0.292,1.568)   0.533 (0.130,0.926)

Posterior regression effects.
Holding age ﬁxed, patients typically survive e0.345 ≈ 1.4
times longer under treatment A versus B under the AFT
assumption.

53 / 72
Fundamental concepts
Bayesian approach       Lung cancer I
Semiparametric models       Lung cancer II
Examples

Par.            MPT AFT                      MPT PO                MPT PH
β1 (age)     0.007 (−0.004,0.036)         0.034 (−0.001,0.071)   0.028 (0.003,0.054)
β2 (A or B)    0.345 (0.157,0.533)          0.930 (0.292,1.568)   0.533 (0.130,0.926)

Posterior regression effects.
Holding age ﬁxed, patients typically survive e0.345 ≈ 1.4
times longer under treatment A versus B under the AFT
assumption.
The PO model indicates odds of surviving past any time t
is e0.93 ≈ 2.5 greater for treatment A versus B.

53 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

MPT AFT comparing treatments

0.0012

0.0008

0.0004

500                 1500          2500

Figure: Treatment A solid, B dashed.

54 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt

55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.

55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Primarily written by Christiane Belitz, Andreas Brezger,
Thomas Kneib, and Stefan Lang.

55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Primarily written by Christiane Belitz, Andreas Brezger,
Thomas Kneib, and Stefan Lang.
Models can include spatial random effects (frailties), both
areal and point referenced (but not nonparametric).

55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Primarily written by Christiane Belitz, Andreas Brezger,
Thomas Kneib, and Stefan Lang.
Models can include spatial random effects (frailties), both
areal and point referenced (but not nonparametric).
Additive effects carried out primarily through penalized
B-splines.

55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Proportional hazards in BayesX

BayesX is a free, amazing Windows-based program to ﬁt
http://www.stat.uni-muenchen.de/∼bayesx/bayesx.html.
Primarily written by Christiane Belitz, Andreas Brezger,
Thomas Kneib, and Stefan Lang.
Models can include spatial random effects (frailties), both
areal and point referenced (but not nonparametric).
Additive effects carried out primarily through penalized
B-splines.
Fitting of the proportional hazards (& important extensions)
model is easy; log-baseline hazard is modeled as a
penalized B-spline.
55 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

BayesX code using default priors
delimiter = ;
%%%%%%%%%%%%%%%%%%%%%%
% input data
%%%%%%%%%%%%%%%%%%%%%%
dataset surv;
surv.infile age time delta group, maxobs=5000
using c:\some_folder\cancer.txt;
%%%%%%%%%%%%%%%%%%%%%%
% linear predictor
%%%%%%%%%%%%%%%%%%%%%%
bayesreg lp;
lp.outfile = c:\some_folder\lp;
lp.regress delta = time(baseline) + group + age,
iterations=12000 burnin=2000 step=10 family=cox using surv;
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
bayesreg aa;
aa.outfile = c:\some_folder\BayesX\aa;
aa.regress delta = time(baseline) + group + age(psplinerw2),
iterations=12000 burnin=2000 step=10 family=cox using surv;
%%%%%%%%%%%%%%%%%%%%%%
% varying coefficients
%%%%%%%%%%%%%%%%%%%%%%
bayesreg vc;
vc.outfile = c:\some_folder\BayesX\vc;
vc.regress delta = time(baseline) + group*time(baseline) + age*time(baseline),
iterations=12000 burnin=2000 step=10 family=cox using surv;
56 / 72
Fundamental concepts
Bayesian approach      Lung cancer I
Semiparametric models      Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg gi + βa ai }

ESTIMATION RESULTS:

FixedEffects1

Acceptance rate:     75.19 %

Variable   mean           Std. Dev.         2.5% quant.         median        97.5% quant.
const      -10.2569       1.43149           -13.4               -10.1214      -7.90685
group      0.535453       0.198478          0.150767            0.546233      0.915852
age        0.0278298      0.0132894         0.00240202          0.0282356     0.0534999

Treatment B increases hazard of death by e0.535 ≈ 1.7 times.

57 / 72
Fundamental concepts
Bayesian approach      Lung cancer I
Semiparametric models      Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg gi + βa ai }

Effect of time

5.07

-1.71

-8.49

-15.3

-22.1

83            557        1032         1506       1980
time

Figure: f0 (t)

58 / 72
Fundamental concepts
Bayesian approach      Lung cancer I
Semiparametric models      Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg gi + fa (ai )}

Effect of time

4.51

-1.55

-7.61

-13.7

-19.7

83            557        1032         1506       1980
time

Figure: f0 (t)

59 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg gi + fa (ai )}

Effect of age

1.63

0.89

0.15

-0.59

-1.33

36            46.8         57.5          68.3      79
age

Figure: fa (age)

60 / 72
Fundamental concepts
Bayesian approach      Lung cancer I
Semiparametric models      Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg (t)gi + βa (t)ai }

Effect of time

8.68

-0.5

-9.67

-18.8

-28

83            557        1032         1506       1980
time

Figure: f0 (t)

61 / 72
Fundamental concepts
Bayesian approach       Lung cancer I
Semiparametric models       Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg (t)gi + βa (t)ai }

Effect of group

22.2

15.2

8.23

1.22

-5.78

83            557         1032         1506       1980
group

Figure: βg (t)

62 / 72
Fundamental concepts
Bayesian approach      Lung cancer I
Semiparametric models      Lung cancer II
Examples

BayesX: λi (t) = exp{β0 + f0 (t) + βg (t)gi + βa (t)ai }

Effect of age

0.31

-0.06

-0.43

-0.8

-1.17

83            557        1032         1506       1980
age

Figure: βa (t)

63 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

BayesX: option predict gives DIC

λi (t)                                       barD          pD        DIC
exp{β0 + f0 (t) + βg gi + βa ai }            1449.2        8.3       1465.8
exp{β0 + f0 (t) + βg gi + fa (ai )}          1447.2        9.4       1466.0
exp{β0 + f0 (t) + βg (t)gi + βa (t)ai }      2638.7        -1199.8   239.2

Something not quite right with varying coefﬁcient model’s DIC...
In richly parameterized models, posterior mean may not be
ideal to evaluate deviance at.

64 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer II
Veterans Administration (VA) Lung Cancer data introduced
by Prentice (1973).

65 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer II
Veterans Administration (VA) Lung Cancer data introduced
by Prentice (1973).
Semiparametric PO model (Cheng et al., 1997; Murphy et
al, 1997; Yang and Prentice, 1999) & parametric models
(Farewell and Prentice, 1977; Bennett, 1983).

65 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer II
Veterans Administration (VA) Lung Cancer data introduced
by Prentice (1973).
Semiparametric PO model (Cheng et al., 1997; Murphy et
al, 1997; Yang and Prentice, 1999) & parametric models
(Farewell and Prentice, 1977; Bennett, 1983).
Survival in days of men with advanced inoperable lung
cancer.

65 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer II
Veterans Administration (VA) Lung Cancer data introduced
by Prentice (1973).
Semiparametric PO model (Cheng et al., 1997; Murphy et
al, 1997; Yang and Prentice, 1999) & parametric models
(Farewell and Prentice, 1977; Bennett, 1983).
Survival in days of men with advanced inoperable lung
cancer.
Predictors of survival well established: tumor type (large,
adeno, small, squamous) & ﬁtness performance score
ranging from 10 (completely hospitalized) to 90 (able to
take care of oneself).

65 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Lung cancer II
Veterans Administration (VA) Lung Cancer data introduced
by Prentice (1973).
Semiparametric PO model (Cheng et al., 1997; Murphy et
al, 1997; Yang and Prentice, 1999) & parametric models
(Farewell and Prentice, 1977; Bennett, 1983).
Survival in days of men with advanced inoperable lung
cancer.
Predictors of survival well established: tumor type (large,
adeno, small, squamous) & ﬁtness performance score
ranging from 10 (completely hospitalized) to 90 (able to
take care of oneself).
Following others, consider a subgroup of n = 97 patients
with no prior therapy. Six of the 97 survival times are
censored.
65 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing ﬁts

Parameter                      MPT              MPLE              MDF
Score                −0.055 (0.010)     −0.055 (0.010)   −0.034 (0.007)
Adeno vs. large       1.303 (0.559)      1.339 (0.556)    1.411 (0.674)
Small vs. large       1.362 (0.527)      1.440 (0.525)    1.353 (0.506)
Squamous vs. large   −0.173 (0.580)     −0.217 (0.589)    0.165 (0.653)

MPT PO model with J = 5 and c = 1; maximum proﬁle
likelihood estimator (MPLE) of Murphy et al. (1997); one
minimum distance estimator (MDF) of Yang and Prentice
(1999).

66 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing ﬁts

Parameter                      MPT              MPLE              MDF
Score                −0.055 (0.010)     −0.055 (0.010)   −0.034 (0.007)
Adeno vs. large       1.303 (0.559)      1.339 (0.556)    1.411 (0.674)
Small vs. large       1.362 (0.527)      1.440 (0.525)    1.353 (0.506)
Squamous vs. large   −0.173 (0.580)     −0.217 (0.589)    0.165 (0.653)

MPT PO model with J = 5 and c = 1; maximum proﬁle
likelihood estimator (MPLE) of Murphy et al. (1997); one
minimum distance estimator (MDF) of Yang and Prentice
(1999).
Posterior medians and standard deviations obtained under
the MPT model are very close to the MPLE estimates.

66 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Comparing ﬁts

Parameter                      MPT              MPLE              MDF
Score                −0.055 (0.010)     −0.055 (0.010)   −0.034 (0.007)
Adeno vs. large       1.303 (0.559)      1.339 (0.556)    1.411 (0.674)
Small vs. large       1.362 (0.527)      1.440 (0.525)    1.353 (0.506)
Squamous vs. large   −0.173 (0.580)     −0.217 (0.589)    0.165 (0.653)

MPT PO model with J = 5 and c = 1; maximum proﬁle
likelihood estimator (MPLE) of Murphy et al. (1997); one
minimum distance estimator (MDF) of Yang and Prentice
(1999).
Posterior medians and standard deviations obtained under
the MPT model are very close to the MPLE estimates.
Increasing performance score by 20 increases the odds of
surviving past any ﬁxed time point by about 200%,
e(−20)(−0.055) ≈ 3.
66 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

Comparing PO, AFT, PH, and parametric

log-logistic    MPT PO        MPT gen. odd rate      MPT AFT       MPT PH
centered at      centered at         centered at   centered at
LPML:                  log-logistic      log-logistic         Weibull       Weibull
−508           −508              −511               −514          −516

67 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

Comparing PO, AFT, PH, and parametric

log-logistic    MPT PO        MPT gen. odd rate      MPT AFT       MPT PH
centered at      centered at         centered at   centered at
LPML:                  log-logistic      log-logistic         Weibull       Weibull
−508           −508              −511               −514          −516

Little predictive difference among PO and log-logistic
models. (log-logistic has PO property).

67 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

Comparing PO, AFT, PH, and parametric

log-logistic    MPT PO        MPT gen. odd rate      MPT AFT       MPT PH
centered at      centered at         centered at   centered at
LPML:                  log-logistic      log-logistic         Weibull       Weibull
−508           −508              −511               −514          −516

Little predictive difference among PO and log-logistic
models. (log-logistic has PO property).
Weibull model clearly inferior. Pseudo Bayes factor of 3000
in favor of PO over PH model.

67 / 72
Fundamental concepts
Bayesian approach        Lung cancer I
Semiparametric models        Lung cancer II
Examples

Comparing PO, AFT, PH, and parametric

log-logistic    MPT PO        MPT gen. odd rate      MPT AFT       MPT PH
centered at      centered at         centered at   centered at
LPML:                  log-logistic      log-logistic         Weibull       Weibull
−508           −508              −511               −514          −516

Little predictive difference among PO and log-logistic
models. (log-logistic has PO property).
Weibull model clearly inferior. Pseudo Bayes factor of 3000
in favor of PO over PH model.
Proportional odds implies attenuation of risk as time goes
on.

67 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Integrated Cox-Snell residual plots

PO                                   AFT
6                                 6
4                                 4
2                                 2

2        4     6                     2         4   6

Log logistic                                PH
6                                 6
4                                 4
2                                 2

2        4     6                     2         4   6

68 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

MPT PO comparing treatments

0.01     PS=40

0.005
PS=60

PS=80

100                   300          500

Figure: Predictive densities, squamous, MPT with c = 1; survival is in
days.

69 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

MPT PO comparing treatments

0.015

0.01

0.005               large, squamous

100                   300          500

Figure: Predictive densities, performance status = 60, MPT with
c = 1.

70 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

MPT PO median survival

days

400

300

200

100

score
40       50       60     70        80   90

Figure: Median survival with 95% CI versus score for squamous,
c = 1.

71 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Discussion...

Bayesian approach allowed comparison of parametric and
semiparametric survival regression models using standard
model selection criterion.

72 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Discussion...

Bayesian approach allowed comparison of parametric and
semiparametric survival regression models using standard
model selection criterion.
All ﬁtting via MCMC routines.

72 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Discussion...

Bayesian approach allowed comparison of parametric and
semiparametric survival regression models using standard
model selection criterion.
All ﬁtting via MCMC routines.
Non-asymptotic inference. Everything exact up to MCMC
error.

72 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Discussion...

Bayesian approach allowed comparison of parametric and
semiparametric survival regression models using standard
model selection criterion.
All ﬁtting via MCMC routines.
Non-asymptotic inference. Everything exact up to MCMC
error.
Able to get inferences for hazard ratios, quantiles, etc.

72 / 72
Fundamental concepts
Bayesian approach   Lung cancer I
Semiparametric models   Lung cancer II
Examples

Discussion...

Bayesian approach allowed comparison of parametric and
semiparametric survival regression models using standard
model selection criterion.
All ﬁtting via MCMC routines.
Non-asymptotic inference. Everything exact up to MCMC
error.
Able to get inferences for hazard ratios, quantiles, etc.
Differences minor here, but have seen very marked
differences across surivival models (e.g. PH vs. PO).

72 / 72

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 510 posted: 1/14/2010 language: English pages: 209