Contents 1 of 15
Goodness of Fit in Relative Survival Regression
Maja Pohar Perme; Janez Stare; Robin Henderson
University of Ljubljana, Institute of Biomedical Informatics, Vrazov trg 2, 1000 Ljubljana, Slovenia.
University of Newcastle, School of Mathematics and Statistics, NE1 7RU, Newcastle upon Tyne, UK.
E-mail: firstname.lastname@example.org; email@example.com; Robin.Henderson@ncl.ac.uk
The idea of the relative survival methods is to compare the observed survival of the patients to the
survival of the population at the same period of time. These methods become important when we
wish to estimate the mortality due to a certain disease, but we do not have reliable information on
the cause of death. The relative survival is therefore crucial for all the survival analyses, in which the
patients have been followed for long periods of time. In such studies, we shall almost always observe
the negative eﬀect of age and the positive eﬀect of the calendar year, even though this might have no
relation to the disease of our interest. To answer questions like ”Has the hazard of a disease lowered
in the last decade?” one can resort to the relative survival regression models.
The data used in this ﬁeld are frequently obtained from the registries, with cancer survival studies being
the prime example. The information on the population survival is obtained from the the population
mortality tables - we can use them to estimate the hazard every person is subdued to because of his
or her age, sex and cohort year (or any other combination of covariates included in the population
mortality data). This quantity is named the population hazard λP as opposed to the observed hazard
T (λO ), which is the hazard we estimate from the data by regarding all deaths as events.
Contents 2 of 15
The most common approach of modelling relative survival is through assuming a relationship
between these two hazards, which can be either additive [7, 4, 3, 2]
(1) λO (t) = λP (t) + λE (t),
or multiplicative 
(2) λO (t) = λP (t)ν(t).
The term λE (t) is named the excess hazard and interpreted as the hazard speciﬁc for the disease in
question, while the unit-free term ν(t) can be seen as a factor of relative mortality. These are the
terms in which the relationship with the covariates is speciﬁed, we shall focus on the situation where
the excess hazard is modelled as
(3) λE (t) = λ0 (t)eβ Z
and similarly the relative mortality is set to equal
(4) ν(t) = ν0 (t)eβ Z .
In this way, both models assume that the eﬀect of the covariates Z does not change in time, i.e. the
β coeﬃcient is constant (proportional hazards assumption).
An important theoretical advantage of the multiplicative model (4) is that it can be rewritten as
(5) λO (t) = ν0 (t)eβ Z+1·ln(λP (t)) ,
Contents 3 of 15
implying that this is a special case of the Cox model with an additional time-dependent variable having
the coeﬃcient ﬁxed to 1. Therefore, this model can be ﬁtted using the partial likelihood approach and
no assumptions are needed for the ν0 (t). On the contrary, full likelihood is needed to ﬁt the additive
model and its fully parametric form poses additional assumptions.
When ﬁtting a relative survival regression model, one must therefore make a series of assump-
tions. However, the topic of goodness of ﬁt of these models has traditionally been very poorly addressed
in statistical literature and hardly ever in practice. In this paper we shall review some of the recently
introduced methods that either deal with checking or avoiding the assumptions. The next section
describes the individual measure as an alternative approach to relative survival that needs no assump-
tions about the relationship between the observed and population hazard. Section 3 presents the
methods for checking the proportional (excess) hazards assumption and Section 4 demonstrates how
the additive model can also be ﬁtted as semiparametric. Section 5 brieﬂy describes the R package
relsurv that puts into practice all the reviewed methods and Section 6 contains some discussion and
2 Individual measure of relative survival - the transformation ap-
Using the population mortality tables, the cumulative population mortality distribution can be cal-
culated for each individual. We denote this curve as FP (t) and interpret it as the probability that an
individual in our study (of a given age and sex, diagnosed in a certain calendar year) dies of ’popu-
T lation reasons’ in the course of the next t years. To evaluate the eﬀect of the disease in question we
Contents 4 of 15
then transform the actual observed time T to get
(6) Y := FP (T ).
The value Y obtained with this simple transformation is named the individual measure and can be
interpreted as how long a person has lived compared to his counterparts in the population . As
a value on the cumulative distribution curve it is always between 0 and 1, a value of 0.5 for example
implies that an individual has outlived exactly half of his population counterparts. So, in order to
compare the survival experience of two individuals that have a known diﬀerent expected survival, we
use the values Y they achieved on their expected cdf. In this way we deﬁne a new time to event
that can be used for survival analysis. The censoring mechanism remains the same, i.e. if time
t was a censoring time, the same holds for time y. Note also that if T is distributed as FP , the
measure Y is the probability integral transform and thus uniformly distributed. This property helps
us intuitively understand why this transformation is sensible in the context of relative survival. If all
our patients had only the general population risks (and none of the disease), by transforming their
times by their corresponding FP , we would make them all identically distributed, i.e. uniformly on
[0, 1]. Therefore, by transforming to the Y scale, we automatically take into account all the population
risk, consequently, what is left is precisely the excess risk, that we can thus directly model. Basically
any model that is appropriate for the survival data could be used, we shall focus here on the Cox
(7) λ( y) := λ0 (t)eβ Z .
Note that the transformation can cause dependence between the censoring times and the covariates
T used for transforming. Therefore, the model should always include these variables in order to ensure
Contents 5 of 15
conditional independence between the survival and censoring times distribution.
As both the multiplicative model and the Cox model in transformed time are just special cases of the
Cox model, no further theoretical work is needed. It should be noted, however, that the three models
introduced (1,2,7) have quite a diﬀerent interpretation. While the idea of the additive model is to use
the population tables to get information about the cause speciﬁc survival, the other two models give
information on the eﬀect of the covariates relative to the population. As an example, say we are in
year 2000 and the excess hazard for the disease in question equals to λE = 0.04 regardless of age. This
is roughly the size of the population hazard of a 70 year old man and implies about 20% of people
dying of excess risk in the course of the next 5 years. Added to the population hazard of a 50 year
old (λP = 0.007), this is a substantial increase of his observed hazard. On the other hand, a 90 year
old, whose population hazard is λP = 0.25, would hardly notice the excess hazard and the disease
would have only a negligible eﬀect on his survival. While the additive model would thus result in
no eﬀect of age, the multiplicative model and the transformation approach would give a signiﬁcantly
negative coeﬃcient for age. The interpretation would be similar in both cases, i.e. the impact of
the disease on survival relatively decreases with age. The sign of the coeﬃcient in these two models
gives an indication of how the patients live compared to the population, if β is constant in time, this
relationship remains the same throughout the follow-up time. On the other hand, in the case of the
additive model, one should always interpret also the size of the baseline excess hazard. If it becomes
very small after a few years of follow-up, even a large coeﬃcient does not carry much importance for
the observed survival.
Note also that in the case of the both, the multiplicative and the Cox model in transformed time,
the coeﬃcient for the covariates not included in the population tables (e.g. smoking) is the same as
Contents 6 of 15
if the observed hazard was ﬁtted with a Cox model. This corresponds to the fact that no additional
information on these covariates was obtained by using the population tables. This is not the case for
the additive model, where all coeﬃcients are diﬀerent and the proportional hazards assumption is also
not met simultaneously. Nevertheless, the sign of the coeﬃcients will remain the same.
In cancer survival the ﬁrst year is usually critical, regardless whether the patients are young or old.
Therefore the change of scale is probably not a good idea as it would lead to nonproportional hazards.
On the other hand, the additive model is not suitable when the excess hazard is small compared to
the population hazard. Though the multiplicative model and the Cox model in transformation time
often give similar results, the proportional hazards assumption is usually not simultaneously satisﬁed.
Therefore, we should check the goodness of ﬁt of the model before interpreting it. This is the focus of
the next section.
3 Proportional (excess) hazard assumption - partial residuals
The multiplicative model and the Cox model in transformed time can both be studied withing the
Cox model framework, whose theory provides us with several options for checking the PH assumption.
A particularly useful diagnostic tool are the Schoenfeld’s partial residuals  that arise naturally
from the estimating equations obtained from partial likelihood. On the other hand, no substantial
literature exists on the goodness of ﬁt checking for the additive model. The deviance residuals  can
only be used for grouped data and their performance is sensible to the group sizes, ad hoc testing using
interaction with time depends on the parametrization of the baseline excess hazard. The idea of this
section is to deﬁne residuals akin to Schoenfeld residuals that give good information about goodness of
ﬁt of the additive model, but are not tied to the model ﬁtting procedure and do not require grouping.
T The partial residuals for the additive model are deﬁned as 
Contents 7 of 15
Ui (β) : = Zi − E(β, ti )
where E is given by
λOj (β, ti )
(8) E (β, ti ) : = Zj .
λOk (β, ti )
k ∈R i
and can be interpreted as the expected value of the covariate Z of the person who experienced an
event at time ti . The Ri in the above formula denotes the risk set at time ti . Note that this is the
same as the deﬁnition of the Schoenfeld residuals, the only diﬀerence is that the Cox model hazard λ
is replaced by the observed hazard λOj .
Using a one-term Taylor series expansion, one can show that the true coeﬃcient β 0 in time can be
(9) β 0 (ti ) = β + E(β, ti ) E [Ui (β)|Ri ] .
A graphical test of the proportional hazards assumption can therefore be performed by plotting a
smoothed trace through scaled partial residuals, see Figure 1. This is completely analogous to the
method used in the Cox model framework, with the sole diﬀerence that the scaling factor does not
equal the variance of the residuals.
To support the graphical impression with a formal test, again the ideas for the Cox model are followed
and a cumulative sum Bn of the residuals standardized with their variance Vi is formed 
Contents 8 of 15
k 1 Ui (β)
(10) Bn (β, 0) := 0 Bn (β, ) := √ , k = 1, . . . , n.
n n i=1
[ Figure not available at time of publication ]
Figure 1: Two simulated data sets with (a) a constant β 0 (t), (b) a decreasing β 0 (t). The true value is
denoted by the dotted line, the smooth trace through scaled residuals is presented with a solid curve.
If the smoothed trace through residuals is constant, the positive and negative steps exchange randomly
in the cumulative sum (10) and therefore its value rarely digresses far from 0. On the other hand,
if a certain, e.g. negative linear trend exists, the sign of the residuals will ﬁrst be positive and later
negative with the cumulative sum departing quite far away from 0, see Figures 2(a) and (b).
To evaluate the importance of these departures, we consider the ’tied down’ process (see Figure 2(c))
k k k
(11) BBn (β, ) = Bn (β, ) − Bn (β, 1).
n n n
[ Figure not available at time of publication ]
(a) (b) (c)
Figure 2: An example of a negative linear trend: (a) standardised residuals and their smoothed average;
(b) Brownian motion process; (c) Brownian bridge process.
Contents 9 of 15
This process can be shown to converge to the Brownian bridge in the true β 0 , and is a reasonable
approximation of it using the estimated coeﬃcient β . Based on the Brownian bridge theory, we
can use several kinds of test statistics, depending on the alternative hypothesis in mind. A simple
test statistic having good power when checking for monotonic changes of β would be the maximum
absolute value of the bridge that is under the null hypothesis distributed as
(12) P ( max (|BB(t)|) ≤ x) = 1 + 2 (−1)k e−2k , x > 0.
4 A semiparametric additive model - the EM approach
In the context of relative survival literature most frequently refers to the additive models [7, 4, 3],
dominating especially in cancer research. While models (2) and (7) can proﬁt from the Cox model
theory, the theory for the additive model has been developed separately. In this section we will show
that the additive model can also be ﬁtted by iterating the Cox model estimation . This enables us
to use the methods already available and to better understand its properties. Furthermore, the ﬁtting
algorithm allows the model to be semi-parametric with the baseline excess hazard left unspeciﬁed.
This is important as assumptions on the parametric form of the baseline excess hazard can aﬀect the
coeﬃcient estimates as well as the goodness of ﬁt tests.
The main idea of this approach is to regard the cause of death (δiE = 1 for deaths due to the
disease, 0 otherwise) as the missing information. The probability of a patient dying due to the disease
(conditional on that patient dying (δi = 1) at time ti ) equals:
Contents 10 of 15
λEi (ti ) λ0 (ti ) exp(βZi )
(13) P (δEi = 1|δi , ti ) = δi = δi .
λP i (ti ) + λEi (ti ) λP i (ti ) + λ0 (ti ) exp(βZi )
If δiE was known, the relative survival methods would not be needed as one could ﬁt the cause speciﬁc
Cox model with the partial likelihood function
(14) L(β|Z) = .
Hence iterating the partial likelihood maximization and updating the value of δEi forms the EM al-
gorithm. The baseline excess hazard can be estimated on each step using the Breslow estimator, so
no assumptions are needed about its form. Nevertheless some smoothing must be applied in order to
ensure the excess hazard is positive on all intervals.
The proposed method is thus an iteration between Cox model ﬁtting and a simple ratio calculation.
This means that the additive model can be seen as a generalization of the Cox model and the wealth
of extensions available for the Cox model can be straightforwardly incorporated into relative survival.
One can also get a better understanding of the additive model properties. For example, it can easily
be seen that the precision of the additive model estimates depends on the percentage of deaths due
to the disease - it determines the number of events for the Cox model and the amount of missing
information. The interpretation of the model can also be helped by the estimated individual disease
related death probabilities (13) that are obtained in the ﬁtting procedure.
Contents 11 of 15
5 The relsurv package
The R  package relsurv  currently unites more than 25 functions that cover several phases
of the relative survival analysis, from plotting the relative survival curves to ﬁtting and evaluating
diﬀerent models with common goodness of ﬁt methods. The common input and output that follows
the organization of the existing survival package ensure easy use and comparability of results. An
important reason for the choice of R software is the ﬂexible implementation of the population tables
(via object ratetable) that allows almost any format. The package also includes a couple of functions
that enable direct usage of the mortality tables published on internet (e.g. Human Mortality
Database http://www.mortality.org ). The package is freely available at the R web repository
The relative survival regression models are quite often used when analysing cancer registries data,
but this usage is almost exclusively conﬁned to the additive model. While it is often argued that the
additive model is well suited to describing cancer survival, this is hardly ever supported by checking
the sensibility of its assumptions. To help improving the analyses, this paper therefore reviews some
of the recent advances in the ﬁeld of goodness of ﬁt of relative survival regression models.
The most crucial assumption made in any of the models is the one concerning the relationship be-
tween the observed and population hazard, unfortunately, no methods exist for checking it. The only
method that avoids this assumption is the transformation approach that presents a ﬂexible way of
Contents 12 of 15
modelling and an interesting alternative to the existing methods. It is important to note that the
additive, the multiplicative and the Cox model in transformed time all attempt to describe relative
survival, but have to be interpreted in a very diﬀerent way. While the additive model aims to estimate
the cause speciﬁc survival, the estimates in the other two models directly give information relative
to the population survival. Furthermore, these two models become indispensable when the observed
survival is close or even better than the population survival.
The statement based on any chosen model, that a certain predictor does not inﬂuence relative sur-
vival, should therefore be taken with care and interpreted accordingly, since the null hypotheses of
the models diﬀer. To get a better understanding of the data it might even be useful to ﬁt several
models. These diﬀerences between the models are problematic only for the variables included in the
population tables, though. All other variables will yield results with the same interpretation in all the
The other assumption common to all the models described is the one of proportional (excess) haz-
ards. This assumption can be checked using residuals with the same idea and interpretation in all
the models. Graphical as well as formal testing procedures can thus give a valuable and comparable
information on the ﬁt of the models.
When performing any goodness of ﬁt tests one must keep in mind that all the assumptions but the
one we are checking are assumed to be fulﬁlled. In the case of the additive model, a misspeciﬁed
parametric form of the baseline excess hazard (another non-testable assumption) might importantly
aﬀect the test results. To avoid specifying the baseline excess hazard form, one can resort to the EM
ﬁtting approach [? ]. When the assumptions about the baseline excess hazard hold, the performance
of both parametric and semi-parametric approaches will be very similar, but when the assumptions
are violated, the EM approach can be more reliable.
Contents 13 of 15
As the the EM ﬁtting method can be viewed as a generalization on the Cox model, any further exten-
sions used in the Cox model framework (splines, frailties, etc) for classical survival can be straightfor-
wardly incorporated into the study of relative survival. This makes the method very ﬂexible and frees
us of the need to ”reinvent” all the methods for relative survival use.
The relative survival methodology is in practice mostly conﬁned to the cancer registries and is routinely
following the theoretical work done several decades ago using predominantly methods for grouped
data. This is partly due also to the lack of programs that would allow the users to perform more
into depth analyses. While some software does exist (Surv3 , SAS macros and Stata functions
(www.pauldickman.com), RSurv R function ), it is mostly covering only one regression model and
has several limitations regarding the population mortality tables. A hopefully useful step forward is
the relsurv  package in R  that enables ﬁtting and checking all the models described and
provides many other functions needed for a quality analysis of the data.
 Andersen, P. K., Borch-Johnsen, K., Deckert, T., Green, A., Hougaard, P., Keiding, N., and Kreiner, S.
(1985). A Cox regression model for relative mortality and its application to diabetes mellitus survival data.
 Cortese, G. and Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics
in Medicine, 27:3563–3548.
 Dickman, P. W., Sloggett, A., Hills, M., and Hakulinen, T. (2004). Regression models for relative survival.
Statistics in Medicine, 23:51–64.
Contents 14 of 15
 Est`ve, J., Benhamou, E., Croasdale, M., and Raymond, M. (1990). Relative survival and the estimation of
net survival: elements for further discussion. Statistics in Medicine, 9:529–538.
 Giorgi, R., Payan, J., and Gouvernet, J. (2005). RSURV: A function to perform relative survival analysis
with S-PLUS or R. 78:175–178.
 Hakulinen, T. and Abeywickrama, K. H. (1985). A computer program package for relative survival analysis.
Computer Methods and Programs in Biomedicine, 19 (2-3):197–207.
 Hakulinen, T. and Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal
Statistical Society — Series C, 36:309–317.
 Pohar, M. (2007). Goodness of ﬁt of relative survival models. Doctoral thesis.
 Pohar, M., Henderson, R., and Stare, J. (2008). An approach to estimation in relative survival regression.
Biostatistics, to appear.
 Pohar, M. and Stare, J. (2006). Relative survival analysis in R. Computer Methods and Programs in
 Pohar, M. and Stare, J. (2007). Making relative survival analysis relatively easy. Computers in biology and
 R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation
for Statistical Computing, Vienna, Austria.
 Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model. Biometrika, 69:239–
Contents 15 of 15
 Stare, J., Henderson, R., and Pohar, M. (2005a). An individual measure of relative survival. Journal of
the Royal Statistical Society — Series C, 54:115–126.
 Stare, J., Pohar, M., and Henderson, R. (2005b). Goodness of ﬁt of relative survival models. Statistics in