Goodness of Fit in Relative Survival Regression by sdsdfqw21


                                      Contents                                                                     1 of 15


                                  Goodness of Fit in Relative Survival Regression
               Maja Pohar Perme; Janez Stare; Robin Henderson
               University of Ljubljana, Institute of Biomedical Informatics, Vrazov trg 2, 1000 Ljubljana, Slovenia.
               University of Newcastle, School of Mathematics and Statistics, NE1 7RU, Newcastle upon Tyne, UK.

               1     Introduction
               The idea of the relative survival methods is to compare the observed survival of the patients to the

               survival of the population at the same period of time. These methods become important when we
               wish to estimate the mortality due to a certain disease, but we do not have reliable information on
               the cause of death. The relative survival is therefore crucial for all the survival analyses, in which the
               patients have been followed for long periods of time. In such studies, we shall almost always observe
               the negative effect of age and the positive effect of the calendar year, even though this might have no
               relation to the disease of our interest. To answer questions like ”Has the hazard of a disease lowered
               in the last decade?” one can resort to the relative survival regression models.
               The data used in this field are frequently obtained from the registries, with cancer survival studies being
               the prime example. The information on the population survival is obtained from the the population
               mortality tables - we can use them to estimate the hazard every person is subdued to because of his
               or her age, sex and cohort year (or any other combination of covariates included in the population
               mortality data). This quantity is named the population hazard λP as opposed to the observed hazard
           T   (λO ), which is the hazard we estimate from the data by regarding all deaths as events.
                                    Contents                                                        2 of 15

                    The most common approach of modelling relative survival is through assuming a relationship
               between these two hazards, which can be either additive [7, 4, 3, 2]

               (1)         λO (t) = λP (t) + λE (t),

               or multiplicative [1]

               (2)         λO (t) = λP (t)ν(t).

               The term λE (t) is named the excess hazard and interpreted as the hazard specific for the disease in
               question, while the unit-free term ν(t) can be seen as a factor of relative mortality. These are the

               terms in which the relationship with the covariates is specified, we shall focus on the situation where
               the excess hazard is modelled as

               (3)         λE (t) = λ0 (t)eβ Z

               and similarly the relative mortality is set to equal

               (4)         ν(t) = ν0 (t)eβ Z .

               In this way, both models assume that the effect of the covariates Z does not change in time, i.e. the
               β coefficient is constant (proportional hazards assumption).
               An important theoretical advantage of the multiplicative model (4) is that it can be rewritten as

               (5)         λO (t) = ν0 (t)eβ Z+1·ln(λP (t)) ,
                                   Contents                                                            3 of 15

               implying that this is a special case of the Cox model with an additional time-dependent variable having
               the coefficient fixed to 1. Therefore, this model can be fitted using the partial likelihood approach and
               no assumptions are needed for the ν0 (t). On the contrary, full likelihood is needed to fit the additive
               model and its fully parametric form poses additional assumptions.
                     When fitting a relative survival regression model, one must therefore make a series of assump-
               tions. However, the topic of goodness of fit of these models has traditionally been very poorly addressed
               in statistical literature and hardly ever in practice. In this paper we shall review some of the recently
               introduced methods that either deal with checking or avoiding the assumptions. The next section
               describes the individual measure as an alternative approach to relative survival that needs no assump-

               tions about the relationship between the observed and population hazard. Section 3 presents the
               methods for checking the proportional (excess) hazards assumption and Section 4 demonstrates how
               the additive model can also be fitted as semiparametric. Section 5 briefly describes the R package
               relsurv that puts into practice all the reviewed methods and Section 6 contains some discussion and
               concluding remarks.

               2    Individual measure of relative survival - the transformation ap-
               Using the population mortality tables, the cumulative population mortality distribution can be cal-
               culated for each individual. We denote this curve as FP (t) and interpret it as the probability that an
               individual in our study (of a given age and sex, diagnosed in a certain calendar year) dies of ’popu-
           T   lation reasons’ in the course of the next t years. To evaluate the effect of the disease in question we
                                    Contents                                                           4 of 15

               then transform the actual observed time T to get
               (6)         Y := FP (T ).
               The value Y obtained with this simple transformation is named the individual measure and can be
               interpreted as how long a person has lived compared to his counterparts in the population [14]. As
               a value on the cumulative distribution curve it is always between 0 and 1, a value of 0.5 for example
               implies that an individual has outlived exactly half of his population counterparts. So, in order to
               compare the survival experience of two individuals that have a known different expected survival, we
               use the values Y they achieved on their expected cdf. In this way we define a new time to event

               that can be used for survival analysis. The censoring mechanism remains the same, i.e. if time

               t was a censoring time, the same holds for time y. Note also that if T is distributed as FP , the
               measure Y is the probability integral transform and thus uniformly distributed. This property helps
               us intuitively understand why this transformation is sensible in the context of relative survival. If all
               our patients had only the general population risks (and none of the disease), by transforming their
               times by their corresponding FP , we would make them all identically distributed, i.e. uniformly on
               [0, 1]. Therefore, by transforming to the Y scale, we automatically take into account all the population
               risk, consequently, what is left is precisely the excess risk, that we can thus directly model. Basically
               any model that is appropriate for the survival data could be used, we shall focus here on the Cox
               (7)         λ( y) := λ0 (t)eβ Z .
               Note that the transformation can cause dependence between the censoring times and the covariates
           T   used for transforming. Therefore, the model should always include these variables in order to ensure
                                  Contents                                                           5 of 15

               conditional independence between the survival and censoring times distribution.
               As both the multiplicative model and the Cox model in transformed time are just special cases of the
               Cox model, no further theoretical work is needed. It should be noted, however, that the three models
               introduced (1,2,7) have quite a different interpretation. While the idea of the additive model is to use
               the population tables to get information about the cause specific survival, the other two models give
               information on the effect of the covariates relative to the population. As an example, say we are in
               year 2000 and the excess hazard for the disease in question equals to λE = 0.04 regardless of age. This
               is roughly the size of the population hazard of a 70 year old man and implies about 20% of people
               dying of excess risk in the course of the next 5 years. Added to the population hazard of a 50 year

               old (λP = 0.007), this is a substantial increase of his observed hazard. On the other hand, a 90 year
               old, whose population hazard is λP = 0.25, would hardly notice the excess hazard and the disease
               would have only a negligible effect on his survival. While the additive model would thus result in
               no effect of age, the multiplicative model and the transformation approach would give a significantly
               negative coefficient for age. The interpretation would be similar in both cases, i.e. the impact of
               the disease on survival relatively decreases with age. The sign of the coefficient in these two models
               gives an indication of how the patients live compared to the population, if β is constant in time, this
               relationship remains the same throughout the follow-up time. On the other hand, in the case of the
               additive model, one should always interpret also the size of the baseline excess hazard. If it becomes
               very small after a few years of follow-up, even a large coefficient does not carry much importance for
               the observed survival.
               Note also that in the case of the both, the multiplicative and the Cox model in transformed time,
               the coefficient for the covariates not included in the population tables (e.g. smoking) is the same as
                                   Contents                                                           6 of 15

               if the observed hazard was fitted with a Cox model. This corresponds to the fact that no additional
               information on these covariates was obtained by using the population tables. This is not the case for
               the additive model, where all coefficients are different and the proportional hazards assumption is also
               not met simultaneously. Nevertheless, the sign of the coefficients will remain the same.
               In cancer survival the first year is usually critical, regardless whether the patients are young or old.
               Therefore the change of scale is probably not a good idea as it would lead to nonproportional hazards.
               On the other hand, the additive model is not suitable when the excess hazard is small compared to
               the population hazard. Though the multiplicative model and the Cox model in transformation time
               often give similar results, the proportional hazards assumption is usually not simultaneously satisfied.

               Therefore, we should check the goodness of fit of the model before interpreting it. This is the focus of
               the next section.

               3    Proportional (excess) hazard assumption - partial residuals
               The multiplicative model and the Cox model in transformed time can both be studied withing the
               Cox model framework, whose theory provides us with several options for checking the PH assumption.
               A particularly useful diagnostic tool are the Schoenfeld’s partial residuals [13] that arise naturally
               from the estimating equations obtained from partial likelihood. On the other hand, no substantial
               literature exists on the goodness of fit checking for the additive model. The deviance residuals [3] can
               only be used for grouped data and their performance is sensible to the group sizes, ad hoc testing using
               interaction with time depends on the parametrization of the baseline excess hazard. The idea of this
               section is to define residuals akin to Schoenfeld residuals that give good information about goodness of
               fit of the additive model, but are not tied to the model fitting procedure and do not require grouping.
           T   The partial residuals for the additive model are defined as [15]
                                     Contents                                                       7 of 15

                          Ui (β) : = Zi − E(β, ti )

               where E is given by
                                                           λOj (β, ti )
               (8)        E (β, ti ) : =           Zj                     .
                                                             λOk (β, ti )
                                                        k ∈R i

               and can be interpreted as the expected value of the covariate Z of the person who experienced an
               event at time ti . The Ri in the above formula denotes the risk set at time ti . Note that this is the
               same as the definition of the Schoenfeld residuals, the only difference is that the Cox model hazard λ

               is replaced by the observed hazard λOj .

               Using a one-term Taylor series expansion, one can show that the true coefficient β 0 in time can be
               approximated by
                                    .        ∂
               (9)        β 0 (ti ) = β +      E(β, ti )              E [Ui (β)|Ri ] .
               A graphical test of the proportional hazards assumption can therefore be performed by plotting a
               smoothed trace through scaled partial residuals, see Figure 1. This is completely analogous to the
               method used in the Cox model framework, with the sole difference that the scaling factor does not
               equal the variance of the residuals.

               To support the graphical impression with a formal test, again the ideas for the Cox model are followed
               and a cumulative sum Bn of the residuals standardized with their variance Vi is formed [15]
                                    Contents                                                           8 of 15

                                                     k      1            Ui (β)
               (10)         Bn (β, 0) := 0     Bn (β, ) := √                     , k = 1, . . . , n.
                                                     n       n     i=1
                                                                          Vi (β)

                                             [ Figure not available at time of publication ]
                                     (a)                                  (b)

               Figure 1: Two simulated data sets with (a) a constant β 0 (t), (b) a decreasing β 0 (t). The true value is
               denoted by the dotted line, the smooth trace through scaled residuals is presented with a solid curve.

               If the smoothed trace through residuals is constant, the positive and negative steps exchange randomly
               in the cumulative sum (10) and therefore its value rarely digresses far from 0. On the other hand,
               if a certain, e.g. negative linear trend exists, the sign of the residuals will first be positive and later
               negative with the cumulative sum departing quite far away from 0, see Figures 2(a) and (b).
               To evaluate the importance of these departures, we consider the ’tied down’ process (see Figure 2(c))
                                   k          k   k
               (11)         BBn (β, ) = Bn (β, ) − Bn (β, 1).
                                   n          n   n

                                             [ Figure not available at time of publication ]
                      (a)                                 (b)                                (c)

               Figure 2: An example of a negative linear trend: (a) standardised residuals and their smoothed average;
               (b) Brownian motion process; (c) Brownian bridge process.
                                   Contents                                                          9 of 15

               This process can be shown to converge to the Brownian bridge in the true β 0 , and is a reasonable
               approximation of it using the estimated coefficient β [8]. Based on the Brownian bridge theory, we
               can use several kinds of test statistics, depending on the alternative hypothesis in mind. A simple
               test statistic having good power when checking for monotonic changes of β would be the maximum
               absolute value of the bridge that is under the null hypothesis distributed as
                                                                               2 x2
               (12)        P ( max (|BB(t)|) ≤ x) = 1 + 2         (−1)k e−2k          ,   x > 0.

               4      A semiparametric additive model - the EM approach

               In the context of relative survival literature most frequently refers to the additive models [7, 4, 3],
               dominating especially in cancer research. While models (2) and (7) can profit from the Cox model
               theory, the theory for the additive model has been developed separately. In this section we will show
               that the additive model can also be fitted by iterating the Cox model estimation [9]. This enables us
               to use the methods already available and to better understand its properties. Furthermore, the fitting
               algorithm allows the model to be semi-parametric with the baseline excess hazard left unspecified.
               This is important as assumptions on the parametric form of the baseline excess hazard can affect the
               coefficient estimates as well as the goodness of fit tests.
               The main idea of this approach is to regard the cause of death (δiE = 1 for deaths due to the
               disease, 0 otherwise) as the missing information. The probability of a patient dying due to the disease
               (conditional on that patient dying (δi = 1) at time ti ) equals:
                                    Contents                                                                            10 of 15

                                                               λEi (ti )                   λ0 (ti ) exp(βZi )
               (13)        P (δEi = 1|δi , ti ) = δi                           = δi                                 .
                                                        λP i (ti ) + λEi (ti )      λP i (ti ) + λ0 (ti ) exp(βZi )
               If δiE was known, the relative survival methods would not be needed as one could fit the cause specific
               Cox model with the partial likelihood function
                                                                  δEi
                                                   exp(βZi ) 
               (14)        L(β|Z) =                                       .
                                                     exp(βZj )
                                                              

               Hence iterating the partial likelihood maximization and updating the value of δEi forms the EM al-

               gorithm. The baseline excess hazard can be estimated on each step using the Breslow estimator, so
               no assumptions are needed about its form. Nevertheless some smoothing must be applied in order to
               ensure the excess hazard is positive on all intervals.
               The proposed method is thus an iteration between Cox model fitting and a simple ratio calculation.
               This means that the additive model can be seen as a generalization of the Cox model and the wealth
               of extensions available for the Cox model can be straightforwardly incorporated into relative survival.
               One can also get a better understanding of the additive model properties. For example, it can easily
               be seen that the precision of the additive model estimates depends on the percentage of deaths due
               to the disease - it determines the number of events for the Cox model and the amount of missing
               information. The interpretation of the model can also be helped by the estimated individual disease
               related death probabilities (13) that are obtained in the fitting procedure.

                                  Contents                                                         11 of 15


               5    The relsurv package
               The R [12] package relsurv [10] currently unites more than 25 functions that cover several phases
               of the relative survival analysis, from plotting the relative survival curves to fitting and evaluating
               different models with common goodness of fit methods. The common input and output that follows
               the organization of the existing survival package ensure easy use and comparability of results. An
               important reason for the choice of R software is the flexible implementation of the population tables
               (via object ratetable) that allows almost any format. The package also includes a couple of functions
               that enable direct usage of the mortality tables published on internet [11](e.g. Human Mortality

               Database ). The package is freely available at the R web repository
               called CRAN.

               6    Discussion
               The relative survival regression models are quite often used when analysing cancer registries data,
               but this usage is almost exclusively confined to the additive model. While it is often argued that the
               additive model is well suited to describing cancer survival, this is hardly ever supported by checking
               the sensibility of its assumptions. To help improving the analyses, this paper therefore reviews some
               of the recent advances in the field of goodness of fit of relative survival regression models.
               The most crucial assumption made in any of the models is the one concerning the relationship be-
               tween the observed and population hazard, unfortunately, no methods exist for checking it. The only
               method that avoids this assumption is the transformation approach that presents a flexible way of
                                   Contents                                                          12 of 15

               modelling and an interesting alternative to the existing methods. It is important to note that the
               additive, the multiplicative and the Cox model in transformed time all attempt to describe relative
               survival, but have to be interpreted in a very different way. While the additive model aims to estimate
               the cause specific survival, the estimates in the other two models directly give information relative
               to the population survival. Furthermore, these two models become indispensable when the observed
               survival is close or even better than the population survival.
               The statement based on any chosen model, that a certain predictor does not influence relative sur-
               vival, should therefore be taken with care and interpreted accordingly, since the null hypotheses of
               the models differ. To get a better understanding of the data it might even be useful to fit several

               models. These differences between the models are problematic only for the variables included in the
               population tables, though. All other variables will yield results with the same interpretation in all the
               The other assumption common to all the models described is the one of proportional (excess) haz-
               ards. This assumption can be checked using residuals with the same idea and interpretation in all
               the models. Graphical as well as formal testing procedures can thus give a valuable and comparable
               information on the fit of the models.
               When performing any goodness of fit tests one must keep in mind that all the assumptions but the
               one we are checking are assumed to be fulfilled. In the case of the additive model, a misspecified
               parametric form of the baseline excess hazard (another non-testable assumption) might importantly
               affect the test results. To avoid specifying the baseline excess hazard form, one can resort to the EM
               fitting approach [? ]. When the assumptions about the baseline excess hazard hold, the performance
               of both parametric and semi-parametric approaches will be very similar, but when the assumptions
               are violated, the EM approach can be more reliable.
                                    Contents                                                               13 of 15

               As the the EM fitting method can be viewed as a generalization on the Cox model, any further exten-
               sions used in the Cox model framework (splines, frailties, etc) for classical survival can be straightfor-
               wardly incorporated into the study of relative survival. This makes the method very flexible and frees
               us of the need to ”reinvent” all the methods for relative survival use.
               The relative survival methodology is in practice mostly confined to the cancer registries and is routinely
               following the theoretical work done several decades ago using predominantly methods for grouped
               data. This is partly due also to the lack of programs that would allow the users to perform more
               into depth analyses. While some software does exist (Surv3 [6], SAS macros and Stata functions
               (, RSurv R function [5]), it is mostly covering only one regression model and

               has several limitations regarding the population mortality tables. A hopefully useful step forward is
               the relsurv [10] package in R [12] that enables fitting and checking all the models described and
               provides many other functions needed for a quality analysis of the data.

               [1] Andersen, P. K., Borch-Johnsen, K., Deckert, T., Green, A., Hougaard, P., Keiding, N., and Kreiner, S.
                  (1985). A Cox regression model for relative mortality and its application to diabetes mellitus survival data.
                  Biometrics, 41:921–932.

               [2] Cortese, G. and Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics
                  in Medicine, 27:3563–3548.

               [3] Dickman, P. W., Sloggett, A., Hills, M., and Hakulinen, T. (2004). Regression models for relative survival.
                  Statistics in Medicine, 23:51–64.
                                    Contents                                                               14 of 15

               [4] Est`ve, J., Benhamou, E., Croasdale, M., and Raymond, M. (1990). Relative survival and the estimation of
                  net survival: elements for further discussion. Statistics in Medicine, 9:529–538.

               [5] Giorgi, R., Payan, J., and Gouvernet, J. (2005). RSURV: A function to perform relative survival analysis
                  with S-PLUS or R. 78:175–178.

               [6] Hakulinen, T. and Abeywickrama, K. H. (1985). A computer program package for relative survival analysis.
                  Computer Methods and Programs in Biomedicine, 19 (2-3):197–207.

               [7] Hakulinen, T. and Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal
                  Statistical Society — Series C, 36:309–317.

               [8] Pohar, M. (2007). Goodness of fit of relative survival models. Doctoral thesis.

               [9] Pohar, M., Henderson, R., and Stare, J. (2008). An approach to estimation in relative survival regression.
                  Biostatistics, to appear.

               [10] Pohar, M. and Stare, J. (2006). Relative survival analysis in R. Computer Methods and Programs in
                  Biomedicine, 81:272–278.

               [11] Pohar, M. and Stare, J. (2007). Making relative survival analysis relatively easy. Computers in biology and
                  medicine, 37:1741–1749.

               [12] R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation
                  for Statistical Computing, Vienna, Austria.

               [13] Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model. Biometrika, 69:239–
           T      241.
                                    Contents                                                              15 of 15

               [14] Stare, J., Henderson, R., and Pohar, M. (2005a). An individual measure of relative survival. Journal of
                  the Royal Statistical Society — Series C, 54:115–126.

               [15] Stare, J., Pohar, M., and Henderson, R. (2005b). Goodness of fit of relative survival models. Statistics in
                  Medicine, 24.


To top