VIEWS: 10 PAGES: 15 POSTED ON: 2/11/2011 Public Domain
Page Contents 1 of 15 Print Goodness of Fit in Relative Survival Regression Maja Pohar Perme; Janez Stare; Robin Henderson University of Ljubljana, Institute of Biomedical Informatics, Vrazov trg 2, 1000 Ljubljana, Slovenia. University of Newcastle, School of Mathematics and Statistics, NE1 7RU, Newcastle upon Tyne, UK. E-mail: maja.pohar@mf.uni-lj.si; janez.stare@mf.uni-lj.si; Robin.Henderson@ncl.ac.uk 1 Introduction The idea of the relative survival methods is to compare the observed survival of the patients to the IPMs Sessions survival of the population at the same period of time. These methods become important when we wish to estimate the mortality due to a certain disease, but we do not have reliable information on the cause of death. The relative survival is therefore crucial for all the survival analyses, in which the patients have been followed for long periods of time. In such studies, we shall almost always observe the negative eﬀect of age and the positive eﬀect of the calendar year, even though this might have no relation to the disease of our interest. To answer questions like ”Has the hazard of a disease lowered in the last decade?” one can resort to the relative survival regression models. The data used in this ﬁeld are frequently obtained from the registries, with cancer survival studies being the prime example. The information on the population survival is obtained from the the population mortality tables - we can use them to estimate the hazard every person is subdued to because of his or her age, sex and cohort year (or any other combination of covariates included in the population mortality data). This quantity is named the population hazard λP as opposed to the observed hazard B T (λO ), which is the hazard we estimate from the data by regarding all deaths as events. Page Contents 2 of 15 Print The most common approach of modelling relative survival is through assuming a relationship between these two hazards, which can be either additive [7, 4, 3, 2] (1) λO (t) = λP (t) + λE (t), or multiplicative [1] (2) λO (t) = λP (t)ν(t). The term λE (t) is named the excess hazard and interpreted as the hazard speciﬁc for the disease in question, while the unit-free term ν(t) can be seen as a factor of relative mortality. These are the IPMs Sessions terms in which the relationship with the covariates is speciﬁed, we shall focus on the situation where the excess hazard is modelled as (3) λE (t) = λ0 (t)eβ Z and similarly the relative mortality is set to equal (4) ν(t) = ν0 (t)eβ Z . In this way, both models assume that the eﬀect of the covariates Z does not change in time, i.e. the β coeﬃcient is constant (proportional hazards assumption). An important theoretical advantage of the multiplicative model (4) is that it can be rewritten as (5) λO (t) = ν0 (t)eβ Z+1·ln(λP (t)) , B T Page Contents 3 of 15 Print implying that this is a special case of the Cox model with an additional time-dependent variable having the coeﬃcient ﬁxed to 1. Therefore, this model can be ﬁtted using the partial likelihood approach and no assumptions are needed for the ν0 (t). On the contrary, full likelihood is needed to ﬁt the additive model and its fully parametric form poses additional assumptions. When ﬁtting a relative survival regression model, one must therefore make a series of assump- tions. However, the topic of goodness of ﬁt of these models has traditionally been very poorly addressed in statistical literature and hardly ever in practice. In this paper we shall review some of the recently introduced methods that either deal with checking or avoiding the assumptions. The next section describes the individual measure as an alternative approach to relative survival that needs no assump- IPMs Sessions tions about the relationship between the observed and population hazard. Section 3 presents the methods for checking the proportional (excess) hazards assumption and Section 4 demonstrates how the additive model can also be ﬁtted as semiparametric. Section 5 brieﬂy describes the R package relsurv that puts into practice all the reviewed methods and Section 6 contains some discussion and concluding remarks. 2 Individual measure of relative survival - the transformation ap- proach Using the population mortality tables, the cumulative population mortality distribution can be cal- culated for each individual. We denote this curve as FP (t) and interpret it as the probability that an individual in our study (of a given age and sex, diagnosed in a certain calendar year) dies of ’popu- B T lation reasons’ in the course of the next t years. To evaluate the eﬀect of the disease in question we Page Contents 4 of 15 Print then transform the actual observed time T to get (6) Y := FP (T ). The value Y obtained with this simple transformation is named the individual measure and can be interpreted as how long a person has lived compared to his counterparts in the population [14]. As a value on the cumulative distribution curve it is always between 0 and 1, a value of 0.5 for example implies that an individual has outlived exactly half of his population counterparts. So, in order to compare the survival experience of two individuals that have a known diﬀerent expected survival, we use the values Y they achieved on their expected cdf. In this way we deﬁne a new time to event IPMs that can be used for survival analysis. The censoring mechanism remains the same, i.e. if time Sessions t was a censoring time, the same holds for time y. Note also that if T is distributed as FP , the measure Y is the probability integral transform and thus uniformly distributed. This property helps us intuitively understand why this transformation is sensible in the context of relative survival. If all our patients had only the general population risks (and none of the disease), by transforming their times by their corresponding FP , we would make them all identically distributed, i.e. uniformly on [0, 1]. Therefore, by transforming to the Y scale, we automatically take into account all the population risk, consequently, what is left is precisely the excess risk, that we can thus directly model. Basically any model that is appropriate for the survival data could be used, we shall focus here on the Cox model (7) λ( y) := λ0 (t)eβ Z . Note that the transformation can cause dependence between the censoring times and the covariates B T used for transforming. Therefore, the model should always include these variables in order to ensure Page Contents 5 of 15 Print conditional independence between the survival and censoring times distribution. As both the multiplicative model and the Cox model in transformed time are just special cases of the Cox model, no further theoretical work is needed. It should be noted, however, that the three models introduced (1,2,7) have quite a diﬀerent interpretation. While the idea of the additive model is to use the population tables to get information about the cause speciﬁc survival, the other two models give information on the eﬀect of the covariates relative to the population. As an example, say we are in year 2000 and the excess hazard for the disease in question equals to λE = 0.04 regardless of age. This is roughly the size of the population hazard of a 70 year old man and implies about 20% of people dying of excess risk in the course of the next 5 years. Added to the population hazard of a 50 year IPMs Sessions old (λP = 0.007), this is a substantial increase of his observed hazard. On the other hand, a 90 year old, whose population hazard is λP = 0.25, would hardly notice the excess hazard and the disease would have only a negligible eﬀect on his survival. While the additive model would thus result in no eﬀect of age, the multiplicative model and the transformation approach would give a signiﬁcantly negative coeﬃcient for age. The interpretation would be similar in both cases, i.e. the impact of the disease on survival relatively decreases with age. The sign of the coeﬃcient in these two models gives an indication of how the patients live compared to the population, if β is constant in time, this relationship remains the same throughout the follow-up time. On the other hand, in the case of the additive model, one should always interpret also the size of the baseline excess hazard. If it becomes very small after a few years of follow-up, even a large coeﬃcient does not carry much importance for the observed survival. Note also that in the case of the both, the multiplicative and the Cox model in transformed time, the coeﬃcient for the covariates not included in the population tables (e.g. smoking) is the same as B T Page Contents 6 of 15 Print if the observed hazard was ﬁtted with a Cox model. This corresponds to the fact that no additional information on these covariates was obtained by using the population tables. This is not the case for the additive model, where all coeﬃcients are diﬀerent and the proportional hazards assumption is also not met simultaneously. Nevertheless, the sign of the coeﬃcients will remain the same. In cancer survival the ﬁrst year is usually critical, regardless whether the patients are young or old. Therefore the change of scale is probably not a good idea as it would lead to nonproportional hazards. On the other hand, the additive model is not suitable when the excess hazard is small compared to the population hazard. Though the multiplicative model and the Cox model in transformation time often give similar results, the proportional hazards assumption is usually not simultaneously satisﬁed. IPMs Sessions Therefore, we should check the goodness of ﬁt of the model before interpreting it. This is the focus of the next section. 3 Proportional (excess) hazard assumption - partial residuals The multiplicative model and the Cox model in transformed time can both be studied withing the Cox model framework, whose theory provides us with several options for checking the PH assumption. A particularly useful diagnostic tool are the Schoenfeld’s partial residuals [13] that arise naturally from the estimating equations obtained from partial likelihood. On the other hand, no substantial literature exists on the goodness of ﬁt checking for the additive model. The deviance residuals [3] can only be used for grouped data and their performance is sensible to the group sizes, ad hoc testing using interaction with time depends on the parametrization of the baseline excess hazard. The idea of this section is to deﬁne residuals akin to Schoenfeld residuals that give good information about goodness of ﬁt of the additive model, but are not tied to the model ﬁtting procedure and do not require grouping. B T The partial residuals for the additive model are deﬁned as [15] Page Contents 7 of 15 Print Ui (β) : = Zi − E(β, ti ) where E is given by λOj (β, ti ) (8) E (β, ti ) : = Zj . j∈Ri λOk (β, ti ) k ∈R i and can be interpreted as the expected value of the covariate Z of the person who experienced an event at time ti . The Ri in the above formula denotes the risk set at time ti . Note that this is the same as the deﬁnition of the Schoenfeld residuals, the only diﬀerence is that the Cox model hazard λ IPMs is replaced by the observed hazard λOj . Sessions Using a one-term Taylor series expansion, one can show that the true coeﬃcient β 0 in time can be approximated by −1 . ∂ (9) β 0 (ti ) = β + E(β, ti ) E [Ui (β)|Ri ] . ∂β A graphical test of the proportional hazards assumption can therefore be performed by plotting a smoothed trace through scaled partial residuals, see Figure 1. This is completely analogous to the method used in the Cox model framework, with the sole diﬀerence that the scaling factor does not equal the variance of the residuals. To support the graphical impression with a formal test, again the ideas for the Cox model are followed and a cumulative sum Bn of the residuals standardized with their variance Vi is formed [15] B T Page Contents 8 of 15 Print k k 1 Ui (β) (10) Bn (β, 0) := 0 Bn (β, ) := √ , k = 1, . . . , n. n n i=1 Vi (β) [ Figure not available at time of publication ] (a) (b) Figure 1: Two simulated data sets with (a) a constant β 0 (t), (b) a decreasing β 0 (t). The true value is denoted by the dotted line, the smooth trace through scaled residuals is presented with a solid curve. IPMs Sessions If the smoothed trace through residuals is constant, the positive and negative steps exchange randomly in the cumulative sum (10) and therefore its value rarely digresses far from 0. On the other hand, if a certain, e.g. negative linear trend exists, the sign of the residuals will ﬁrst be positive and later negative with the cumulative sum departing quite far away from 0, see Figures 2(a) and (b). To evaluate the importance of these departures, we consider the ’tied down’ process (see Figure 2(c)) k k k (11) BBn (β, ) = Bn (β, ) − Bn (β, 1). n n n [ Figure not available at time of publication ] (a) (b) (c) Figure 2: An example of a negative linear trend: (a) standardised residuals and their smoothed average; (b) Brownian motion process; (c) Brownian bridge process. B T Page Contents 9 of 15 Print This process can be shown to converge to the Brownian bridge in the true β 0 , and is a reasonable ˆ approximation of it using the estimated coeﬃcient β [8]. Based on the Brownian bridge theory, we can use several kinds of test statistics, depending on the alternative hypothesis in mind. A simple test statistic having good power when checking for monotonic changes of β would be the maximum absolute value of the bridge that is under the null hypothesis distributed as ∞ 2 x2 (12) P ( max (|BB(t)|) ≤ x) = 1 + 2 (−1)k e−2k , x > 0. t∈[0,1] k=1 4 A semiparametric additive model - the EM approach IPMs Sessions In the context of relative survival literature most frequently refers to the additive models [7, 4, 3], dominating especially in cancer research. While models (2) and (7) can proﬁt from the Cox model theory, the theory for the additive model has been developed separately. In this section we will show that the additive model can also be ﬁtted by iterating the Cox model estimation [9]. This enables us to use the methods already available and to better understand its properties. Furthermore, the ﬁtting algorithm allows the model to be semi-parametric with the baseline excess hazard left unspeciﬁed. This is important as assumptions on the parametric form of the baseline excess hazard can aﬀect the coeﬃcient estimates as well as the goodness of ﬁt tests. The main idea of this approach is to regard the cause of death (δiE = 1 for deaths due to the disease, 0 otherwise) as the missing information. The probability of a patient dying due to the disease (conditional on that patient dying (δi = 1) at time ti ) equals: B T Page Contents 10 of 15 Print λEi (ti ) λ0 (ti ) exp(βZi ) (13) P (δEi = 1|δi , ti ) = δi = δi . λP i (ti ) + λEi (ti ) λP i (ti ) + λ0 (ti ) exp(βZi ) If δiE was known, the relative survival methods would not be needed as one could ﬁt the cause speciﬁc Cox model with the partial likelihood function δEi n exp(βZi ) (14) L(β|Z) = . exp(βZj ) i=1 j∈Ri Hence iterating the partial likelihood maximization and updating the value of δEi forms the EM al- IPMs Sessions gorithm. The baseline excess hazard can be estimated on each step using the Breslow estimator, so no assumptions are needed about its form. Nevertheless some smoothing must be applied in order to ensure the excess hazard is positive on all intervals. The proposed method is thus an iteration between Cox model ﬁtting and a simple ratio calculation. This means that the additive model can be seen as a generalization of the Cox model and the wealth of extensions available for the Cox model can be straightforwardly incorporated into relative survival. One can also get a better understanding of the additive model properties. For example, it can easily be seen that the precision of the additive model estimates depends on the percentage of deaths due to the disease - it determines the number of events for the Cox model and the amount of missing information. The interpretation of the model can also be helped by the estimated individual disease related death probabilities (13) that are obtained in the ﬁtting procedure. B T Page Contents 11 of 15 Print 5 The relsurv package The R [12] package relsurv [10] currently unites more than 25 functions that cover several phases of the relative survival analysis, from plotting the relative survival curves to ﬁtting and evaluating diﬀerent models with common goodness of ﬁt methods. The common input and output that follows the organization of the existing survival package ensure easy use and comparability of results. An important reason for the choice of R software is the ﬂexible implementation of the population tables (via object ratetable) that allows almost any format. The package also includes a couple of functions that enable direct usage of the mortality tables published on internet [11](e.g. Human Mortality IPMs Sessions Database http://www.mortality.org ). The package is freely available at the R web repository called CRAN. 6 Discussion The relative survival regression models are quite often used when analysing cancer registries data, but this usage is almost exclusively conﬁned to the additive model. While it is often argued that the additive model is well suited to describing cancer survival, this is hardly ever supported by checking the sensibility of its assumptions. To help improving the analyses, this paper therefore reviews some of the recent advances in the ﬁeld of goodness of ﬁt of relative survival regression models. The most crucial assumption made in any of the models is the one concerning the relationship be- tween the observed and population hazard, unfortunately, no methods exist for checking it. The only method that avoids this assumption is the transformation approach that presents a ﬂexible way of B T Page Contents 12 of 15 Print modelling and an interesting alternative to the existing methods. It is important to note that the additive, the multiplicative and the Cox model in transformed time all attempt to describe relative survival, but have to be interpreted in a very diﬀerent way. While the additive model aims to estimate the cause speciﬁc survival, the estimates in the other two models directly give information relative to the population survival. Furthermore, these two models become indispensable when the observed survival is close or even better than the population survival. The statement based on any chosen model, that a certain predictor does not inﬂuence relative sur- vival, should therefore be taken with care and interpreted accordingly, since the null hypotheses of the models diﬀer. To get a better understanding of the data it might even be useful to ﬁt several IPMs Sessions models. These diﬀerences between the models are problematic only for the variables included in the population tables, though. All other variables will yield results with the same interpretation in all the models. The other assumption common to all the models described is the one of proportional (excess) haz- ards. This assumption can be checked using residuals with the same idea and interpretation in all the models. Graphical as well as formal testing procedures can thus give a valuable and comparable information on the ﬁt of the models. When performing any goodness of ﬁt tests one must keep in mind that all the assumptions but the one we are checking are assumed to be fulﬁlled. In the case of the additive model, a misspeciﬁed parametric form of the baseline excess hazard (another non-testable assumption) might importantly aﬀect the test results. To avoid specifying the baseline excess hazard form, one can resort to the EM ﬁtting approach [? ]. When the assumptions about the baseline excess hazard hold, the performance of both parametric and semi-parametric approaches will be very similar, but when the assumptions B T are violated, the EM approach can be more reliable. Page Contents 13 of 15 Print As the the EM ﬁtting method can be viewed as a generalization on the Cox model, any further exten- sions used in the Cox model framework (splines, frailties, etc) for classical survival can be straightfor- wardly incorporated into the study of relative survival. This makes the method very ﬂexible and frees us of the need to ”reinvent” all the methods for relative survival use. The relative survival methodology is in practice mostly conﬁned to the cancer registries and is routinely following the theoretical work done several decades ago using predominantly methods for grouped data. This is partly due also to the lack of programs that would allow the users to perform more into depth analyses. While some software does exist (Surv3 [6], SAS macros and Stata functions (www.pauldickman.com), RSurv R function [5]), it is mostly covering only one regression model and IPMs Sessions has several limitations regarding the population mortality tables. A hopefully useful step forward is the relsurv [10] package in R [12] that enables ﬁtting and checking all the models described and provides many other functions needed for a quality analysis of the data. References [1] Andersen, P. K., Borch-Johnsen, K., Deckert, T., Green, A., Hougaard, P., Keiding, N., and Kreiner, S. (1985). A Cox regression model for relative mortality and its application to diabetes mellitus survival data. Biometrics, 41:921–932. [2] Cortese, G. and Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics in Medicine, 27:3563–3548. [3] Dickman, P. W., Sloggett, A., Hills, M., and Hakulinen, T. (2004). Regression models for relative survival. Statistics in Medicine, 23:51–64. B T Page Contents 14 of 15 Print e [4] Est`ve, J., Benhamou, E., Croasdale, M., and Raymond, M. (1990). Relative survival and the estimation of net survival: elements for further discussion. Statistics in Medicine, 9:529–538. [5] Giorgi, R., Payan, J., and Gouvernet, J. (2005). RSURV: A function to perform relative survival analysis with S-PLUS or R. 78:175–178. [6] Hakulinen, T. and Abeywickrama, K. H. (1985). A computer program package for relative survival analysis. Computer Methods and Programs in Biomedicine, 19 (2-3):197–207. [7] Hakulinen, T. and Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal Statistical Society — Series C, 36:309–317. IPMs Sessions [8] Pohar, M. (2007). Goodness of ﬁt of relative survival models. Doctoral thesis. [9] Pohar, M., Henderson, R., and Stare, J. (2008). An approach to estimation in relative survival regression. Biostatistics, to appear. [10] Pohar, M. and Stare, J. (2006). Relative survival analysis in R. Computer Methods and Programs in Biomedicine, 81:272–278. [11] Pohar, M. and Stare, J. (2007). Making relative survival analysis relatively easy. Computers in biology and medicine, 37:1741–1749. [12] R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. [13] Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model. Biometrika, 69:239– B T 241. Page Contents 15 of 15 Print [14] Stare, J., Henderson, R., and Pohar, M. (2005a). An individual measure of relative survival. Journal of the Royal Statistical Society — Series C, 54:115–126. [15] Stare, J., Pohar, M., and Henderson, R. (2005b). Goodness of ﬁt of relative survival models. Statistics in Medicine, 24. IPMs Sessions B T