# Goodness-of fit methods for regression models

Document Sample

```					Goodness-of ﬁt methods for regression models

Jelle Goeman

Department of Medical Statistics and Bioinformatics
Leiden University Medical Center
Leiden, the Netherlands

Goodness-of ﬁt methods for regression models – p. 1/3
Regression models

n subjects with data (xi , Yi )
Outcome Yi is related to covariates via xi β
with
xi vector of covariates
β vector of regression parameters
Examples:
• Linear regression
• Poisson regression
• Logistic regression
• Multinomial logistic regression
• Cox regression

Goodness-of ﬁt methods for regression models – p. 2/3
How to determine the ﬁt of such a model?

Binary outcome Y with values 1 and 0
Model for pi = P (Yi = 1) :

exp(xi β)
pi =
1 + exp(xi β)

logit(pi ) = xi β

Goodness-of ﬁt methods for regression models – p. 3/3
Goodness-of-ﬁt for logistic regression

Many methods exist:
Hosmer-Lemeshow test
A chi-squared type test
10 groups are made, based on predicted probabilities
Implemented in standard software (SPSS, SAS)
Tests based on extending the model
Reweighting cases and controls Nagelkerke et al (Stat in Med
2004)

Goodness-of ﬁt methods for regression models – p. 4/3
More goodness of ﬁt methods

smoothing methods
Compare model to smooth alternatives
Consider smoothed residuals

Goodness-of ﬁt methods for regression models – p. 5/3
Our approach uses smoothing residuals

Residuals
ei = Yi − pi
with pi the predicted probability that Yi = 1.
Smoothed residual:
˜
ei =       wij ej
j

with wij smoothing coefﬁcients.
A weighted sum of residuals in the neighbourhood of i

Goodness-of ﬁt methods for regression models – p. 6/3
Example: data from early arthritis clinic

570 patients with undifferentiated arthritis
Outcome Reumatoid arthritis (RA)after 1 year yes/no
Logistic model with CRP ( C-reactive protein, blood test which
indicates inﬂammation) as covariate
yields: logit (pi )= -1.063 + .012 CRP

Does the model ﬁt?

Goodness-of ﬁt methods for regression models – p. 7/3
Plot of residuals, with kernel smooth

Goodness-of ﬁt methods for regression models – p. 8/3
Test based on smoothed residuals

T =       e2
˜i
i

In matrix form
T = (Y − p) W W (Y − p)
with W matrix of smoothing coefﬁcients wij .

Goodness-of ﬁt methods for regression models – p. 9/3
Distribution of T

T is quadratic form. Under H0 , E(T) and var(T) can be
calculated exactly assuming that β is known
Correction for estimation of parameters. Shrinkage of E(T)
and var(T).
Distribution can be approximated by scaled chi-square, cχ2 ,
(ν)
with c = var (T )/(2E (T )) and ν = 2(E (T ))2 /var (T ).

Goodness-of ﬁt methods for regression models – p. 10/3
How to smooth?

Choice of smoothing method – We use Kernel smoother
(uniform kernel)
Choice of distance measure
We standardize numeric covariates
Mixture of Euclidean distance (numeric covariates) and
number of equal values for discrete covariates

Goodness-of ﬁt methods for regression models – p. 11/3
Choice of Bandwidth

Choice of bandwidth is important
Small bandwidth: no power
Large bandwidth: deviations are smoothed away
Suggestion
1. Use 0.5* average distance between observations
2. Compute test statistic for different bandwidths and look at
distribution of maximum. Test statistics are highly
correlated, so this approach is quite powerful.

Goodness-of ﬁt methods for regression models – p. 12/3
Remarks

Easy to calculate (Software is available in SAS and R)
Sample size can be large: 1600 observations is no problem
Handles continuous and categorical covariates
Could be used as global gof test (smooth in all directions)
Could be used in a speciﬁc direction (smooth for one
covariate)
if W is identity matrix, T equals Copas unweighted sum of
squares statistic (which performs quite well in simulations of
Hosmer et al (Stat in Med 1997, 2002))

Goodness-of ﬁt methods for regression models – p. 13/3
The RA example: model with only CRP as covariate

P-value of T for different bandwidths

0.001
pvalue

0.5*average distance

0.01

0.05

0.10
0.15

0.0           0.5                   1.0                        1.5

Bandwidth
Goodness-of ﬁt methods for regression models – p. 14/3
Several models in the RA example

Covariates        p-value test   Deviance

CRP               0.02         689.80
CRP CRP2              0.01         678.28
log(CRP)             0.11         677.36
CRP in 3 categories      0.79         672.63

Goodness-of ﬁt methods for regression models – p. 15/3
Extension to multinomial regression models

Outcome Y has g categories
Model for
exβs
P (Y = s) =          g     xβt
, s = 1, · · · , g
t=1 e

Here
g   n    n
2
T =                   wij ([Yj = s] − pjs )
s=1 i=1   j=1

Derivations more complex
Submitted for publication (Goeman and le Cessie(2005))

Goodness-of ﬁt methods for regression models – p. 16/3
RA example: multinomial regression

Outcome at 1 year with 4 categories: RA (177), Probable RA(93),
Other disease(94), Undifferentiated arthritis(205)

Results goodness of ﬁt test:
Covariates       p-value test   Deviance

CRP              0.0003        1484.8
log(CRP)             0.11         1471.5
CRP in 3 categories      0.79         1465.3

Goodness-of ﬁt methods for regression models – p. 17/3
Residual plots, model with CRP as covariate

outcome = RA                                      outcome = probable RA
1.0

1.0
0.5

0.5
residual

residual
0.0

0.0
-0.5

-0.5
-1.0

-1.0
0    50      100     150      200                     0      50      100        150        200

CRP                                                     CRP

outcome = Other disease                           outcome = Undifferentiated arthrites
1.0

1.0
0.5

0.5
residual

residual
0.0

0.0
-0.5

-0.5
-1.0

-1.0

0    50      100     150      200                     0      50      100        150        200

CRP                                                     CRP
Goodness-of ﬁt methods for regression models – p. 18/3
Relation with random effect models

If model does not ﬁt
0.05
0.04
0.03

FITTED MODEL                                    FITTED MODEL
Y

IS TOO LOW                                      IS TOO LOW
0.02
0.01

FITTED MODEL
IS TOO HIGH
0.0

0.0          0.2   0.4            0.6   0.8            1.0

X

Neighbouring points deviate in the same direction
Positive correlation between neighbouring residuals ﬁt methods for regression models – p. 19/3
Goodness-of
Use random effect models

Is model logit(pi ) = xi β adequate ?
Embed model in family of random effect models

logit(pi ) = xi β + ri

with ri random effects, E(ri ) = 0; cov (ri ) = τ 2 R
Goodness of ﬁt test: score test for H0 : τ 2 = 0
Let R depend on the distance between observations in
covariate space.
Close observations have large correlations

Goodness-of ﬁt methods for regression models – p. 20/3
Score test statistic has the form:

T = (Y − pi ) R(Y − pi )

This is exactly the same as our smoothed residual test
with R = W W
For multinomial regression the same relation exists (more
complex: multivariate random effects, see our paper)

Goodness-of ﬁt methods for regression models – p. 21/3

It is clear to what alternatives our goodness of ﬁt test has
optimal power
Can be extended to other areas (GLM, Survival, Genetic
family trees, micro array data)

Goodness-of ﬁt methods for regression models – p. 22/3
Why not ﬁt the random effect model?

Can be done with standard software
No problem with choice bandwidth (use the one
corresponding to largest likelihood)
Use likelihood ratio test as goodness of ﬁt test (distribution
0.5(χ2 + χ2 ))
(1)    (2)

Posterior means of random effects as diagnostic tool

Goodness-of ﬁt methods for regression models – p. 23/3
Linear regression

Data of 702 infants from Malawi (birth weight vs gestational age)

Goodness-of ﬁt methods for regression models – p. 24/3
Does linear model ﬁt well ?

Use proc mixed in SAS to ﬁt model with random terms
Different Rs:
Exponential correlation function

Rij = exp(−(||xi − xj ||/h))

Gaussian correlation

Rij = exp(−((xi − xj )2 /h))

proc mixed data=malawi cl method=ml ;
class babycode;
model gew= age/solution outp=posterior
outpm=model residual;
random babycode/ type=sp(exp)(age)solution ;
run;
Goodness-of ﬁt methods for regression models – p. 25/3
Results

Sometimes convergence to boundary → try different starting
values
Deviance model without random effect 3997.2
Deviance model with exponential error structure 3991.0
Deviance model with Gaussian error structure 3990.5
Estimate of τ 2 was 0.6041 (95%CI 0.18 ; 12.6)
Goodness-of-ﬁt test:6.2, p=0.02
A squared term in gestational age is signiﬁcant (p=0.0047)

Goodness-of ﬁt methods for regression models – p. 26/3
Plot of posterior means xi β + E[ri |Y ]

et al ﬁt methods for
Relation with non parametric smoothing (Verbyla Goodness-of 1999)regression models – p. 27/3
Logistic regression

Fit alternative model with correlated random effect
Use SAS PROC GLIMMIX
Model almost always converges to boundary (either 0 or
inﬁnite estimates)
Did not work out

Goodness-of ﬁt methods for regression models – p. 28/3
Back to the RA example

In total 21 possible predictors, with 28 parameters
After variable selection 11 predictors, 13 parameters

Goodness-of ﬁt methods for regression models – p. 29/3
Goodness of ﬁt test is not signiﬁcant

P-value of T for different bandwidths

0.001
pvalue

0.01

0.05
0.5*average distance
0.10

0.20
0.30
0.40

0   1           2            3             4    5                6

Bandwidth
Goodness-of ﬁt methods for regression models – p. 30/3
Is it a clinical useful model?

Histogram of predicted probabilities

Goodness-of ﬁt methods for regression models – p. 31/3
Is it a clinical useful model?

Look at predicted probabilities of RA
53 % has pred < 0.2 ⇒ probably no RA ⇒ No treatment
10 % has pred > 0.8 ⇒ probably RA within year ⇒ Start
treatment
36 % has pred between 0.2 and 0.8. We do not know what will
happen: unclassiﬁable
Hence we could make a "Tailor made" prediction for 63% of the
patients.
36 percent can not be classiﬁed
Difference between ﬁt and predictive value

Goodness-of ﬁt methods for regression models – p. 32/3
Conclusions

A simple goodness-of ﬁt test for regression models
Goodness of ﬁt test using smoothed residuals is equivalent to
score tests for random effects
Yields residuals for further diagnostics
It would be nice to ﬁt the alternative model
Then posterior means of random effects can be used as
diagnostic tool
This is a kind of non parametric smoothing
In practice not easy to do
Software on our website

Goodness-of ﬁt methods for regression models – p. 33/3

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 16 posted: 9/27/2011 language: English pages: 33