# Introduction to longitudinal analysis and repeated measures analysis

### Pages to are hidden for

"Introduction to longitudinal analysis and repeated measures analysis"

```					    Introduction to longitudinal analysis
and repeated measures analysis

Applied longitudinal regression
Fitzmaurice, Laird, Ware
Fundamental objectives of
longitudinal studies

• Assess within individual changes in
response (individual growth curves)
• Explain differences in growth curves
between individuals
Longitudinal data

• A distinctive feature of longitudinal data is
that the measurements are clustered
(within people)
• Longitudinal data also have a temporal
order
Clusters

• Measurements within a cluster (e.g., a
person) are more similar than
measurements in different clusters
• An individual’s propensity to respond – be
it high, medium, or low – is shared by all
repeated measures
• Measurements taken more closely in time
are more strongly correlated
Normal probability density
Multivariate normal probability density
Regression with repeated measures

• Model the mean
• Model the covariance
(the tendency of measurements within a person
to vary together)
Covariance patterns

• A variety of options are available to
describe the covariance
• Parameters describing the covariance
must be estimated along with traditional
explanatory variables
• Some covariance patterns require more
information (i.e., require more parameters
to be estimated than others)
Variance component (VC) or simple
• Assumes the repeated measurements are
uncorrelated
• Usually unrealistic for longitudinal data
(
2

• Only have to estimate one parameter              )
Unstructured covariance

• Assumes a distinct correlation between every
pair of measurements
• The most complex covariance pattern
Unstructured covariance matrix
Compound symmetry

• Assumes the same correlations between all pairs of
data points
• Requires estimating two parameters ( ,  )
2
Compound symmetry matrix
Toeplitz

• Assumes correlations between equally distant
points is constant
Spatial Power
• Correlations decline with increasing spacing between
points
• Allows unequal spacing between points
• Requires estimating two parameters ( ,  )
2
Spatial Gaussian

• Correlations decline with increasing distances between points
• Requires 2 parameters
• Allows unequally spaced points
Autoregressive (AR(1))
Selecting covariance structures

Different time
Equal    Unequal   points across
spacing   spacing      subjects
Compound         Yes       Yes          Yes
symmetry
Unstructured     Yes       Yes          No

AR(1)            Yes       No           No

Toeplitz         Yes       No           No

Spatial          Yes       Yes          Yes
structures
Modeling covariance versus modeling the mean
Generalized estimating equations
(GEE) for repeated measurements

• Estimate regression coefficients as usual
for the explanatory variables
• Estimate coefficients as well for the
correlation matrix
How to choose the ‘best’ covariance pattern

• Fit a complex model for the mean (i.e.,
overfit)
• Try nested covariance patterns and select
the best based upon differences in
likelihood ratios
• For patterns not nested choose the
covariance pattern with the lowest Akaike
Information Criteria
Misspecified covariance patterns
with GEEs
• GEEs have an adjustment to correct the
covariance pattern if misspecified
measurements at the various
combinations of the explanatory variables
• The adjustment works best for similar
measurement times, and with limited
missing data
Sample size

• The number of subjects should be large
relative to the number of measurements
• If you have 5-12 explanatory variables you
need at least 100 clusters; to be
reasonably confident, you probably need
200 clusters
Oral treatment to reduce blood lead levels

A rebound occurs
from stored sources
Study data

1     P       0    30.8
1     P       1    26.9
1     P       4    25.8
1     P       6    23.8
2     A       0    26.5
2     A       1    14.8
2     A       4    19.5
2     A       6    21.0
3     A       0    25.8
3     A       1    23.0
Regression model

Blood lead levels= Group Week Group*Week

Unstructured covariance matrix
Unstructured covariance matrix

Estimated R Matrix

Row   Col1        Col2       Col3      Col4

1    25.2257    19.1074     19.6995   22.2016

2    19.1074    44.3458     35.5351   29.6750

3    19.6995    35.5351     47.3778   30.6205

4    22.2016    29.6750     30.6205   58.6510
Regression results
Standard                    Pr >
Effect       group time Estimate     Error DF     t Value      |t|

Intercept                26.2720    0.7103   98   36.99     <.0001

group         A          0.2680     1.0045   98   0.27      0.7902

group         P             0         .      .      .         .

time                6    -2.6260    0.8885   98   -2.96     0.0039

time                4    -2.2020    0.8149   98   -2.70     0.0081

time                1    -1.6120    0.7919   98   -2.04     0.0445

time                0       0         .      .      .         .

group*time    A     6    -3.1520    1.2566   98   -2.51     0.0138

group*time    A     4    -8.8240    1.1525   98   -7.66     <.0001

group*time    A     1    -11.4060   1.1199   98   -10.18    <.0001
Trapezoidal differences
P-value for area differences

• A p-value for the differences in areas
under the curves
• Estimate differences from baseline and
compare areas

Contrasts

Label                      Num DF Den DF Chi-Square F Value Pr > ChiSq Pr > F
3 DF Test of Interaction        3      99          111.96   37.32   <.0001   <.0001
Linear splines

Connected linear slopes
Data for fitting splines
id     succimer   time       y     time_1

1         0        0      30.8       0
1         0        1      26.9       0
1         0        4      25.8       3
1         0        6      23.8       5
2         1        0      26.5       0
2         1        1      14.8       0
2         1        4      19.5       3
2         1        6      21.0       5
3         1        0      25.8       0
3         1        1      23.0       0

Subtract the value of the break point (the knot) from
time to create a second time variable
Regression model

time time_1 succimer*time succimer*time_1
Regression results

Effect            Estimate Standard Error DF t Value Pr > |t|

Intercept          26.3422         0.4991   99   52.78    <.0001

time                -1.6296        0.7818   99    -2.08   0.0397

time_1              1.4305         0.8777   99    1.63    0.1063

time*succimer      -11.2500        1.0924   99   -10.30   <.0001

time_1*succimer    12.5822         1.2278   99   10.25    <.0001
Estimated and mean blood lead levels
Study of the development of obesity

• Data on 4865 boys and girls were collected
biannually
• Five cohorts: 5-7, 7-9, 9-11, 11-13, 13-15
• Outcome was obesity (yes or no)
• Modeled mean age of measurement and
longitudinal age
Study data

id   female   baselineage   occasion   obesity   cage   cage2

1       0          6           1         1        -6      36
1       0          6           2         1        -4      16
1       0          6           3         1        -2       4
2       0          6           1         1        -6      36
2       0          6           2         1        -4      16
2       0          6           3         1        -2       4
3       0          6           1         1        -6      36
3       0          6           2         1        -4      16
3       0          6           3         1        -2       4
4       0          6           1         1        -6      36
Regression model

Obesity=
Female + cage + cage2 + female*cage +
female*cage2

Fit using logistic regression for repeated
measures
Regression results
Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

95%
Standard      Confidence
Parameter      Estimate     Error       Limits               Z Pr > |Z|

Intercept        -1.2135    0.0506   -1.3126   -1.1144   -24.00   <.0001

gender            0.1159    0.0711   -0.0235   0.2553     1.63    0.1033

cage              0.0378    0.0133   0.0118    0.0638     2.85    0.0043

cage2            -0.0175    0.0034   -0.0241   -0.0109    -5.19   <.0001

gender*cage       0.0075    0.0182   -0.0282   0.0433     0.41    0.6795

gender*cage2      0.0039    0.0046   -0.0051   0.0130     0.85    0.3949
Regression curves
Regression model

Obesity = Female + cage + cage2
Regression results

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence
Parameter Estimate        Error    Limits                Z Pr > |Z|

Intercept   -1.2283     0.0477   -1.3218   -1.1348   -25.75   <.0001

gender       0.1449     0.0627   0.0221    0.2678     2.31    0.0208

cage         0.0418     0.0091   0.0240    0.0596     4.60    <.0001

cage2       -0.0155     0.0023   -0.0200   -0.0110    -6.73   <.0001
Clinical trial of antibiotics for leprosy

• Participants randomized to placebo or to
two antibiotics
• Measured number of leprosy bacilli at six
sites of the body
• Analyzed using Poisson regression
(although results showed overdispersion
and may have been better modeled with
Negative binomial regression)
Study data
id    time   A    B

1      0    1    0
1      1    1    0
2      0    0    1
2      1    0    1
3      0    0    0
3      1    0    0
4      0    1    0
4      1    1    0
5      0    0    1
5      1    0    1
Regression model

Y = time + A*time + B*time

Note: Treatments were not included alone
because treatments were randomized and
the difference should be zero
Regression results

Analysis Of GEE Parameter Estimates

Model-Based Standard Error Estimates

Standard    95% Confidence
Parameter Estimate        Error       Limits             Z Pr > |Z|

Intercept   2.3734     0.1035     2.1704    2.5763    22.92   <.0001

time        -0.0138    0.1111     -0.2315   0.2039    -0.12   0.9010

time*A      -0.5406    0.1818     -0.8969   -0.1843   -2.97   0.0029

time*B      -0.4791    0.1779     -0.8278   -0.1303   -2.69   0.0071
Test that the antibiotics are equally effective

Contrast Results for GEE Analysis

Contrast                  DF Chi-Square Pr > ChiSq Type

Drug x Time Interaction    2        6.99         0.0803   Wald
Regression model:
Y = time + time*Antibiotic (y/n)

Analysis Of GEE Parameter Estimates

Model-Based Standard Error Estimates

95%
Standard      Confidence
Parameter          Estimate     Error       Limits              Z Pr > |Z|

Intercept             2.3734    0.1028   2.1718    2.5749    23.08   <.0001

time                 -0.0108    0.1142   -0.2345   0.2130    -0.09   0.9249

time*Antibiotic      -0.5141    0.1536   -0.8152   -0.2131   -3.35   0.0008

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 24 posted: 12/24/2009 language: English pages: 48