Lecture_6_-_control_of_extraneous_factors

Document Sample

```					Adjusting for extraneous factors
Topics for today
• Stratified analysis of 2x2 tables
• Regression

• Jewell Chapter 9
1973 study showed that
45% of 2691 male                         Admit     Reject
applicants were admitted,      Male   1198         1493
compared with only 30%         Female 557          1278
of 1835 female applicants.
The odds ratio is 1.84 with   Log odds ratio =
95% confidence interval       95% conf interval:
(1.62, 2.08). Is this
evidence of sex bias?
Dept Male          Female
The picture changes
1    825 62% 108 82%
completely once we
2    560 63% 25 68%
3      325     37%      593   34%
by department!
4      417     33%      375   35%
5      191     28%      393   24%
6      373     6%       341   7%
Bickel, P.J., J.W. Hammel and J.W. O'Connell (1975) "Sex bias in
Stratified analysis
• Consider relationship between a disease outcome (D in
Jewell, often Y in practice) and an exposure (E in Jewell,
often X in practice), but we also want to adjust for an
additional factor such as age or sex that can be divided up
into I distinct strata.
• Suppose that the data from the ith stratum can be
represented as follows:

Diseased         Not Diseased
Exposed          ai               bi
Unexposed        ci               di

• Jewell Tables 9.2 & 9.3 give two examples
What do we want to do?
1. Ask whether there is a significant association
between disease (D) and exposure (E), after
2. Estimate an adjusted odds ratio, that
appropriately takes into account the stratification
factor.

over another way to assess whether there is a
significant association for a 2x2 table
Assessing association - Berkeley Admissions again
there is a significant          Observed Admit Reject
association in this 2x2         data
table, based on the 95%
confidence interval for the     Male     1211 1480
odds ratio. An alternative
approach is a chi-squared       Female 716     1119
test

There are several               Expected Admit   Reject
variations. But basic idea is   data
to compare observed data
to what would be expected       Male
if there were no association
(see J p 69)                    Female
Chi-Squared test for a 2x2 table
The test statistic is
2    2     (Oij  Eij )   2

  
2

i 1 i 1        Eij

And its “significance” can be determined by looking up
the chi-squared tables with 1 degree of freedom.
For the Berkeley data, we get:
Back to the stratified analysis
Cochran-Mantel-Haenszel
test combines the differences Stratum i                       D       Not D
between observed and           E                              ai      bi
expected values over all the Not E                            ci      di
strata. It focuses only on the
“a” element of each 2x2 table
2
 I          I

  ai   Ai 
 CMH
2
  i 1 I
i 1         where Ai  (ai  bi )(ai  ci ) / ni
V
i 1
i

and Vi  (ai  bi )(ci  d i )(ai  ci )(bi  d i ) /[ni2 (ni  1)]
Male       Female
stratum    a      b    c       d
1   512   313    89   19
2   353   207    17      8
3   120   205   202 391
4   138   279   131 244
5    53   138    94 299
6    22   351    24 317
Estimating a common effect
• Wolf method (averages the log odds ratios)
• Mantel-Haenszel (averages the odds ratios)
• Regression-based
Wolf’s average log-odds ratio
I                     I
log(OR )   w log(OR )  w
ˆ
W
ˆ
i          i          i
i 1                  i 1

ˆ ))  log  ai   log  ci 
where log(ORi                      
 bi         di 
1            ˆ ))  1  1  1  1
wi  var(log(ORi
ai bi ci di
I
Var (log(ORW ))   wi
ˆ
i 1

Can add .5 to cell entries if sample sizes are small
Applying Wolf method to Berkeley data
stratum    a    b     c     d      lor     v    w=1/v    w*lor

1 512    313    89    19 -0.457    0.069 14.489    -6.62

2 353    207    17       8 -0.096 0.1915 5.2223   -0.499

3 120    205   202   391    0.054 0.0207 48.264 2.6185

4 138    279   131   244 -0.036 0.0226 44.321     -1.578

5   53   138    94   299    0.087 0.0401 24.939 2.1682

6   22   351    24   317 -0.082 0.0931 10.738     -0.881

Wolf estimate of LOR is .03, with variance .0068. What is 95% CI?
Corresponding OR estimate is
Wolf’s average log-odds ratio
I                     I
log(OR )   w log(OR )  w
ˆ
W
ˆ
i          i          i
i 1                  i 1

ˆ ))  log  ai   log  ci 
where log(ORi                      
 bi         di 
1            ˆ ))  1  1  1  1
wi  var(log(ORi
ai bi ci di
I
Var (log(ORW ))   wi
ˆ
i 1

Can add .5 to cell entries if sample sizes are small
Applying Wolf method to Berkeley data
stratum    a    b     c     d      lor     v    w=1/v    w*lor

1 512    313    89    19 -0.457    0.069 14.489    -6.62

2 353    207    17       8 -0.096 0.1915 5.2223   -0.499

3 120    205   202   391    0.054 0.0207 48.264 2.6185

4 138    279   131   244 -0.036 0.0226 44.321     -1.578

5   53   138    94   299    0.087 0.0401 24.939 2.1682

6   22   351    24   317 -0.082 0.0931 10.738     -0.881

Wolf estimate of LOR is .03, with variance .0068. What is 95% CI?
Corresponding OR estimate is
Mantel-Haenszel average odds ratio
I                I
ORMH   wi*ORi
ˆ           ˆ           wi*
i 1             i 1

where ORi

ˆ  ai d i   
 cibi 
bi ci
w 
*
i
ni
ˆ
Var (ORMH )  page 131 of Jewell!
Applying Wolf method to Berkeley data
stratum    a    b     c     d      lor     v    w=1/v    w*lor

1 512    313    89    19 -0.457    0.069 14.489    -6.62

2 353    207    17       8 -0.096 0.1915 5.2223   -0.499

3 120    205   202   391    0.054 0.0207 48.264 2.6185

4 138    279   131   244 -0.036 0.0226 44.321     -1.578

5   53   138    94   299    0.087 0.0401 24.939 2.1682

6   22   351    24   317 -0.082 0.0931 10.738     -0.881

Wolf estimate of LOR is .03, with variance .0068. What is 95% CI?
Corresponding OR estimate is
data berkeley;             Regression-based
input stratum male a b ;
cards;                       analysis for
1 1 512 313
1 0 89 19                   Berkeley data
2 1 353 207
2 0 17 8                    Code continued
3 1 120 205                 data berkeley; set berkeley;
3 0 202 391                 n=a+b;
4 1 138 279
4 0 131 244                 Unstratified analysis;
5 1 53 138                  proc genmod;
5 0 94 299                  model
6 1 22 351                    a/n=male/dist=binomial;
6 0 24 317                  run;
run;
Results of unstratified analysis

Standard      95% Confidence      Chi-
Parameter DF Estimate    Error        Limits           Square    P

Intercept   1   -0.8305   0.0508   -0.9300   -0.7310   267.56   <.0001
male        1    0.6104   0.0639    0.4851    0.7356   91.25    <.0001
Scale       0    1.0000   0.0000    1.0000    1.0000

Compare with our initial analysis
Stratified                          proc genmod;
class stratum;
analysis                            model a/n=male
stratum/dist=binomial;
run;

Standard     95% Conf             Chi-
Parameter       DF Estimate    Error        Limits           Square Pr > ChiSq

Intercept        1   -2.6246   0.1577    -2.9337   -2.3154    276.88      <.0001
male             1   -0.0999   0.0808    -0.2583   0.0586      1.53       0.2167
stratum     1    1    3.3065   0.1700     2.9733   3.6396    378.38      <.0001
stratum     2    1    3.2631   0.1788     2.9127   3.6135    333.12      <.0001
stratum     3    1    2.0439   0.1679     1.7149   2.3729    148.24      <.0001
stratum     4    1    2.0119   0.1699     1.6788   2.3449    140.18      <.0001
stratum     5    1    1.5672   0.1804     1.2135   1.9208     75.44      <.0001
stratum     6    0    0.0000   0.0000     0.0000   0.0000      .     .
Scale            0    1.0000   0.0000    1.0000    1.0000
More general modeling
regression model so as to obtain an estimate of the
factors.

Example, smoking in the Epilepsy study. Lets look
in SAS:
proc freq ;
table one3*cig2 /chisq;
run;
Epilepsy data in
SAS
Standard     Wald 95% Confidence        Chi-
Parameter            DF Estimate      Error     Limits          Square Pr > ChiSq

Intercept            1    -3.1396    0.2229    -3.5765    -2.7028    198.41    <.0001
DRUG         1       1     1.0384    0.2876     0.4748    1.6020     13.04    0.0003
DRUG         2       1    -0.2944    0.6275    -1.5243     0.9355     0.22    0.6390
DRUG         3       0     0.0000    0.0000     0.0000    0.0000      .     .
Scale                0     1.0000    0.0000     1.0000     1.0000

Standard      Wald 95% Confidence   Chi-
Parameter            DF Estimate     Error        Limits      Square Pr > ChiSq

Intercept            1    -3.3872    0.2435    -3.8644    -2.9100    193.55     <.0001
DRUG            1    1     1.0712    0.2939     0.4952    1.6472     13.29    0.0003
DRUG            2    1    -0.3596    0.6337    -1.6016     0.8824     0.32    0.5704
DRUG            3    0     0.0000    0.0000     0.0000    0.0000      .     .
CIG2                 1    1.0721     0.3131    0.4585     1.6857     11.73    0.0006
Scale                0    1.0000     0.0000    1.0000     1.0000
Why don’t drug estimates change
much??
Hint – look at
association
between drug and
smoking
proc freq ;
table one3*cig2 /chisq;
run;

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 7 posted: 12/12/2011 language: pages: 24