Bias

Document Sample

Spring 2008

Bias, Confounding,

and Effect Modification

STAT 6395

Filardo and Ng
Bias

Any systematic error in the design or conduct
of a study that results in a mistaken estimate
of the association between an exposure and
a disease

Bias is often a major problem in observational
epidemiologic studies
Systematic error (bias) is different than random error

• Example:        an association between an
exposure an a disease in which the true
relative risk is 2.0
Systematic error (bias) is different than random error

• If the design and conduct of a study are
unbiased, and there is no confounding, and
we repeat the study an infinite number of
times, the mean relative risk will be 2.0, with
the individual relative risks from the different
studies fluctuating around 2.0
Systematic error (bias) is different than random error

• If the design or conduct of the study is biased,
and we repeat the study an infinite number of
times, the mean relative risk will differ from
2.0 (for example, it may be 1.2), with the
individual relative risks from the different
studies fluctuating around 1.2
Systematic error (bias) is different than random error

• Due to random variation, an association that
is far from the truth can be observed in an
unbiased study, but it usually won’t be.
Systematic error (bias) is different than random error

• Due to random variation, the true association
can be observed in a biased study, but it
usually won’t be
Systematic error (bias) is different than random error

Statistical significance     does    not   protect
against bias
Two major categories of bias

• Selection bias
• Information bias
Selection bias

Error that results from criteria or procedures
used to select study subjects or from factors
that influence study participation.

With selection bias, the relation between exposure
and disease is different for those who are selected for
and participate in the study and those who should be
theoretically eligible to participate.
Selection bias

Selection bias can occur as a result of:
   Incorrect selection criteria for study subjects
   Differences in characteristics between eligible subjects who agree
to participate and eligible subjects who do not participate
Information bias

Error due to collection of incorrect information
about study subjects. Due to this incorrect
information, subjects are classified into
incorrect exposure or disease categories.
Selection bias is a major issue in case-control studies

• Source population: the population that gives
rise to the cases
Selection bias is a major issue in case-control studies

• Cases should be selected such that the
distribution of the exposures of interest
among the cases selected for the study is the
same as it is among all cases that arise in the
source population. The cases should be
representative of all cases that arise in the
source population with respect to the
exposures of interest.
Selection bias in case-control studies (cont.)

• Controls should be selected such that the
distribution of the exposures of interest
among the controls is the same as it is in the
source population. The controls should be
representative of the source population with
respect to the exposures of interest.
Selection bias in case-control studies (cont.)

• Selection bias occurs when either:
   The cases are not representative of all cases that arise in the
source population with respect to the exposures of interest and/or
   The controls are not representative of the source population with
respect to the exposures of interest.
Selection bias in case-control studies: how it works

• In the hypothetical data depicted in the
following tables, we will assume there is:
» no information bias,
» confounding, or
» random variability

so that all differences are due to differences
in selection of cases or controls
Hypothetical case-control study including all cases and all
non-cases from Source Population A

All Cases       All Non-cases

Exposed               500             100,000
Nonexposed             1,000            900,000
Total             1,500           1,000,000

Gold standard OR = 4.5
Hypothetical case-control study including a 70% unbiased
sample of the cases and 0.5% unbiased sample of the
controls from Source Population A

Cases              Controls

500 x 0.7 =      100,000 x 0.005 =
Exposed
350                 500
1,000 x 0.7 =     900,000 x 0.005 =
Nonexposed
700                4,500
Total           1,050               5,000

Unbiased OR = (350x4,500)/(500x700) = 4.5
This is an unbiased odds ratio because the selection of
cases and controls was unrelated to exposure.
Selection bias in choosing controls in a hypothetical case-
control study including a 70% sample of the cases and 0.5%
sample of the controls from Source Population A

Cases             Controls

500 x 0.7 =     100,000 x 0.0095 =
Exposed
350                 950
1,000 x 0.7 =   900,000 x 0.0045 =
Nonexposed
700               4,050
1,500 x 0.7 =   1,000,000 x 0.005 =
Total
1,050               5,000

Biased OR = (350x4,050)/(950x700) = 2.13
Selection of controls was related to exposure
-over selecting exposed controls biases OR downward
Selection bias in choosing controls in a case-control study
due to incorrect criteria for control selection

Example: A hospital-based case-control study
of the relation of smoking to a given disease.
Selection bias in choosing controls in a case-control study
due to incorrect criteria for control selection

If the control group includes persons
hospitalized for smoking-related diseases
(e.g, cardiovascular disease)…

…the control group would likely have a higher
proportion of smokers than the source
population, and the resultant odds ratio would
be biased downward
Selection bias in choosing controls in a case-control study
due to a difference in participation rates between exposed
controls and nonexposed controls

• Example: Case-control study of the relation between
housing characteristics and lead poisoning among
children 6 years of age or younger who are screened
for blood lead levels at the Hill Health Center in New
Haven
Selection bias in choosing controls in a case-control study
due to a difference in participation rates between exposed
controls and nonexposed controls

• Cases: all children with a blood lead level of >10
micrograms/dL

• Controls: a systematic sample of children with a
blood lead level of <10 micrograms/dL
Housing characteristics and lead poisoning (cont.)

• Incentive for participation: the parents of the children
were offered a free lead inspection of their homes

• Participation rate among cases: 91% (parents were
motivated by their child’s elevated blood lead level to
have the inspection)
Housing characteristics and lead poisoning (cont.)

• Participation rate among controls: 69% (parents did
not have the same motivation to participate)

The condition of the housing of the control parents
who refused to participate was better than the
condition of the housing of the control parents who
did participate
Housing characteristics and lead poisoning (cont.)

• The housing of the controls selected for the study
was in poorer condition than the housing of the
source population

The odds ratio for the association between measures
of dilapidated housing and childhood lead poisoning
would be biased downward
Housing characteristics and lead poisoning (cont.)

• Although the criteria for selecting controls were
sound, the difference in participation rate between
exposed controls and nonexposed controls resulted
in a biased odds ratio
Selection bias in choosing cases in a hypothetical case-
control study including a 70% sample of the cases and 0.5%
sample of the non-cases from Source Population A

Cases             Controls

500 x 0.9 =    100,000 x 0.005 =
Exposed
450               500
1,000 x 0.6 =   900,000 x 0.005 =
Nonexposed
600              4,500
1,500 x 0.7 =   1,000,000 x 0.005 =
Total
1,050               5,000

Biased OR = (450x4,500)/(500x600) = 6.75

Selection of cases was related to exposure
-over-selecting exposed cases biases OR upward
Selection bias in choosing cases in a case-control study

• Example: Population-based case-control study of
pancreatic cancer cancer

• Hypothesis: vitamin C protects against development of
pancreatic cancer

Vitamin C intake    assessed   by   food   frequency
questionnaire
Selection bias in choosing cases in a case-control study

• Median interval between diagnosis and interview: 9
months

• One-year case fatality rate of pancreatic cancer: 80%

Many cases would die before being interviewed
Selection bias in choosing cases in a case-control study

Suppose vitamin C intake improves survival from
pancreatic cancer

• Then vitamin C intake among cases selected for the
study would be higher than vitamin C intake among all
cases

• Over-selection of exposed cases would bias OR
upward
Compensating Selection Bias

To avoid biased odds ratios, investigators often attempt
to equalize selection bias between cases and controls
by selecting cases and controls undergoing the same
selection processes
Compensating bias in choosing cases and controls in a
hypothetical case-control study including a 70% sample of
the cases and 0.5% sample of the non-cases from Source
Population A

Cases             Controls

500 x 0.9 =    100,000 x 0.00714 =
Exposed
450                714
1,000 x 0.6 =   900,000 x 0.004762 =
Nonexposed
600                4,286
1,500 x 0.7 =   1,000,000 x 0.005 =
Total
1,050               5,000

Unbiased OR = (450x4,286)/(714x600) = 4.5

Equal over-selection (1.5x) of exposed cases and controls
Hypothetical case-control study including a 70% unbiased
sample of the cases and 0.5% unbiased sample of the
controls from Source Population A

Cases            Controls

500 x 0.7 =    100,000 x 0.005 =
Exposed
350               500
1,000 x 0.7 =   900,000 x 0.005 =
Nonexposed
700              4,500
Total              1,050            5,000

Unbiased OR = (350x4,500)/(500x700) = 4.5

This is the original table
Cases and controls undergoing the same selection
processes in a case-control study of breast cancer

• Example: Cases and controls selected from among
women attending a breast cancer screening program

These women are likely to have high prevalence of
known breast cancer risk factors, (family history of
breast cancer, history of benign breast disease, late
age at first birth)
Cases and controls undergoing the same selection
processes in a case-control study of breast cancer

• Example: Cases and controls selected from among
women attending a breast cancer screening program

If cases from this population were compared to
controls from the general population, an overestimate
of the magnitude of some risk factors would probably
occur
Cases and controls undergoing the same selection
processes in a case-control study of breast cancer

• Selecting both cases and controls from the screening
program should make the bias the same in both
groups, leading to unbiased odds ratios

This is another way of saying that controls should be
selected from the source population that gave rise to
the cases
Minimizing selection bias in case-control studies

• In the study design stage, carefully consider the
criteria for selection of cases and controls,
particularly with respect to ensuring internal validity
Minimizing selection bias in case-control studies

• Choose study procedures aimed at maximizing the
participation rate of the subjects selected for the
study
Selection bias in cohort studies using internal comparison
groups is unlikely

• Selection bias would occur if participation were related
to both exposure and the subsequent development of
disease

• Because study participants are selected before the
development of disease, this is unlikely

The exposed group and nonexposed comparison group
were drawn from the same source population and went
through the same selection process
Selection bias in cohort studies using internal comparison
groups is unlikely

• The nurses who participated in the Nurses’ Health
Study most likely differed from the nurses who did
not, but since the same selection process was
used to select the exposed group and the
nonexposed internal comparison group, the
relative risk estimates should be unbiased.
Cohort studies using external comparison groups are prone
to selection bias

• Exposed cohort and nonexposed external comparison
group are not selected from the same source population

The exposed cohort may be selected such that it is at
higher or lower risk for disease than the external
comparison group for a reason other than the exposure
of interest
Healthy worker effect

• A selection bias in occupational cohort studies using
a general population external comparison group

Persons selected for employment are usually
healthier than and have lower mortality rates than the
general population, which includes the sick and
disabled.
Healthy worker effect

• A selection bias in occupational cohort studies using
a general population external comparison group

The healthy worker effect makes any excess disease
or mortality associated with an occupational exposure
more difficult to detect than it would have been if a
valid comparison group had been used, biasing the
estimates of relative risk downward
Losses to follow-up in cohort studies are analogous to
selection bias in case-control studies

• When a subject in a cohort study is lost to follow-up,
we do not know whether that subject developed the
disease of interest during the remainder of the study’s
follow-up period
Losses to follow-up in cohort studies are analogous to
selection bias in case-control studies

• If the subjects lost to follow-up have a different
incidence of the disease of interest than the subjects
not lost to follow-up, the estimates of the incidence
rate of the disease of interest in the cohort will be
biased
Losses to follow-up in cohort studies are analogous to
selection bias in case-control studies

• However, relative risk estimates will be unbiased if
the bias on the incidence rate estimates is the same
in the exposed and nonexposed groups.

A biased relative risk estimate will occur only if losses
to follow-up are related to both disease and exposure
• The best defense against bias due to losses to
follow-up is to make intense efforts to locate each
cohort member, and thus minimize losses
Losses to follow-up in cohort studies are analogous to
selection bias in case-control studies

• The best defense against bias due to losses to
follow-up is to make intense efforts to locate each
cohort member, and thus minimize losses
Hypothetical cohort study with 100% follow-up (to keep the
examples simple, we will not use the person-years method,
but will use 10-year cumulative incidence)

No                 Incidence
Disease              Total
Disease             ( x10,000 x 10 yrs)

Exposed      50      10,000    10,050       49.75

Non-
100      90,000    90,100       11.10
exposed

Gold standard RR = 49.75/11.10 = 4.48
Hypothetical cohort study with 30% of the cohort lost to
follow-up: losses to follow-up independent of exposure and
disease
No                            Incidence
Disease                           Total
Disease                        ( x10,000 x 10 yrs)

50 x 0.7 =   10,000 x 0.7 =   10,050 x 0.7 =
Exposed                                                      49.75
35           7,000            7,035

Non-  100 x 0.7 =    90,000 x 0.7 =   90,100 x 0.7 =
11.10
exposed     70            63,000           63,070

Unbiased RR = 49.75/11.10 = 4.48
Hypothetical cohort study with 40% of the exposed group and
20% of the nonexposed group lost to follow-up: losses to
follow-up related to exposure, but not disease
No                            Incidence
Disease                           Total
Disease                        ( x10,000 x 10 yrs)

50 x 0.6 =   10,000 x 0.6 =   10,050 x 0.6 =
Exposed                                                      49.75
30           6,000            6,030

Non-  100 x 0.8 =    90,000 x 0.8 =   90,100 x 0.8 =
11.10
exposed     80            72,000           72,080

Unbiased RR = 49.75/11.10 = 4.48
Hypothetical cohort study with 40% of those who developed
disease and 20% of those who did not develop disease lost to
follow-up: losses to follow-up related to disease, but not
exposure
No                    Incidence
Disease                       Total
Disease                ( x10,000 x 10 yrs)

50 x 0.6 =   10,000 x 0.8 =
Exposed                                 8,030        37.36
30           8,000

Non-  100 x 0.6 =    90,000 x 0.8 =
72,060        8.33
exposed     60            72,000

Unbiased RR = 37.36/8.33 = 4.48
Hypothetical cohort study: losses to follow-up related to
disease and exposure

No                    Incidence
Disease                       Total
Disease                ( x10,000 x 10 yrs)

50 x 0.6 =   10,000 x 0.8 =
Exposed                                 8,030        37.36
30           8,000

Non-  100 x 0.8 =    90,000 x 0.8 =
72,080       11.10
exposed     80            72,000

Biased RR = 37.36/11.10 = 3.37
Information bias (error due to collection of incorrect
information about study subjects) results in misclassification
of exposure or disease

• Nondifferential exposure misclassification:
misclassification of exposure unrelated to disease

• Nondifferential disease misclassification:
misclassification of disease unrelated to exposure

• Differential misclassification: misclassification related
to both exposure and disease
Information bias (error due to collection of incorrect
information about study subjects) results in misclassification
of exposure or disease

• Nondifferential misclassification tends to bias an
association toward     the   null   hypothesis   (no
association)

• Differential misclassification can bias an association
either toward or away from the null hypothesis,
depending on the specific nature of the
misclassification
Nondifferential exposure misclassification in a cohort study

• Inclusion of nonexposed subjects in the exposed
group and exposed subjects in the nonexposed
group will bias the relative risk toward the null if the
exposure misclassificiation is unrelated to the future
development of disease, which is usually the case

Differential exposure misclassification is not likely in
cohort studies
Hypothetical cohort study with 100% follow-up and 100%
accuracy in exposure and disease classification

No                 Incidence
Disease              Total
Disease             ( x10,000 x 10 yrs)

Exposed     75      15,000    15,075        49.75

Non-
150      135,000   135,150       11.10
exposed

Gold standard RR = 49.75/11.10 = 4.48
Hypothetical cohort study with 20% of exposed misclassified
as nonexposed and 10% of nonexposed misclassified as
exposed, independent of disease: nondifferential exposure
misclassification
No                 Incidence
Disease               Total
Disease             ( x10,000 x 10 yrs)

75 –    15,000 –
15 +     3,000 +
Exposed                            25,075       29.33
15 =    13,500 =
75      25,500
150 –   135,000 –
15 +     3,000 +
Non-exposed                        124,650       12.03
15 =    13,500 =
150     124,500

Biased RR = 29.33/12.03 = 2.44
Nondifferential exposure misclassification in a cohort study:
dietary assessment example

• At baseline, study subjects complete a food frequency
questionnaire about dietary habits over the past year.

Measurement error due to imperfect recall will result in
exposure misclassification –which will occur in both the
exposed and nonexposed group
Hypothetical cohort study with 0.1% of nondiseased
misclassified as having developed the disease and 8% of the
diseased misclassified as nondiseased, independent of
exposure: nondifferential disease misclassification
No                 Incidence
Disease               Total
Disease             ( x10,000 x 10 yrs)

75 +    15,000 -
15 -        15 +
Exposed                            15,075       55.72
6=          6=
84      14,991
150 +   135,000 -
135 -       135 +
Non-exposed                        135,150       20.20
12 =        12 =
273     134,877

Biased RR = 55.72/20.20 = 2.76
Hypothetical cohort study with 0.5% of nondiseased in the
exposed group misclassified as having developed the
disease and 0.04% of the nondiseased in the nonexposed
group misclassified as having developed the disease:
differential disease misclassification
No                 Incidence
Disease               Total
Disease             ( x10,000 x 10 yrs)

75 +    15,000 -
Exposed          75 =        75 = 15,075        99.50
150      14,925
150 +   135,000 -
Non-exposed        54=        54 = 135,150       15.09
204     134,946

Biased RR = 99.50/15.09 = 6.59
Disease misclassification in cohort studies

• Disease misclassification is a particular issue when
information on disease is obtained from the members
of the cohort themselves (e.g. health questionnaire)

Whenever possible, subject reports about disease
should be confirmed by more objective means, such
as review of medical records
Disease misclassification in cohort studies

• Differential misclassification is a concern if the study
members involved in data collection on disease or in
disease classification are aware of the exposure
status of the subjects
Hypothetical case-control study with no misclassification of
exposure or disease

Cases    Controls

Exposed         350         500

Non-exposed      700       4,500

Gold standard OR = 4.50
Hypothetical case-control study with 10% of cases
misclassified as controls and 5% of controls misclassified as
cases, independent of exposure: nondifferential disease
misclassification

Cases     Controls

350 –       500 +
35 +        35 –
Exposed
25 =        25 =
340         510
700 –     4,500 +
70 +        70 –
Non-exposed
225 =       225 =
855       4,345

Biased OR = 3.54
Nondifferential disease misclassification in      case-control
study: Alzheimer’s disease

• Definitive diagnosis can only be made by brain
biopsy, which isn’t done.

We therefore must rely for diagnosis on clinical
criteria and exclusion of other diseases.       The
diagnostic criteria are imperfect and will result in
misclassification of the disease status
Nondifferential disease misclassification in       case-control
study: Alzheimer’s disease

• Persons with other types of dementia, such as multi-
infarct dementia may be included in the case group.

• Persons with early Alzheimer’s disease may be
included in the control group
Hypothetical case-control study with 10% of exposed
controls misclassified as cases and 1% of nonexposed
controls misclassified as cases: differential disease
misclassification

Cases            Controls

Exposed      350 + 50 = 400     500 – 50 = 450

Non-exposed    700 + 45 = 745 4,500 – 45 = 4,455

Biased OR = 5.31
Differential disease misclassification in case-control study:
Alzheimer’s disease

• Exposure: hypertension

Hypertension is a risk factor for multi-infarct
dementia, which could be confused with Alzheimer’s
disease
Exposure misclassification in a case-control study: an
important source of both nondifferential and differential
misclassification

• Classifying exposed persons as being nonexposed
and nonexposed persons as being exposed will bias
the odds ratio toward the null if the exposure
misclassification is unrelated to disease status

• Classifying exposed persons as being nonexposed
and nonexposed persons as being exposed can bias
the odds ratio in either direction if the exposure
misclassification depends on disease status
Hypothetical case-control study with 20% of the nonexposed
misclassified as exposed and 16% of the exposed
misclassified as nonexposed, independent of disease:
nondifferential exposure misclassification

Cases                    Controls

Exposed     350 + 140 – 56 = 434     500 + 225 – 80 = 1,320

Non-exposed   700 – 140 + 56 = 616   4,500 – 225 + 80 = 3,680

Biased OR = 1.96
Example: dietary assessment
Hypothetical case-control study with 20% of the nonexposed
cases misclassified as exposed and 5% of the nonexposed
controls misclassified as exposed: differential exposure
misclassification

Cases            Controls

Exposed     350 + 140 = 490     500 + 225 = 725

Non-exposed   700 – 140 = 560 4,500 – 225 = 4,275

Biased OR = 5.16
Example: Recall bias
Types of information bias that can lead to differential
misclassification

• Recall bias
• Reporting bias
• Observer bias
Recall bias

Systematic error due to differences in accuracy of
recall of past exposures or diseases between study
groups

• Example: family history of prostate cancer in a case-
control study of prostate cancer
Recall bias

• Men diagnosed with prostate cancer are often more
aware of their family history than men who have not

In a case-control study, reporting of family history of
prostate cancer could be more complete among
cases than among controls, biasing the result away
from the null hypothesis
Reporting bias

Systematic error due to selective revealing or
suppression of information about exposure or
disease due to attitudes, beliefs, or perceptions

• Example: married, apparently heterosexual men may
not reveal homosexual behavior
Reporting bias

• Example: persons who belong to religious groups
that proscribe   alcohol   may   lie   about   alcohol
consumption
Observer bias

Systematic error due to well-intentioned members of
the study team subconsciously or consciously
collecting data or making decisions about subjects’
exposure or disease status in different ways
according to study group. This may occur because
the observer has his/her own hypothesis about the
relationship between exposure and disease
Observer bias

• Interviewer bias: in a case-control study, an
interviewer may probe more thoroughly for an
exposure in a case than in a control

• Abstractor bias: in a cohort study, a data abstractor
may probe over the medical records of an exposed
subject more thoroughly than the medical records of
an unexposed subject to identify evidence of disease
Observer bias

• Bias on the part of study team members involved in
the classification of disease in a cohort study:
classification of disease may be influenced by
knowledge of the exposure status of the subject
Reducing bias

• Ensure that the study design is appropriate for

• Carefully define exposure and disease
• Choose valid measurement methods
• Train study personnel and standardize procedures
• Perform quality control on all aspects of data
collection and processing
Reducing bias

Make every effort to maximize participation rates and
to minimize losses to follow-up

• Apply study methods in the same manner and with
the same care to all study subjects, irrespective of
the group to which they belong
 Blind interviewers, abstractors, and other study staff involved in
data collection or exposure/disease classification to the subjects’
case-control status in case-control studies and exposure status in
cohort studies
 Blind study subjects and data collectors to study hypothesis
Reducing bias

• If it is possible to improve the quality of exposure
data in a case-control study in the case group or in
the control group, but not in both, the investigator
should resist the temptation to do so in order to
preserve the validity of the comparison of exposures
between cases and controls
Reducing bias

• If it is possible to improve the quality of disease data
in a cohort study in the exposed group or in the
nonexposed comparison group, but not in both, the
investigator should resist the temptation to do so in
order to preserve the validity of the comparison of
disease outcome between the exposed and
nonexposed
Detection (surveillance) bias

Error due to persons with an exposure of interest
being under closer medical surveillance than persons
without the exposure, resulting in a higher probability
of detection of the disease of interest in exposed
persons than in nonexposed persons
Detection bias is a threat when:

• The disease has a high prevalence of asymptomatic
cases, and would thus be more likely to be diagnosed
in persons under close medical surveillance than in
persons not under medical surveillance

• The exposure of interest leads to frequent medical
checkups:
   A medical therapy
   A medical condition
   A harmful exposure
Detection bias in a case-control study: selection bias in
which selection of cases is related to the presence of the
exposure

Example: Case-control study of hormone replacement
therapy (HRT) use and breast cancer

• Women who use HRT are likely to have more medical
visits than women who do not

• They may be more likely to have a screening
mammography and have subclinical breast cancer
detected
Detection bias in a case-control study: selection bias in
which selection of cases is related to the presence of the
exposure

Example: Case-control study of hormone replacement
therapy (HRT) use and breast cancer

• HRT would cause breast cancer to be detected, but not
to occur

The OR for the relationship between HRT and breast
cancer would be biased upward
Detection bias in a cohort study: information bias in which
exposed persons are under closer medical surveillance than
nonexposed persons

Example: Cohort study of statin use and prostate cancer

• Men who take statins have blood drawn periodically to
check their serum cholesterol and liver function

• May be more likely to have a PSA test than men not
taking statins
Detection bias in a cohort study: information bias in which
exposed persons are under closer medical surveillance than
nonexposed persons

Example: Cohort study of statin use and prostate cancer

• This would lead to a higher probability of diagnosis of
prostate cancer

• Statin use would cause prostate cancer to be detected,
but not to occur

The RR for the relationship between statin use and
prostate cancer would be biased upward
Detection bias: further observations

• In a cohort study, more likely to occur when disease
is ascertained through regular medical channels as
opposed to when all study subjects are examined for
disease using standardized methods (the same for
exposed and nonexposed subjects) by members of
the study team.
Detection bias: further observations

• When detection bias occurs, the disease tends to be
diagnosed in an early subclinical form in exposed
persons more often than in nonexposed persons
   The RR or OR for the association between the exposure and less
advanced disease is higher than the relative risk or odds ratio for
the association between the exposure and more advanced disease
Qualitatively assessing how biases in case-control studies
work

• In a case-control study, selection bias, information
bias resulting in differential misclassification, or
detection bias will lead to a biased distribution of
subjects in the 2x2 table that is differential between
cases and controls

Assess which cells will be over-represented under
various scenarios, as shown in the following slides
Over-representation of exposed cases

Cases        Controls

Exposed         A              b
Non-
exposed
c            d

OR = (   Ad / b
) ( c)
OR is biased upward
Detection bias: HRT and breast cancer
Over-representation of nonexposed cases

Cases       Controls
Expose
d
a             b
Non-
exposed     C                 d

d / bC
OR = (a ) (       )
OR is biased downward
Differential exposure misclassification:
Alcohol consumption and automobile accidents
Over-representation of exposed controls

Cases        Controls
Expose
d              a             B
Non-
exposed           c              d

a
OR = ( d)/(   Bc   )
OR is biased downward
Selection (nonparticipation) bias: Poor housing and
Over-representation of nonexposed controls

Cases        Controls
Expose
d             a              b

Non-
exposed          c             D