UNM Health Sciences Center

Document Sample
UNM Health Sciences Center Powered By Docstoc
					Randomized Controlled

    GME Evidence Based
     Medicine Course
         Module 4
    Randomized Controlled Trials
At the completion of module 4, residents should be able to:
• Describe the design of a Randomized Controlled Trial.
• Describe the importance of randomization and a control group in the
   RCT design
• Define blinding and list four levels at which studies are blinded.
• Define an "Intention to Treat" analysis and describe why it is important.
• Calculate and be able to describe the meaning of the following
     • Control Event Rate (CER) and Experimental Event Rate (EER)
     • Absolute Risk Reduction (ARR) & Number Needed to Treat
     • Relative Risk (RR) and Relative Risk Reduction (RRR)

• Critically appraise an article about a Randomized Controlled Trial
       Definition & Characteristics
Randomized Controlled Trial: A study design in which subjects are
assigned at random to receive one of at least two different treatments so
that differences in outcomes between the different treatments can be
  Screening    Randomization       Treatment       Outcome assessment

 • Subjects are screened to make sure they meet study criteria.
 • Subjects are randomized to the experimental or control group.
      • By convention, the "experimental" group receives the newer or
          more novel treatment.
      • The "control" group(s) receive placebo or standard treatment.
 • After randomization, subjects receive the treatment appropriate for
   their group.
 • Outcomes are assessed following treatment.
      Definition & Characteristics

• Controlled. The control group allows differences in outcomes to be
  attributed to the differences in treatment types, rather than natural
  history or an effect that is solely determined by the passage of time.
• Randomized. Random assignment is meant to equally distribute
  prognostic factors (known, unknown, and uncontrollable) between the
  experimental and control groups.
• Blinding. Keeping knowledge of group allocation secret reduces the
  probability of the experimental and control groups being treated or
  assessed differently.
• Prospective design. Group assignment occurs prior to receipt of
  treatment, and receipt of treatment occurs prior to assessment of
  outcomes. This decreases the probability that knowledge of
  outcomes can selectively skew patient selection, study conduct, or
  handling of study information.
     Definition & Characteristics
Disadvantage: Low generalizability. Experimental conditions
are so tightly controlled that resemblance to real-world
conditions is low.

 • RCTs commonly over-estimate effectiveness
 • RCTs may be better at ranking the efficacy of different
   treatments rather than realistically estimating their
 • Costly and difficult to perform.
 • Not ethical or practical in many situations
The critical appraisal of Randomized Controlled
Trials requires familiarity with the following six

  •   Blinding
  •   Volunteer Bias
  •   Crossover
  •   Intention to Treat analysis
  •   Dichotomization
  •   Dichotomized statistical measures
Blinding is the process of keeping group allocation secret from:
• The person performing group allocation
• The subject
• The person providing treatment
• The person assessing outcomes

"Double blind." This is an ambiguous and antiquated term that is best
avoided since it does not specify which of the four levels were blinded.
• Instead, explicitly list the roles that are blinded
• Nonetheless, you’ll still see the term used widely, either out of habit,
   ignorance, or expectations.

Allocation concealment. A special name given to the blinding of the
person who allocates subjects to one of the study groups. Without
allocation concealment, the allocator might:
• Inadvertently un-blind the patient (or other roles).
• Influence the outcome of the randomization (such as allocating the
  sicker-appearing subjects to the experimental group).
                      Volunteer Bias
Volunteer bias: Reasons for dropping out of (or remaining in)
a study can bias the results of the study.
Examples of Volunteer bias:
• Sicker patients in the control group may drop out due to death,
  disability, or other morbidity. May cause an effective treatment to
  appear ineffective.
• Patients in the experimental group who improve may recover and loose
  interest in the study. Causes an effective treatment to appear
• Patients in the experimental group with significant side effects may drop
  out. Only those with a robust improvement remain, causing treatment
  to appear more effective than it really is.
• Actual reasons for dropouts can’t be known in advance (otherwise, you
  probably wouldn’t need to do the study to begin with), so knowing how
  dropout might influence results can’t necessarily be predicted.
In many studies, when a subject is doing poorly on one
treatment, clinical needs prevail and it is not uncommon for
the clinician to placed the patient on the alternative treatment
for clinical purposes.

• This may unfairly “load” the alternative treatment with
  patients experiencing poor outcomes. This may cause the
  alternative treatment to appear less efficacious than it really
• It may also leave only the patients who are doing well in the
  first treatment group. This may cause the first treatment to
  appear more efficacious than it really is.
• Taken together, this would increase the apparent difference
  in efficacy between the two groups.
         Intention to Treat Analysis
An Intention to Treat Analysis is one way of controlling for
Volunteer bias and Crossover effects.
• "Once randomized, always analyzed."
• In order to avoid drop out or crossover from biasing study results, the
  data from subjects must in analyzed according to the group the subject
  was randomized to, not the group subjects were treated in.
• When subjects have dropped out, their data must be conservatively
  estimated using one of several methods:
    1) Substitute the worse value for the entire study for any missing
    2) Substitute the worse value taken for that patient for any missing
       values for that subject
    3) "Last value carried forward": Substitute the subject's last known
       measurement for all subsequent, missing measurements.

Any type of estimation will reduce the validity of a study, so it's
important to keep dropout rates low.
Common analyses of RCT data require outcomes to
be classified as either "present" or "absent."

• Some outcomes such as death, stroke, or pregnancy are
  naturally dichotomous (present or not).
• Sometimes events are dichotomized as a "success" or a
  "failure" instead of "present" and "absent."
• Other outcomes that are naturally continuous (e.g., length
  of hospital stay, blood pressure, pain score) can be
  dichotomized by the selection of a “cutoff” score that
  separates successes from failures.
• A "two by two" table often summarizes results
 Example: In a study by Rouse et al.*, women were administered
 magnesium sulfate a few hours prior to delivery of pre-term (24-31 week)
 infants. Infants were evaluated for cerebral palsy at 2 years of age. Results
 can be summarized in the following table:

                                  Experimental                         Control
                                     Group                             Group
                                    (MgSO4)                           (placebo)                    Row totals:
       Cerebral                              20                             38                              58
       palsy (+)
       Cerebral                            1021                           1057                            2078
       palsy (-)
       Column                              1041                           1095                            2136

*Rouse DJ, Hirtz DG, Thom E, et al. A randomized, controlled trial of magnesium sulfate for the prevention of cerebral palsy.
NEJM 2008. 359(9):895-905.
            Statistical Measures
Once outcomes have been dichotomized, six statistical
measures are used to describe the treatment effect:

• Control Event Rate (CER) & Experimental Event Rate
• Absolute Risk Reduction (ARR) & Number Needed to
  Treat (NNT)
• Relative Risk (RR) & Relative Risk Reduction (RRR)
                Exp         Ctrl    Row
              (MgSO4)    (placebo) totals:          CER & EER
    CP (+)       20          38           58
                                                         Control Event Rate
    CP (-)      1021        1057      2078             Experimental Event Rate
    totals:     1041        1095      2136

      number of events in ctrl group                    number of events in exp group
CER =                                          EER =
      number of subjects in ctrl group                  number of subjects in exp group

          38                                             20
CER =            = 0.035 = 3.5%                EER =           = 0.019 = 1.9%
         1095                                           1021

Interpretation:                                Interpretation:
3.5% of the infants born in the control        1.9% of the infants born in the
group (without MgSO4) developed                experimental group (with MgSO4)
moderate to severe cerebral palsy.             developed moderate to severe
                                               cerebral palsy.
                Exp        Ctrl    Row
              (MgSO4)   (placebo) totals:        ARR & NNT
   CP (+)        20        38          58
                                                    Absolute Risk Reduction
    CP (-)      1021      1057     2078             Number Needed to Treat
    totals:     1041      1095     2136
ARR =    CER – EER                          NNT=
ARR =    3.5% – 1.9%                                  1
                                            NNT =          = 62.5
ARR =    1.6%
Interpretation:                             Between 62 and 63 pre-term mothers
Use of MgSO4 reduced the rate of            need to be treated in order to avoid
moderate to severe cerebral palsy by        one additional case of moderate to
1.6 percentage points.                      severe cerebral palsy.

Notice how the units for CER & EER are "events per person" and the units for NNT is
"people per event." This matches the intuitive interpretation of ARR & NNT.
                 Exp            Ctrl    Row
               (MgSO4)       (placebo) totals:           RR & RRR
   CP (+)           20          38       58
                                                              Relative Risk
   CP (-)       1021           1057     2078             Relative Risk Reduction
   totals:      1041           1095     2136

        EER                                              ARR
 RR=                                             RRR =
        CER                                              CER
            0.019                                        0.016
 RR =                    = 0.54 = 54%            RRR =           = 0.46 = 46%
            0.035                                        0.035
  Interpretation:                                Interpretation:
  Use of MgSO4 reduced the risk of               Use of MgSO4 reduced the risk of
  moderate to severe CP to 54% of its            moderate to severe CP by 46% of its
  original value.                                original value.

RR represents the fraction of the original risk that remains with MgSO4.
RRR represents the fraction of the original risk that is removed with MgSO4.
                Critical Appraisal
The critical appraisal of an article is the third step of
the Evidence Based Medicine process:

 1. Formulate a focused clinical question (PICO question)
 2. Search the literature for the highest level of evidence
 3. Critically appraise the article
 4. Apply the evidence to a particular patient
                Critical Appraisal
The critical appraisal includes three major questions:

1. How valid are the study results likely to be?
2. What are the results?
3. How can the results be applied to a particular patient?
               Validity Questions
There are five sub-questions that address study validity:

1. Were subjects randomized?
2. Were experimental and control groups similar?
3. Who was blinded (allocators, subjects, treatment providers,
   outcome assessors)
4. Was an Intention to Treat Analysis performed?
5. How complete was follow up?
               Validity Questions
1. Were subjects randomized?
• Methods of randomization should be explicitly stated.
• Ambiguities in the randomization process might allow the
  allocator to influence group assignment.
• Lack of allocation concealment can un-blind participants.

In the hypothermia study, see the last paragraph of the
   "Methods" section.
• Subjects were randomized using a telephone based system.
• "Blocking" refers to the practice of including equal numbers of
   experimental and control group assignments in "blocks" of a
   certain size. This ensures no site gets a disproportionate
   number of experimental or control subjects.
              Validity Questions
2. How similar were the experimental and control

 • If randomization was effective, each group of subjects
   should begin the study with similar prognostic factors.
 • See Table 1 of the hypothermia article. While there were
   some differences in CT findings and other types of
   injuries, in general both groups were relatively similar.
             Validity Questions
3. Who (group allocators, subjects, clinicians, and
   outcome assessors) was blinded?
• Blinding minimizes different types of bias that can
  influence how subjects are treated or assessed based on
  knowledge of which treatment they are receiving.

• See the Methods section.
• The physician allocator was blinded.
• It was not possible to blind the treatment providers.
• Blinding of subjects was not mentioned, although given
  the type of injury, it would not seem to be relevant.
• Outcome assessors were blinded ("without knowledge of
  the treatment assignments.")
                  Validity Questions
4. Was an intention to treat analysis performed?
• An ITT analysis is meant to avoid bias due to loss to follow up or
  crossover that could potentially load the experimental or control group
  with subjects who are doing particularly well or particularly poorly.
• The ITT analysis is not an ideal solution. It requires values to be
  estimated which reduces the validity of the study.
• See the third paragraph of the "Statistical analysis" section and the first
  paragraph of the "Study outcomes" section.
• The authors report performing an intention to treat analysis (yielding
  more unfavorable outcomes in the hypothermia group), yet the reported
  Relative Risk and confidence interval for the primary outcome (see Table
  3) were based only on subjects who completed the study.
• Authors commonly report performing an intention to treat analysis, yet
  focus on the "on protocol" analysis when reporting outcomes, especially
  if the ITT analysis does not change the overall study results
               Validity Questions
5. How complete was follow up?
• In order to perform an Intention to Treat analysis, missing
  values must be conservatively estimated.
• Every estimated value decreases the validity of the study.
• Even if an ITT analysis was performed, a large loss to follow
  up rate would still cause concern about the study’s validity.
• See the first paragraph of the "Study outcomes" section.
• At six months, 9% of the subjects were lost to follow up.
• While this is not an unusual loss rate for a long-term study, a
  loss of nearly 1 patient in 10 might be expected to change
  the outcome, based on the type of estimation done.
• See the first paragraph of the "Study Outcomes" section to
  see how different assumptions changed the outcome
  analysis of this study.
            Results Questions
Two sub-questions address the results of the study:

1. How large was the point estimate of the treatment
2. How precise was the confidence interval for the
   treatment effect?
               Results Questions
1. How large was the point estimate of the treatment
• This question deals with both statistical significance and
  clinical significance.
• If results are not statistically significant, then the question
  of clinical significance is not relevant.
• If results are statistically significant, then it is important
  that the treatment effect be clinically significant as well.
• See the primary outcomes in Table 3.
• The point estimate for RR was 1.41. This indicates that
  the hypothermia group was 1.41 times more likely to
  have an unfavorable outcome (death, persistent
  vegetative state, or severe disability) than the
  normothermia group.
                Results Questions
2. How wide or narrow is the confidence interval?
The width of the confidence interval is important for 2 reasons:
   1) Even if there is a treatment effect, a wide CI is more likely
      to cross the “zero effect” line than a narrow one.
   2) A narrow confidence interval has less uncertainty in the
      actual value of the treatment effect than a wide one.
• In the hypothermia study, the reported 95% confidence
  interval is 0.89-2.22.
• Since this confidence interval includes 1, the RR for the entire
  population might be 1, so this result is not statistically
• Compared to similar studies, this confidence interval is
  average in width.
   Patient Applicability Questions
These questions help decide how to apply the results
of the study to a particular patient:

1. How similar were the study patients to my patient?
2. Were all clinically relevant outcomes considered?
3. Do the benefits outweigh the costs and potential for harm?
              Patient Applicability
1. How similar were the study patients to my patient?
• Clinical judgment must determine what constitutes a
  relevant difference or how to interpret the study’s findings in
  light of a difference.
• See the first part of the "Methods" section. Subjects were
  between 1 and 17 years old, with clinical and CT evidence
  of acute (<8 hrs), TBI requiring mechanical ventilation.
• Consider the location of the study. This study took place at
  17 centers in Canada, France, and the UK.
• Also consider your practice and the treatment options
  available to you.
• The "Limitations" section describes some patient
  characteristics, such as acuity of injury and duration of
  hypothermia therapy that could possibly influence
             Patient Applicability
2. Were all clinically relevant outcomes considered?
• Consider three types of outcomes: Clinically beneficial
  outcomes; Side effects and adverse events; and social
  factors such as cost, convenience, and expertise needed.

• Table 3 shows the primary and secondary outcomes
  considered in this study. Numerous outcomes were

• Cost was not included in the analysis, although this is not
  too important since the article concludes that the treatment
  itself does not improve outcomes.
           Patient Applicability
3. Do the benefits outweigh the costs and potential
for harm?
• This question can be quite subjective and must be
  considered in light of an actual patient.
• Factors such as effect size, relative risk, NNT,
  potential side effects, and cost need to weighted
  against patient preference, convenience, and
• The last paragraph of the paper concludes that
  hypothermia therapy does not appear to be