Randomized Controlled Trials GME Evidence Based Medicine Course Module 4 Randomized Controlled Trials Objectives At the completion of module 4, residents should be able to: • Describe the design of a Randomized Controlled Trial. • Describe the importance of randomization and a control group in the RCT design • Define blinding and list four levels at which studies are blinded. • Define an "Intention to Treat" analysis and describe why it is important. • Calculate and be able to describe the meaning of the following parameters: • Control Event Rate (CER) and Experimental Event Rate (EER) • Absolute Risk Reduction (ARR) & Number Needed to Treat (NNT) • Relative Risk (RR) and Relative Risk Reduction (RRR) • Critically appraise an article about a Randomized Controlled Trial Definition & Characteristics Randomized Controlled Trial: A study design in which subjects are assigned at random to receive one of at least two different treatments so that differences in outcomes between the different treatments can be estimated. Screening Randomization Treatment Outcome assessment • Subjects are screened to make sure they meet study criteria. • Subjects are randomized to the experimental or control group. • By convention, the "experimental" group receives the newer or more novel treatment. • The "control" group(s) receive placebo or standard treatment. • After randomization, subjects receive the treatment appropriate for their group. • Outcomes are assessed following treatment. Definition & Characteristics Characteristics: • Controlled. The control group allows differences in outcomes to be attributed to the differences in treatment types, rather than natural history or an effect that is solely determined by the passage of time. • Randomized. Random assignment is meant to equally distribute prognostic factors (known, unknown, and uncontrollable) between the experimental and control groups. • Blinding. Keeping knowledge of group allocation secret reduces the probability of the experimental and control groups being treated or assessed differently. • Prospective design. Group assignment occurs prior to receipt of treatment, and receipt of treatment occurs prior to assessment of outcomes. This decreases the probability that knowledge of outcomes can selectively skew patient selection, study conduct, or handling of study information. Definition & Characteristics Disadvantage: Low generalizability. Experimental conditions are so tightly controlled that resemblance to real-world conditions is low. • RCTs commonly over-estimate effectiveness • RCTs may be better at ranking the efficacy of different treatments rather than realistically estimating their effectiveness. • Costly and difficult to perform. • Not ethical or practical in many situations Concepts The critical appraisal of Randomized Controlled Trials requires familiarity with the following six concepts: • Blinding • Volunteer Bias • Crossover • Intention to Treat analysis • Dichotomization • Dichotomized statistical measures Blinding Blinding is the process of keeping group allocation secret from: • The person performing group allocation • The subject • The person providing treatment • The person assessing outcomes "Double blind." This is an ambiguous and antiquated term that is best avoided since it does not specify which of the four levels were blinded. • Instead, explicitly list the roles that are blinded • Nonetheless, you’ll still see the term used widely, either out of habit, ignorance, or expectations. Allocation concealment. A special name given to the blinding of the person who allocates subjects to one of the study groups. Without allocation concealment, the allocator might: • Inadvertently un-blind the patient (or other roles). • Influence the outcome of the randomization (such as allocating the sicker-appearing subjects to the experimental group). Volunteer Bias Volunteer bias: Reasons for dropping out of (or remaining in) a study can bias the results of the study. Examples of Volunteer bias: • Sicker patients in the control group may drop out due to death, disability, or other morbidity. May cause an effective treatment to appear ineffective. • Patients in the experimental group who improve may recover and loose interest in the study. Causes an effective treatment to appear ineffective. • Patients in the experimental group with significant side effects may drop out. Only those with a robust improvement remain, causing treatment to appear more effective than it really is. • Actual reasons for dropouts can’t be known in advance (otherwise, you probably wouldn’t need to do the study to begin with), so knowing how dropout might influence results can’t necessarily be predicted. Crossover In many studies, when a subject is doing poorly on one treatment, clinical needs prevail and it is not uncommon for the clinician to placed the patient on the alternative treatment for clinical purposes. • This may unfairly “load” the alternative treatment with patients experiencing poor outcomes. This may cause the alternative treatment to appear less efficacious than it really is. • It may also leave only the patients who are doing well in the first treatment group. This may cause the first treatment to appear more efficacious than it really is. • Taken together, this would increase the apparent difference in efficacy between the two groups. Intention to Treat Analysis An Intention to Treat Analysis is one way of controlling for Volunteer bias and Crossover effects. • "Once randomized, always analyzed." • In order to avoid drop out or crossover from biasing study results, the data from subjects must in analyzed according to the group the subject was randomized to, not the group subjects were treated in. • When subjects have dropped out, their data must be conservatively estimated using one of several methods: 1) Substitute the worse value for the entire study for any missing values 2) Substitute the worse value taken for that patient for any missing values for that subject 3) "Last value carried forward": Substitute the subject's last known measurement for all subsequent, missing measurements. Any type of estimation will reduce the validity of a study, so it's important to keep dropout rates low. Dichotomization Common analyses of RCT data require outcomes to be classified as either "present" or "absent." • Some outcomes such as death, stroke, or pregnancy are naturally dichotomous (present or not). • Sometimes events are dichotomized as a "success" or a "failure" instead of "present" and "absent." • Other outcomes that are naturally continuous (e.g., length of hospital stay, blood pressure, pain score) can be dichotomized by the selection of a “cutoff” score that separates successes from failures. • A "two by two" table often summarizes results Dichotomization Example: In a study by Rouse et al.*, women were administered magnesium sulfate a few hours prior to delivery of pre-term (24-31 week) infants. Infants were evaluated for cerebral palsy at 2 years of age. Results can be summarized in the following table: Experimental Control Group Group (MgSO4) (placebo) Row totals: Cerebral 20 38 58 palsy (+) Cerebral 1021 1057 2078 palsy (-) Column 1041 1095 2136 totals: *Rouse DJ, Hirtz DG, Thom E, et al. A randomized, controlled trial of magnesium sulfate for the prevention of cerebral palsy. NEJM 2008. 359(9):895-905. Statistical Measures Once outcomes have been dichotomized, six statistical measures are used to describe the treatment effect: • Control Event Rate (CER) & Experimental Event Rate (EER) • Absolute Risk Reduction (ARR) & Number Needed to Treat (NNT) • Relative Risk (RR) & Relative Risk Reduction (RRR) Exp Ctrl Row (MgSO4) (placebo) totals: CER & EER CP (+) 20 38 58 Control Event Rate CP (-) 1021 1057 2078 Experimental Event Rate Column totals: 1041 1095 2136 number of events in ctrl group number of events in exp group CER = EER = number of subjects in ctrl group number of subjects in exp group 38 20 CER = = 0.035 = 3.5% EER = = 0.019 = 1.9% 1095 1021 Interpretation: Interpretation: 3.5% of the infants born in the control 1.9% of the infants born in the group (without MgSO4) developed experimental group (with MgSO4) moderate to severe cerebral palsy. developed moderate to severe cerebral palsy. Exp Ctrl Row (MgSO4) (placebo) totals: ARR & NNT CP (+) 20 38 58 Absolute Risk Reduction CP (-) 1021 1057 2078 Number Needed to Treat Column totals: 1041 1095 2136 1 ARR = CER – EER NNT= ARR ARR = 3.5% – 1.9% 1 NNT = = 62.5 0.016 ARR = 1.6% Interpretation: Interpretation: Between 62 and 63 pre-term mothers Use of MgSO4 reduced the rate of need to be treated in order to avoid moderate to severe cerebral palsy by one additional case of moderate to 1.6 percentage points. severe cerebral palsy. Notice how the units for CER & EER are "events per person" and the units for NNT is "people per event." This matches the intuitive interpretation of ARR & NNT. Exp Ctrl Row (MgSO4) (placebo) totals: RR & RRR CP (+) 20 38 58 Relative Risk CP (-) 1021 1057 2078 Relative Risk Reduction Column totals: 1041 1095 2136 EER ARR RR= RRR = CER CER 0.019 0.016 RR = = 0.54 = 54% RRR = = 0.46 = 46% 0.035 0.035 Interpretation: Interpretation: Use of MgSO4 reduced the risk of Use of MgSO4 reduced the risk of moderate to severe CP to 54% of its moderate to severe CP by 46% of its original value. original value. RR represents the fraction of the original risk that remains with MgSO4. RRR represents the fraction of the original risk that is removed with MgSO4. Critical Appraisal The critical appraisal of an article is the third step of the Evidence Based Medicine process: 1. Formulate a focused clinical question (PICO question) 2. Search the literature for the highest level of evidence available 3. Critically appraise the article 4. Apply the evidence to a particular patient Critical Appraisal The critical appraisal includes three major questions: 1. How valid are the study results likely to be? 2. What are the results? 3. How can the results be applied to a particular patient? Validity Questions There are five sub-questions that address study validity: 1. Were subjects randomized? 2. Were experimental and control groups similar? 3. Who was blinded (allocators, subjects, treatment providers, outcome assessors) 4. Was an Intention to Treat Analysis performed? 5. How complete was follow up? Validity Questions 1. Were subjects randomized? • Methods of randomization should be explicitly stated. • Ambiguities in the randomization process might allow the allocator to influence group assignment. • Lack of allocation concealment can un-blind participants. In the hypothermia study, see the last paragraph of the "Methods" section. • Subjects were randomized using a telephone based system. • "Blocking" refers to the practice of including equal numbers of experimental and control group assignments in "blocks" of a certain size. This ensures no site gets a disproportionate number of experimental or control subjects. Validity Questions 2. How similar were the experimental and control groups? • If randomization was effective, each group of subjects should begin the study with similar prognostic factors. • See Table 1 of the hypothermia article. While there were some differences in CT findings and other types of injuries, in general both groups were relatively similar. Validity Questions 3. Who (group allocators, subjects, clinicians, and outcome assessors) was blinded? • Blinding minimizes different types of bias that can influence how subjects are treated or assessed based on knowledge of which treatment they are receiving. • See the Methods section. • The physician allocator was blinded. • It was not possible to blind the treatment providers. • Blinding of subjects was not mentioned, although given the type of injury, it would not seem to be relevant. • Outcome assessors were blinded ("without knowledge of the treatment assignments.") Validity Questions 4. Was an intention to treat analysis performed? • An ITT analysis is meant to avoid bias due to loss to follow up or crossover that could potentially load the experimental or control group with subjects who are doing particularly well or particularly poorly. • The ITT analysis is not an ideal solution. It requires values to be estimated which reduces the validity of the study. • See the third paragraph of the "Statistical analysis" section and the first paragraph of the "Study outcomes" section. • The authors report performing an intention to treat analysis (yielding more unfavorable outcomes in the hypothermia group), yet the reported Relative Risk and confidence interval for the primary outcome (see Table 3) were based only on subjects who completed the study. • Authors commonly report performing an intention to treat analysis, yet focus on the "on protocol" analysis when reporting outcomes, especially if the ITT analysis does not change the overall study results Validity Questions 5. How complete was follow up? • In order to perform an Intention to Treat analysis, missing values must be conservatively estimated. • Every estimated value decreases the validity of the study. • Even if an ITT analysis was performed, a large loss to follow up rate would still cause concern about the study’s validity. • See the first paragraph of the "Study outcomes" section. • At six months, 9% of the subjects were lost to follow up. • While this is not an unusual loss rate for a long-term study, a loss of nearly 1 patient in 10 might be expected to change the outcome, based on the type of estimation done. • See the first paragraph of the "Study Outcomes" section to see how different assumptions changed the outcome analysis of this study. Results Questions Two sub-questions address the results of the study: 1. How large was the point estimate of the treatment effect? 2. How precise was the confidence interval for the treatment effect? Results Questions 1. How large was the point estimate of the treatment effect? • This question deals with both statistical significance and clinical significance. • If results are not statistically significant, then the question of clinical significance is not relevant. • If results are statistically significant, then it is important that the treatment effect be clinically significant as well. • See the primary outcomes in Table 3. • The point estimate for RR was 1.41. This indicates that the hypothermia group was 1.41 times more likely to have an unfavorable outcome (death, persistent vegetative state, or severe disability) than the normothermia group. Results Questions 2. How wide or narrow is the confidence interval? The width of the confidence interval is important for 2 reasons: 1) Even if there is a treatment effect, a wide CI is more likely to cross the “zero effect” line than a narrow one. 2) A narrow confidence interval has less uncertainty in the actual value of the treatment effect than a wide one. • In the hypothermia study, the reported 95% confidence interval is 0.89-2.22. • Since this confidence interval includes 1, the RR for the entire population might be 1, so this result is not statistically significant. • Compared to similar studies, this confidence interval is average in width. Patient Applicability Questions These questions help decide how to apply the results of the study to a particular patient: 1. How similar were the study patients to my patient? 2. Were all clinically relevant outcomes considered? 3. Do the benefits outweigh the costs and potential for harm? Patient Applicability 1. How similar were the study patients to my patient? • Clinical judgment must determine what constitutes a relevant difference or how to interpret the study’s findings in light of a difference. • See the first part of the "Methods" section. Subjects were between 1 and 17 years old, with clinical and CT evidence of acute (<8 hrs), TBI requiring mechanical ventilation. • Consider the location of the study. This study took place at 17 centers in Canada, France, and the UK. • Also consider your practice and the treatment options available to you. • The "Limitations" section describes some patient characteristics, such as acuity of injury and duration of hypothermia therapy that could possibly influence outcomes. Patient Applicability 2. Were all clinically relevant outcomes considered? • Consider three types of outcomes: Clinically beneficial outcomes; Side effects and adverse events; and social factors such as cost, convenience, and expertise needed. • Table 3 shows the primary and secondary outcomes considered in this study. Numerous outcomes were considered. • Cost was not included in the analysis, although this is not too important since the article concludes that the treatment itself does not improve outcomes. Patient Applicability 3. Do the benefits outweigh the costs and potential for harm? • This question can be quite subjective and must be considered in light of an actual patient. • Factors such as effect size, relative risk, NNT, potential side effects, and cost need to weighted against patient preference, convenience, and values. • The last paragraph of the paper concludes that hypothermia therapy does not appear to be warranted.