Learning Center
Plans & pricing Sign in
Sign Out

Why do we need Randomsied Controlled Trials - Teaching and


									        Why do we Need
      Randomised Controlled
David Torgerson
Director, York Trials Unit
              What works?
• In most areas, education, health, criminal
  justice, etc, we want to know WHAT or
  WHETHER something works.
  » Do ‘bootcamps’ reduce criminal behaviour?
  » Are teaching volunteers effective?
  » Are computers effective at improving literacy
    and numeracy?
• Of secondary importance is HOW.
        The WHAT question
• The ONLY way we can find out whether
  something works or not is by using a
• All other evaluative methods are
  INFERIOR ways of answering the WHAT
  question and some cannot answer it at all
  (e.g., qualitative research).
        Structure of Session
• Randomised Controlled Trials ARE the
  ‘gold-standard’ evaluation method.
  » What is wrong with other research methods?
  » Why should we do trials
Before and After Methods
    Clinical Practice in the           18 th

• "It is incident to physicians, I am afraid,
  beyond all other men, to mistake
  subsequence for consequence."

  Samuel Johnson, 1734
• Traditionally most interventions have been
  evaluated using a pre-test post-test or before
  and after design.
• Participants are tested treated and then tested
  again any improvements are attributable to the
• Currently this is probably the most POPULAR
  evaluative method in most fields.
     Who uses before and after?
•   Policy makers
•   Teachers assessing individual children.
•   Action researchers.
•   Parents
•   Lecturers
•   We all do.
• Problems include:
  » Temporal changes;

  » Regression to the mean.
         Temporal Change
• Self-learning irrespective of teaching
• As children mature they will become better
  at learning.
• Any intervention or treatment is mixed up
  with these temporal changes difficult to
       Changes in Outcomes

• If we measured outcome on public
  examination results we will see an
  improvement. Is this because the
  intervention has worked? Or is it because
  exams have got easier? Or have children
  become more intelligent?
• Without a control group we CANNOT know.
     Regression to the Mean
• As well as temporal changes before and
  after studies are confounded by a
  statistical phenomenon known as
  ‘Regression to or towards the mean’
     Regression to the mean
• This is a GROUP phenomenon and occurs
  when the group are measured with an
  inexact measurement tool and then
  remeasured. Those individuals with
  ‘extreme’ values will have a high
  probability of regressing towards the mean
  on the second measurement.
            History of RTM
• Galton’s work from 1869 started to provide
  the understanding of the phenomenon.
• By 1886 Galton had described the
  phenomenon among the heights of
  children and their parents (children of tall
  parents tend to be shorter and vice versa
  – regression to mediocrity).
        Economists and RTM
• “I suspect that the regression fallacy is the
  most common fallacy in the statistical
  analysis of economic data”
      Milton Friedman 1992
       Marking Exam Scripts
• For MSc in Health Sciences system of
  double marking markers are blind to
  student identity and the other marker’s
• There is a tendency to disagree with
  marks at the extreme of the distribution.
• Explanation: Regression to mean.
      RTM and exam scripts

             R = 0.7788

      0             50       100

 Annual Increase in offences
        with firearms
     2000   2001/02       2002/03
      Did the Amnesty work?
• Unclear, the year preceeding the amnesty
  had a large, unexpected, increase in
  offences, we would expect through
  regression to the mean that in the
  following year the rate of increase would
  ‘regress’ back to towards the ‘average’
  annual increase.
        Education intervention
• Wheldall selected 40 pupils whose reading
  was at least 2 years behind their peers.
• Half were exposed to an intervention.

 Wheldall Educational Review 2000;52:29.
       Before and after reading


 90                                                     Control


             Pre                  Post

Difference highly statistically significant p < 0.001
      Before and after reading



 95                                               Intervention


             Pre                Post

Differences between groups NOT statistically significant
       RTM misunderstanding
• “the mean gain scores translated to impressive
  effect sizes of 0.6.”
• “It could be argued that it is asking too much of
  any program to demonstrate enhanced efficacy
  on top of such high existing efficacy”
• “…control group gains were largely attributable
  to pre-existing …literacy programme..”
• Perhaps, BUT much of the gain will be due to
  RTM and School Exclusions
• A qualitative and before and after
  evaluation of an intervention to reduce
  school exclusions said
  » “an RCT would not have been able to
    adequately address fundamental problems
    concerning the reliability and validity of
    quantitative data in relation to exclusions”
          Flawed Methods
• Selected schools with HIGH exclusion
  rates on which to intervene. Therefore we
  would EXPECT exclusions to fall.
• They did by 15%.
• BUT schools with the fewest exclusions
  INCREASED exclusions by 55% whilst
  schools with the highest exclusions had a
  fall of 32%.
• In England, part of the KS3 Strategy
• Backed by Government and private
• ‘Mentoring’ means a lot of different things
• Research evidence is
  » Case studies
  » Feelings and perceptions of participants
  » Completely inadequate to infer impact
       Neil Appleby’s Experiment
• A randomised controlled trial involving 20
  underachieving Y8 (12-13 year-old) students
• Matched in pairs on ability and gender
• Randomly allocated: in each pair, one mentored, the
  other not
• Mentored group had 20 mins individually every two
  weeks (11 sessions)
  » ‘It nearly killed me’
  » Cost estimated at between £170 and £410 per mentored
    pupil, represents between 8-19% of the school’s annual per
    pupil funding for the whole of their education
  What the teachers said about
   the mentored students …
• “**** is a changed person this year she
  has progressed greatly and is a superb
  helpful student.”
• “Better now, has achieved more, more
• “Generally a great improvement recently.”
• “****’s attitude and effort have improved
  over the year. He is a lot pleasanter and
  more willing to participate in lessons
  particularly oral work, he responds well to
 What they said about the control
            group …
• “Has improved overall this term.”
• “****’s attitude and effort have improved over
  the last few months, she is now trying very hard
  to achieve her target. Great effort.”
• “Commended for attitude and progress.”
• “**** has settled since the beginning of the
• “**** has undergone quite a transformation
  since September. Her attitude towards the
  teacher and her learning have improved
  drastically and she should be congratulated.”
               Change in Teachers’ Ratings
               of progress, effort and attitude
               (English, maths and science combined)
                                                              group mean

                                                          +   group median



-6             -4    -2    0          2           4   6       8              10

                           Overall rating of change
          What this proves
• If you identify a group of underachieving
  pupils at a particular time and then come
  back to them after a few months, many of
  them will have improved, whatever you
• Others (the ‘hard cases’) will not have
  improved, whether mentored or left alone.
• The interpretation of this would have been
  very different without a ‘control’ group
      RTM and League Tables

• RTM GREAT for Governments to help the
  credulous into believing what they do works.
• In any league table those at the bottom will tend
  to ‘regress’ upwards to the mean whilst those at
  the top regress down. This lends support to
  naming and shaming or extra financial help to
  those at the bottom.
         Dealing with RTM
• The only way to reliably deal with the
  problem is through randomised trials.
• Which is why before and after data are
  generally regarded, by the congnescenti,
  as almost USELESS.
   History of Controlled Trials
• Because of temporal and regression to
  mean effects we MUST have a control
• Many researchers over the centuries have
  seen the need for a ‘control’ group to avoid
  the inherent biases in the before and after
• Controlled trials have been conducted for
  several hundred years probably
  occasionally using randomisation.
• Scurvy was a very prevalent condition
  among sailors before the 19th Century.
• A controlled trial in the middle of the 18th
  Century of 12 sailors showed that the two
  sailors allocated to receive lime or orange
  juice recovered and were able to care for
  their ship mates allocated to vinegar or
  salt water.
      Lack of Dissemination

• An even earlier trial in scurvy prevention
  used a ‘cluster’ design whereby a whole
  ship’s crew were allocated citrus fruit and
  were compared with two ships’ crews who
  were not.
• The treatment worked but lesson forgotten.
• After second trial took Navy 50 years to
  implement results
• Fisher is usually thought of as the
  originator of randomisation in the 1920s in
  agricultural experiments.
• He was concerned with the statistical
  properties of ‘randomness’ as well as the
  formation of unbiased groups.
• In 1937 a classic experiment – the
  Cambridge-Somerfield trial was launched.
• The aim was to show that social worker
  intervention among ‘delinquent’ boys
  would reduce ‘criminality’.
• 650 boys were identified by their teachers
  as having delinquent behaviour that put
  them at later risk of criminal activity.
• 325 pairs were formed and one from each
  pair was allocated a social worker
  supported by psychiatrists.
  Results – early follow-up % of
    boys indulging in crime.






       Property   Assault       Sex     Drunk   Traffic

Green bar indicates intervention grop
     Results later follow-up

• In 1975 ‘boys’ were followed up again
  when middle aged men.
• 58% of intervention group had NOT had a
  criminal conviction
• BUT 68% of control group had NOT had a
• If a control group had not been used
  success of the intervention would be
    Consequences of the Trial
• The social work profession largely
  ABANDONED the RCT as a method of
  evaluation as it failed to give the RIGHT
               RCTs and education
   • Lindquist writing about experimental
     methods in 1940 argued that advanced
     text books use “all illustrations given are in
     the field of agricultural experimentation
     and are concerned with “plots” “blocks”
     “yields” “treatments” etc, rather than with
     “schools” “classes” “scores” “methods”
     “pupils” etc.”

Lindquist Statistical Analysis in Educational Research, 1940.
 The Importance of Design in Educational
        Experiments (Lindquist)
• In 1940 in his book on statistics in educational
  research Lindquist quite clearly describes
  appropriate RCTs for educational research.
• His book is also the first description of the
  appropriate techniques to be used in analysing
  pupils scores in classes (I.e, cluster analysis),
  which was an advance on Fisher’s Design of
             Cluster analysis
• In health statistics Lindquists statistical methods
  were largely ignored until the late 1980s when it
  became accepted to use the methods he
  advocated to analyse clustered data although
  even now most cluster trials are badly analysed.
• But 64 years on what about his descriptions on
  how to rigorously evaluate educational
         Educational Trials: UK
• Not many trials in education have been
  undertaken in the UK.
• Most educational trials are from the USA.
• WHY? (my personal view)
  »   Futility of the ‘paradigm war’;
  »   Failure to understand their importance;
  »   Trials often give the ‘wrong’ answer;
  »   Lack of funding.
       Opposition to Trials is
• In health care many doctors will refuse to
  believe the results of a trial and argue the
  trial was faulty or poorly conducted if the
  result was ‘wrong’.
• Recent example: WHI study of hormone
  replacement – many doctors REFUSE to
  accept the findings of this study that it
  INCREASES risk of heart disease.
     Opposition to Polio Trial
• “I found but one person who rigidly
  adherred to the idea of a placebo control
  and he is a bio-statistician who, if he did
  not adhere to this view, would have had to
  admit his own purposelessness in life”
  (Jonas Salk).
               1950s to 1970s
• The use of trials expanded rapidly within and
  beyond medicine.
• In the social sciences experiments included:
  »   Negative income tax;
  »   Adoption;
  »   Busing;
  »   Public vs private schools;
  »   Prevention of spousal abuse.
          Health Care Trials
• Although ALL new medicines have to be
  evaluated using RCTs many medical
  treatments do not.
• HOWEVER, health care is ‘fortunate’
  because we bury our disasters we KNOW
  how important trials are as a protection for
       Health Care Disasters
• Opposition to RCTs has declined over the
  years, partly due to a number of
  catastrophes, from unevaluated
• Harmful treatments are still in widespread
  use today – we just don’t know which
      Disasters among babies
• Routine practice in 40s and 50s to give
  premature infants pure oxygen. At the same
  time it was noted that there was an ‘epidemic’ of
  blindness among babies. Linked to oxygen use.
• Routine practice in 50s to give prophylactic
  antibiotics to premature infants, caused brain
  damage and death.
• BOTH of these problems only discovered
  AFTER an RCT was undertaken.
             Trial sabotage
• Interestingly an early trial of pure oxygen
  for neonates was sabotaged by nurses
  who secretly gave oxygen to some of the
  controls because they KNEW that it was
• Because of this ARROGANCE they
  contributed to the blinding of healthy
       Educational Disaster?
• On the basis of ‘before and after’ and
  anecdote widespread implementation of
  driver education (in the USA) among older
  pupils was implemented.
• It was thought that this would reduce car
• Did it? Fortunately, some ‘sceptics’
  undertook a series of trials in the USA.
    Driver Education - Results
• Roberts and colleagues (see Campbell
  Collaboration) reviewed these trials and
  undertook a meta-analysis.
• They found that driver education
  INCREASED the likelihood of deaths in
  car accidents as it increased the
  prevalence of young motorists.
          UK Policy makers
• Have IGNORED these results and
  implemented driver education in some
• This will directly increase deaths among
  young drivers.
        Computers in Schools
• Introduction of computers into schools has not
  been preceded by large RCTs.
• The best evidence we have is from a ‘quasi-
  experiment’ from Israel, which showed that
  introduction of computers into half the state
  schools led to no change in Hebrew literacy but
  a DECLINE in maths.
• The Israeli Government has since introduced
  computers into all schools!!!
        Volunteers in Schools
• The use of volunteers to help children learn to
  read is widespread – but are they effective?
• In a systematic review of RCTs only 7 trials
  could be identified with largest with ONLY 99
• The effect of volunteering was very slight (0.19, -
  0.31 to 0.68) and not statistically significant.

       Torgerson et al. 2002 Ed Studies, 28 No 4.
• Virtually all new interventions need to be
  evaluated using RCTs.
• Unlike health care children are compelled
  to have education. Therefore it is even
  more urgent that they should not be
  exposed to ineffective educational
We need more trials

To top