1. Early Identification of Risk 2. An Evaluation of the Use of Irlen
The first and second studies summarised here describe the challenges and
some possible approaches in respect of early screening for the risk of specific
The third summary describes the ( non-supportive ) outcomes of a systematic
evaluation of the use of coloured overlays following Irlen diagnostic
1. Early Identification of Risk 2. An Evaluation of the Use of Irlen
Identifying Children in Need of Early Reading Intervention
The review completed by McAlenney and Coyne (2011) begins by citing the
No Child Left Behind Act ( passed in the USA in 2001 ) which includes the
requirement for early identification of children at risk for reading difficulties and
their access to intervention which has a clear evidence base.
The implication is for a tiered system of teaching within which progress is
carefully monitored and the nature and intensity of the progress is based upon
Tier 1 may be described as the provision available for all children which is part
of the general curriculum. The contents of this early stage are based upon
the recommendations of the National Reading Panel, reporting in 2000, which
highlighted the particular significance of 5 skill areas for the early teaching
and learning of reading … phonemic awareness, phonics, fluency,
vocabulary, and comprehension.
Most children respond positively to this initial and universal teaching, but
some children do not achieve adequate progress and would be considered “at
risk”. More intensive teaching is required for these children as a Tier 2
Common components of Tier 2 programmes include a longer period of
teaching of basic reading skills, a more explicit and “scaffolded” approach,
and smaller groups or one-to-one support.
Critical for the success of a tiered system is the accurate and early
identification of those children who need the additional and more intensive
teaching. However, the development of a viable screening procedure for
children in the very early stages of schooling has proven a major challenge.
If the children are only at the beginning of formal programme of reading
teaching, they will not score on traditional tests which tap word or text reading
; and the reading intervention needs to be in place well before the children
reach the age at which they would normally be expected to produce
meaningful results on such tests.
Accordingly, screening measures have focused upon the component skills
deemed necessary for reading acquisition … typically letter-name knowledge
and phonological awareness in which children’s scores have been shown to
be moderately or strongly correlated with later performance in reading ( see,
for example, Scarborough 1998). . Combinations of scores from different
measures ( such as initial sound fluency, letter naming, and phoneme
segmentation fluency ) increase the predicative accuracy of the early
screening for reading.
However, it has been argued that the correlations between early screening
scores and later scores on reading tests are not sufficient to evaluate the
effectiveness of screening procedures. Even if the correlations are medium or
high, the screening may not consistently identify those children who are at risk
for reading difficulty. There may be insufficient sensitivity or specificity.
Sensitivity of a measure refers to the proportion of children who present with
reading difficulties having been identified as at risk by that screening measure
( i.e. the success rate in making true-positive identifications ).
Specificity refers to the proportion of children who achieve normal reading
progress who were correctly identified by the screening measure as not being
at risk ( i.e. the success rate in making true-negative identifications ).
The converging evidence cited by the current authors indicates that early
screening measures tend to over-predict an at-risk status … i.e. to produce
significant numbers of false-positive identifications.
This may reflect the use of rather generous criteria or cut-off points in
determining which children are at risk as a means of ensuring that no child in
this category is missed … and this increases the likelihood of false positives.
The desirability of early intervention is well established but these possible
inaccuracies in the screening processes make for difficulties in decision-
making on the part of school staff. Where the groups for Tier 2 intervention
are large because of the inclusion of some false-positives, there will be
financial and logistical pressures. There is the additional concern that
incorrect labelling of children as likely to have difficulties may be damaging to
the confidence and self esteem of these children, inappropriately influence
opinions and expectations held by their teachers, and cause anxiety to the
The current state of play appears to involve some trade-off between timing of
screening and the accuracy of outcomes. Early screening ensures rapid
access to intervention but may well produce a number of false-positives ; later
screening is more accurate in identifications but will introduce a delay before
intervention is provided.
The current authors go on to describe 5 alternative approaches with their
various advantages and disadvantages
Immediate intervention involves screening children at the start of the
kindergarten year so that intervention to reduce or avert reading disabilities
can start as early as possible.
The problem here is that there has not yet emerged a consensus concerning
how best to carry out screening among very young children. Existing trials
described in the literature have adopted different timing, different measures,
and different cut-off scores. Further, the earlier the screening, the greater is
the probability of large numbers of false-positives.
Waiting to screen involves allowing for greater exposure of children to
standard reading instruction, or delaying the procedure until the assessments
more closely reflect actual reading tasks.
Existing evidence has shown that this approach produces fewer children
wrongly identified as at risk of reading disability. This may reflect the
opportunity to adopt various measures within the screening ( including
phonological awareness, phonic fluency, and simple word reading ) and
teachers’ reports and recommendations.
However, the advantage gained in respect of fewer false-positives is
countered by the delay before additional intervention is initiated. A lengthy
delay to ensure accuracy of screening is not seen as a very satisfactory trade-
Waiting for response is a matter of closely monitoring responsiveness to initial
and standard instruction alongside the use of some actual screening
Children who initially present as at-risk according to the screening scores are
targeted for additional performance monitoring, with responsiveness to the
general Tier 1 teaching assessed over time so that the children included in
Tier 2 intervention groups are those with both poor scores on the screening
and poor performance in their response to the standard teaching ( as
reflected, for example, in their scores on a measure of word identification
fluency implemented at intervals over a relatively short period ).
It is this dual weakness … poor screening result and low slope of plotted
performance scores … that indicates the greatest risk.
However, while this approach has been shown to be more accurate in its
predictive capacity, false-positives are not eliminated and continue to be a
The combination of responsiveness to basic teaching and screening scores is
seen as a positive aid to decision-making, albeit with the concern that the low
rate of false negatives is countered by a continuing number of false positives.
Dynamic assessment differs from a standard or static assessment in that it
explores what a child can do when given different levels of support. It is not a
matter of comparing scores on a given test over time, but of measuring
performance in response to gradually increasing cues and prompts.
The basis of this approach is the belief that children who cannot manage a
task or question independently, but who can respond appropriately with some
help, will respond in much the same way to the standard teaching. Children
who initially appear at risk may make adequate progress in the classroom (
with some additional encouragement and prompting ) whereas the children
who require a constant and high level of prompting and cueing to produce a
response, or who continue to fail to produce a response, are the ones who are
likely to require the additional support of a Tier 2 intervention.
The problem here is that this kind of assessment is quite complex and not
only is its full potential yet to be fully explored but also there is the issue of the
time and expertise needed to implement it.
One suggestion ( Fuchs et al 2009 ) has been for a “double” process whereby
schools use a fairly basic screening procedure to identify children who may be
at risk, and follow up this sub-group with a dynamic assessment … although,
even here, there would still be the matter of sufficient staffing to complete the
In sum, dynamic assessment can assist in gaining greater accuracy in
identifying children at risk of reading disability and enable a relatively rapid
access to additional support ; but current procedures are quite complicated
and require longer sessions than traditional assessments, with implications for
staff time and training.
Data-based regrouping uses both observed responsiveness to instruction and
tiered levels of intervention to avoid false positives.
For example, Lentini and Coyne (2009) found that children who were strong
responders after initial screening and a 9-week period of intensified teaching
could be returned to the standard classroom routine and still be above the at-
risk range by the end of the kindergarten year. Such children might otherwise
have become false -positives. Whether or not the children could safely move
out of the intensive intervention group was determined by ongoing testing of
their level of phonemic and alphabetic skills ( the targets of the intervention ).
In this approach, all the children whose screening results suggest a possible
risk are given the additional intervention with a revision of the grouping after a
( with some children likely to be removed ). This reduces the overall size of
the intervention group, provides space for children newly emerging as
apparently at risk, and reduces demands upon staffing.
Delay in initiating intervention is avoided, and the flexibility of re-grouping
does not require any particular training or enhanced level of expertise among
However, one potentially negative issue is that all the children who appear
initially to be at risk of reading disability ( inevitably including some false
positives ) are part of the intervention group ; and a question that needs to be
explored is the length of time required accurately to identify strong responders
who can safely return to the standard classroom instruction.
In their discussion, McAlenney and Coyne argue that it is important to
establish agreement on the benefits and costs of the various methods of
identifying children in genuine need of additional support. There is a fairly
strong consensus in respect of what should be the focus of early reading
teaching ( phonemic awareness and phonic skills ) but how best to identify
children at risk of failure and in need of intensive intervention is not yet clear
… and different approaches may suit the particular needs and circumstances
of different schools
Screening procedures at the start of schooling have been the least accurate,
while delaying this process may achieve greater accuracy but loses valuable
intervention time. Progress monitoring can improve accuracy, but there
remains the question of delayed access to support.
Dynamic assessment and data-based regrouping reduce delay in initiating
intervention but each approach has considerable resource implications …
although data-based regrouping could be implemented by using, without any
great modification, the existing assessment procedures and tiered groupings
in place within schools.
It is appropriate that all children who appear to be at risk are included in
intervention groups. It is recognised that some will prove to be false positives,
but time is required to be certain who these children are, and the need for
large group size and resource expenditure is acknowledged to be a necessary
albeit short term concomitant.
Predicting Dyslexia among Children at Age 5 Years
Helland et al (2011) quote the British Dyslexia Association definition of
dyslexia as an impairment of constitutional origin closely related to
developmental speech and language impairments. Over time, it has been
recognised as a multi-factorial disability with idiosyncratic symptoms and
The implication of this kind of definition is that children have an in-born
susceptibility to dyslexia ; and neuro-scientific evidence has implicated
dysfunction in the parieto-cortical and occipito-temporal cortical networks in
the brain, but compensatory activation in anterior systems of the frontal lobes.
These brain regions are also involved in a range of language processes, such
as rapid naming, phonological awareness, and working memory.
In other words, Helland et al argue, there is scope for the identification of
dyslexic-risk signs before the start of formal literacy teaching. Their own
study explored the capacity of a questionnaire applied to children aged 5
years for predicting dyslexia directly observable in later childhood.
Their review of existing work in this domain highlights a number of possible
predictors such as early phonological awareness, language impairments,
limited skills of word identification, and a family history of dyslexia.
However, the problem has been that false-positives appear common. For
example Gabrieli (2009) estimated that around 20% of children are identified
as being at risk of dyslexia on the basis of existing screening procedures
when the proportion of children with actual dyslexia is likely to be no more
than 10% .
It may well be the case that dyslexia tends to run in families, but various
studies have typically shown that only around a third of young children
identified as at-risk are from a family with a history of dyslexia. Further, there
is some uncertainty whether the observed pattern of a greater incidence of
dyslexia among boys compared to girls is a reflection of the true situation, and
whether existing screening measures produce artefactual results because of a
focus upon early language weaknesses or delays which are more common on
In any event, converging evidence has indicated that impairments in language
and literacy are not attributable to some unitary factor but reflect idiosyncratic
and subtle combinations of a range of factors.
The impact of these underlying factors ought to be identifiable at an early age
… via information based upon the experience and observations of caregivers ;
and reports of early and possibly disadvantaging features such as premature
birth, frequent ear infections, and autoimmune disorders would be suggestive
of increased vulnerability of the children to developmental disabilities which
could include dyslexia.
In practical terms, the authors continue, questionnaires designed to be
completed by parents or carers need to be straightforward and to provide
scope for the expression of the issues causing concern.
Accordingly, their own study used a specially-developed questionnaire which
included 28 items covering 6 domains for completion by parents and pre-
school staff, on the basis of which a risk index could be calculated.
These domains were :
Health ( history of any sensory problems or infections, chronic illness,
History of asthma or allergies ; and handedness
Requirement for any special educational provision
Familial data ( presence of any language, learning, or sensory
problems within the family ).
The children involved in this study were drawn from preschools across
Western Norway, and represented urban and rural districts.
120 children, aged 5 years, were included following appropriate permissions
and official approval ; and their parents and pre-school teachers were sent
their particular versions of the risk-index questionnaire.
From this initial probe, 25 children were placed in an at-risk group, and a
control group was also established. The remaining children were not followed
Between the ages of 5 and 8, these two groups of children were assessed on
measures of reading and spelling.
When the children were aged 11 years, a further questionnaire was
completed by the parents of 22 of the at-risk group of children, and 20 of the
parents of the control group. Literacy tests were also used, and the children
were regrouped into those who could be described as dyslexic and those
showing no dyslexic-type difficulties.
The authors argued that the result from this study indicated that dyslexia can
be predicted with good accuracy before the start of formal teaching.
The higher the calculated risk on the screener questionnaire at age 5, the
greater the likelihood of poor performance on the later tests of literacy.
11 of the children in the dyslexic group identified at age 11 years were from
the at-risk group, and only 2 from the control group, so that sensitivity and
specificity were seen as high. There were 8 girls and 5 boys in this dyslexia
The risk index produced was shown to be reliable across respondents (
parents and pre-school staff ) and over time.
What was regarded as important was that the questionnaires from which
could be calculated the risk index were completed by individuals without any
expertise or experience in the field of dyslexia … and there was positive
agreement/reliability between parent and pre-school teachers ratings ( albeit
with some differences noted, explicable in terms of the different settings in
which observations were made and the likely differences in expectations ).
At the cognitive level, major benchmarks of dyslexia were identified as
impaired phonological awareness or processing, impaired short-term or
working memory, impaired verbal processing, impaired visuo-spatial skills,
and impaired language development occurring in each individual at varying
Major differences between at-risk and control groups were observed in
phonological awareness, verbal learning, and letter knowledge.
The authors noted that the at-risk signs at age 5 years were not so clear in the
girls as in the boys, and it was speculated that signs develop later in girls ( in
line with more obvious morphological changes in the brain ), with the
implication that there need to be some differences in screening methods or
criteria for girls and boys.
In their conclusion, Helland et al restated the major finding that the risk index
at 5 years appeared to have the ability to differentiate children likely to be
identified with dyslexia from those who are not. Early screening followed up
by professional testing of dyslexic benchmarks would seem to have the
potential to identify children at a stage when preventive measures can be
effective and take advantage of plasticity of brain organisation and
What is important is continuing research to determine if these promising
results are replicated thus to be able to identify and support at-risk children at
a time when they would be most responsive to the support.
Irlen Overlays and Reading Difficulties
The study by Ritchie et al (2011) set out to explore the validity of
recommendations made by Irlen Institute consultants that the use of coloured
overlays can alleviate reading problems.
Reference is made to a claim that around 12% of the general population and
up to 46% of the population with dyslexia or other learning difficulties have the
Irlen Syndrome ( also described as visual stress or scotopic sensitivity ) which
is marked by visual distortions when attempting to read text. The use of the
overlays is held to compensate for these difficulties thus aiding children to
learn to read and facilitating the use of the reading skills that have been
However, these authors note that the existence of this specified syndrome,
and the effectiveness of the overlay treatment, remain the subject of some
It has been argued ( by the American Academy of Pediatrics in 2009, for
example ) that evaluative studies have been of questionable quality, with
methodological issues inhibiting meaningful conclusions ( such as small
sample sizes, limited statistical analysis, and a lack of blinding or masking ).
Claims of improved reading by means of the overlays ( eg Noble et al 2004 )
may be contrasted by other studies ( eg Christenson et al 2001 ) reporting no
In their own study, Ritchie et al tested the short-term effectiveness of overlays
in a sample of schoolchildren with reading difficulties who had been assessed
by an Irlen consultant. The authors describe this study as being the first, to
their knowledge, which has involved children with no prior knowledge of the
prescribed colour of the overlays at the time of their participation in the
There were 75 children ( age range 7-12 ) in the initial sample, attending a
primary school in Port Glasgow, Scotland. All had been selected by their
teachers for an Irlen assessment because of their being identified as reading
below the average level for their age.
The measures used in this current evaluation included a test of general
cognitive ability ; the Wilkins Rate of Reading Test ; the Gray Oral Reading
Test covering oral reading, reading fluency, and reading comprehension.
Full sets of results were available for 61 children, 44 of whom had been
diagnosed with Irlen Syndrome.
These 44 children were the focus of the current study. Differences in reading
performance were explored for 3 conditions … using an overlay of the
prescribed colour, using an overlay of a non-prescribed colour, and using no
The authors reported that Irlen coloured overlays had no immediate clinically
or statistically significant effect on the reading ability of poor readers.
Neither the overlays of the prescribed colour, nor of the non-prescribed
colour, had any impact upon reading scores according to the measures used
in this study.
It was noted that drawing no attention to the overlays along with the masked
design may have led to the absence of any effects ( even placebo effects )
despite what might be considered the salience of the use of these overlays.
It was further recognised that this study did not examine the nature of Irlen
syndrome or its validity as a diagnosis. Also, the 2 groups of children were
not only similar in showing no significant treatment effects but also in their
measured ability and baseline reading performance and orthoptic status.
This is what the proponents of the use of overlays might well have predicted
on the grounds that individuals who are diagnosed with this syndrome or
visual stress are not necessarily experiencing actual reading problems but this
specific type of visual disadvantage in accessing the material to be read.
In other words, the authors accept that their results ( especially based upon a
relatively limited samples of target and control children ) do not rule out the
possibility that a small number of children might benefit from the use of
Another caveat is that the focus in this study was upon children diagnosed by
Irlen consultants and the observed results may not generalise to other
interventions which use coloured overlays such as the Wilkins’ System. While
the Irlen and Wilkins’ approaches overlap considerably, with some justification
for predicting somewhat similar outcomes, it is acknowledged that the Wilkins’
System uses a greater precision in the selection of colours for the overlays
and that this could have a bearing on outcomes.
In any event, on the basis of these current findings the authors conclude that
the use of the Irlen-identified overlays has no immediate effect upon the
reading of children experiencing difficulties … even those for whom a
diagnosis of Irlen syndrome has been applied. Their recommendation based
upon this study, and on the lack of wholly convincing evidence from existing
studies, is that parents, teachers, and other professionals should be cautious
and examine all the available evidence when considering the expenditure of
time and resources on what might still be seen as a controversial approach.
From the first study concerning the challenges surrounding how best to
organise early screening ad intervention, one can only echo the point about
establishing the best available system to ensure that no child slips through the
net and to adopt a cautious approach by taking seriously any indication of risk
… even if this does allow for the inclusion of false positives in intervention
groups for at least some time.
False positives are obviously to be avoided if at all possible, but one might
argue that false negatives ( ie a child not getting support when it is actually
necessary ) are more regrettable still given the potentially serious outcomes
educationally and psycho-socially.
One might also ponder whether the very early start to formal education in this
country, and the pace demanded if imposed expectations are to be met,
increase these dilemmas and dangers.
With this in mind, it is encouraging to note the work summarised in the second
study indicating the apparent potential for early identification from a pre-
school procedure which is reported to have relatively high sensitivity and
From this study, it was interesting to note that there were more boys than girls
who proved to fit the existing criteria for dyslexia when assessed at age 11
years. It may be that this is simply a function of the small sample and that any
replication study may not produce the same pattern.
However, it does add some weight to the argument that boys and girls may
not be as different in terms of vulnerability to dyslexia as the existing statistics
would have one believe. A possible implication is that current screening
measures ( and characteristics perceived as significant ) are not as
meaningful for girls as for boys so that false negatives may be more common
among girls … hence the need to avoid placing complete reliance upon one
single screening procedure and, instead, to maintain close monitoring over a
prolonged period of early schooling.
In respect of the third study, on the theme of Irlen-type overlays for reducing
certain reading difficulties, it might be concluded that “the jury is still out”, so
that one might avoid active recommendations for its use but not stand in the
way of some experimental usage if all involved accept it as such.
As with autism and ASD, various intervention schemes have arisen and one
can appreciate that parents may well be prepared to consider any options that
could be of help to their children. However, it appears that where gains have
been reported, it may not be clear how or why they have been achieved.
Further, what might be effective for one child may not be effective for another
child if gains are dependent upon some interaction between the particular
features of the intervention and the needs of the child operating within a
The logical response when an (innovative) approach is mooted is for cautious
open-mindedness but an avoidance of high expectations ; and, if the
approach is to be adopted, it would be for an initial trial period with care taken
to produce clear baseline performance data and to maintain close monitoring
during that trial period.
* * * * *
American Academy of Pediatrics 2009 Joint statement : learning
disabilities, dyslexia, and vision. Pediatrics 124(2) 837-844
Christenson G., Griffin J., and Taylor M. 2001 Failure of blue-tinted lenses
to change reading scores of dyslexic individuals. Optometry 72(10) 627-633
Fuchs D., Compton D., and Fuchs L. 2009 Construct and predictive validity
of a dynamic assessment of beginning reading. Paper presented to the
annual meeting of the Pacific Coast Research Conference : San Diego,
Gabrieli J. 2009 Dyslexia : a new synergy between education and
neuroscience. Science 325 280-283
Helland T., Plante E., and Hugdahl K. 2011 Predicting dyslexia at age 11
from a risk index questionnaire at age 5. Dyslexia 17 207-226
Lentini A. and Coyne M. 2009 A successful return to classroom instruction :
addressing false positive risk classifications in kindergarten reading
assessment. Manuscript submitted for publication.
McAlenney A. and Coyne M. 2011 Identifying at-risk students for early
reading intervention : challenges and possible solutions. Reading and
Writing Quarterly 27 306-323
Noble J., Orton M., Irlen S., and Robinson G. 2004 A controlled field study
of the use of coloured overlays on reading achievement. Australian Journal
of Learning Disabilities 9(2) 14-22
Ritchie S., Della Salla S., and McIntosh D. 2011 Irlen Colored overlays do
not alleviate reading difficulties. Pediatrics 128(4) e932-e938
Scarborough H. 1998 Early identification of children at risk for reading
disabilities. In B. Shapiro, P. Accardo, and A. Capute (Eds) Specific
Reading Disability : A View of the Spectrum. Timonium MD : York Press