CONSONANT LISTS
Introduction
Each of the consonant lists provided in the AR Sampler consists of four
quasi-random presentations of the 20 English consonants /p, t, k, b, d, g,
f, v, h, s, z, sh, ch, j, m, n, w, r, l, y/. Although there are 24 consonants in
English, I prefer to restrict the set to these 20 items as the remaining
four items, [th], [th], [zh], and [ng], can create many difficulties for
clients. For example, even if I can persuade a client that there really are
two “th” consonants in English, s/he usually has great difficulty
remembering the orthographic difference between the voiceless [th] and
the voiced [th]. A similar situation exists with the voiced sibilant [zh],
but here I can console myself that its low frequency of occurrence makes
it easier to omit. Finally, the velar nasal [ng] presents its own unique set
of difficulties. Many people cling to the notion that it is a combination of
[n] and [g], a view encouraged when people talk of someone “droppin’ their
g’s,” while some dialects of English such as “conservative RP,” (Crystal,
1995; p. 245) substitute final ‘in’ for ‘ing,’ resulting in words such as
“huntin” and “fishin.”
When I use these items, I usually say them in an [aCa] frame ( [apa],
[ama], [afa], etc.,), and ask the client to tell me which of the 20
consonants has been presented as the stimulus. I find that most clients
have little difficulty understanding the task, although their performance
can vary considerably. For example, I am currently working with two
adults with implants and when I use these items as an auditory only task,
one scores around 50% correct, while the other scores close to 100%.
I sometimes use other vowel environments such as [iCi] or [uCu], and find
that this can have a marked effect on a client’s score. This is especially
true when the materials are presented as a lipreading task, and the
spread and rounded lip shapes that accompany [i] and [u] respectively, can
mask visual cues that are easily recognized in the [aCa] format.
Presenting the lists
When I use this test with clients I provide them with a sheet (see the
example on the next page) setting out the 20 alternatives. I place this on
the table in front of the client, and introduce the items one at a time. I
point to the consonant first, and then produce it in the vowel format
chosen. I often ask the client to repeat each of the syllables after it is
produced to ensure that they understand what the task involves. Once
I’m confident that s/he is familiar with the items, I present an entire 80-
item list, noting which items are perceived correctly, and, the direction
of any error responses. I always keep a record of error responses, as
analysis of these can yield very useful information concerning a client’s
ability to perceive consonantal cues.
p t k b d
g f v h s
z sh ch j w
r l y m n
I almost always ask the client to point to the consonant s/he thinks has
been presented each time. When I’m using this technique, however, I
take great care not to look down at the list of response alternatives until
after the client has found the consonant s/he thinks was presented. I
take this approach because I am aware that my eye gaze can sometimes
provide inadvertent cues, which alerts the client to the identity of the
stimulus. I’m not suggesting that clients “cheat” here, but rather that
their response can be influenced by my unintentional visual behavior.
I also encourage the client to give a spoken response, as this makes the
task of scoring the items a little easier. Some clients will repeat the
syllables, while others prefer to use the “letter names” – /pi/ for [p],
/ef/ for [f], etc. I don’t really mind which of the alternatives they
choose, because asking them to point at the list of consonants helps
ensure that I know which item they think was presented.
Presentation conditions
I can present the lists in one of three conditions – auditory only (A),
visual only (V), and auditory-visual (AV). I usually sit about 1.5 meters (4
– 5 feet) away from the client, and use a voice level that is appropriate to
the acoustic conditions in which I am working. I know that this might
sound more than a little inexact, but the voice level we use is usually
determined by what’s going on around us. If I am doing formal testing, I
am much stricter about such things, and might use a Sound Level Meter
to maintain a consistent presentation level, but for training and informal
“testing” I prefer using a less stringent approach. When I first use
these materials to a client, I always use Clear Speech to ensure that s/he
is getting the best possible auditory and/or visual signal. This style of
speech, which has been shown to result in much improved speech
perception scores by people with hearing loss, involves “talking in a clear
and concise manner,” 1 and, as a result, “speech becomes slower and louder
and the stress on certain words or syllables becomes more obvious.” 2
It’s the style of speech that most therapists spontaneously start to use
soon after they start work with people with hearing loss, and often
results in the oft-heard remark, “If everyone spoke like you, I wouldn’t
need hearing aids!”
When I present the items via A only, I cover my face using an embroidery
frame with two pieces of black loudspeaker cloth (I bought mine at Radio
Shack) stretched across it. This removes any visual cues, but ensures
that the speech signal passes through unimpeded. I sometimes still see
therapists working with deaf children or adults use their hand or a piece
of cardboard to obscure their lips, and am surprised that such practices
remain in the face of overwhelming evidence regarding the deleterious
effect this has on the speech signal. If you want to ensure that the high
frequencies are considerably dampened, use your hand or a piece of card;
but if you want the high frequencies to pass through unimpeded use an
acoustic screen.
1
Oticon Corporation. 2002. Clear Speech; p. 3. This is a very informative pamphlet available from
Oticon Hearing Aids
2
Ibid, p. 3
When I present materials V only, I ask the client to turn off her/his
aid(s), once s/he has done this I can present the materials using normal
vocal effort. I could silently mouth the items, but would not be confident
that this did not have other, potentially harmful, effects on my
production. If I am sitting in a room with a window, I always sit facing it,
so that my face is well lit, and that I am the one affected by any glare.
AV presentations are usually the easiest for both therapist and client.
Again, I try to use a voice level that is appropriate to the room
conditions, and provide the best possible visual conditions to ensure that
the client has a clear, well-lit view of my face and lips.
Scoring
Once I’ve presented the entire list I count up the number of items that
were correctly perceived and multiply this score by 1.25 to derive a
percent correct score. In many ways, however, I’m far more interested
in the type of errors made, as these can reveal a great deal about the
client’s ability to perceive the acoustic/phonetic cues that we use to
discriminate between the consonants. For example, is the client able to
perceive consonant voicing, or is s/he able to reliably discriminate
between stops and continuants, or nasals and orals? In order to answer
such questions I enter the client’s responses onto prepared confusion
matrices that look at her/his ability to assign the consonants into their
correct voicing, manner of articulation, and place of articulation
categories.
Voicing
In the set of twenty consonants used in this test there are eight
voiceless /p, t, k, f, h, s, sh, ch/ and twelve voiced /b, d, g, v, z, j, w, r, l,
y, m, n/ consonants.
Manner of articulation
The categories I use, assign the consonants into the following groups:
1. Stops /p, t, k, b, d, g/
2. Fricatives /f, v, h/
3. Sibilants /s, z, sh/
4. Affricates /ch. j/
5. Semi-vowels /w, r, l, y/
6. Nasals /m, n/
Place of articulation
The place of articulation categories that I use are:
1. Bilabials /p, b, m/
2. Labio-dentals /f, v/
3. Alveolars /t, d, n, s, z, l/
4. Post-alveolars /sh, ch, j, r/
5. Palatals /w, y/
6. Velars /k, g/
7. Glottal /h/
In looking at V and AV performance, however, this seems to be an
incomplete set, due to the distinct lip-rounding that accompanies the
production of the semi-vowels [w] and [r]. As a result, I sometimes use a
revised set that includes this “visual place of articulation,” which I call
“rounded lips,” as a separate category.
Using confusion matrices
I have been using confusion matrices for many years, and have always
regarded them as an excellent way in which to present and interpret data.
When I speak to therapists about confusion matrices, however, I realize
that many clinicians regard their use as being restricted to research
studies. In my opinion, nothing could be further from the truth, and I
hope that the following explanation will make them more accessible to
clinicians.
On the next page, you’ll find two confusion matrices showing the manner
of articulation performance of an implant user. This subject had been
deaf for over 50 years before he finally obtained an implant, and, in
common with most people who knew his history, I wondered how much
benefit he would obtain after such a long time without hearing.
I presented five lists of consonants in /aCa/ for two conditions –
lipreading only and lipreading + the cochlear implant. The subject’s overall
scores were 49.7% and 75.5% correct respectively. When I plotted the
subject’s responses onto confusion matrices, some distinct patterns