Effects of Speech Presentation Mode on Listeners Perceptions of AL Speakers

Paul M. Evitts, PhD, CCC-SLP, Ami Van Dine B.A., & Aline Holler, B.A.
Department of Communicative Disorders, West Chester University
Currently, there is limited information on listeners' perceptions of an individual with a laryngectomy (IWL) based on audio-visual stimuli. The purpose of this study is to determine whether listeners have different impressions of an IWL based on mode of alaryngeal (AL) speech (i.e., TE, ES, EL, Normal) and mode of presentation (i.e., audio, audio-visual). Thirtythree listeners were randomly presented with a standard reading passage produced by one of four speakers (three AL, one laryngeal speaker) in both audio-only and audio-visual presentation mode. Based on mode of speech, results suggest differences in measures of voice quality, overall speech characteristics, and personality . Based on mode of presentation, there were no significant differences among all the measures. Discussion will focus on implications on therapy, limitations of the study, and directions for future research.

Previous studies have shown that listeners have negative perceptions of people with various speech and hearing disorders based on the sound their speech. In the laryngeal literature, Blood, Mahan, and Hyman (1979) reported listeners’ perceptions of adults with a voice disorder were more negative than those without a voice disorder. McKinnon et al (1986) showed that college students reacted more negatively to speech disorders than normal speech. Lass, Ruscello and Lakawicz (1988) also showed that listeners reacted more negatively to listeners with dysarthria than normal speakers. In the stuttering literature, Susca and Healy (2001) showed that increasing levels of dysfluency were associated with lower perceptual ratings. The previous studies demonstrate that listeners have negative or less favorable perceptions of a speaker with a voice or speech disorder. These studies, however, are based on laryngeal speakers and may not reflect listener’s perceptions of an IWL. Although numerous researchers have looked at listeners perceptions of the acceptability, naturalness, and voice quality of AL speech, there is limited information how listeners perceive the actual person rather than the features of the acoustic signal. Evitts, Searl, and Gabel (2004) previously reported that listeners perceptions of personality were not significantly different based on mode of speech but listeners perceptions of overall speech characteristics (e.g., how comfortable did you feel listening to this person?) were significantly affected. These results, however, were based on audio-only speech samples and may not accurately reflect the perceptions of listeners when presented with audio-visual stimuli. Intuitively, it makes sense that listeners may have less than favorable perceptions of someone with a physical difference (e.g., anatomical changes following surgery, visible stoma) or someone who requires additional movements or equipment to produce speech (e.g., injecting air into the esophagus, digital occlusion of the stoma, use of an electrolarynx). Based on previous laryngeal and AL research, it is reasonable to suspect that listeners may have less than favorable perceptions of an IWL. The AL research is replete with studies showing that listeners perceive AL speech to be less acceptable, less natural, and of poorer quality than normal, laryngeal speech. Those studies, coupled with the physical differences that IWL require to produce speech, may lead to negative perceptions of an IWL. Furthermore, those negative perceptions may not be evident from audio samples alone.

The purpose of this study is to determine whether listeners have different impressions of a laryngectomized speaker’s personality, overall speech characteristics, and voice quality based on mode of alaryngeal speech and mode of presentation. Research Questions: 1) Is there a difference in listener’s perceptions based on presentation mode (audio-only vs. AV)? 2) Within mode of speech, is there a difference in listener’s perceptions based on mode of presentation (Audio-only vs. AV)? 3) Within mode of presentation, is there a difference in listener’s perceptions based on mode of speech?

Mean Perceptual Ratings* Normal TE ES EL Total Factor 1 Factor 2 Factor 3 Audio-only 32.12 43.34 43.59 42.21 40.31 52.35 25.27 42.29 AV__ 27.37** 43.88 52.74** 36.98** 40.24 53.87 26.92 36.73**

* Based on 100 mm visual-analog scale. Neutral ratings = 50 mm, lower numbers are associated with more positive ratings, higher numbers = more negative. ** = Significant at p < .05.


•Three male alaryngeal speakers and one age- and gender-matched normal speaker •Audio and audio-visual recordings were made of each speaker producing a standard reading passage (The Grandfather Passage). •Any acoustic or visual information that related to the sentence production (e.g., injection of air, stomal noise, digital occlusion of stoma) was included. •In order to reduce any potential bias, each listener was presented with only one mode of speech presented in both audio-only and audio-visual mode. •Mode of presentation was randomized and listeners were presented the reading passage twice to base their perceptions •Ratings obtained using visual analog scale (10 cm), 0 =positive, 10 =negative. •Ratings divided into three categories: 1-9 = perceptions of voice quality, 1015 = speech, 16-22 = personality. •One sample was repeated for each listener in order to determine intra-rater reliability.

Rating Factors: Significant correlations between presentation mode and factor (r = .360-.74, p < .001). Paired-samples t-test showed a significant difference between audio-only and AV stimuli within Factor 3 (personality; t [255] = .986, p < .01) . No difference between modes for Factors 1, 2.

RQ 1) Results suggest that overall, listeners had similar impressions of speakers when presented with AV and Audio-only speech stimuli and that these impressions were rated as ‘neutral’. RQ 2) Results suggest that visual info improves listeners’ perceptions of Normal and EL speaker. However, visual info decreased listeners’ impressions of the ES speaker. May be related to degree of deviancy from normal, laryngeal speech. That is, further from normal equates to more negative perceptions (e.g., injection of air). EL speech however, may be considered completely distinct from laryngeal speech and perceived in a different category altogether. RQ 3) There were significant differences among speech modes for AV stimuli but not for Audio-only stimuli. This suggests that based on Audio-only information, listeners may have similar perceptions among the modes of speech, including normal speech. Within AV mode, however, the inclusion of visual information alters listeners’ perceptions of the speaker. Future research could address why listeners perceived the EL speaker similarly to the Normal speaker. This line of research could also help determine specifically what visual information made listeners perceive the TE and ES speakers as more negative. Rating Factors: Correlations showed that listeners rated AV and Audio-only in a similar fashion across Factors (that is, if AV rated high, then Audio-only rated high). Although rated similarly, a significant difference between AV and Audio-only was shown for Factor 3 (personality). LIMITATIONS: It is difficult to generalize the results to all AL speakers due to the small number of speakers and differences in baseline intelligibility. Also, the listeners were primarily college-aged females which may not represent the perceptions of the peers of an IWL.

Preliminary Analysis: Data from 33 listeners was included – two listeners were excluded due to poor reliability ratings (r = .419, 0.398 [p > .05]). RQ 1) Independent samples t-test showed there was no significant difference between presentation (AV-A) modes (t [1641] = -0.573, p = .567) among all modes of speech. RQ 2) Paired samples t-test showed significant differences between modes of presentation for Normal, ES, and EL (p < .05). RQ 3) A one-way ANOVA showed there were significant differences within AV mode of listener perceptions (F[3, 35] = 5.31, p <.01) but not Audio-only (F[3, 35] = 1.04, p =.389). Posthoc analysis of AV data showed that Normal speech had significantly more positive ratings than TE and ES modes of speech within the AV mode.

