Adapted from Acoustical Society of America Tutorial by Carol Espy-Wilson, firstname.lastname@example.org, Spring 2005 Current applications of speech recognition Machinery control: speaker-independent, small vocabulary, high-quality mike, very low error tolerance Voice telephone dialing: speaker-independent (digit recognition) and speaker dependent (name recognition), small vocabulary, low-quality mike, moderate error tolerance Human-machine interface via telephone: speaker-independent, moderate vocabulary, low-quality mike, high error tolerance Dictation: speaker-dependent, large vocabulary, high-quality mike, high error tolerance Human speech recognition (HSR) versus ASR Percentage word error rate (Lippman, Speech Communication 22, pp 1-15, 1997) % HSR ASR Grammar 0.1 3.6 Non-grammar 2 17 Reasons for performance gap Poor modeling of low-level acoustic-phonetic information Poor modeling of speech variability Lack of robustness to noise and channel variability Inability to deal effectively with disfluencies in spontaneous speech Major Stumbling Block: Speech variability Changes due to recording conditions (background noise, room reverb, mike characteristics) Differences in vocal tract length and shape (age, sex) Undershoot in articulation (effect of speaking rate and/or style) Voice quality changes (breathy to creaky) Variable degree of coarticulation (overlapping sounds/words, exs. “cart”, “seven plus”) Idiolect (detailed articulatory habits of a single person) Differences in dialect (vowel substitutions) Chomsky and Halle, Sound Pattern of English, 1968. 20 or so phonetic features characterizing al of the world’s languages Based on position of articulators Phonetic features come in 3 categories 1. Manner of articulation features, related to how open vocal tract is 2. Place of articulation features, location of main constriction 3. Source feature, opening of glottis and vibration of vocal folds Formant Map or vowel loops Peterson and Marney, American Journal of Physics 24, 1952, pp 175-184. heed, hid, head, had, hod, hawed, hood, who'd, hud, and heard. heed hid head had hud heard hod hood hawed who’d Peterson and Marney, American Journal of Physics 24, 1952, pp 175-184. heed, hid, head, had, hod, hawed, hood, who'd, hud, and heard.
Pages to are hidden for
"Speech recognition"Please download to view full document