spectrograph
Shared by: zO52Btv
-
Stats
- views:
- 12
- posted:
- 12/1/2011
- language:
- English
- pages:
- 30
Document Sample


Spectrogram & its reading
by Tae-Yeoub Jang
What is spectrogram?
Begin to be used since 1940s
Another representation of frequency
domain analysis
The most popular way of representing
spectral information
3 dimensional representation
X-axis: Time
Y-axis: Frequency
Darkness (or color): Energy
Reviving Sonus 2
Spectrogram example (color
resolution of word “compute”)
Reviving Sonus 3
Spectrogram example (grayscale
of word “compute”)
Reviving Sonus 4
Wideband vs. Narrowband
spectrograms of the question "Is Pat sad, or mad?" The 5th,
10th and 15th harmonics have been marked by white squares
in two of the vowels
Reviving Sonus 5
Types of spectrogram
Wideband spectrogram
better time resolution
eg) 15 msec window, 1 msec shift,
125 Hz bandwidth
Narrowband spectrogram
better frequency resolution
eg) 50 msec window, 1 msec shift,
40 Hz bandwidth
Reviving Sonus 6
Advantages & Disadvantages
Advantages
Time alignment
Disadvantages
Less reliable than waveform
Reviving Sonus 7
Vowel Spectrogram
Formant frequencies are critical cues
for vowel distinction
F1: Height
high vowels: low F1
F2: Backness
back vowels: low F2
Reviving Sonus 8
Example formant frequencies of
English monophthongs
F3 2900 2550 2490 2490 2640 2380 2300 2500 2390
F2 2250 1900 1770 1660 1100 1030 870 1500 1190
F1 280 400 550 690 710 450 310 900 640
Reviving Sonus 9
"heed, hid, head, had, hod, hawed, hood, who'd" (a male
speaker, American English)
Reviving Sonus 10
Consonant Spectrogram
General
Acoustic structure more complicated
than vowels
Adjacent sounds (especially vowels)
convey important information locus
High frequency characteristics
especially for fricatives and
affricates
Reviving Sonus 11
What is LOCUS
Information of formant transition from
vowels into obstruents or from obstruents
into vowels
The target frequency that each formant
transition is heading toward as an
obstruction is made, or the frequency the
transition comes as the obstruction is
released
The characteristic of the consonantal
place and manner roughly the same in
different vowel contexts
Reviving Sonus 12
Stops
General
Fairly distinct locus for each place
Burst
Silence during the closure (only at
syllable onset position)
Virtually no difference during the
closure
Reviving Sonus 13
Stops (cntd.)
Voicing distinction
voiced: vertical striations for voiced
sounds, less abrupt burst, frequently
weakened to be like fricatives or
approximants
voiceless: generally abrupt burst at
higher frequency area
Reviving Sonus 14
Stops (cntd.)
Place distinction
bilabial
relatively low F2, F3 locus rising into and falling
out of vowel
weak and spread vertical lines
alveolar
F2 locus about 1800 Hz
Strong vertical lines
velar
Velar pinch: vowels F2, F3 merging
often double burst
long formant transitions
Reviving Sonus 15
Stops (cntd.)
Manner distinction
Silence duration, VOT, vowel F0
silence VOT F0
aspirated short long high
tense long short high
lax med med low
Reviving Sonus 16
Examples -- “a bab, a dad, a gag”
Reviving Sonus 17
Place dependent loci
Reviving Sonus 18
Fricatives
General
Random noise pattern especially in high
frequency regions
Place distinction
Labiodental [f, v]: rising locus into the following
vowel
Dental [, ð]: major energy above 6000Hz
Alveolar [s, z]: major energy above 4000Hz
Alveopalatal [š, ž ]: major energy above 6000Hz
Glottal [h]: the trace of formant frequencies of
neighbouring vowels
Reviving Sonus 19
Fricatives (cntd.)
Weak vs. strong
Strong [s, z, š, ž ]: darker bands
Weak [f, v, , ð ]: spread and fainter
Voiced [v, ð ]: often so weak and
confused with nasals or approximants
Cues to tell [] from [f]: higher formants
of [] fall into adjacent vowels
Reviving Sonus 20
Example – “fie, thigh, sigh, shy”
Reviving Sonus 21
Example – “ever, weather, fizzer,
pleasure”
Reviving Sonus 22
Nasals
General
Formants similar to vowels but fainter
Very low F1 (about 250Hz), F2 (about
2500Hz), and F3 (about 3250Hz)
Place distinction
bilabial [m]: downward F2, F3 locus
alveolar [n]: less amount of F2 transition
velar [ŋ ]: velar pinch
Reviving Sonus 23
Examples -- “a Pam, a tan, a kang”
Reviving Sonus 24
Liquies & Approximants
General
Formants similar to vowels but fainter
(especially at high frequency regions)
Approximately F1(250Hz), F2(1200Hz),
F3(2400Hz)
Change in formant structure
Reviving Sonus 25
Liquids & Approximants
(cntd.)
Phone specific properties
Labial glide [w]:
very low F1, F2 (600-1000Hz|) and gets
too close to each
relatively low F3
rapid falloff of spectral amplitude
Palatal glide [y]:
extremely low F1
extremely high F2, F3
Reviving Sonus 26
Liquids & Approximants
(cntd.)
Phone specific properties (cntd.)
Flap [Ր]: soft burst, short duration
Retroflex [r]:
F3 dipping down close to F2
General lowering of F3, F4
Lateral [l]:
Low F1, F2 (approx. F1 250Hz, F2 1200Hz)
usually substantial energy in the high F
region
Reviving Sonus 27
Example – “led, red, wed, yell”
Reviving Sonus 28
Final remarks
Spectrogram is not the only cue for
acoustic distinction of speech
sounds
Very often, the waveform is more
reliable
Reviving Sonus 29
References & Links
http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/s
pectrogram_reading.html
http://hctv.humnet.ucla.edu/departments/linguistics/Vowel
sandConsonants/course
http://www.cs.indiana.edu/~port/teach/306/speech.acoust
ics.html
http://www.phon.ucl.ac.uk/courses/spsci/b203/week2-
5.pdf
Reviving Sonus 30
Get documents about "