spectrograph

Document Sample
spectrograph Powered By Docstoc
					Spectrogram & its reading


   by Tae-Yeoub Jang
What is spectrogram?
   Begin to be used since 1940s
   Another representation of frequency
    domain analysis
   The most popular way of representing
    spectral information
   3 dimensional representation
       X-axis: Time
       Y-axis: Frequency
       Darkness (or color): Energy


                      Reviving Sonus       2
Spectrogram example (color
resolution of word “compute”)




             Reviving Sonus     3
Spectrogram example (grayscale
of word “compute”)




             Reviving Sonus      4
Wideband vs. Narrowband
spectrograms of the question "Is Pat sad, or mad?" The 5th,
10th and 15th harmonics have been marked by white squares
in two of the vowels




                        Reviving Sonus                        5
Types of spectrogram
   Wideband spectrogram
       better time resolution
       eg) 15 msec window, 1 msec shift,
        125 Hz bandwidth
   Narrowband spectrogram
       better frequency resolution
       eg) 50 msec window, 1 msec shift,
        40 Hz bandwidth

                   Reviving Sonus           6
Advantages & Disadvantages
   Advantages
       Time alignment


   Disadvantages
       Less reliable than waveform




                   Reviving Sonus     7
Vowel Spectrogram
   Formant frequencies are critical cues
    for vowel distinction
   F1: Height
       high vowels: low F1
   F2: Backness
       back vowels: low F2



                   Reviving Sonus           8
Example formant frequencies of
English monophthongs


                                                  
F3   2900 2550 2490 2490 2640 2380 2300 2500 2390

F2   2250 1900 1770 1660 1100 1030            870 1500 1190

F1   280   400   550     690     710    450   310   900   640




                       Reviving Sonus                           9
"heed, hid, head, had, hod, hawed, hood, who'd" (a male
speaker, American English)




                         Reviving Sonus                   10
Consonant Spectrogram
   General
       Acoustic structure more complicated
        than vowels
       Adjacent sounds (especially vowels)
        convey important information  locus
       High frequency characteristics
         especially for fricatives and
        affricates

                   Reviving Sonus              11
What is LOCUS
   Information of formant transition from
    vowels into obstruents or from obstruents
    into vowels
   The target frequency that each formant
    transition is heading toward as an
    obstruction is made, or the frequency the
    transition comes as the obstruction is
    released
   The characteristic of the consonantal
    place and manner  roughly the same in
    different vowel contexts

                   Reviving Sonus               12
Stops
   General
       Fairly distinct locus for each place
       Burst
       Silence during the closure (only at
        syllable onset position)
       Virtually no difference during the
        closure


                    Reviving Sonus             13
Stops (cntd.)
   Voicing distinction
       voiced: vertical striations for voiced
        sounds, less abrupt burst, frequently
        weakened to be like fricatives or
        approximants
       voiceless: generally abrupt burst at
        higher frequency area



                    Reviving Sonus               14
Stops (cntd.)
   Place distinction
       bilabial
           relatively low F2, F3 locus  rising into and falling
            out of vowel
           weak and spread vertical lines
       alveolar
           F2 locus about 1800 Hz
           Strong vertical lines
       velar
           Velar pinch: vowels F2, F3 merging
           often double burst
           long formant transitions


                           Reviving Sonus                           15
Stops (cntd.)
   Manner distinction
         Silence duration, VOT, vowel F0

                silence            VOT    F0
aspirated        short             long   high
tense            long             short   high
lax              med               med    low

                        Reviving Sonus           16
Examples -- “a bab, a dad, a gag”




                Reviving Sonus      17
Place dependent loci




           Reviving Sonus   18
Fricatives
   General
       Random noise pattern especially in high
        frequency regions
       Place distinction
           Labiodental [f, v]: rising locus into the following
            vowel
           Dental [, ð]: major energy above 6000Hz
           Alveolar [s, z]: major energy above 4000Hz
           Alveopalatal [š, ž ]: major energy above 6000Hz
           Glottal [h]: the trace of formant frequencies of
            neighbouring vowels


                           Reviving Sonus                         19
Fricatives (cntd.)
   Weak vs. strong
       Strong [s, z, š, ž ]: darker bands
       Weak [f, v, , ð ]: spread and fainter
         Voiced [v, ð ]: often so weak and
          confused with nasals or approximants
         Cues to tell [] from [f]: higher formants
          of [] fall into adjacent vowels




                     Reviving Sonus                    20
Example – “fie, thigh, sigh, shy”




             Reviving Sonus     21
Example – “ever, weather, fizzer,
pleasure”




              Reviving Sonus        22
Nasals
   General
       Formants similar to vowels but fainter
       Very low F1 (about 250Hz), F2 (about
        2500Hz), and F3 (about 3250Hz)
   Place distinction
       bilabial [m]: downward F2, F3 locus
       alveolar [n]: less amount of F2 transition
       velar [ŋ ]: velar pinch


                      Reviving Sonus                 23
Examples -- “a Pam, a tan, a kang”




                Reviving Sonus       24
Liquies & Approximants
   General
       Formants similar to vowels but fainter
        (especially at high frequency regions)
       Approximately F1(250Hz), F2(1200Hz),
        F3(2400Hz)
       Change in formant structure




                    Reviving Sonus               25
Liquids & Approximants
(cntd.)
   Phone specific properties
       Labial glide [w]:
         very low F1, F2 (600-1000Hz|) and gets
          too close to each
         relatively low F3
         rapid falloff of spectral amplitude

       Palatal glide [y]:
         extremely low F1
         extremely high F2, F3



                     Reviving Sonus                26
Liquids & Approximants
(cntd.)
   Phone specific properties (cntd.)
       Flap [Ր]: soft burst, short duration
       Retroflex [r]:
         F3 dipping down close to F2
         General lowering of F3, F4

       Lateral [l]:
         Low F1, F2 (approx. F1 250Hz, F2 1200Hz)
         usually substantial energy in the high F
          region

                       Reviving Sonus            27
Example – “led, red, wed, yell”




             Reviving Sonus       28
Final remarks
   Spectrogram is not the only cue for
    acoustic distinction of speech
    sounds
   Very often, the waveform is more
    reliable




                 Reviving Sonus           29
References & Links
   http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/s
    pectrogram_reading.html
   http://hctv.humnet.ucla.edu/departments/linguistics/Vowel
    sandConsonants/course
   http://www.cs.indiana.edu/~port/teach/306/speech.acoust
    ics.html
   http://www.phon.ucl.ac.uk/courses/spsci/b203/week2-
    5.pdf




                         Reviving Sonus                     30

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:14
posted:12/1/2011
language:English
pages:30