PowerPoint ?????? by QoA2W6h7

VIEWS: 0 PAGES: 18

									핵심어 검출을 위한 단일 끝점
    DTW 알고리즘

       Yong-Sun Choi and Soo-Young Lee

           Brain Science Research Center and
Department of Electrical Engineering and Computer Science
   Korea Advanced Institute of Science and Technology


                                                            1
                      Contents                    2




• Keyword Spotting
  – Meaning & Necessity
  – Problems
• Dynamic Time Warping (DTW)
  – Advantages of DTW
  – Some conventional types & Proposed DTW type
• Experimental Results
  – Verification of proposed DTW performance
  – Standard threshold setting
  – Results of various conditions
• Conclusions
              Keyword Spotting                                   3




• Meaning
  – Detection of pre-defined keywords in the continuous speech
  – Example)
      • Keywords : ‘open’, ‘window’
      • Input : “um…okay, uh… please open the…uh…window”



• Necessity
  – Human may say OOV(Out Of Vocabulary), sometimes stammer
  – But machine only needs some specific words for recognition
             Problems & Goal                     4




• Difficulties
  – of process
     • End-Point-Detection of speech segment
     • Rejection of OOVs
  – of implementation
     • A big load of calculations
     • Complex algorithm
     • Hard to build up a real hardware system
• Goal
  – Simple & Fast Algorithm
   DTW for Keyword Spotting                                         5




• Hidden Markov Model (HMM)
  – A statistical model : need large number of datum for training
  – Complex algorithm : hard to implement a hardware system
  – Many parameters : can cause memory problem
• Dynamic Time Warping (DTW)
  – Advantages
      • Small number of datum for training
      • Simple algorithm (addition & multiplication)
      • Small number of stored datum
  – Weak points
      • Need EPD process, Many calculations
         General DTW Process   6




• Known both End Points
• Repetition of searches
• Finding corresponding
  frames
                        Advanced DTW         7




• Myers, Rabiner and Rosenberg
   – No EPD Process
   – Series of small area searches
       • Global search in one area
       • Setting next area around the best
         match point of local area
       • Reducing amount of calculations
         but still much
   – Tested in isolated word recognition
   Proposal – Shape & Weights       8




• No EPD process
• Only one path
  – Select the best match point
    and search again at the point
  – Less computations
• Modifying weights
  – To compensate weight-sum
    differences
      • For search
      • For distance accumulation
               Proposal – End Point   9




• Small search area
   – Successive local searches
   – Start search at one point


• End condition
   – When the point is on the
     last frame of Ref. pattern
   – Setting up End Point
     automatically
              Proposal – Distance    10




• Modifying distance
    – Using differences of pattern
      lengths
    – Pattern lengths of same
      words are similar each other


                RE  TE 
D'  D  exp          
                 RE  1 
 DTW – Computation Loads                                  11




– 3 types




       1
   TC    N C RETE   TC  K 2  N C RE          1
       K1                                   TC   NC RE
                                               K3
       1
       N C RETE         (2e  1) N C RE      5
       2                                       N C RE
                         9 N C RE             8
           Data Base & EX-SET                                         12




• DB
  – RoadRally
       • For keyword spotting
       • Based on telephone channel
  – Usages
       • 11 keywords (Total 434 occurrences)
       • 40 male speakers read speech (Total 47 min.) in Stonehenge

• SET construction
  – 4 sub-set (about 108 keywords / set)
  – 3 set for training , 1 set for test
  – 2 reference patterns / keyword / set
               Verification Result                       13




• Isolated Word Recognition
  – 3 set for training , 1 set for test

        Test                 Recognition Rate (%)
        Set          General DTW          Proposed DTW
          1                 96.3              98.2
          2                100.0              99.1
          3                 96.3              95.4
          4                 97.2              97.2
        Avg.                97.5              97.5
              Experimental Setup                           14




• Assumption
   – Any frame can be the last frame of keywords


• Threshold
   – To reject OOV
   – 1 threshold / ref.
   – Standard threshold : no false alarm in training set


• Result presentation
   – ROC (Receiver Operator Characteristic)
       • X-axis : false alarm / hour / keyword
       • Y-axis : recognition rate
            Thresholds Setting
                                                                       15
      & Recognition Rate of Training Set
  • Training set = Test set (No false alarm)

Keyword     Right Total       %        Keyword       Right Total   %
Mountain     21      40      52.5       Primary       34    40     85.0
Secondary    38      40      95.0        Minus        25    39     64.1
Middleton    27      37      73.0       Interstate    37    40     92.5
Boonsboro    32      39      82.1       Waterloo      35    40     87.5
 Conway      33      40      82.5       Retrace       36    40     90.0
 Thicket     30      39      77.0        Total       368    434    84.8
       Result – DTW & HMM   16




• ROC Curve
    Changing Conditions               17




No. of Keywords   No. of References
                        Conclusion                                           18




• Proposed DTW
  – Advantages
      •   Simple structure : addition & multiplication (good for hardware)
      •   No EPD processing
      •   Very small computation load
      •   Small stored datum : small memory
            : Only keyword information
  – Good performance


• Keyword Spotting
  – Better than HMM in the case of small training datum

								
To top