project7 presentation by HC120912025830

VIEWS: 0 PAGES: 44

									Query by Pitch

    Jin Yi and Russell Brennan
Introduction
   Input: Sing a snippet of a song
   Output: Name of the song, artist, genre
    etc.
   Marketable: Integrate with online music
    shops
   Useful: Provides a quick, easy solution
    for determining song information
Methodology
   Vocal delivery
       Subject to sing into microphone
   Filtering
       Filter noise via ~100 – 800Hz bandpass filter
   Pitch Detection
       Calculate difference function to determine
        fundamental frequency
   Segmentation
       Determine discrete pitches throughout signal
Methodology (continued)
   Indexing/Database Building
       Calculate ratios of pitches and pitch durations to previous
        pitches and durations
       Create database of known song ratios for comparison


   Comparison
       Compute second difference function, sliding vocal ratios
        across database ratio windows


   Result
       Song with lowest difference
Bandpass Filter
   Needed for filtering out noise
   Butterworth filter doesn’t have ripple in
    the passband, unlike the Chebyshev
    filter
Bandpass Filter (as originally intended, 4th order
bandpass filter)
First Order Bandpass Filter +
First Order Lowpass Filter
                                The output signal is too low
                                since voltage is consumed in
                                the resistor
  First order low pass filter




 First order bandpass filter
Inverted Amplifier




 Added op-a   Gain = -r2/r1
 mp
             Final Circuit For Bandpass
             Filter
                Bandpass filter cuts off the low frequency but has a
                 long transition band for the high cutoff. We added
                 two more low pass filters.




Microphone   Low pass   Bandpas    Inverted   Low pass   Inverted
             filter     s filter   op-amp     filter     op-amp     dsp
Pitch Detection
   Vocal delivery creates a periodic signal in
    short-time…




   It should have a high correlation with itself,
    when shifted one period
Detect the Period
   A difference function squared:

   dt(tau) = sum(j=1 to W) (sj – sj-tau)2

   Can detect the offset, tau = period

   The period will be at a minimum of this difference
    function.
Segmentation
   Exponentially Weighted Moving Average
    (EWMA)
       EWMA is often used in statistical process control
        to detect shifts in the mean of a process
       The pitch from dsp should be smoothed to detect
        changes in pitch better
       EWMA weights current and past values to create
        a current estimate of a signal average

       A(i)=r * signal(i) + (1-r) * A(i-1)
Segmentation (continued)
   Use EWMA thusly:
       EWMA “smoothes” the signal greatly
       Detects shift in pitch by detecting a trend line
       A trend of 4 in a row increasing or decreasing
        indicates a shift in mean


   Can we trust the EWMA?
       Each trend line becomes a mark
    Segmentation (conclusion)
   By default, a mark is placed at the first and last
    samples of the pitch signal

   Calculate means of pitch signal within each mark
    section, i.e. 1-25:26-39:39-50

   If means are reasonably close, consider them one (this
    happens often)

   Ratios of mean(i-1) / mean(i) are used for
    comparison
Block Diagram for Calculating the Ratio

Index                     Mark Th
             EWMA
                          e Pitch




Calculat     Combine      Calculat
e Ratio      Close        e Pitch
             Pitch
Example of calculating the ratio
   Marks: 1     1 36 111 168
   Pitches : 214.4 161.4 240.0
   Ratios: 161/214 , 240/161 = .737, 1.52
Algorithm for Finding the
Right Song
   R1 = Ratio of The Database
   R2 = Ratio of The Current Input
   Difference = (R1 – R2) ^2
  d
                                          91




(2-8)^2



      (3-9)^2

           (1-0)^2

                 (4-3)^2

                      (5-4)^2

                                (6-2)^2
                                          91
                                          135

(3-8)^2



      (1-9)^2

           (4-0)^2

                 (5-3)^2

                      (6-4)^2

                                (7-2)^2
                                      91
                                      135
(1-8)^2
                                      195


      (4-9)^2

           (5-0)^2

                 (6-3)^2

                      (7-4)^2


                            (8-2)^2
                                      91
                                      135
(4-8)^2
                                      195
                                      149
      (5-9)^2

           (6-0)^2

                 (7-3)^2

                      (8-4)^2


                            (9-2)^2
                                          91
                                          135
(5-8)^2                                   195
                                          149

      (6-9)^2                             121

           (7-0)^2

                 (8-3)^2

                      (9-4)^2


                                (0-2)^2
                                          91
                                          135
(6-8)^2                                   195
                                          149

      (7-9)^2                             121
                                          125
           (8-0)^2

                 (9-3)^2

                      (0-4)^2

                                (3-2)^2
                                      91
                                      135
                                      195
(7-8)^2                               149
                                      121
      (8-9)^2                         125

           (9-0)^2                    97

                 (0-3)^2

                      (3-4)^2



                            (4-2)^2
                                          91
                                          135
(8-8)^2
                                          195
                                          149
      (9-9)^2                             121

           (0-0)^2                        125
                                          97
                 (3-3)^2
                                           0
                      (4-4)^2

                                (2-2)^2
                                          91
                                          135
(9-8)^2
                                          195
                                          149
      (0-9)^2                             121

           (0-3)^2                        125
                                          97
                 (4-3)^2
                                           0
                      (2-4)^2
                                          97
                                (1-2)^2
91
135
195
      (2-8)^2
149
121
                (3-9)^2
125
97                   (4-3)^2

 0                         (4-3)^2
97
                                (5-4)^2
91
                                          (6-2)^2
91
135
195
      (3-8)^2
149
121
                (4-9)^2
125
97                   (2-3)^2

 0                         (1-3)^2

97
                                (4-4)^2
91
64                                        (5-2)^2
Returns the minimum

91
135
195
149
121
125
97           Comparison Result for thi
             s song is 0
 0
97
91
64
        In case of missing the pitch
    This doesn’t work since one missing             91

     pitch will cause two incorrect ratios

              pitches                                 ratio

    Correct   425678    4/2 2/5 5/6 6/7 7/8 8/3 = 2 0.4 0.83 0.86 .88
    pitch     3         2.67

missing       425683       4/2 2/5 5/6 6/8 8/3 = 2 0.4 0.83 .75
                           2.67
One pitc
h
                                         Messing up this pitch and ther
                                         e is one pitch missing
Build
   Band-Pass Filter
       Capacitors, inductors and op-amps, as well
        as resistors
   Pitch Detection
       TI 54x DSP dev board
       Code Composer Studio version 1.2
   Serial Transmission of pitch indexes
       Start/Stop signal capabilities
Build (continued)
   Pitch index reception/post-processing
       Programmed as a standalone application in
        C++
       Ability to change song database on-the-fly
Testing
   Band-Pass Filter
       Input a sinusoid and observed the result in
        the oscilloscope.
       Measured the voltage at nodes to debug.
       Output had a -0.6v Offset. Getting rid of
        this offset is not necessary since we are
        detecting only periodicity.
Testing
   Pitch Detector
       Most testing done in Matlab environment
       Sinusoid, swept sinusoid, noisy sinusoid,
        harmonic stack with noise, vocal singing
       From DSP, serial output and memory
        dumps
       Stepwise expectation verification
Testing (continued)
Testing Serial Port
   Tera Term was used to test output of DSP.
   Assembly serial port output function did not
    work for some reason. We had to use C
    functions written for ECE 420
   ASCII code was interpreted and found to
    correspond to correct pitches. Sending
    characters to the DSP was tested using a DSP
    on/off technique.
     Debugging the Software
   In the Unix programming environment, most people use ‘printf’ to
    debug
   In visual C++(api), printf cannot be used, so we debugged using
    popup windows
   To view intermediate values or any results, we converted floating point
    numbers or integers to strings for use with popups.
Testing Segmentation
   Segmentation was first tested in Matlab to
    facilitate quick changes
   Test clips of pepole singing short tunes were
    used
   Parameters such as average weight decay
    and trend length were adjusted
   Finally, the algorithm was integrated into our
    main executable
Discussion (Successes/Failures)
   Vocal extraction (failure)
   Missing pitches
   At least 5-6 pitches are needed
   The program could match some songs
    almost 90 percent of the time
Recommendations
   Missing pitches ( curve-fitting )
   Duration of pitches
   “De-esser”
   Harmonic search (vocal extraction)
   Double pitch output
References
   A. D. Cheveigne and H. Kawahara. Yin: A fundamental frequency
         estimator for speech and music. Journal of the Acoustical Society
         of America, 111(4), 2002.
   Mark Hasegawa-Johnson. Audio Engineering Lecture Notes for ECE
         403. January 20, 2005.
   Robert Morrison, Jason Laska, Douglas Jones. Digital Signal
         Processing Laboratory. Feb. 21, 2005.
         http://cnx.rice.edu/content/col10236/latest/
   Alex Spektor, Personal Communication, Summer 2005

								
To top