Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Optimal Adaptive Wavelet Bases

VIEWS: 7 PAGES: 48

									Signal Subspace Speech Enhancement




                                Page 0 of 43
Presentation Outline

 Introduction

 Principals

 Orthogonal Transforms (KLT Overview)

 Papers Review




                                         Page 1 of 47
Introduction

 Two major classes of speech enhancement

  – By modeling of noise/speech: like HMM
      Highly dependent on speech signal syntax and noise
       characteristics
  – Based on transformation: Spectral Subtraction
      Musical noise


 Signal Subspace belongs to the second class
  (nonparametric)
                                                      Page 2 of 47
       Schematic Diagram



Noisy signal    Orthogonal    Modifying      Inverse
(time domain)   Transform    Coefficients   Transform
                                                        Estimated
                                                        Clean
                                                        Signal




                                                         Page 3 of 47
   Schematic Diagram

  Noisy signal
  (time domain)                           Signal+Noise
                                          subspace
                                                         Estimating
                                                         Clean signal
                               Estimating                from
 Framing         Orthogonal
                 Transform
                              Dimensions of     Gs       Signal+Noise
overlapping                    Subspaces                 subspace


                                               Inverse        Clean
 Producing two                     Gn
 orthogonal            Noise                  Transform       Signal
 subspaces             subspace


                                                            Page 4 of 47
Principals


 Procedure
   – Estimate the dimension of the signal+noise subspace
     in each frame

   – Estimate clean signal from (S+N) subspace by
     considering some criteria (main part)
       energy of the residual noise
       energy of the signal distortion


   – Nulling the coefficients related to the noise subspace

                                                     Page 5 of 47
Principals

 Assumptions
   – Noise & speech are uncorrelated

   – Noise is additive & white (whitened)

   – Covariance matrix of the noise in each frame is
     positive definite and close to a Toeplitz matrix

   – Signal is more statistically structured than noise
     process

                                                        Page 6 of 47
Principals

 Key Factor in Signal Subspace method

   – Covariance matrices of the clean signal have some
     zero eigenvalues.

       The improvement in SNR is proportional to the number of
        those zeros.


       Nullifying the coefficients of the noise subspace corresponds
       to that of weak spectral components in spectral subtraction.



                                                            Page 7 of 47
Orthogonal Transforms

 Signal Subspace decomposition can be achieved by
  applying:

   – KLT
       via Eigenvalue Decomposition (ED) of signal covariance
        matrix
       via Singular Value Decomposition (SVD) of data matrix
       SVD approximation by recursive methods

   – DCT as a good approximation to the KLT

   – Walsh, Haar, Sine, Fourier,…
                                                           Page 8 of 47
Orthogonal Transforms:
Karhunen-Loeve Transform (KLT)
 Also known as “Hotelling”, “Principal Component” or
  “Eigenvector" Transform

 Decorrelates the input vector perfectly
   – Processing of one component has no effect on the
     others

 Applications
   – Compression, Pattern Recognition, Classification,
     Image Restoration, Speech Recognition, Speaker
     Recognition,…
                                                   Page 9 of 47
KLT Overview

Let R be the N N correlation matrix of a random
complex sequence x  ( x1 , x2 ,..., xN )T
then
                      x1                           
                                                   
         
 R  E xxH           x2
                   E 
                      
                               
                               
                               x1
                                      
                                     x2          
                                                 xN   
                                                      
                                                      
                     
                      
                              
                                                     
                      x N
                                                    
                                                      

Where E is the expectation operator and R is
Hermitian matrix.
                                                          Page 10 of 47
KLT Overview

Let  be N N unitary matrix which diagonalizes R

           
            1     H


          R  
           H


           Diag 1 , 2 ,..., N 
          i , i  1,2,..., N are the eigenvalues of R.

    H
         is called the KLT matrix.
                                                          Page 11 of 47
KLT Overview

 Property of      H
                       :
•Consider the following transform:   y  x
                                        H

sequence y is uncorrelated because :

           
        E yy H  E  H xxH      
                
          H E xxH    H R
        
 y has no cross-correlation

                                              Page 12 of 47
KLT Overview

 What is  ?

   R     R    R  
    H                    H


  where   12  N 

  and   `s are ith column of 
         i

 Ri  ii , i  1,2,..., N
 Thus   i ' s are eigenvectors corresponding to i ' s
                                                    Page 13 of 47
KLT Overview

 Comments

  – The arrangement of y auto-correlations is the same as
    that of i s '

  – KLT can be based on Covariance matrix

  – Using largest eigenvalues to reconstruct sequence with
    negligible error

  – KLT is optimal
                                                  Page 14 of 47
KLT Overview

 Difficulties

   – Computational Complexity (no fast algorithm)

   – Dependency on the statistics of the current frame

   – Make uncorrelated not independent

 Utilize KLT as a Benchmark in evaluating the
  performance of the other transforms.

                                                    Page 15 of 47
Papers Review

1. A Signal Subspace Approach for S.E. [Ephraim 95]

2. On S.E. Algorithms based on Signal Subspace Methods [Hansen]

3. Extension of the Signal Subspace S.E. Approach to Colored Noise
   [Ephraim]
4. An Adaptive KLT Approach for S.E. [Gazor]

5. Incorporating the Human Hearing Properties in Signal Subspace
   Approach for S.E. [Jabloun]

6. An Energy-Constrained Signal Subspace Method for S.E. [Huang]

7. S.E. Based on the Subspace Method [Asano]
                                                           Page 16 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Principal
   – Decompose the input vector of the noisy signal into a
     signal+noise subspace and a noise subspace by
     applying KLT

 Enhancement Procedure
   – Removing the noise subspace
   – Estimating the clean signal from S+N subspace
   – Two linear estimators by considering:
       Signal distortion
       Residual noise energy
                                                    Page 17 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Notes
   – Keeping the residual noise below some threshold to
     avoid producing musical noise

   – Since DFT & KLT are related, SS is a particular case
     of this method

   – if # of basis vectors (for linear combination of a vector)
     are less than the dim of the vector, then there are
     some zero eigenvalues for its correlation matrix

                                                       Page 18 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Basics
   – speech signal : z=y+w , K-dimensional
           M
   –   y   smVm , M  K
          m 1

       s1 ,, sM  Are zero mean complex variables
   –  y  Vs
   – If M=K, representation is always possible.
   – Else ―damped complex sinusoid model‖ can be used.
   – Span( V ): produces all vector y            Page 19 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 When M<K, all vectors y lie in a subspace of   RK
  spanned by the columns of V
    SIGNAL+NOISE SUBSPACE

 Covariance matrix of clean signal y
   y  Vs
   Ry  Eyy  VRsV
            #        #
                            ; K  M, M  M, M  K

  Rank ( R y )  M
   has K  M zero eigenvalues
                                                      Page 20 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Covariance matrix of noise w : (K-Dim)
                                                   RK

              
     Rw  E ww   I
                  #      2
                         w
                                       n
                                           S   n
                                                    n
     Rank ( Rw )  K

   – White noise vectors fill the entire Euclidean space RK

   – Thus the noise exists in both S+N subspace and
     complementary subspace
      NOISE SUBSPACE
                                                    Page 21 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 The discussion indicates that Euclidean space of the
  noisy signal is composed of a signal subspace and a
  complementary noise subspace

 This decomposition can be performed by applying KLT to
  the noisy signal :

 Let z  Vs  w
 The covariance matrix of z is:

             
    Rz  E zz  VRsV  Rw
               #           #

                                                   Page 22 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Noise is additive   Rz  Ry  Rw

 Let   Rz  U zU    #
                          be the eigendecomposition of Rz

 Where   U  u1 ,, uk  are eigenvectors of Rz and
    z  diag z 1,, z K 
 Eigenvalues of Rw are            2
                                    w
             y k    if k  1,, M
                            2
 z k    2             w

             w          if k  M  1,, K          Page 23 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]
 Estimating Dimensions of Signal Subspace M

  U  u1 ,, uk 
  Let U  U1 ,U 2 

        
  U1  uk : z k      2
                          w    : principal eigenvectors
 Because span(U1 )  span(V ) ,Hence U1U1# is the
  orthogonal projector onto the S+N subspace
                                                       Page 24 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Thus a vector z of noisy signal can be decomposed as
             UU  I  U1U  U 2U  I
                 #             #
                               1
                                        #
                                        2
              z  U1U1 z  U 2U 2 z
                     #          #


   U 1#   is the Karhunen-Loeve Transform Matrix.

                      #
 The vector U 2U z does not contain signal information
                      2
  and can be nulled when estimating the clean signal.

 However, M (dim of S+N subspace) must be calculated
  precisely
                                                     Page 25 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Linear Estimation of the clean signal

  – Time Domain Constrained Estimator
      Minimize signal distortion while constraining the energy of
       residual noise in every frame below a given threshold


  – Spectral Domain Constrained Estimator
      Minimize signal distortion while constraining the energy of
       residual noise in each spectral component below a given
       threshold


                                                             Page 26 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Time Domain Constrained Estimator

  – Having z=y+w
    Let y  Hz be a linear estimator of y
        ˆ
    where H is a K*K matrix

  – The residual signal is
                                             
      r  y  y  ( H  I ) y  Hw  ry  rw
          ˆ
      Representing signal distortion and residual noise respectively

                                                             Page 27 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Defining Criterion
   ry  ( H  I ) y        Energy:    y2  trE ry ry# 
  rw  Hw                  Energy:        w  trE rw rw 
                                            2            #


 Solving :
                      min  y2
                        H

       subject to : ε  α
                        1
                        K
                            2
                            w
                                      2
                                      w       0     M
                                                       K

   Minimize signal distortion while constraining the energy of
   residual noise in the entire frame below a given threshold

                                                                  Page 28 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 After solving the Constrained minimization by „„Kuhn-
  Tucker‟‟ necessary conditions we obtain

                 
    HTDC  Ry Ry   I     2
                            w      1


   Where  is the Lagrange multiplier that must satisfy

         1
         K
             
     tr R Ry   I 
                 2
                 y
                                2
                                w
                                      2
                                           
 Eigendecomposition of HTDC
                                                 G   0 #
                                    H TDC      U      U
                                                 0    0
                                                          Page 29 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 In order to null noisy components

              G      0 #
    H TDC   U         U
              0       0

               
     G   y  y        
                           2 1
                           w


       HTDC  U1GU    #
                       1

 If   ( max  M K ) then HTDC=I, which means minimum
distortion and maximum noise
                                                Page 30 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Spectral Domain Constrained Estimator
   – Minimize signal distortion while constraining the energy
     of residual noise in each spectral component below a
     given threshold.
 Results: H  UQU #

            Q  diag (q11 ,  , qKK )
                      12
                             k  1,  , M
            qKK       k

                   0        k  M  1,  , K
             k  exp{v / y (k )}
                               2
                               w
                                                     Page 31 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Notes

  – The most computational complexity is in
    Eigendecomposition of the estimated covariance.

  – Eigendecomposition of Toeplitz covariance matrix of
    the noisy vector is used as an approximate to KLT

  – Compromise between large T in estimating Rz ,and
    large K to satisfy M<K, while KT can not be too large

                                                    Page 32 of 47
A Signal Subspace Approach for S.E.
[Ephraim 95]

 Implementation Results
   – The improvement in SNR is proportional to K /M

   – The SDC estimator is more powerful than the TDC
     estimator

   – SNR improvements in Signal Subspace and SS are
     similar

   – Subjective Test
       83.9 preferred Signal Subspace over noisy signal
       98.2 preferred Signal Subspace over SS             Page 33 of 47
         On S.E. Algorithms based on Signal
         Subspace Methods [Hansen]
         The dimension of the signal subspace is chosen at a point
          with almost equal singular values
         Gain matrices for different estimators
           – SDC       Less sensitive to errors in
           – TDC       the noise estimation
           – MV
Musical noise      Lowest residual noise
                – LS  G=I
                   Lowest signal distortion and highest residual noise M
                                                                        K      2
                                                                                noise
                   K /M improvement in SNR
         SDC improves the SNR in the range 0-20 db
                                                                        Page 34 of 47
Extension of the Signal Subspace S.E.
Approach to Colored Noise [Ephraim]

 Whitening approach is not desirable for SDC estimator.
 Obtaining gain matrix H for SDC estimator
                 min      2
                           d


                                  
                   H

      subject to : E viN  αi         i  1,...,m
                               2


            12 ~
      H  Rw UHU Rw1 2




  ~
  H   is not diagonal when the input noise is colored
 Whitening  Orthogonal Transformation U‟  modify
                  ~
  components by H
                                                     Page 35 of 47
An Adaptive KLT Approach for S.E.
[Gazor]
 Goal
  – Enhancement of speech degraded by additive colored
    noise

 Novelty

   – Adaptive tracking based algorithm for obtaining KLT
     components

   – A VAD based on principle eigenvalues

                                                   Page 36 of 47
An Adaptive KLT Approach for S.E.
[Gazor]
 Objective
  – Minimize the distortion when residual noise power is
    limited to a specific level

 Type of colored noise
   – Have a diagonal covariance matrix in KLT domain

                      
            G   y  y       
                                2 1
                                w

                Replaced by
            G   y  y   n 
                                  1

                                                   Page 37 of 47
An Adaptive KLT Approach for S.E.
[Gazor]
 Adaptive KLT tracking algorithm
   – named ―projection approximation subspace tracking‖
   – reducing computational time

   – Eigendecomposition is considered as a constrained
     optimization problem
   – Solving the problem considering quasi-stationarity of
     speech

   – Then a recursive algorithm is planned to find a close
     approximation of eigenvectors of the noisy signal
                                                     Page 38 of 47
An Adaptive KLT Approach for S.E.
[Gazor]
 Voice activity detector
   – When the current principle components’ energy is
     above 1/12 its past minimum and maximum
 Implementation Results
            SNR   Non-                  Noise
                            Ephraim‟s
           (dB)   Processed             Type
            10       85%       55%      white
             5       75%       69%      white
             0       64%       89%      white
            10       75%       73%      office
             5       85%       79%      office
             0       68%       89%      office
                                                   Page 39 of 47
  Incorporating the Human Hearing Properties in
  the Signal Subspace Approach for S.E. [Jabloun]

   Goal
     – Keep the residual noise as much as possible, in order
       to minimize signal distortion
   Novelty
     – Transformation from Frequency to Eigendomain for
       modeling masking threshold.
eigendomain                                      eigendomain
              IFET       Masking          FET

     Many masking models were introduced in frequency
      domain; like Bark scale
                                                      Page 40 of 47
Incorporating the Human Hearing Properties in
the Signal Subspace Approach for S.E. [Jabloun]

 Use noise prewhitening to handle the colored noise

 Implementation results

                   Compared with   Compared with
       Input SNR
                   noisy signal    Signal Subspace

        20 dB          92%              71%

        10 dB          85%              78%

         5 dB          85%              92%


                                                     Page 41 of 47
An Energy-Constrained Signal
Subspace Method for S.E. [Huang]
 Novelty
   – The colored noise is modelled by an AR process.

   – Estimating energy of clean signal to adjust the speech
     enhancement

 Prewhitening filter is constructed based on the estimated
  AR parameters.
   – Optimal AR coeffs is given by [Key 98]


                                                    Page 42 of 47
An Energy-Constrained Signal
Subspace Method for S.E. [Huang]
 Implementation Results
 Word Recognition Accuracy for noisy digits
  Input SNR     0 dB       5 dB      10 dB     20 dB

   Baseline     40 %       70 %      90 %      100 %

    ECSS        90 %      100 %      100 %     100 %


 SNR improvement for isolated noisy digits
   Input SNR     0 dB      5 dB     10 dB     20 dB
  Improvement    7.6       6.4       5.2       2.9
                                                       Page 43 of 47
S.E. Based on the Subspace Method
[Asano]—Microphone Array
 The input spectrum observed at the mth microphone
              D
  X m k    Am,d k .S d k   N m k 
             d 1
 Vector notation for all microphones
                                                               Ambient
       x k  Aksk  n k                        Directional
                                               Sources
                                                               Noise


 (spatial) correlation matrix for xk is
       R k  E[x k x H ]
                     k

                                                    Microphone array
 Then Eigenvalue Decomposition
  is applied to R k
                                                               Page 44 of 47
S.E. Based on the Subspace Method
[Asano]—Microphone Array
 Procedure
   – Weighting the eigenvalues of spatial correlation matrix

       Energy of D directional sources is concentrated on D largest
        eigenvalues

       Ambient noise is reduced by weighting eigenvalues of the
        noise-dominant subspace
       discarding M-D smallest eigenvalues when direct-ambient
        ratio is high


   – Using MV beamformer to extract directional component
     from modified spatial correlation matrix      Page 45 of 47
S.E. Based on the Subspace Method
[Asano]—Microphone Array
 Implementation results

   – Two directional speech signals + Ambient noise
     Recognition Rate:

                     MV           MV-NSR

        SNR      A         B1    A       B1

        5 dB    66.9%   71.5%   72.3%   78%

        10 dB   81.1%   86.6%   81.5%   87.2%


                                                  Page 46 of 47
Thanks For Your Attention




         The End
                            Page 47 of 47

								
To top