ADSP-20-STFT-EC623 by harshi446


More Info
    Short-Term Fourier Transform (STFT)

 • To   analyze STFT from Joint Time-Freq Distribution Perspec-
 • To   understand the limitations of STFT as a TFD


 • Most   widely used method for studying non-stationary signals
 • Breakup   given non-stationary signal into small segments and
  analyze each segment
 • What   is the typical size of segment?
   – Approximately   where stationary property holds
  – 10-30     ms in case of speech
• Can    we continue breaking to achieve finer time localization?
• No,    because after certain narrowing spectrum becomes mean-
 ingless and show no relation to original signal spectrum
  – Termed      as windowing effect
  – Finer    time localization   =⇒   small window in TD      =⇒    larger
       window in FD    =⇒   convolved with signal spectrum     =⇒   more
       smoothing of true spectrum information
• This    should be attributed to limitation of STFT which makes
 short segments
• It   is not uncertainty principle as applied to signal that is limiting
 factor, it is uncertainty principle associated with small segments
• Distinction   between the two uncertainty principles should be
 kept in mind and the two should not be confused
• In   STFT properties of signal are scrambled with the properties
 of window function
• Unscrambling    required for proper interpretation and estimation
 of original signal
• In   spite of these limitations STFT provides excellent time-freq
 str. for some signals under proper choice of window function
 and duration.
• However,    it may not be best tool for all non-stationary signals
• Hence    search for new TFA tools
The STFT and Spectrogram

• To    study properties of signal at time t, emphasize signal around
  t   and suppress signals at other places.                            st (τ ) = s(τ )h(τ − t)

• Modified         signal is a fn. of two times, fixed time t, and running
  time     τ

• Window         fn. is chosen to leave the signal more or less unaltered
  around the time t, but to suppress the signal for times distant
  from     t   i.e.,   st(τ ) = s(τ )   for   τ   near   t   and   0   for   τ   away from       t

• Since        modified signal emphasizes the signal around t, the FT
  will reflect the distribution of freq around that time
        st(w) = √       e−jwτ st (τ )dτ
       =√       e−jwτ s(τ )h(τ − t)dτ
 • The    energy density spectrum at time                  t   is therefore,
                   PSP (t, w) = |St (w)|2 = | √       e−jwτ s(τ )h(τ − t)dτ |2
 • For   each different time we get a different spectrum and the
  totality of these spectra is the time-frequency distribution,                   PSP

 • More       commonly termed as          Spectrogram

Short-Frequency Time Transform

 • In   STFT we emphasize the desire to study freq. properties at
  time    t

 • Conversely,    we may wish to study time properties at a particular
 • We    then window spectrum,             S(w)      with a freq. window fn.     H(w),
  and take the time transform, which, of course, is the IFT
• Termed        as short-freq time transform and defined as
                                      1          ′
                            Sw (t) = √       ejw t S(w′ )H(w − w′ )dw′
• If   we relate    h(t)    and    H(w)     by   H(w) =    √1
                                                                 h(t)e−jwtdt,   then   St (w) =
 e−jwt Sw (t)

• The    STFT is the same as short-freq. time tfm. except for the
 phase factor       e−jwt

• Since    the distribution is the absolute square, the phase factor
 e−jwt   does not enter into it
• Either    the STFT or short-freq time tfm can be used to define
 the joint distribution
                                  P (t, w) = |St (w)|2 = |Sw (t)|2

• This    shows that the spectrogram can be used to study the
   behavior of time properties at a particular freq.
 • This    is done by choosing an           H(w)       that is narrow or equivalently
   by taking an          h(t)   that is broad

Narrowband and Wideband Spectrogram

 • If   time window         h(t)   is of short duration the freq. window      H(w)   is
   broad and in that case the spectrogram is termed as                     Wideband.

 • Alternatively,         if time window        h(t)   is of long duration, the freq.
   window       H(w)     is narrow and we say we have a            Narrowband   spec-

Characteristic Function
MSP (θ, τ ) =     |St (w)|2ejθt+jτ w dtdw

= As(θ, τ )Ah (−θ, τ )
As(θ, τ ) =           1
               s∗(t − 2 τ )s(t + 1 τ )ejθtdt

Ah(−θ, τ ) =             1
                  h∗(t − 2 τ )h(t + 1 τ )e−jθt dt

As()   and    Ah ()   are the termed as ambiguity fns. of the signal   s(t)   and
window fn.            h(t),   respectively.
The results we will obtain are revealing when expressed in terms
of the phases and amplitudes of the signal and window and their
s(t) = A(t)ejφ(t)     and     S(w) = B(w)ejψ(w)

h(t) = Ah (t)ejφh(t)    and    H(w) = BH (w)ejψH (w)

In the calculation of global averages (e.g., mean freq, bandwidth)
we will have to indicate which density fn is being used. We will
use the superscript (s), to indicate the signal being used.
< w >(SP )=    w|St (w)|2dwdt

< w >(s)=     w|S(w)|2dw

< w >(h) =    w|H(w)|2dw

Characterisitc Function:

 • The   joint characteristic function of a time frequency density is
                   M (θ, τ ) =< ejθt+jτ w >=   P (t, w).ejθt+jτ w .dtdw
General Properties:
Total Energy

 • Integrating          over all time and freq we get total energy

             ESP =         PSP (t, w)dwdt

         = MSP (0, 0) = As (0, 0)Ah(0, 0)

               =     |s(t)|2dt    |h(t)|2dt

 • If    the energy of window is taken to be one, then energy of
  spectrogram is equal to total energy of the signal
  Time Marginal
  P (t) =      |St (w)|2dw
         1                                           ′
  =     2π   s(τ )h(τ − t)s∗(τ ′ )h∗(τ ′ − t)e−jw(τ −τ )dτ dτ ′dw
=    s(τ )h(τ − t)s∗(τ ′)h∗(τ ′ − t)δ(τ − τ ′)dτ dτ ′

=    |s(τ )|2|h(τ − t)|2dτ

=    A2(τ )A2 (τ − t)dτ

Freq. Marginal
P (w) =            2
           B 2(w′)Bh(w − w′)dw′

As can be seen from the equations, the marginals of the spec-
trogram generally do not satisfy the correct marginals, namely,
|s(t)|2   and   |S(w)|2

P (t) = A2(t) = |s(t)|2

P (w) = B 2(w) = |S(w)|2

 – The     reason is that the spectrogram scrambles the energy
    distribution of the window with those of the signal
 – This     introduces effects unrelated to the properties of the
     original signal
  – Notice     that the time marginal of the spectrogram depends
     only on the magnitude of the signal and window and not on
     their phases.
  – Similarly,     freq. marginal depends only on the amplitudes of
     the FT
 Averages of time and freq. fns:
• Since   the marginals are not satisfied, averages of time and freq.
 fns. will never be correctly given,
 < g1(t) + g2(w) >=       {g1(t) + g2(w)}PSP (t, w)dωdt

 =    g1(t)|s(t)|2dt +   g2(w)|S(w)|2dw

• This    is in contrast to other distributions we will be studying
 where these types of averages are always correctly given
 Finite Support
• According    to finite support property, the distribution should be
 zero before and after the signal ends
• In   case of spectrogram, if a time     t   is chosen before the signal
 starts, will the spectrogram be zero for that time?
• Generally,   no, because the modified signal as a function of t will
 not necessarily be zero since the window may pick up some of
 the signal
• Even   though   s(t)   may be zero for a time t,   s(τ )h(τ − t)   may not be
 zero for that time
• similar   condition applies to FD
• Spectrogram     does not possess finite support property in either
 time or freq. domains.
 Localization Trade-Off
• Good    time localization          =⇒   narrow window   h(t)   in TD
• Good    freq localization         =⇒    narrow window   H(w)   in FD
• But    both   h(t)   and   H(w)   cannot be made arbitrarily narrow
• This   is the inherent trade off between time and freq localization
 in the spectrogram for a particular window
• Degree    of trade off depends on the window, signal, time and
• The    uncertainty principle for the spectrogram quantifies these
 trade off dependencies
 Entanglement and Symmetry Between Window and Signal
• Results   obtained using the spectrogram generally do not give
 results regarding the signal solely, because the STFT entangles
 the signal and window
• Therefore   we must be cautious in interpreting the results and
 we must attempt to disentangle the window. This is not always
• In   fact, because of the basic symmetry in the defn. of the
 STFT between the window and signal, we have to be careful
 that we are not using the signal to study the window
Uncertainty Principle for the STFT:
–A   short duration signal obtained by windowing is given by
                                   st (τ ) = s(τ )h(t − τ )

– The    normalized short duration signal at time t is given by
                                             s(τ )h(t − τ )
                            ηt (τ ) =
                                           | s(τ )h(t − τ ) |2 dτ

– DR.    is square root of total energy in windowed signal.
– This   normalization ensures that
                                        | ηt (τ ) |2 dτ = 1

        i.e. total normalized energy will be unity for any t.
– The    STFT of   ηt(τ )   is given by
                               Ft(jw) =       ηt (τ ).e−jwτ dτ.
– We   can define all the relevant quantities such as mean time,
  duration, and bandwidth in the standard way, but they will
  be time dependent.
Mean Time:
                            < τ >t =     τ | ηt(τ ) |2 dτ

                                     τ | s(τ )h(τ − t) |2 dτ
                       < τ >t =
                                      | s(τ )h(τ − t) |2 dτ
                       Tt2 =   (τ − < τ >t)2. | ηt (τ ) |2 dτ

                           (τ − < τ >t )2. | s(τ )h(τ − t) |2 dτ
                 Tt2   =
                                  | s(τ )h(τ − t) |2 dτ
Mean Frequency:
                           < w >t=      w | Ft(jw) |2 dw
                        Bt2 =      (w− < w >t )2 | Ft(jw) |2 dw

                          Tt2 =      (τ − < τ >t )2 | ηt (τ ) |2 dτ

– Let < τ >t = 0   then,
                                    Tt2 =     τ 2 | ηt (τ ) | dτ

– Similarly, < w >t= 0      then,
                                   Bt2 =     w2 | Ft (jw) | dw

                                     Bt2 =         ′
                                                | ηt(τ ) |2 dτ

                         Tt2Bt2 =      | τ ηt (τ ) | dτ.       ′
                                                            | ηt (τ ) |2 dτ

                   | f (x) |2 dx     | g(x) |2 dx      ≥      |    f ∗(x)g(x)dx |2
– Let, f = τ ηt   &        ′
                      g = ηt

                           Tt2Bt2   ≥        | τ ηt (τ )ηt (τ )dτ |2
                                                  ∗      ′

– Substituting        and simplifying we get
                                    Tt2Bt2    ≥
– This   is the uncertainty principle for the STFT.
– It   is a function of time, the signal, and the window.
– It   should not be compared with the uncertainty principle ap-
 plied to the signal.
– It   is important to understand this uncertainty principle, be-
 cause it places limits on the technique of the STFT proce-
– However,    it places no constraints on the original signal.
  – It   is true that if we modify the signal by the technique of
       STFT, we limit our abilities in terms of resolution.
  – Hence     the search for new time-frequency analysis tools.
 Global Quantities
 Mean Time:        < t >(SP )=    t|St(w)|2dtdw

 Mean Freq:       < w >(SP )=     w|St (w)|2dtdw

• Direct   evaluation leads to
 < t >(SP )=< t >(s) − < t >(h)

 < w >(SP )=< w >(s) − < w >(h)

• If   the window is chosen so that its mean time and freq. are
 zero, which can be done by choosing a window symmetrical in
 time and whose spectrum is symmetrical in freq. domain, then
 the mean time and freq. of the spectrogram will be that of
 the signal.
• The   second conditional moments are calculated to be
 < w2 >(SP )=< w2 >(s) + < w2 >(h) +2 < w >(s) < w >(h)

 < t2 >(SP )=< t2 >(s) + < t2 >(h) −2 < t >(s) < w >(h)

 by combining these with mean time and freq. defns., we find
 that the durations and bandwidths are related by
   2             2
 T(SP ) = Ts2 + Th
  2        2    2
 B(SP ) = Bs + Bh

 which indicates how the duration of the windowed signal is
 related to the duration of the signal and window
 Covariance and Correlation Coefficient
• First    mixed moment
 < tw >(SP )=              tw|St (w)|2dtdw

 =< tφ′ >(s) − < tφ′h >(h) − < t >(h) < φ′ >(s) + < t >(s)< φ′h >(h)

 subtracting           < t >(SP )< w >(SP ),   from both sides we have that the
 covariance of spectrogram is
       (SP )                                           (s)     (h)
 Covtw         =< tw >(SP ) − < t >(SP )< w >(SP )= Covtw − Covtw

 covariance of real signal is zero
• If   we take real window, the covariance of the spectrogram will
 be the covariance of the signal
       (SP )         (s)
 Covtw         = Covtw     for real windows
Local Averages
Method of Calculation

 • STFT St (w)      and windowed signal                sh (t) = s(τ )h(τ − t)   forms Fourier
  s(τ )h(τ − t) ↔ St (w)   represents Fourier pair between                      τ   and   w

 • Modified       signal expressed in terms of the phase and amplitude
  st (τ ) = s(τ )h(τ − t) = A(τ )Ah(τ − t)ej[ψ(τ )+ψh(τ −t)]

 •A    fruitful way to look at the situation is that we are dealing
  with a signal in the variable                  τ   whose amplitude is              A(τ )Ah(τ − t)
  and whose phase is              ψ(τ )ψh (τ − t)
 • The   normalized modified signal is given by
                      s(τ )h(τ − t)             s(τ )h(τ − t)
      ηt (τ ) =                             =
                      |s(τ )h(τ − t)|2dτ              P (t)

Conditional Average
< g(w) >t =   P (t)
                      g(w)|St (w)|2dw =                      d
                                                ηt (τ )g( 1 dτ )ηt (τ )dτ

Local Frequency
Instantaneous Freq.
 • WKT < w >=             ψ ′ (t)|s(t)|2dt =      ψ ′ (t)A2(t)dt

 • Let A2      be replaced by              A2 (τ )A2 (τ − t)
                                                   h               and      ψ′   by               ′
                                                                                      ψ ′ (τ ) + ψh(τ − t)   to
                    1                               1 d
      < w >t =            w|St(w)|2dw = ηt (τ )  ∗
                                                         ηt (τ )dτ                                           (1)
                  P (t)                             j dτ
              =                                     ′
                        A2(τ )A2 (τ − t){ψ ′(τ ) + ψh (τ − t)}dτ
                               h                                                                             (2)
                P (t)
Local Square Frequency
 • WKT < w2 >=                 w2|S(w)|2dw = ( A (t) )2A2(t)dt +
                                               A(t)                           ψ ′2(t)A2(t)dt

 • < w2 >t=       ηt (τ )( 1 dτ )2ηt (τ )dτ =
                              d                         d
                                                     | dτ ηt (τ )|2dτ
 • < w2 >t=   P (t)
                          ( dτ A(τ )Ah (τ − t))2dτ + P 1
                                                                   A2(τ )A2 (τ − t){ψ ′(τ ) + ψh(τ − t)}2dτ

Conditional or Instantaneous Bandwidth
 • Bandwidth              Eqn:      B 2 = ( A (t) )2A2dt + (ψ ′ (t)− < w >)2A2(t)dt

 • Bt2 = σw|t =   P (t)     (w− < w >t )2|St(w)|2dw
                                     1           d
                            Bt2 =           (      A(τ )Ah (τ − t))2dτ                                    (3)
                                    P (t)       dτ
        +     2               A2 (τ1)A2(τ2)A2 (τ1 − t)A2 (τ 2 − t)
                                            h          h                                                  (4)
           2P1 (t)
                              ′             ′
      ×[ψ ′(τ1) − ψ ′ (τ2) + ψh (τ1 − t) − ψh (τ2 − t)]2dτ1dτ2                                            (5)
• For    convenience we define
                                        1           d
                          < w2 >0=
                                t              (      A(τ )Ah(τ − t))2dτ
                                       P (t)       dτ

Narrowing and Broadening Window

• All   the results obtained above using spectrogram suffer from
  windowing effect
• Thus    these quantities are estimates and depends on the window
  function chosen
• By    narrowing window, we will get better temporal resolution
  and better estimates and we may be tempted to do so. But
  this will give very poor frequency resolution.
• For    instance let us consider the window such that                     A2 (t) → δ(t)

• In    the limit   < w >t → ψ ′ (t)
 • If   the window is narrowed to get increasing time resolution, the
    limiting value of the estimated inst. freq. is the derivative of
    the phase, which is the inst. freq.
 • This     is a pleasing and impt. result. But a penalty to pay for
 • σw|t → ∞

Some Examples to illustrate Limitations of STFT
Ex1: Sinusoid with Gaussian Window
                                            2 /2
s(t) = ejω0 t   and    h(t) = (α/π)1/4e−αt
STFT:       St (w) =      1
                       (απ)1 /2
                                e−j(w−w0 )texp[− (w−w0) ]
TFD:      PSP (t, w) = |St(w)|2 =        1
                                              exp[− (w−w0) ]

Using it we have             < w >t = w0 ; σw|t = 1/2a
Avg. value of freq. for a given time is always                                        w0 ,   but the width
about that freq. is dependent on the window width.
Ex2: Impulse
For an impulse at               t = t0 ,   with the same window as above
         √                                                   2 /2
s(t) =       2πδ(t − t0)   and      h(t) = (a/π)1/4e−at
                                                       2 /2
We have          St (w) = (a/π)1/4e−jwt0 e−a(t−t0 )
PSP (t, w) = |St (w)|2 = (a/π)1/2e−a(t−t0 )

Ex3: Sinusoid Plus Impulse
Consider sum of a sinusoid and impulse
                  √                                                   2 /2
s(t) = ejw0 t +       2πδ(t − t0)    and     h(t) = (a/π)1/4e−at
                                                     (w−w0)2                                      2 /2
We have          St (w) =      1
                                    e−j(w−w0 )t
                                                exp[− a ]           + (a/π)1/4e−jwt0 e−a(t−t0 )
                                                       2                          2
and      PSP (t, w) = |St (w)|2 =        1
                                              e−(w−w0 ) /a    + (a/π)1/2e−a(t−t0 ) +
 2         2           2
√ e−(w−w0 ) /a−a(t−t0 ) cos[w(t−t0 )−w0 t]
This example illustrates one of the fundamental difficulties with
the spectrogram. For one window we cannot have high resolution
in time and frequency.
The broadness of the self terms, the first two terms of above
eqn., depends on the window size in an inverse relation. If we
try to make one narrow, then the other must be broad. That will
not be the case with other distributions.
In Fig. shown below we plot the spectrogram for diff. window
sizes. The cosine term is an example of the so called cross terms.
Not that they essentially fall on the self terms in the spectrogram.

To top