# Expectation-Maximization

Document Sample

```					Expectation-Maximization

Fatih Gelgi, ASU, 2005

6/17/2012                                      1
Outline
   What is EM?
   Intuitive Explanation
   Example: Gaussian Mixture
   Algorithm
   Generalized EM
   Discussion
   Applications
   HMM – Baum-Welch
   K-means

6/17/2012                Fatih Gelgi, ASU’05   2
What is EM?
   Two main applications:
     Data has missing values, due to problems with or
limitations of the observation process.

     Optimizing the likelihood function is extremely hard,
but the likelihood function can be simplified by assuming
the existence of and values for additional missing or
hidden parameters.
*  arg max L ( | U )  arg max p (U | )
                         

M                     
   j p j ui |  j 
N                                   N
 arg max     pu   i   |    arg max          j 1
i 1 

       i 1                                                          

6/17/2012                             Fatih Gelgi, ASU’05                                   3
Key Idea…
   The observed data U is generated by some
distribution and is called the incomplete data.

   Assume that a complete data set exists Z =
(U,J), where J is the missing or hidden data.

   Maximize the posterior probability of the
parameters  given the data U, marginalizing over
J:
 *  arg max P(, J | U )


6/17/2012                    Fatih Gelgi, ASU’05        4
Intuitive Explanation of EM
   Alternate between estimating the unknowns  and
the hidden variables J.

   In each iteration, instead of finding the best J  J,
compute a distribution over the space J.

   EM is a lower-bound maximization process
(Minka,98).

   E-step: construct a local lower-bound to the posterior
distribution.

   M-step: optimize the bound.

6/17/2012                    Fatih Gelgi, ASU’05                  5
Intuitive Explanation of EM
   Lower-bound approximation method

** Sometimes provides
faster convergence
and Newton’s method

6/17/2012        Fatih Gelgi, ASU’05                       6
Example:
Mixture Components

6/17/2012   Fatih Gelgi, ASU’05   7
Example (cont’d):
True Likelihood of Parameters

6/17/2012   Fatih Gelgi, ASU’05   8
Example (cont’d):
Iterations of EM

6/17/2012   Fatih Gelgi, ASU’05   9
Lower-bound Maximization
Posterior probability  Logarithm of the joint distribution

  arg max P (, J | U )
*



 P (U , J
 arg max log P (U , )  arg max log difficult!!!, )
                                   J J n

computed lower-bound B(; t) to the function log
P(|U) and maximize the bound instead.

6/17/2012              Fatih Gelgi, ASU’05                10
Lower-bound Maximization (cont.)
   Construct a tractable lower-bound B(; t)
that contains a sum of logarithms.

ft(J) is an arbitrary prob. dist.
   By Jensen’s inequality,

6/17/2012              Fatih Gelgi, ASU’05       11
Optimal Bound
   B(; t) touches the objective function log
P(U,) at t.
   Maximize B(t; t) with respect to ft(J):

   Introduce a Lagrange multiplier  to enforce
the constraint

6/17/2012            Fatih Gelgi, ASU’05           12
Optimal Bound (cont.)
   Derivative with respect to ft(J):

   Maximizes at:

6/17/2012           Fatih Gelgi, ASU’05   13
Maximizing the Bound
   Re-write B(;t) with respect to the expectations:

where

   Finally,

6/17/2012                   Fatih Gelgi, ASU’05          14
EM Algorithm

   EM converges to a local maximum of
log P(U,)  maximum of log P(|U).

6/17/2012         Fatih Gelgi, ASU’05     15
A Relation to the Log-Posterior
   An alternative way to compute expected
log-posterior:

which is the same as maximization with
respect to ,

6/17/2012         Fatih Gelgi, ASU’05    16
Generalized EM
   Assume ln p( X | ) and B function are differentiable in
 .The EM likelihood converges to a point where

ln p( X | )  0


   GEM: Instead of setting t+1 = argmax B(;t)
Just find t+1 such that
B(;t+1) > B(;t)

   GEM also is guaranteed to converge

6/17/2012                   Fatih Gelgi, ASU’05                17
HMM – Baum-Welch Revisited
Estimate the parameters (a, b, ) st. number of correct individual
states to be maximum.

gt(i) is the probability of
being in state Si at time t

xt(i,j) is the probability of
being in state Si at time t,
and Sj at time t+1

6/17/2012                     Fatih Gelgi, ASU’05                            18
Baum-Welch: E-step

6/17/2012   Fatih Gelgi, ASU’05   19
Baum-Welch: M-step

6/17/2012   Fatih Gelgi, ASU’05   20
K-Means
   Problem: Given data X and the number of
clusters K, find clusters.
   Clustering based on centroids,
        1       
μ(c)        x
| c | xc


   A point belongs to the cluster with closest
centroid.
   Hidden variables centroids of the clusters!

6/17/2012                 Fatih Gelgi, ASU’05   21
K-Means (cont.)
Starting with an initial 0, centroids,
 E-step: Split the data into K clusters
according to distances to the centroids
(Calculate the distribution ft(J)).

   M-step: Update the centroids
(Calculate t+1).

6/17/2012         Fatih Gelgi, ASU’05       22
K Means Example
(K=2)
Pick seeds
Reassign clusters
Compute centroids
Reassign clusters
x      x                  Compute centroids
x
x
Reassign clusters

Converged!

6/17/2012       Fatih Gelgi, ASU’05                       23
Discussion
   Is EM a Primal-Dual algorithm?

6/17/2012         Fatih Gelgi, ASU’05   24
Reference:
   A.P.Dempster et al “Maximum-likelihood from incomplete data
Journal of the Royal Statistical Society. Series B
(Methodological), Vol. 39, No. 1. (1977), pp. 1-38.
   F. Dellaert, “The Expectation Maximization Algorithm”, Tech.
Rep. GIT-GVU-02-20, 2002.
   T. Minka, “Expectation-Maximization as lower bound
maximization”, 1998
   Y. Chang, M. Kölsch. Presentation: Expectation Maximization,
UCSB, 2002.
   K. Andersson, Presentation: Model Optimization using the EM
algorithm, COSC 7373, 2001

6/17/2012                  Fatih Gelgi, ASU’05                     25
Thanks!

6/17/2012         Fatih Gelgi, ASU’05   26

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 11 posted: 6/17/2012 language: pages: 26