Docstoc

Expectation Maximization - UCLA

Document Sample
Expectation Maximization - UCLA Powered By Docstoc
					Expectation Maximization
• First introduced in 1977

• Lots of mathematical derivation

• Problem : given a set of data (data is incomplete or
having missing values).

• Goal : assume the set of data come from a underlying
distribution, we need to guess the most likely (maximum
likelihood) parameters of that model.
Example
                                                           
• Given a set of data points in R2   ( x1, x 2,..., x n )  { X }

• Assume underlying distribution is mixture of Gaussians

• Goal: estimate the parameters of each gaussian
distribution

• Ѳ is the parameter, we consider it consists of means and
variances, k is the number of Gaussian model.
                                       
                   ( 1, 2,...,  k )  {}
    Steps of EM algorithm(1)
    • randomly pick values for Ѳk (mean and variance)

    • for each xn, associate it with a responsibility value r

    • rn,k - how likely the nth point comes from/belongs to the
    kth mixture

    • how to find r?

Assume data come from
these two distribution
Steps of EM algorithm(2)
              
          p( xn |  k )
rn, k  k
                        Probability that we observe xn in the

        p( xn |  i )
        i 1
                          data set provided it comes from kth
                          mixture


                                     Distribution by Ѳk



                                        Distance between xn and
                                        center of kth mixture
Steps of EM algorithm(3)
• each data point now associate with (rn,1, rn,2,…, rn,k)
rn,k – how likely they belong to kth mixture, 0<r<1

• using r, compute weighted mean and variance for each
gaussian model

• We get new Ѳ, set it as the new parameter and iterate
the process (find new r -> new Ѳ -> ……)

• Consist of expectation step and maximization step
Ideas and Intuition
• given a set of incomplete (observed) data

• assume observed data come from a specific model

• formulate some parameters for that model, use this to
guess the missing value/data (expectation step)

• from the missing data and observed data, find the most
likely parameters (maximization step)

• iterate step 2,3 and converge
Application
• Parameter estimation for Gaussian mixture (demo)

• Baum-Welsh algorithm used in Hidden Markov Models

• Difficulties
   • How to model the missing data?

   • How to determine the number of Gaussian mixture.

   • What model to be used?

				
DOCUMENT INFO