A Global Geometric Framework for Nonlinear Dimensionality

Document Sample
A Global Geometric Framework for Nonlinear Dimensionality Powered By Docstoc
					Nonlinear Dimensionality
Reduction Approaches
Dimensionality Reduction

   The goal:
        The meaningful low-dimensional structures hidden in their
        high-dimensional observations.
   Classical techniques
       Principle Component Analysis—preserves the variance
       Multidimensional Scaling—preserves inter-point distance

       Isomap
       Locally Linear Embedding
Common Framework

   Algorithm
       Given data D  x ,, x  . Construct a nxn affinity matrix M.
                          1   n

       Normalize M, yielding M .
       Compute the m largest eigenvalues 'j and eigenvectors v j
            of M . Only positive eigenvalues should be considered.
       The embedding of each example x j is the vector y j with yij
            the i-th element of the j-th principle eigenvector v j of M .
        Alternatively (MDS and Isomap), the embedding is ei , with
         eij  'j yij . If the first m eigenvalues are positive, then ei .e j
            is the best approximation of M using only m corrdinates,
        in the sense of squared error.
Linear Dimensionality Reduction

   PCA
       Finds a low-dimensional embedding of the data
        points that best preserves their variance as
        measured in the high-dimensional input space

   MDS
       Finds an embedding that preserves the inter-point
        distances, equivalent to PCA when the distances
        are Euclidean.
Multi-Dimensional Scaling

   MDS starts from a notion of distacne of affinity that
    is computed each pair of training examples.
   The normalizing step is equivalent to dot products
    using the “double-centering” formula:
                           1                            
                                                                          Si   M ij
                   ~                1    1      1
                   M ij    M ij  Si  S j  2 Si S j    where
                           2       n    n     n                              j

   The embedding         of example is given by  v
                                       eik                           xi                 k ik
    where v is the k-th eigenvector of M . Note that if

                    M   y  y y  y  where y is the average
    M  y y
     ij   i   jthen                   ij      i          j

    value of y         i
Nonlinear Dimensionality Reduction

   Many data sets contain essential nonlinear
    structures that invisible to PCA and MDS
   Resorts to some nonlinear dimensionality
    reduction approaches.
A Global Geometric Framework
for Nonlinear Dimensionality
Reduction (Isomap)
    Joshua B. Tenenbaum, Vin de Silva,
    John C. Langford
             64X64 Input
              Images form
             Intrinsically, three
              dimensions is
              enough for
              Two pose
              parameters and
              azimuthal lighting
Isomap Advantages

   Combining the major algorithmic features of
    PCA and MDS
       Computational efficiency
       Global optimality
       Asymptotic convergence guarantees
   Flexibility of learning a broad class of
    nonlinear manifold
Example of Nonlinear Structure
   Swiss roll
      Only the geodesic distances reflect the true low-dimensional
      geometry of the manifold.

   Built on top of MDS.
   Capturing in the geodesic manifold path of
    any two points by concatenating shortest
    paths in-between.
   Approximating these in-between shortest
    paths given only input-space distance.
Algorithm Description

   Step 1
      Determining neighboring points within a fixed radius based on the
      input space distance d X i, j 
      These neighborhood relations are represented as a weighted
      graph G over the data points.
   Step 2
      Estimating the geodesic distances d M i, j  between all pairs of
      points on the manifold M by computing their shortest path
      distances d G i, j  in the graph G
   Step 3
      Constructing an embedding of the data in d-dimensional
      Euclidean space Y that best preserves the manifold’s geometry
Construct Embeddings

   The coordinate vector y for points in Y are chosen

    to minimize the cost function
              E   DG   DY  L2

      where D denotes the matrix of Euclidean distances d i, j   y  y 
                  Y                                              Y     i   j

               the L matrix norm  A The  operator converts
      and A  L2
                                            i, j   ij

        distances to inner products.

   The true dimensionality of data can be
    estimated from the decrease in error as the
    dimensionality of Y is increased.
Manifold Recovery Guarantee

   Isomap is guaranteed asymptotically to recover the
    true dimensionality and geometric structure of
    nonlinear manifolds
   As the sample data points increases, the graph
    distances d (i, j) provide increasingly better

    approximations to the intrinsic geodesic distances d   M   (i, j )

   Interpolations
    between distant
    points in the low-
    coordinate space.

   Isomap handles non-linear manifold
   Isomap keeps the advantages of PCA and
       Non-iterative procedure
       Polynomial procedure
       Guaranteed convergence
   Isomap represents the global structure of a
    data set within a single coordinate system.
Nonlinear Dimensionality
Reduction by Locally Linear
    Sam T. Roweis and Lawrence K. Saul

   Neighborhood preserving embeddings
   Mapping to global coordinate system of low
   No need to estimate pairwise distances
    between widely separated points
   Recovering global nonlinear structure from
    locally linear fits
Algorithm Description

   We expect each data point and its neighbors to lie
    on or close to a locally linear patch of the manifold.
   We reconstruct each point from its neighbors.
                                                
                       W   
                                   X i   j Wij X j

        where Wij summarize the contribution of jth data point to the
        ith data reconstruction and is what we will estimated by
        optimizing the error
       Reconstructed from only its neighbors
       Wj sums to 1
Algorithm Description

   A linear mapping for transform the high dimensional
    coordinates of each neighbor to global internal
    coordinates on the manifold.
                                                     
                           min Y    Yi   j Wij Y j


    Note that the cost defines a quadratic form
                     Y    M Y  Y 
                                   
                                                           ij   i   j

    where M    W  W  W W
            ij   ij   ij   ji
                                        ki        kj

    The optimal embedding is found by computing the
    bottom d eigenvector of M, d is the dimension of the
   Two Dimensional Embeddings of Faces
Thank you

Shared By: