Basic principles of probability theory

Document Sample
Basic principles of probability theory Powered By Docstoc
					                          Procrustes analysis
•   Purpose of procrustes analysis
•   Algorithm
•   Various modifications
                           Purpose of procrustes analysis
There are many situations when different techniques produce different configurations. For
       example when metric and non-metric scaling are used then different configurations may
       be generated. Even if metric scaling is used, different proximity (dissimilarity) matrices
       can produce different configurations. Since these techniques are used for the same
       multivariate observations each observation in one configuration corresponds to exactly
       one observation in another one. Most of the techniques produce configuration that is
       rotationally undefined.
Scores in factor analysis can also be considered as one of the possible configurations.
There are other situations when comparison of configurations is needed. For example in
       macromolecular biology 3-dimensional structures of different proteins are derived using
       some experimental technique. One of the interesting question is if two different proteins
       are similar. If they are what is the similarity between them. To find similarity it is
       necessary to match configurations of two protein structures.
All these questions can be addressed using procrustes analysis.
Suppose we have two configurations X=(x1,x2,,,xn) and Y = (y1,y2,,,yn). where each x and y are
       vectors in p dimensional space. We want to find an orthogonal matrix A and a vector b
       so that:
                     n                   n     p
              M 2  || xi  Ay i  b ||  ( xij  (Ay i  b) j ) 2  min
                    i 1                 i 1 j 1
                         Prcucrustes analysis: vector and matrix
It can be show that finding translation (b) and rotation matrix (A) can be considered separately.
       Translation can easily be found if we centre each configuration. If rotation is already
       known then we can find translation. Let us denote zi=Ayi+b. Then we can write:
                                      n                      n                n

                                      || x
                                     i 1
                                              i    z i ||  || x i  x ||   || z i  z || n || x  z ||
                                                            i 1             i 1

It is minimised when centres of x and z coincide. I.e.
                                                                 b  xz
We want centroids of the configuration to match. It can be done if we will subtract from x and y
     their respective centroids. Remaining problem is finding the orthogonal matrix (matrix of
     rotation or inverse). We can write:
M 2   || x i  Ay i || tr (X  YA)(X  YA) T )  tr (XX T )  tr (YAA T Y)  2tr (X T YA)  tr (XX T )  tr (YY T )  2tr (X T YA )
       i 1
Here we used the fact that A is an orthogonal matrix:

                                              AAT  I
Then we want to perform constrained maximisation:

                                            tr(XT YA)               max
                                            AAT  I

We can do it using Lagrange’s multipliers technique.
                                          Rotation matrix using SVD
Let us define symmetric matrix of the constraints by 1/2. Then we want to maximise:
                                          V  tr ( XT YA  Λ(AAT  I))  max
If we get derivatives of this expression wrt to matrix A and equate them to 0 then we can get:
                                                   YT X  ΛA

Here we used the following facts:
                              p     n
                                            (tr (BA))         tr(BA))
               tr(BA)   b ji aij ,                   bqp            T
                              j 1 i 1        a pq              A

and remembering that the matrix of the constraints is symmetric.
                   p    p           p
                                              tr ( ΛAAT ) p              p
                                                                                     tr (ΛAAT )
    tr ( ΛAA )   mi  aik amk
                                                            pi aiq   mp amq               2ΛA
                  m 1 i 1       k 1             a pq    i 1         m 1             A

We have necessary linear equations to find the required orthogonal matrix. Let us use SVD of
                                          YT X  UDV T

V and U are pxp orthogonal matrices. D is the diagonal matrix of the singular values.
                                         Rotation matrix and SVD
If we use the fact that A is orthogonal then we can write:
 Y T X  ΛA  Y T XX T Y  ΛAA T Λ  (UDV T )(VDU T )  Λ 2  UD 2 U T  Λ 2  Λ  UDU T

             YT X  ΛA  UDV T  UDU T A  UV T  A
It gives the solution for the rotation (orthogonal) matrix. Now we can calculate least-squares
       differences between configurations:
  M 0  tr ( XX Τ )  tr (YY T )  2tr ( X T YA )  tr ( XX T )  tr (YY T )  2tr (VDU T UV T )  tr ( XX T )  tr (YY T )  2tr (VDV T )

Thus we have the expressions for the rotation matrix and differences between configurations
     after matching. It is interesting to note that to find differences between configurations it
     is not necessary rotate them. This expression can also be written:
                                            M 02  tr ( XX T )  tr (YY T )  2tr (D)

One more useful expression is:
                                         M 0  tr (  )  tr (Y T Y)  2tr (( X T YY T X)1/ 2 )

This expression shows that it is even not necessary to do SVD to find differences between
      configurations. (For square root of a matrix Cholesky decomposition could be used)
                                      Some modifications
There are some situations where problems may occur:
1)    Dimensions of configurations can be different. There are two ways of handling this
      problem. First way is to fill low dimensional (k) space with 0-s and make it high (p)
      dimensional. This way we assume that the first k dimensions coincide. Here we assume
      that k-dimensional configuration is in the k-dimensional subspace of p-dimensional space.
      Second way is to collapse high dimensional configuration to low dimensional one. For this
      we need to project p-dimensional configuration to k-dimensional space.
2)    Second problem is when the scales of the configurations are different. In this case we can
      add scale factor to the function we want to minimise:
                             z i  cAy i  b  M  tr ( XX T )  c 2tr (YY T )  2ctr ( XT YA )

If we find orthogonal matrix as before then we can find expression for the scale factor:
                             c  tr (( X T YY T X)1/ 2 ) / tr (YY T )

As a result M is no longer symmetric wrt X and Y.
3) Sometimes it is necessary to weight some variables down and others up. In this case procrustes
       analysis can be performed using weights. We want to minimise the function:
                            M 2  tr ( W(X  AY)(X  AY) T )
This modification can be taken into account. Analysis becomes easy when weight matrix is

1)   Krzanowski WJ and Marriout FHC. (1994) Multivatiate analysis.
     Kendall’s library of statistics
2)   Mardia, K.V. Kent, J.T. and Bibby, J.M. (2003) Multivariate

Shared By: