Canonical Correlation

Document Sample
Canonical Correlation Powered By Docstoc
					Canonical Correlation

      Mechanics
                              Data
• The input correlation setup is 
                                                Rxx   Rxy
                                                Ryx   Ryy
• The canonical correlation matrix is the
  product of four correlation matrices,
  between DVs (inverse of Ryy,), IVs
  (inverse of Rxx), and between DVs
                                            R = R-1 RyxR-1 Rxy
                                                 yy     xx
  and IVs
• It also can be thought of as a product
  of regression coefficients for
  predicting Xs from Ys, and Ys from Xs
           What does it mean?
• In this context the eigenvalues of that R matrix represent
  the percentage of overlapping variance between the
  canonical variate pairs
• To get the canonical correlations, you get the
  eigenvalues of R and take the square root
                      r  i
                        2
                       ci

                      rci  i

• The eigenvector corresponding to each eigenvalue is
  transformed into the coefficients that specify the linear
  combination that will make up a canonical variate
        Canonical Coefficients
• Two sets of canonical coefficients
  (weights) are required
  – One set to combine the Xs
  – One to combine the Ys
  – Same interpretation as regression coefficients
                         Equations
                ˆ
B y = (R -1/2 )'B y
         yy
         
Where ( Ryy1/ 2 ) 'is the transpose of the inverse of the square root of the
                               ˆ
correlation matrix of DVs, and By is a normalized matrix
of eigenvectors for the DVs
Bx = R -1 R xyB*y
       xx

Where B* is By from above dividing each entry by their corresponding
       y

canonical correlation.
             Is it significant?
• Testing Canonical Correlations
  – There will be as many canonical correlations as
    there are variables in the smaller set
  – Not all will be statistically significant
• Bartlett’s Chi Square test (Wilk’s on
  printouts)
  – Tests whether an eigenvalue and the ones that
    follow are significantly different than zero.
                          kx  k y  1 
        2    N 1                    ln  m
                               2       
       Where N is number of cases, k x is number of X variables and
       k y is number of Y variables
        m  (1  1 )(1  2 )...(1  m )
       Lamda, Λ, is the product of differences between eigenvalues (R c s) and 1,
                                                                      2


       generated across m canonical correlations.



• Essentially it is an omnibus test of whether the
  eigenvalues are significantly different from zero
• It is possible this test would be significant even
  though a test for the correlation itself would not
  be
                      Variate Scores
• Canonical Variate Scores
  – Like factor scores (we’ll get there later)
  – What a subject would score if you could measure them directly on the
    canonical variate
     • The values on a canonical variable for a given case, based on the canonical
       coefficients for that variable.
• Canonical coefficients are multiplied by the standardized
  scores of the cases and summed to yield the canonical
  scores for each case in the analysis



                             X = ZxBx
                             Y = Zy By
Loadings (structure coefficients)
•   Loadings or structure coefficients
     – Key question: how well do the variate(s) on either side relate to their own set of
       measured variables?
     – Bivariate correlation between a variable and its respective variate
     – Would equal the canonical coefficients if all variables were uncorrelated with one
       another
     – Its square is the proportion of variance linearly shared by a variable with the
       variable’s canonical composite
•   Found by multiplying the matrix of correlations between variables in a set by the
    matrix of canonical coefficients

                       Loadings

X1 rX1                                            ry1                    A x = R xxBx
      \rX2    V1                       W1         ry2
                                                         Y1
X2 r
                                                  ry3
                                                                         A y = R yyBy
    X3                                                   Y2
X3 rX4
                                                         Y3
X4
          Redundancy Equations
•   Redundancy                            kx     2
                                              a
                                  pvxc  
•   Within                                       ixc
    – Percentage of variance
      in a set of variables
      extracted by the
      canonical variate
                                         i 1 k x
    – Adequacy coefficient
                                          ky     2
    – Is the average of the                      a
                                  pv yc  
      squared correlations                       iyc
      (loadings)

                                          i 1   ky
•   Across
    – Variance in IVs
      explained by the DVs
      and vice versa
    – Take how much variance
      is accounted for with its
                                  Rd  ( pv)(r )       c
                                                        2

      own variables and
      multiply that by the
      canonical correlation
      squared

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:24
posted:6/6/2012
language:
pages:10