Principle Components Analysis with SPSS - East Carolina University_1_

Document Sample
Principle Components Analysis with SPSS - East Carolina University_1_ Powered By Docstoc
					Principal Components
 Analysis with SPSS
     Karl L. Wuensch
    Dept of Psychology
  East Carolina University
           When to Use PCA

• You have a set of p continuous variables.
• You want to repackage their variance into
  m components.
• You will usually want m to be < p, but not
   Components and Variables
• Each component is a weighted linear
  combination of the variables
 Ci  Wi 1 X 1  Wi 2 X 2    Wip X p
• Each variable is a weighted linear
  combination of the components.

  X j  A1 j C1  A2 j C2    Amj Cm
       Factors and Variables
• In Factor Analysis, we exclude from the
  solution any variance that is unique, not
  shared by the variables.
  X j  A1 j F1  A2 j F2    Amj Fm  U j

• Uj is the unique variance for Xj
       Goals of PCA and FA
• Data reduction.
• Discover and summarize pattern of
  intercorrelations among variables.
• Test theory about the latent variables
  underlying a set a measurement variables.
• Construct a test instrument.
• There are many others uses of PCA and
            Data Reduction
• Ossenkopp and Mazmanian (Physiology
  and Behavior, 34: 935-941).
• 19 behavioral and physiological variables.
• A single criterion variable, physiological
  response to four hours of cold-restraint
• Extracted five factors.
• Used multiple regression to develop a
  model for predicting the criterion from the
  five factors.
   Exploratory Factor Analysis
• Want to discover the pattern of
  intercorrleations among variables.
• Wilt et al., 2005 (thesis).
• Variables are items on the SOIS at ECU.
• Found two factors, one evaluative, one on
  difficulty of course.
• Compared FTF students to DE students,
  on structure and means.
  Confirmatory Factor Analysis
• Have a theory regarding the factor
  structure for a set of variables.
• Want to confirm that the theory describes
  the observed intercorrelations well.
• Thurstone: Intelligence consists of seven
  independent factors rather than one global
• Often done with SEM software
   Construct A Test Instrument
• Write a large set of items designed to test
  the constructs of interest.
• Administer the survey to a sample of
  persons from the target population.
• Use FA to help select those items that will
  be used to measure each of the constructs
  of interest.
• Use Cronbach alpha to check reliability of
  resulting scales.
      An Unusual Use of PCA
• Poulson, Braithwaite, Brondino, and Wuensch
  (1997, Journal of Social Behavior and
  Personality, 12, 743-758).
• Simulated jury trial, seemingly insane
  defendant killed a man.
• Criterion variable = recommended verdict
  – Guilty
  – Guilty But Mentally Ill
  – Not Guilty By Reason of Insanity.
• Predictor variables = jurors’ scores on 8
• Discriminant function analysis.
• Problem with multicollinearity.
• Used PCA to extract eight orthogonal
• Predicted recommended verdict from
  these 8 components.
• Transformed results back to the original
 A Simple, Contrived Example
• Consumers rate importance of seven
  characteristics of beer.
  – low Cost
  – high Size of bottle
  – high Alcohol content
  – Reputation of brand
  – Color
  – Aroma
  – Taste
  SPSS-Data.htm .
• Analyze, Data Reduction, Factor.
• Scoot beer variables into box.
• Click Descriptives and then check Initial
  Solution, Coefficients, KMO and Bartlett’s
  Test of Sphericity, and Anti-image. Click
• Click Extraction and then select Principal
  Components, Correlation Matrix,
  Unrotated Factor Solution, Scree Plot, and
  Eigenvalues Over 1. Click Continue.
• Click Rotation. Select Varimax and
  Rotated Solution. Click Continue.
• Click Options. Select Exclude Cases
  Listwise and Sorted By Size. Click

• Click OK, and SPSS completes the
  Principal Components Analysis.
Checking for Unique Variables 1
• Check the correlation matrix.
• If there are any variables not well
  correlated with some others, might as well
  delete them.
 Checking for Unique Variables 2
Correlation Matrix

            cost     size    alcohol reputat color   aroma taste
  cost      1.00     .832    .767    -.406   .018    -.046   -.064
  size      .832     1.00    .904    -.392   .179    .098    .026
  alcohol   .767     .904    1.00    -.463   .072    .044    .012
  reputat   -.406    -.392   -.463   1.00    -.372   -.443   -.443
  color     .018     .179    .072    -.372   1.00    .909    .903
  aroma     -.046    .098    .044    -.443   .909    1.00    .870
  taste     -.064    .026    .012    -.443   .903    .870    1.00
Checking for Unique Variables 3
• Bartlett’s test of sphericity tests null that
  the matrix is an identity matrix, but does
  not help identify individual variables that
  are not well correlated with others.
                      KMO and Bartle tt's Te s t

      Kaiser-Meyer-Olkin Measure of Sampling

      Bartlett's Test of      Approx. Chi-Square 1637.9
      Sphericity              df                     21
                              Sig.                 .000
Checking for Unique Variables 4
• For each variable, check R2 between it
  and the remaining variables.
• SPSS reports these as the
  initial communalities when
  you do a principal axis
  factor analysis
• Delete any variable with a
  low R2 .
  Checking for Unique Correlations
• Look at partial correlations – pairs of
  variables with large partial correlations
  share variance with one another but not
  with the remaining variables – this is
• Kaiser’s MSA will tell you, for each
  variable, how much of this problem exists.
• The smaller the MSA, the greater the
Checking for Unique Correlations 2
• An MSA of .9 is marvelous, .5 miserable.
• Variables with small MSAs should be
• Or additional variables added that will
  share variance with the troublesome
   Checking for Unique Correlations 3
                                                   Anti-image Matrices

                               cost       size        alcohol            reputat     color        aroma        taste
Anti-image       cost           .779a      -.543            .105             .256        .100          .135     -.105
                 size           -.543      .550a           -.806            -.109       -.495          .061      .435
                                 .105      -.806           .630a             .226        .381          -.060    -.310

                                 .256      -.109            .226            .763a       -.231          .287      .257

                 color           .100      -.495            .381            -.231       .590a          -.574    -.693
                 aroma           .135       .061           -.060             .287       -.574          .801a    -.087
                 taste          -.105       .435           -.310             .257       -.693          -.087    .676a

a. Measures of Sampling Adequacy (MSA) on main diagonal. Off diagonal are partial correlations x -1.
Extracting Principal Components 1
• From p variables we can extract p components.
• Each of p eigenvalues represents the amount of
  standardized variance that has been captured
  by one component.
• The first component accounts for the largest
  possible amount of variance.
• The second captures as much as possible of
  what is left over, and so on.
• Each is orthogonal to the others.
Extracting Principal Components 2
• Each variable has standardized variance =
• The total standardized variance in the p
  variables = p.
• The sum of the m = p eigenvalues = p.
• All of the variance is extracted.
• For each component, the proportion of
  variance extracted = eigenvalue / p.
Extracting Principal Components 3
• For our beer data, here are the
  eigenvalues and proportions of variance
  for the seven components:

                        Initial Eigenvalues
                               % of     Cumulative
       Component Total      Variance        %
       1          3.313       47.327       47.327
       2          2.616       37.369       84.696
       3           .575         8.209      92.905
       4           .240         3.427      96.332
       5           .134         1.921      98.252
       6         9.E-02         1.221      99.473
       7         4.E-02           .527    100.000
      Ex traction Method: Princ ipal Component Analy sis.
 How Many Components to Retain
• From p variables we can extract p
• We probably want fewer than p.
• Simple rule: Keep as many as have
  eigenvalues  1.
• A component with eigenvalue < 1 captured
  less than one variable’s worth of variance.
• Visual Aid: Use a Scree Plot
• Scree is rubble at base of cliff.
• For our beer data,
                        Scree Plot







                        1      2       3   4   5   6   7

                        Component Number
• Only the first two components have
  eigenvalues greater than 1.
• Big drop in eigenvalue between
  component 2 and component 3.
• Components 3-7 are scree.
• Try a 2 component solution.
• Should also look at solution with one fewer
  and with one more component.
     Less Subjective Methods
• Parallel Analysis and Velcier’s MAP test.
• SAS, SPSS, Matlab scripts available at
          Parallel Analysis
• How many components account for more
  variance than do components derived from
  random data?
• Create 1,000 or more sets of random data.
• Each with same number of cases and
  variable as your data set.
• For each set, find the eigenvalues.
• For the eigenvalues from the random sets,
  find the 95th percentile for each
• Retain as many components for which the
  eigenvalue from your data exceeds the
  95th percentile from the random data sets.
   Random Data Eigenvalues
         Root      Prcntyle
     1.000000     1.344920
     2.000000     1.207526
     3.000000     1.118462
     4.000000     1.038794
     5.000000      .973311
     6.000000      .907173
     7.000000      .830506

• Our data yielded eigenvalues of 3.313,
  2.616, and 0.575.
• Retain two components
        Velicer’s MAP Test
• Step by step, extract increasing numbers
  of components.
• At each step, determine how much
  common variance is left in the residuals.
• Retain all steps up to and including that
  producing the smallest residual common
Velicer's Minimum Average Partial (MAP) Test:

Velicer's Average Squared Correlations
    .000000    .266624
   1.000000     .440869
   2.000000     .129252
   3.000000     .170272
   4.000000     .331686
   5.000000     .486046
   6.000000 1.000000

The smallest average squared correlation is

The number of components is 2
         Which Test to Use?
• Parallel analysis tends to overextract.
• MAP tends to underextract.
• If they disagree, increase number of
  random sets in the parallel analysis
• And inspect carefully the two smallest
  values from the MAP test.
• May need apply the meaningfulness
 Loadings, Unrotated and Rotated
• loading matrix = factor pattern matrix =
  component matrix.
• Each loading is the Pearson r between one
  variable and one component.
• Since the components are orthogonal, each
  loading is also a β weight from predicting X from
  the components.
• Here are the unrotated loadings for our 2
  component solution:
               Com ponent Matrix

                           1      2
      COLOR               .760  -.576
      AROMA               .736  -.614
      REPUTAT            -.735  -.071
      TASTE               .710  -.646
      COST                .550   .734
      ALCOHOL             .632   .699
      SIZE                .667   .675
      Ex traction Method: Princ ipal Component A naly sis.
         a. 2 components extracted.

• All variables load well on first component,
  economy and quality vs. reputation.
• Second component is more interesting,
  economy versus quality.
• Rotate these axes so that the two
  dimensions pass more nearly through the
  two major clusters (COST, SIZE, ALCH
• The number of degrees by which I rotate
  the axes is the angle PSI. For these data,
  rotating the axes -40.63 degrees has the
  desired effect.
• Component 1 = Quality versus reputation.
• Component 2 = Economy (or cheap drunk)
  versus reputation.
         Rotated Com pone nt M atrix

                         1      2
     TASTE              .960  -.028
     AROMA              .958 1.E-02
     COLOR              .952 6.E-02
     SIZE             7.E-02   .947
     ALCOHOL          2.E-02   .942
     COST              -.061   .916
     REPUTAT           -.512  -.533
     Ex traction Method: Principal Component A nalys is.
     Rotation Method: V arimax w ith Kais er Normalization.
       a. Rotation converged in 3 iterations.
    Number of Components in the
         Rotated Solution
• Try extracting one fewer component, try one
  more component.
• Which produces the more sensible solution?
• Error = difference in obtained structure and true
• Overextraction (too many components)
  produces less error than underextraction.
• If there is only one true factor and no unique
  variables, can get “factor splitting.”
• In this case, first unrotated factor  true
• But rotation splits the factor, producing an
  imaginary second factor and corrupting the
• Can avoid this problem by including a
  garbage variable that will be removed prior
  to the final solution.
         Explained Variance
• Square the loadings and then sum them across
• Get, for each component, the amount of
  variance explained.
• Prior to rotation, these are eigenvalues.
• Here are the SSL for our data, after rotation:
                  Total V ariance Explaine d

                       Rotation Sums of Squared
                                % of    Cumulative
   Component          Total   Variance      %
   1                  3.017     43.101     43.101
   2                  2.912     41.595     84.696
  Ex traction Method: Princ ipal Component A naly sis.

• After rotation the two components together
  account for (3.02 + 2.91) / 7 = 85% of the
  total variance.
• If the last component has a small SSL,
  one should consider dropping it.
• If SSL = 1, the component has extracted
  one variable’s worth of variance.
• If only one variable loads well on a
  component, the component is not well
• If only two load well, it may be reliable, if
  the two variables are highly correlated with
  one another but not with other variables.
        Naming Components
• For each component, look at how it is
  correlated with the variables.
• Try to name the construct represented by
  that factor.
• If you cannot, perhaps you should try a
  different solution.
• I have named our components “aesthetic
  quality” and “cheap drunk.”
• For each variable, sum the squared
  loadings across components.
• This gives you the R2 for predicting the
  variable from the components,
• which is the proportion of the variable’s
  variance which has been extracted by the
• Here are the communalities for our beer
  data. “Initial” is with all 7 components,
  “Extraction” is for our 2 component
                        Com m unalitie s

                               Initial    Extraction
            COST               1.000           .842
            SIZE               1.000           .901
            ALCOHOL            1.000           .889
            REPUTAT            1.000           .546
            COLOR              1.000           .910
            AROMA              1.000           .918
            TASTE              1.000           .922
            Ex traction Method: Princ ipal Component A naly sis.
        Orthogonal Rotations
• Varimax -- minimize the complexity of the
  components by making the large loadings
  larger and the small loadings smaller
  within each component.
• Quartimax -- makes large loadings larger
  and small loadings smaller within each
• Equamax – a compromize between these
           Oblique Rotations
• Axes drawn through the two clusters in the
  upper right quadrant would not be
• May better fit the data with axes that are
  not perpendicular, but at the cost of having
  components that are correlated with one
• More on this later.

Shared By: