multivariate statistical analysis 2 by jpl7986

VIEWS: 128 PAGES: 18

									Multivariate Analysis: Factor
Analysis, Clustering Methods,
Multidimensional Scaling, and
Conjoint Analysis
  Chapter 19
            Factor Analysis
• Factor analysis is a class of techniques
  which reduce and summarize data

  – For example, taking 14 variables, and finding
    similarities and reducing those 14 variables to
    4 factors

     (These reduced variables are known as factors)
           Factor Analysis
• Factor analysis is not about making
  predictions from variables—it is about
  finding relationships between whole sets
  of variables, and finding the strength of
  those relationships
    Factor Analysis: Key Terms
                    A set of techniques for finding the number and characteristics of variables underlying
Factor Analysis:         a large number of measurements made on individuals or objects.

                    A variable or construct that is not directly observable but is developed as a linear
Factor:                  combination of observed variables.

                    The correlation between a variable and a factor. It is computed by from correlating
Factor Loading:          factor scores with observed manifest variable scores.

                    A value for each factor that is assigned to each person. It is derived from a
Factor Score:            summation of the derived weights which are applied to the original data

                    The common variance of each variable summarized by the factors, or the amount
Communality (h2):        (percent) of each variable that is explained by the factors. The uniqueness
                         component of a variable’s variance is 1- h2.

                    The sum of squares of loadings of each factor. It is a measure of the variance of
Eigenvalue:              each factor, and if divided by the number of variables (i.e., the total variance), it
                         is the percent of variance summarized by the factor.
       Factor Analysis—Example
• A grocery store administers a survey to customers, asking them to
  rate stores in a variety of traits
   –   Convenient / inconvenient location
   –   Low-quality / high-quality products
   –   Modern / old-fashioned
   –   Unfriendly / friendly staff
   –   Sophisticated / unsophisticated customers
   –   Cluttered / spacious
   –   Fast / slow check-out
   –   Unorganized / organized layout
   –   Enjoyable / unenjoyable shopping experience
   –   Bad / good reputation
   –   Good / bad service
   –   Unhelpful / helpful clerks
   –   Good / bad selection
   –   Dull / exciting
                                Initial Eigenvalues
       Factor                Total           % of Variance       Cumulative %
1                    5.448                38.917               38.917
2                    1.523                10.882               49.799
3                    1.245                8.890                58.689
4                    1.096                7.827                66.516
5                    .875                 6.247                72.763

Here are eigenvalues for the different numbers of factors.

Where the eigenvalue is less than one, that means that additional factors
explain less of the variance than variables by themselves. Here, since the
eignenvalues are greater than one up to four factors, there are four factors to be

On the right-hand column, it displays the percentage of the variance explained
by the factors.
                         Factor Components

        Variable          1               2                3                4             Communalities

 Location                 1.255E-02             .218       1.075E-02              .735                    .587
 Quality of products           .789       3.318E-02              .237       -8.115E-02                    .687
 Modern                       -.665             .216       -7.144E-02            -.221                    .543
 Friendliness of cl…           .199            -.298             .606             .433                    .683
 Customers                    -.235             .781            -.139       6.850E-02                     .689
 Cluttered                7.027E-02            -.162             .894       5.166E-02                     .834
 Check-out                     .170             .720       -3.992E-02            -.326                    .655
 Layout                        .323       -5.814E-02             .742             .150                    .681
 Shopping experie…            -.353             .448            -.183            -.552                    .664
 Reputation                    .724            -.283       9.555E-02              .296                    .702
 Service                      -.257             .339            -.393            -.588                    .680
 Helpfulness of clerks         .281            -.338             .290             .597                    .634

 Selection of prod…            .799       -6.586E-02             .184             .126                    .692

 Dull                         -.284             .668            -.227       4.610E-02                     .581

The factor loading scores show the correlation between factors and individual variables on a scale from
-1 to 1. Where a factor loading is less than -0.5 or greater than 0.5 (more or less, depending on
researcher’s judgment), the variable is a component of that factor.
       Factor Analysis—Example
• Factor analysis shows that the 14 variables fit into 4

Factor 1           Factor 2    Factor 3                 Factor 4
Quality products   Customers   Friendliness of clerks   Location
Modern stores      Check-out   Cluttered                Shopping experience
Reputation         Dull        Layout                   Service
Selection                                               Helpfulness of clerks

• For each respondent, a factor score to be used in future
  analysis is generated for each respondent by taking the
  sum of the products of the variable and a weighting for
  that variable
   –   Factor1= (Variable1 x Weight1) + (V2 x W2) + …
     Cluster Analysis Defined
• Grouping instances into groups that show
  as little difference between instances
  within the group, and maximum
  differences between the different groups
• Techniques designed to identify objects,
  people, or variables that are similar with
  respect to some criteria or characteristics
           Cluster Analysis
• There are many approaches to finding
  ―clusters‖ or groups of data points with
  similar values, always through use of
  mathematical formulas
• Most statistical software packages have
  tools to do all of the calculations
Applications of Cluster Analysis
• Segmentation
  – Breaking consumers into different groups so
    that they have similar preferences and
    reactions to product configurations or
• Product positioning
  – Allows marketers to see how various products
    are positioned relative to competing brands
Multidimensional Scaling (MDS)
• Using distances, or differences between
  data points to create a 2- or 3-dimensional
  map to represent data
• ―psychological dissimilarity as geometric
MDS models can display data a few
different ways. In this example,
different breakfast breads are scored in
a variety of traits.
In this case, the arrows represent
different vectors, or traits. They begin
at the origin and head outward. Vectors
heading in similar directions are
related—‖inexpensive‖ and ―not highly
filling‖ are related. Vectors heading off
at a 90 degree angle are unrelated, and
vectors heading the opposite direction
are negatively correlated.
The closer a point is to a vector, the
more it exemplifies the trait. Hard rolls
are ―eaten with other foods‖, somewhat
―hard to prepare‖ and not ―mainly for
        Applications of MDS
• Brand positioning—shows positioning of
  all brands relative to product attributes
          Conjoint Analysis
• Shows the economic trade-offs people
  make when different product traits (brand,
  configuration, packaging, etc.) are
• Helps identify both important attributes
  and ideal product configurations
        Conjoint Methodologies
• Several methodologies exist, each with
  –   Two-factor
  –   Full profile
  –   Adaptive conjoint analysis (ACA)
  –   Choice-based conjoint
  –   Self-explicated conjoint
  –   Hybrid conjoint
  –   Hierarchical bayes
• Ultimately, all methods measure how choices
  change when several attributes are combined
               Conjoint Example
• Product manager for dairy has the following ice cream
   – 3 ice cream formulations (gelato, premium, cheap)
   – 12 different base flavors (vanilla, chocolate, strawberry, mocha,
   – 6 different flavor sworls (marshmallow, chocolate syrup, fudge,
     strawberry, caramel)
   – 12 different mix-ins (pralines, peanuts, brownie bits, cookie
     dough, etc.)
• (3 x 12 x 6 x 12) = 2,592 different flavor combinations
• By giving test subjects the choice of different
  combinations of attributes, the relative importance of
  each category emerges, and the preferred level or trait in
  each falls out
          Points to Consider
• Different techniques exist for clustering,
  MDS, and conjoint
  – Different researchers have different
  – Math behind these techniques is not
    complicated, and not statistically validated
  – Using these techniques is more an art than a

To top