4 by wuzhengqin

VIEWS: 0 PAGES: 52

									Answers to Exercises                                                                                   1




10. Structural Equation Models with Latent Variables

Exercise 10.2

I went back to the original article by Crosby, Evans and Cowles (1990) and obtained the standard
deviations for the variables examined in this exercise. This makes it possible to analyze the covariance
matrix rather than the correlation matrix. The data are shown below (in the format used by AMOS):


  row    var
  type_ name_     Y1       Y2       Y3       Y4       X1       X2       X3       X4       X5      X6
  n               151      151      151      151      151      151      151      151      151    151
  corr     Y1     1.00
  corr     Y2     0.63     1.00
  corr     Y3     0.28     0.22     1.00
  corr     Y4     0.23     0.24     0.51     1.00
  corr     X1     0.38     0.33     0.29     0.20     1.00
  corr     X2     0.42     0.28     0.36     0.39     0.57     1.00
  corr     X3     0.37     0.30     0.39     0.29     0.48     0.59     1.00
  corr     X4     0.30     0.36     0.21     0.18     0.15     0.29     0.30     1.00
  corr     X5     0.45     0.37     0.31     0.39     0.29     0.41     0.35     0.44    1.00
  corr     X6     0.56     0.56     0.24     0.29     0.18     0.33     0.30     0.46    0.63 1.00
  stddev 0.78     1.32     0.83     1.01     1.36     1.09     1.28     0.73     1.29 1.17



The measurement equations and structural equations for the model of salesperson service outcomes are
shown below. The command syntax is taken from AMOS. There are two independent constructs
("similarity," measured by X 1, X2, and X3) and ("interaction," measured by X4, X5, and X6) and one
dependent construct ("attitude," measured by Y1 and Y2) in the model. The total number of observed
variances and covariances available to estimate the model parameters is 36; the total number of
parameters estimated is 19 (six factor loadings and six error variances for the two independent
constructs, one covariance term between independent constructs, two structural equation coefficients,
one parameter describing the variance of the error term in the structural equation, and one factor
loading and two error variances for the dependent construct). Thus, there are 17 degrees of freedom
associated with the model specified below.

  Measurement and Structural Equations for 10.2:
           "Y1 = (1) attitude + (1) eps1"
           "Y2 =     attitude + (1) eps2"

           "X1 = (1) similarity + (1) eps3"
           "X2 =     similarity + (1) eps4"
           "X3 =     similarity + (1) eps5"

           "X4 = (1) interaction + (1) eps6"
           "X5 =     interaction + (1) eps7"
           "X6 =     interaction + (1) eps8"

           "attitude = similarity + interaction + (1) zeta"
2                                                                                   Answers to Exercises




The goodness-of-fit statistics for the calibrated model are shown below. The chi-square test is
significant at the p=0.025 level: 2(17)=30.14. Strictly speaking, this suggests we should reject the
model (i.e., if this were in fact the true model, we would expect a fit this discrepant less than three
percent of the time). However, both GFI (0.95) and AGFI (0.90) indicate good levels of fit. Thus, we
are inclined to accept the model fit.

    Summary of models
                   Model   NPAR         CMIN      DF            P        CMIN/DF
        ----------------   ----    ---------      --    ---------      ---------
           Default model     19       30.144      17        0.025          1.773
         Saturated model     36        0.000       0
      Independence model      8      468.748      28        0.000        16.741


                   Model          RMR          GFI           AGFI           PGFI
        ----------------   ----------   ----------     ----------     ----------
           Default model        0.061        0.954          0.903          0.451
         Saturated model        0.000        1.000
      Independence model        0.475        0.453          0.297         0.352



The parameter estimates (non-standardized, estimated from the covariance matrix) are shown below.
The column "C.R." reports the critical ratio, given by the estimate divided by its standard error (S.E.).
This is effectively a z-score, since the maximum likelihood routine provides asymptotic standard
errors. These results suggest (and the standardized solution confirms) that "interaction" has a greater
impact on "attitude" (in terms of explaining variance in the construct) than "similarity."

    Maximum Likelihood Estimates and Standard Errors:
    Regression Weights:                         Estimate       S.E.         C.R.     Label
    -------------------                         --------     -------      -------   -------
           attitude <-------- similarity          0.177       0.071        2.489
           attitude <------- interaction          1.042       0.213        4.891
           Y1 <---------------- attitude          1.000
           Y2 <---------------- attitude          1.537       0.183       8.385
           X1 <-------------- similarity          1.000
           X2 <-------------- similarity          0.967       0.128       7.530
           X3 <-------------- similarity          0.986       0.137       7.185
           X4 <------------- interaction          1.000
           X5 <------------- interaction          2.356       0.380       6.198
           X6 <------------- interaction          2.487       0.384       6.472

    Variances:                                  Estimate       S.E.         C.R.     Label
    ----------                                  --------     -------      -------   -------
                               similarity         0.858       0.200        4.281
                              interaction         0.162       0.048        3.379
                                     zeta         0.146       0.044        3.350
                                     eps1         0.185       0.044        4.192
                                     eps2         0.741       0.125        5.920
                                     eps3         0.979       0.144        6.799
                                     eps4         0.377       0.088        4.265
Answers to Exercises                                                                               3

                                     eps5          0.793       0.124       6.370
                                     eps6          0.368       0.046       7.919
                                     eps7          0.756       0.117       6.444
                                     eps8          0.360       0.093       3.888

  Covariances:                                 Estimate      S.E.           C.R.     Label
  ------------                                 --------    -------        -------   -------
         similarity <----> interaction           0.191      0.051          3.728




  Standardized Solution:
  Regression Weights:                          Estimate
  --------------------------------             --------
         attitude <-------- similarity           0.254
         attitude <------- interaction           0.647
         Y1 <---------------- attitude           0.833
         Y2 <---------------- attitude           0.756
         X1 <-------------- similarity           0.683
         X2 <-------------- similarity           0.825
         X3 <-------------- similarity           0.716
         X4 <------------- interaction           0.552
         X5 <------------- interaction           0.737
         X6 <------------- interaction           0.857

  Correlations:                                Estimate
  -------------                                --------
         similarity <----> interaction           0.513



For those using SAS, I have included the relevant portions of the SAS output for the purposes of
comparison with AMOS.

  Results from PROC CALIS Model in SAS:
        Goodness of Fit Index (GFI)                                     0.9542
        GFI Adjusted for Degrees of Freedom (AGFI)                      0.9030
        Root Mean Square Residual (RMR)                                 0.0448
        Parsimonious GFI (Mulaik, 1989)                                 0.5793
        Chi-Square                                                     30.1438
        Chi-Square DF                                                       17
        Pr > Chi-Square                                                 0.0253

  Standardized Solution:
  Manifest Variable Equations:
                y1      =   0.8329 f_sales     +   0.5535 e1
                y2      =   0.7564*f_sales     +   0.6541 e2
                                   c2
                x1      =   0.6835*f_sim       +   0.7300 e3
                                   c3
                x2      =   0.8249*f_sim       +   0.5653 e4
                                   c4
                x3      =   0.7162*f_sim       +   0.6979 e5
                                   c5
4                                                                                     Answers to Exercises

                  x4       =    0.5524*f_int     +   0.8336 e6
                                       c6
                  x5       =    0.7366*f_int     +   0.6764 e7
                                       c7
                  x6       =    0.8573*f_int     +   0.5149 e8
                                       c8

    Latent Variable Equations:
         f_sales =   0.2536*f_sim      +   0.6467*f_int     +   0.5910 d
                            g1                    g2

    Correlations Among Exogenous Variables:
                    Var1    Var2    Parameter             Estimate
                    f_sim   f_int   phi                    0.51311




Exercise 10.4

Using AMOS, it is possible to perform a multiple group analysis on these data (where one group is
comprised of 154 men and the other group is comprised of 125 women). We begin by specifying a
model of the impact of personableness (denoted "person") and quality of argument (denoted "quality")
on the perceived outcome of the debate (denoted "success"). We allow all coefficients of the model to
differ across groups; that is, we allow different factor loadings, error variances, path coefficients, etc.,
for men and women. The equations describing this model are shown below:

    Measurement and Structural Equations for Two Groups:

    Men:
           "S1 = (1) success + (1) eps1"
           "S2 =     success + (1) eps2"
           "S3 =     success + (1) eps3"

           "P1 = (1) person + (1) eps4"
           "P2 =     person + (1) eps5"

           "Q1 = (1) quality + (1) eps6"
           "Q2 =     quality + (1) eps7"

           "success = person + quality + (1) zeta"

    Women:
           "S1 = (1) success + (1) eps1"
           "S2 =     success + (1) eps2"
           "S3 =     success + (1) eps3"

           "P1 = (1) person + (1) eps4"
           "P2 =     person + (1) eps5"

           "Q1 = (1) quality + (1) eps6"
           "Q2 =     quality + (1) eps7"
Answers to Exercises                                                                                   5


The total number of observed variances and covariances we have to estimate the model parameters is
equal to 56 (28 each for men and women). The number of parameters estimated is 34 (17 each for
mean and women), leaving a total number of degrees of freedom of 22. The goodness of fit results are
shown below. Note that the fit of the model is exceptionally good.

  Summary of Results: Model I
                 Model    NPAR        CMIN      DF            P        CMIN/DF
      ----------------    ----   ---------      --    ---------      ---------
         Default model      34      15.255      22        0.851          0.693

                 Model           RMR          GFI          AGFI           PGFI
      ----------------    ----------   ----------    ----------     ----------
         Default model         0.207        0.985         0.961          0.387



Note that most of the parameter estimates are quite closely comparable across groups. For example,
the factor loading for men and women seem remarkably similar. One noticeable difference is in the
past coefficient that describes the impact of "person" on "success." The estimate appears to be much
greater among women (1.879) than among men (0.955). This is something we can subject to a more
rigorous statistical test.

  Results for group: men
  Regression Weights:                         Estimate       S.E.         C.R.     Label
  -------------------                         --------     -------      -------   -------
                 success <----- person          0.955       0.137        6.962
                 success <---- quality          1.116       0.154        7.227
                 S1 <--------- success          1.000
                 S2 <--------- success          0.996       0.045      22.347
                 S3 <--------- success          1.035       0.039      26.606
                 P1 <---------- person          1.000
                 P2 <---------- person          0.931       0.110       8.470
                 Q1 <--------- quality          1.000
                 Q2 <--------- quality          1.044       0.120        8.711

  Covariances:                                Estimate       S.E.         C.R.     Label
  ------------                                --------     -------      -------   -------
                  person <----> quality         0.066       0.479        0.137

  Results for group: women
  Regression Weights:                         Estimate       S.E.         C.R.     Label
  -------------------                         --------     -------      -------   -------
                 success <----- person          1.879       0.199        9.463
                 success <---- quality          1.231       0.229        5.374
                 S1 <--------- success          1.000
                 S2 <--------- success          0.994       0.031      31.635
                 S3 <--------- success          1.014       0.031      32.645
                 P1 <---------- person          1.000
                 P2 <---------- person          0.884       0.090       9.801
                 Q1 <--------- quality          1.000
                 Q2 <--------- quality          1.196       0.204       5.853
6                                                                                   Answers to Exercises

    Covariances:                                Estimate       S.E.         C.R.     Label
    ------------                                --------     -------      -------   -------
                   person <----> quality          0.159       0.500        0.318



We are now in a position to test for significant differences in model parameters between groups. We
do this by constraining parameter values to be the same across groups. By convention, we proceed by
first testing for equality in the measurement models, and then test for equality in the structural equation
models. Model II, shown below, constrains the factor loadings and measurement error variances for all
three constructs to be the same across groups. (Note that we might proceed by testing one at a time;
we've accelerated the process somewhat to save space). This is done in AMOS by labeling the
parameters across groups with the same label (e.g., the correlation between "person" and "quality" is
given the same label, phi, in the models for men and women).

In all, we constrain 12 parameters to be the same across groups: four factor loadings (those that are not
already set to 1.0 to identify the model), seven measurement error variances, and one covariance
between the independent factors "person" and "quality." Thus, Model II has 34 degrees of freedom.
As shown below, this constrained model still fits extremely well. The chi-square test is even less
significant than in Model I. Thus, we are unable to reject the hypotheses that the factor structures are
not different across groups.

    Summary of Results: Model II
                   Model   NPAR         CMIN      DF            P        CMIN/DF
        ----------------   ----    ---------      --    ---------      ---------
           Default model     22       22.829      34        0.927          0.671

                   Model          RMR           GFI          AGFI           PGFI
        ----------------   ----------    ----------    ----------     ----------
           Default model        0.228         0.976         0.960          0.592

    Results for group: men
    Regression Weights:                         Estimate       S.E.         C.R.     Label
    -------------------                         --------     -------      -------   -------
                   success <----- person          0.934       0.126        7.399
                   success <---- quality          1.164       0.154        7.569
                   S1 <--------- success          1.000
                   S2 <--------- success          0.995       0.026      37.737     f1
                   S3 <--------- success          1.022       0.024      42.491     f2
                   P1 <---------- person          1.000
                   P2 <---------- person          0.885       0.066      13.338     f3
                   Q1 <--------- quality          1.000
                   Q2 <--------- quality          1.097       0.105      10.399     f4

    Covariances:                                Estimate       S.E.         C.R.     Label
    ------------                                --------     -------      -------   -------
                   person <----> quality          0.146       0.356        0.411    phi

    Results for group: women
    Regression Weights:                         Estimate       S.E.        C.R.      Label
    -------------------                         --------     -------     -------    -------
                   success <----- person          1.856       0.183      10.126
Answers to Exercises                                                                                        7

                      success <---- quality        1.154      0.201       5.748
                      S1 <--------- success        1.000
                      S2 <--------- success        0.995      0.026      37.737     f1
                      S3 <--------- success        1.022      0.024      42.491     f2
                      P1 <---------- person        1.000
                      P2 <---------- person        0.885      0.066      13.338     f3
                      Q1 <--------- quality        1.000
                      Q2 <--------- quality        1.097      0.105      10.399     f4

  Covariances:                                   Estimate      S.E.         C.R.     Label
  ------------                                   --------    -------      -------   -------
                      person <----> quality        0.146      0.356        0.411    phi



We now move on to test for equality of the structural equation model coefficients across groups. (As
before, we can test these parameters separately, but for our purposes it serves to test all parameters at
once). In addition to constraining the parameters of the measurement model, we now also constrain
the structural equation coefficients. The fit results (shown below) suggest that this Model III is only
marginally significant. However, the change in fit from Model II to Model III is 26.3 on a change of
only three degrees of freedom. This suggests a significant deterioration in fit, which leads us to
conclude that there are in fact differences across groups in the values of the path coefficients.

  Summary of Results: Model III
                 Model      NPAR         CMIN     DF            P        CMIN/DF
      ----------------      ----    ---------     --    ---------      ---------
         Default model        19       49.057     37        0.089          1.326

                 Model             RMR          GFI          AGFI           PGFI
      ----------------      ----------   ----------    ----------     ----------
         Default model           3.804        0.952         0.927          0.629



It is not now possible to perform a multiple group analysis using PROC CALIS in SAS. One approach
we might use to test for a difference between groups is to calibrate the model on one set of data (say,
men), and then use the estimated parameter values to predict to the other sample (i.e., women). If the
only differences between the two groups are non-systematic sampling error, then the predicted values
should not be significantly difference from the observed. If there is a difference, then the parameters
estimated using the data from the men should not fit the women's data very well.

The model fits the men's data very well: 2(11)=4.4, which is not significant.


         Goodness of Fit Index (GFI)                                   0.9917
         GFI Adjusted for Degrees of Freedom (AGFI)                    0.9790
         Root Mean Square Residual (RMR)                               0.1114
         Parsimonious GFI (Mulaik, 1989)                               0.5195
         Chi-Square                                                    4.4300
         Chi-Square DF                                                     11
         Pr > Chi-Square                                               0.9556

                      Manifest Variable Equations with Estimates
                 s1         =   1.0000 f_succ   + 1.0000 e1
                 s2         =   0.9960*f_succ   + 1.0000 e2
8                                                                                 Answers to Exercises

                 Std Err        0.0446 c2
                 t Value       22.3546
                 s3        =    1.0347*f_succ   +   1.0000 e3
                 Std Err        0.0389 c3
                 t Value       26.6158
                 p1        =    2.3637*f_pers   +   1.0000 e4
                 Std Err        0.2173 c4
                 t Value       10.8782
                 p2        =    2.1999*f_pers   +   1.0000 e5
                 Std Err        0.1966 c5
                 t Value       11.1889
                 q1        =    2.1214*f_qual   +   1.0000 e6
                 Std Err        0.1917 c6
                 t Value       11.0683
                 q2        =    2.2155*f_qual   +   1.0000 e7
                 Std Err        0.1974 c7
                 t Value       11.2260

                    Latent Variable Equations with Estimates
      f_succ =     2.2570*f_pers   + 2.3670*f_qual    + 1.0000 d
      Std Err      0.3083 g1          0.3106 g2
      t Value      7.3210             7.6209

                    Covariances Among Exogenous Variables
                                                   Standard
        Var1   Var2   Parameter      Estimate         Error         t Value
        f_pers f_qual phi             0.01315       0.09613            0.14



When we use the parameters estimated from the men's data to predict the data observed for the women,
the correspondence is not very good. Note that there is no estimation going on at this stage: no model
parameters are being fitted. Because the chi-square statistic is highly significant, we reject the model
and conclude that the two groups are indeed different.

    Predicting WOMEN Using Model Calibrated on MEN:
         Goodness of Fit Index (GFI)                                0.8216
         GFI Adjusted for Degrees of Freedom (AGFI)                 0.8216
         Root Mean Square Residual (RMR)                            8.9304
         Parsimonious GFI (Mulaik, 1989)                            1.0955
         Chi-Square                                                86.9466
         Chi-Square DF                                                  28
         Pr > Chi-Square                                            <.0001



The problem is that we do not know exactly the basis of the difference between the two groups. One
thing we can do is increase the generality of the model by relaxing the constraints on the structural
equation model parameters. Allowing these three parameters (the two path coefficients and the error
variance) to be different across groups, we get the following result.
Answers to Exercises                                                           9



  Relaxing Constraints on Structural Equation Parameters:
        Goodness of Fit Index (GFI)                                   0.9406
        GFI Adjusted for Degrees of Freedom (AGFI)                    0.9335
        Root Mean Square Residual (RMR)                               0.6635
        Parsimonious GFI (Mulaik, 1989)                               1.1197
        Chi-Square                                                   26.0779
        Chi-Square DF                                                     25
        Pr > Chi-Square                                               0.4035

  Latent Variable Equations with Estimates
     f_succ =    4.3922*f_pers   + 2.2710*f_qual           +    1.0000 d
     Std Err     0.3848 g1          0.3854 g2
     t Value    11.4127             5.8931

  Standardized:
        f_succ =       0.7352*f_pers +   0.3801*f_qual +       0.5547 d
                              g1                g2
10                                                                                      Answers to Exercises




11. Analysis of Variance

Exercise 11.4

As currently stated, this problem asks for an ANOVA that involves a within-subjects treatment (i.e.,
each subject is asked to taste a product and a modified form of that product, which means we observe
two different treatment levels within the same subject). Since we have not discussed these designs, I
changed the format of the problem and asked students to look at the differences in perceptions of the
two products (across six attributes) and test to see if there is a difference due to order of taste. The
modified data format is shown below (the two products are identified as 27 and 45; the first product
tasted is listed in column 2; columns 3-8 contain the attribute rating for product 27 - product 45):



      1    27   -2    -1        0     0    0     0
      2    27   -1     1        0    -2   -3    -1
      3    27    0     0        0    -1    0     0
      4    27    2     1       -1    -1    0    -1
      5    27    1     1        0     0    0     0
      6    27   -1    -1        1    -2    2    -1
      7    27   -2     0        1    -1    0     0
      8    27   -4     0        1    -1    1    -1
      9    27    1    -2       -1     0    0     0
     10    27    2    -1        0     0    0     0
  < 89 rows omitted >
    100    27    1     0        1     1    0     2
    101    45   -1    -1        0     1   -2     0
    102    45    1    -1        1     0    0     0
    103    45    0     0        0     1    0     0
    104    45   -1    -1       -1    -1   -1     0
    105    45    1     0        0    -1   -1     2
    106    45    1     2        0     0    0     1
    107    45    0     0        0     0    0     0
    108    45   -1     0        0     1    0     0
    109    45   -2     2        0    -1    0     0
    110    45   -2     0        1    -1    0    -1
  < 90 rows omitted >




The appropriate test is a MANOVA across all six attributes (one can look at the correlation matrix and
verify that there is a high level of collinearity across attributes; Bartlett's test of sphericity could also be
used here). As shown below, the null hypothesis is rejected, which means there are significant
differences. This is interesting, since one would not necessarily expect the perceptions of the products
to depend on which is tasted first. However, there are studies in the marketing literature that
substantiate the existence of this type of first-mover reference effect.
Answers to Exercises                                                                                    11

                MANOVA Test Criteria and Exact F Statistics for
                   the Hypothesis of No Overall first Effect
                       H = Type III SSCP Matrix for first
                             E = Error SSCP Matrix

                                S=1     M=2    N=95.5

   Statistic                          Value   F Value    Num DF   Den DF   Pr > F
   Wilks' Lambda                 0.82829345      6.67         6      193   <.0001
   Pillai's Trace                0.17170655      6.67         6      193   <.0001
   Hotelling-Lawley Trace        0.20730159      6.67         6      193   <.0001
   Roy's Greatest Root           0.20730159      6.67         6      193   <.0001

                              Treatment group means

   Level of         ------------a1------------      ------------a2------------
   first        N           Mean       Std Dev              Mean       Std Dev
   27         100    -0.55000000    1.69595478       -0.20000000    1.29490064
   45         100     0.37000000    1.66153756        0.21000000    1.17460503

   Level of         ------------a3------------      ------------a4------------
   first        N           Mean       Std Dev              Mean       Std Dev
   27         100     0.21000000    0.70057696        0.08000000    1.07007033
   45         100    -0.01000000    0.67412495       -0.32000000    0.91981553

   Level of         ------------a5------------      ------------a6------------
   first        N           Mean       Std Dev              Mean       Std Dev
   27         100     0.09000000    1.15553127       -0.02000000    1.01483939
   45         100    -0.11000000    1.01399301       -0.18000000    1.08599905




A canonical correlation analysis helps to facilitate the interpretation. Note that the test (based on
Wilks's  is exactly the same. What we also can see is the canonical structure (i.e., the correlations
between the attributes and their canonical variable. It shows that the product tasted first benefits on the
first two attributes, but does more poorly on the last four.




                          Canonical Correlation Analysis

                                Canonical Structure

    Correlations Between the VAR Variables and Their Canonical Variables
                                             V1
                            first        1.0000

    Correlations Between the WITH Variables and Their Canonical Variables
                                            W1
                              a1        0.6407
                              a2        0.3967
                              a3       -0.3832
                              a4       -0.4766
                              a5       -0.2222
                              a6       -0.1841
12                                                                                 Answers to Exercises




Exercise 11.6

These data are meant to mimic the results of an experiment exploring the impact of competitive
expectations. The results are not particularly exciting. Single ANOVAs reveal that both share and
profit are influenced by the experimental manipulation.



  Dependent Variable: share

                                      Sum of
     Source               DF         Squares      Mean Square    F Value     Pr > F
     Model                 2     10290.70000       5145.35000      12.76     <.0001
     Error                57     22982.90000        403.20877
     Corrected Total      59     33273.60000


  Dependent Variable: profit

                                               Sum of
     Source               DF         Squares       Mean Square   F Value     Pr > F
     Model                 2     10157.70000        5078.85000      7.05     0.0018
     Error                57     41043.90000         720.06842
     Corrected Total      59     51201.60000




This results from the single ANOVAs is supported by the MANOVA. The pattern of means shows
that the "competitive" treatment group performed relatively better in terms of profitability, while the
"cooperative" treatment group performed relatively better on share. Both treatment groups seemed to
perform better than the control.



                 MANOVA Test Criteria and F Approximations for
                   the Hypothesis of No Overall treat Effect
                       H = Type III SSCP Matrix for treat
                             E = Error SSCP Matrix

  Statistic                         Value F Value Num DF Den DF Pr > F
   Wilks' Lambda                0.45856613   13.35      4    112 <.0001
   Pillai's Trace               0.62548024   12.97      4    114 <.0001
   Hotelling-Lawley Trace       0.99742972   13.88      4 66.174 <.0001
   Roy's Greatest Root          0.75451889   21.50      2     57 <.0001

          NOTE: F Statistic for Roy's Greatest Root is an upper bound.
                 NOTE: F Statistic for Wilks' Lambda is exact.


      Level of       -----------share----------     ----------profit----------
      treat      N           Mean       Std Dev             Mean       Std Dev
Answers to Exercises                                                                       13

    1          20       92.750000    18.8815560           91.600000      21.7894228
    2          20       99.550000    16.5449149          122.650000      30.9078001
    3          20      123.300000    24.0702918          113.350000      27.0209957



Almost nothing changes if we include past work experience as a covariate:


  Dependent Variable: share

                                      Sum of
   Source                 DF         Squares       Mean Square        F Value     Pr > F
   Model                   3     13707.42043        4569.14014          13.08     <.0001
   Error                  56     19566.17957         349.39606
   Corrected Total        59     33273.60000

  Source                   DF    Type III SS       Mean Square        F Value     Pr > F
   treat                    2    9929.133346       4964.566673          14.21     <.0001
   past_exp                 1    3416.720431       3416.720431           9.78     0.0028


  Dependent Variable: profit

                                      Sum of
   Source                 DF         Squares       Mean Square        F Value     Pr > F
   Model                   3     13252.53660        4417.51220           6.52     0.0007
   Error                  56     37949.06340         677.66185
   Corrected Total        59     51201.60000

  Source                   DF    Type III SS       Mean Square        F Value     Pr > F
   treat                    2    9949.292723       4974.646362           7.34     0.0015
   past_exp                 1    3094.836603       3094.836603           4.57     0.0370




                MANOVA Test Criteria and F Approximations for
                  the Hypothesis of No Overall treat Effect
                      H = Type III SSCP Matrix for treat
                            E = Error SSCP Matrix

  Statistic                          Value     F Value    Num DF    Den DF     Pr > F

   Wilks' Lambda                0.44174276       13.88          4        110   <.0001
   Pillai's Trace               0.64961118       13.47          4        112   <.0001
   Hotelling-Lawley Trace       1.05695747       14.45          4     64.974   <.0001
   Roy's Greatest Root          0.79771155       22.34          2         56   <.0001

        NOTE: F Statistic for Roy's Greatest Root is an upper bound.
               NOTE: F Statistic for Wilks' Lambda is exact.


                                Least Squares Means
                                                        profit
                     treat      share LSMEAN            LSMEAN
                     1             93.015415         91.852604
                     2             99.583177        122.681575
14                                                                                 Answers to Exercises

                     3           123.001408        113.065821




Canonical correlation also helps the interpretation of the MANOVA. In this case, we have three
treatment groups and two dependent variables, so there are two pairs of canonical correlations. As
shown by the sequential test based on Wilks's , both pairs are significant.



                         Canonical Correlation Analysis

                                     Adjusted      Approximate          Squared
                   Canonical        Canonical         Standard        Canonical
                 Correlation      Correlation            Error      Correlation
            1       0.655777         0.637219         0.074202         0.430043
            2       0.442083          .               0.104745         0.195437

                 Test of H0: The canonical correlations in the
                   current row and all that follow are zero

                 Likelihood      Approximate
                      Ratio          F Value      Num DF     Den DF     Pr > F
            1    0.45856613            13.35           4        112     <.0001
            2    0.80456294            13.85           1         57     0.0005




Looking at the canonical structure, we see that the first pair of variables differentiates between the
"competitive" and "cooperative" treatments (note that comp loads positively on V1, while coop loads
negatively). Looking at W1, we see that it reflects the difference in performance on the two variables
(loading positively on profit and negatively on share). The second pair of canonical variables reflects
the difference in performance between the two treatment groups and the control group (which does
worse in terms of profit and share).



                               Canonical Structure

     Correlations Between the VAR Variables and Their Canonical Variables
                                       V1            V2
                       comp        0.7937        0.6083
                       coop       -0.9236        0.3833

     Correlations Between the WITH Variables and Their Canonical Variables
                                        W1            W2
                      share        -0.6966        0.7175
                      profit        0.1121        0.9937
Answers to Exercises                                                                                  15


Exercise 11.7

It is possible to approach this problem using MANOVA to examine the impact of the experimental
factors on all four dependent variables (Y1, Y2, Y3 and Y4). However, it seems clear that these four
dependent measures are designed to measure two distinct constructs: liking (Y1 and Y2) and purchase
intention (Y3 and Y4).

We begin by using exploratory factor analysis to examine the factor structure of the four dependent
variables. As shown below, we find that a two-factor solution accounts for most of the variation in the
data: the sum of the communalities is 3.23, which suggests that two common factors account for
3.23/4.00 or over 80 percent of the variation in the data.


  Exploratory Factor Analysis of Four Dependent Variables in Exercise 11.7
                       Prior Communality Estimates: SMC
               like1             like2            buy1                buy2
          0.80643919        0.80162491      0.71857006          0.69248843

                Eigenvalues of the Reduced Correlation Matrix:
                    Total = 3.0191226 Average = 0.75478065

                 Eigenvalue      Difference      Proportion     Cumulative
            1    2.74173543      2.24935193          0.9081         0.9081
            2    0.49238350      0.58158550          0.1631         1.0712
            3    -.08920200      0.03659233         -0.0295         1.0417
            4    -.12579433                         -0.0417         1.0000

             2 factors will be retained by the NFACTOR criterion.

                                 Factor Pattern
                                    Factor1           Factor2
                       like1        0.86169          -0.33046
                       like2        0.85272          -0.34438
                       buy1         0.81587           0.34354
                       buy2         0.77875           0.38284

                          Variance Explained by Each Factor
                                Factor1         Factor2
                              2.7417354       0.4923835

                 Final Communality Estimates: Total = 3.234119
               like1           like2            buy1            buy2
          0.85170968      0.84573766      0.78365811      0.75301347




An oblique rotation of the factor solution results in two clear factors: “product liking” and “purchase
intention.” The factor loadings exhibit simple structure (Y1 and Y2 load clearly on the first factor and
Y3 and Y4 load clearly on the second). The two factors exhibit a correlation of 0.62 (i.e., product
liking is strongly positively associated with purchase intention).
16                                                                                 Answers to Exercises

  Results of Oblique Rotation for Exercise 11.7
                       Rotation Method: Promax (power = 3)

                             Inter-Factor Correlations
                                      Factor1         Factor2
                     Factor1          1.00000         0.61950
                     Factor2          0.61950         1.00000


         Rotated Factor Pattern (Standardized Regression Coefficients)
                                   Factor1         Factor2
                     like1         0.88132         0.06483
                     like2         0.89175         0.04397
                     buy1          0.09325         0.82444
                     buy2          0.02698         0.85079




Using the factor score coefficients from the factor analysis, we calculated factor scores and used these
as the dependent variables in a MANOVA. (Note that the results would be similar if we simply took
the sum of Y1 and Y2 and the sum of Y3 and Y4 and subjected them to the same analysis).

The results of the simple ANOVAs of Factor1 (“Liking”) and Factor 2 (“Purchase Intent”) are shown
below. Interestingly, we see that the inclusion of a uniqueness claim has a significant impact on liking
(p < 0.05) and the inclusion of a competitive claim has a significant impact on purchase intent (p <
0.05). There is also a significant interaction between claims on liking (p < 0.05) and purchase intent (p
< 0.05).


  Simple ANOVA Results for Exercise 11.7
  Dependent Variable: Factor1

                                             Sum of
     Source                      DF         Squares      Mean Square     F Value    Pr > F
     Model                        3      9.12005460       3.04001820        3.66    0.0150
     Error                       96     79.64772132       0.82966376
     Corrected Total             99     88.76777592

              R-Square      Coeff Var       Root MSE      Factor1 Mean
              0.102741     2.56384E17       0.910859        3.5527E-16

     Source                      DF     Type III SS      Mean Square     F Value    Pr > F
     compet                       1      0.00543779       0.00543779        0.01    0.9356
     unique                       1      4.53521680       4.53521680        5.47    0.0215
     compet*unique                1      4.57940001       4.57940001        5.52    0.0209


  Dependent Variable: Factor2

                                             Sum of
     Source                      DF         Squares      Mean Square     F Value    Pr > F
     Model                        3      9.34817426       3.11605809        4.03    0.0096
     Error                       96     74.24853373       0.77342223
Answers to Exercises                                                                              17

   Corrected Total               99     83.59670799

            R-Square        Coeff Var        Root MSE     Factor2 Mean
            0.111825       8.25139E17        0.879444       1.0658E-16

   Source                        DF     Type III SS      Mean Square      F Value   Pr > F
   compet                         1      3.94340484       3.94340484         5.10   0.0262
   unique                         1      0.59738878       0.59738878         0.77   0.3817
   compet*unique                  1      4.80738064       4.80738064         6.22   0.0144




The statistical tests from the MANOVA of Factor1 and Factor2 yield the same conclusions: all effects
significant at p < 0.05.


  MANOVA Results for Exercise 11.7
                        Multivariate Analysis of Variance

  COMPETITIVE CLAIM:

   Statistic                         Value    F Value   Num DF   Den DF   Pr > F
   Wilks' Lambda                0.90767677       4.83        2       95   0.0100

  UNIQUENESS CLAIM:

   Statistic                         Value    F Value   Num DF   Den DF   Pr > F
   Wilks' Lambda                0.93481584       3.31        2       95   0.0407

  INTERACTION:

   Statistic                         Value    F Value   Num DF   Den DF   Pr > F
   Wilks' Lambda                0.93300518       3.41        2       95   0.0371




The effects themselves are revealed by the mean values of Factor1 and Factor2, shown below:


  Mean Values on Factor1 and Factor 2
    Level of          ----------Factor1---------   ----------Factor2---------
    compet        N           Mean       Std Dev           Mean       Std Dev
    0            50     0.00737414    0.94288753     0.19858008    0.89527570
    1            50    -0.00737414    0.96043709    -0.19858008    0.90777697

    Level of          ----------Factor1---------   ----------Factor2---------
    unique        N           Mean       Std Dev           Mean       Std Dev
    0            50     0.21296048    1.00187542     0.07729093    0.96192245
    1            50    -0.21296048    0.84574078    -0.07729093    0.87668059

  Level of Level of    ---------Factor1--------- ---------Factor2---------
  compet   unique    N         Mean      Std Dev         Mean      Std Dev
  0        0        25   0.00633929   1.07020031   0.05661361   1.00625374
18                                                                                     Answers to Exercises

  0           1       25    0.00840898     0.81840458    0.34054655       0.76282283
  1           0       25    0.41958167     0.90280717    0.09796825       0.93579176
  1           1       25   -0.43432995     0.82974666   -0.49512841       0.78964385




Note that the main effect of the inclusion of a competitive claim is largest (and negative) with respect
to purchase intent. It suggests that a competitive claim has little effect on product liking and a
pronounced negative effect on purchase intent. Similarly, the effect of the inclusion of a uniqueness
claim is largest (and negative) with respect to product liking. It suggests that a uniqueness claim has
little effect on purchase intent and a pronounced negative effect on product liking.

These seemingly counterintuitive results make more sense when we examine the interaction effect
revealed by the cell means. These suggest that the inclusion of a competitive claim alone has a
positive effect on liking (but not purchase intent) relative to no competitive claim, and that the
inclusion of a uniqueness claim alone has a positive effect on purchase intent (but not liking) relative
to no uniqueness claim. However, when both claims are included, the effect on both liking and
purchase intent is substantially negative relative to no claims at all.

We can compare the results above to those from a MANOVA of all four dependent variables
separately. As shown below, the while the main effects of competitive claim and uniqueness claim are
both significant (at p < 0.05), the interaction effect is not significant. This seems unusual, in light of
the fact that the interaction effect is significant in each of the simple ANOVAs.


  MANOVA for all four dependent variables in Exercise 11.7
                       Multivariate Analysis of Variance

  COMPETITIVE CLAIM:

      Statistic                      Value    F Value   Num DF   Den DF    Pr > F
      Wilks' Lambda             0.81605120       5.24        4       93    0.0007

  UNIQUENESS CLAIM:

      Statistic                      Value    F Value   Num DF   Den DF    Pr > F
      Wilks' Lambda             0.87563562       3.30        4       93    0.0141

  INTERACTION:

      Statistic                      Value    F Value   Num DF   Den DF    Pr > F
      Wilks' Lambda             0.93077476       1.73        4       93    0.1501




The reason may be due to the fact that the pattern of the mean values differs across the four dependent
variables. As shown below, the combination of competitive claim and uniqueness claim results in the
lowest average score for each of the four dependent measures. But there appears to be more noise in
the measures across four variables, and the pattern of interaction does not hold up in this analysis.
Answers to Exercises                                                              19

  Mean Values on Four Dependent Variables for Exercise 11.7

  Level of Level of         ----------like1---------- ----------like2----------
  compet   unique       N           Mean      Std Dev         Mean      Std Dev

  0        0           25    49.6400000   9.34469546   49.4000000   10.0332780
  0        1           25    49.2400000   8.00666389   49.5200000    7.4056285
  1        0           25    53.3200000   8.24984848   53.3600000    8.3260635
  1        1           25    45.4000000   8.50490055   46.0400000    7.1032856

  Level of Level of         -----------buy1---------- -----------buy2----------
  compet   unique       N           Mean      Std Dev         Mean      Std Dev

  0        0           25    51.0000000   8.96753403   50.3600000   8.37595766
  0        1           25    52.8800000   6.55311631   53.9200000   7.69696910
  1        0           25    52.6000000   8.91160292   48.4800000   7.94837510
  1        1           25    46.5200000   7.93263302   45.4000000   6.74536878
20                                                                Answers to Exercises



 SAS PROGRAM FOR EXERCISE 11.7
 options ls=72;

 data claims;
     infile 'AD_CLAIM_TEST.txt';
     input subject compet $ unique $ like1 like2 buy1 buy2;
 run;

 proc factor data=claims method=principal priors=smc n=2
     rotate=promax score outstat=stat;
     var like1 like2 buy1 buy2;
 run;

 proc score data=claims score=stat out=scores;
     var like1 like2 buy1 buy2;
 run;

 proc glm data=scores;
     class compet unique;
     model factor1 factor2 = compet unique compet*unique;
     manova h=compet unique compet*unique;
     means compet unique compet*unique;
 run;

 proc glm data=claims;
     class compet unique;
     model like1 like2 buy1 buy2 = compet unique compet*unique;
     manova h=compet unique compet*unique;
     means compet unique compet*unique;
 run;
Answers to Exercises                                                                                  21


Exercise 11.8

Ever wondered how different methods of cooking fish impact the aroma, flavor, texture and moisture
of the final product? Here is your chance to find out! All four of these measures are relatively highly
correlated, so MANOVA is appropriate here.



                        Pearson Correlation Coefficients, N = 36
                                Prob > |r| under H0: Rho=0

                             aroma         flavor       texture      moisture
        aroma              1.00000        0.73024       0.49060       0.39795
                                           <.0001        0.0024        0.0162
        flavor             0.73024        1.00000       0.36140       0.41800
                            <.0001                       0.0303        0.0112
        texture            0.49060        0.36140       1.00000       0.59300
                            0.0024         0.0303                      0.0001
        moisture           0.39795        0.41800       0.59300       1.00000
                            0.0162         0.0112        0.0001




The results of the MANOVA show a significant difference across cooking methods. Looking at the
patterns of means, we see the larges differences across groups in terms of flavor.



                   MANOVA Test Criteria and F Approximations for
                     the Hypothesis of No Overall method Effect
                        H = Type III SSCP Matrix for method
                               E = Error SSCP Matrix

  Statistic                              Value F Value Num DF Den DF Pr > F
   Wilks' Lambda                     0.24182119    7.75      8     60 <.0001
   Pillai's Trace                    0.85684365    5.81      8     62 <.0001
   Hotelling-Lawley Trace            2.72727944   10.04      8 40.602 <.0001
   Roy's Greatest Root               2.56842429   19.91      4     31 <.0001

          NOTE: F Statistic for Roy's Greatest Root is an upper bound.
                 NOTE: F Statistic for Wilks' Lambda is exact.


    Level of            -----------aroma----------    ----------flavor----------
    method         N            Mean       Std Dev            Mean       Std Dev

    1              12     5.38333333     0.58439298    5.70833333    0.44201673
    2              12     5.25833333     0.76331613    5.23333333    0.57419245
    3              12     4.97500000     0.54292976    4.83333333    0.45990776

    Level of            ----------texture---------    ---------moisture---------
    method         N            Mean       Std Dev            Mean       Std Dev

    1              12     5.52500000     0.65244296    5.98333333    0.69652819
22                                                                                    Answers to Exercises

     2          12      5.30833333      0.59460962       5.87500000      0.51720402
     3          12      5.90833333      0.51071845       6.23333333      0.45593726




Canonical correlation provides some additional interpretation. Again, with three methods and four
variables, we have two pairs of canonical correlations. In this case, however, only the first is
significant, so we only bother to attempt to interpret this. Looking at the canonical structure, we see
that methods 1 and 2 load positively on the first canonical variable (so we may interpret this variable as
being associated with these two cooking methods and not with method 3). Looking at the loadings on
W1, we see that flavor (especially) and to some extent aroma load positively, while texture and
moisture load negatively. Putting these together, we can conclude that methods 1 and 2 tend to lead to
higher evaluations on flavor and aroma, but lower on texture and moisture than method 3. This is
borne out by the patterns of treatment group means.



                           Canonical Correlation Analysis

                                        Adjusted      Approximate         Squared
                       Canonical       Canonical         Standard       Canonical
                     Correlation     Correlation            Error     Correlation
            1           0.848389        0.832288         0.047368        0.719764
            2           0.370242        0.312605         0.145860        0.137079

                     Test of H0: The canonical correlations in the
                       current row and all that follow are zero

                     Likelihood     Approximate
                          Ratio         F Value      Num DF     Den DF     Pr > F
            1        0.24182119            7.75           8         60     <.0001
            2        0.86292062            1.64           3         31     0.1999


                                   Canonical Structure

     Correlations Between the VAR Variables and Their Canonical Variables
                                      V1            V2
                        m1        0.7259        0.6878
                        m2        0.2327       -0.9726

     Correlations Between the WITH Variables and Their Canonical Variables
                                         W1            W2
                     aroma           0.3177        0.0106
                     flavor          0.6811        0.4560
                     texture        -0.3770        0.6613
                     moisture       -0.2618        0.3999
Answers to Exercises                                                                                   23

Exercise 11.11

This problem calls for a test of an orientation program on outcome measures of anxiety, depression,
and anger. Presumably, the goal of the program is to increase the effectiveness of psychotherapy and
lead to lower measures on these variables.



                      Pearson Correlation Coefficients, N = 46
                             Prob > |r| under H0: Rho=0

                                anxiety        depress         anger
               anxiety          1.00000        0.85554       0.65612
                                                <.0001        <.0001
               depress          0.85554        1.00000       0.63084
                                 <.0001                       <.0001
               anger            0.65612        0.63084       1.00000
                                 <.0001         <.0001




A MANOVA shows that the results of the program are not significant at the 0.05 level. Thus, even
though the patterns of means are consistent with the goals of the program, we cannot reject the null
hypothesis that these differences might have arisen due to chance.



                MANOVA Test Criteria and Exact F Statistics for
                   the Hypothesis of No Overall treat Effect
                       H = Type III SSCP Matrix for treat
                             E = Error SSCP Matrix

  Statistic                              Value F Value Num DF Den DF Pr > F
   Wilks' Lambda                     0.85025028    2.47      3     42 0.0754
   Pillai's Trace                    0.14974972    2.47      3     42 0.0754
   Hotelling-Lawley Trace            0.17612428    2.47      3     42 0.0754
   Roy's Greatest Root               0.17612428    2.47      3     42 0.0754

    Level of          ----------anxiety---------     ----------depress---------
    treat        N            Mean       Std Dev             Mean       Std Dev
    0           26      158.769231     80.173466       182.884615    124.205419
    1           20      104.350000    105.812782       154.250000    137.158715

                     Level of        ------------anger------------
                     treat       N           Mean          Std Dev
                     0          26     66.0000000       51.1890613
                     1          20     51.0500000       51.1967053



Exercise 11.11

This problem is a relatively straightforward application of MANOVA. This is a one factor design with
three factor levels: control group, behavioral rehearsal treatment group, and behavioral rehearsal with
cognitive restructuring. There are four dependent variables in the study: anxiety, social skills,
24                                                                                      Answers to Exercises

appropriateness, and assertiveness. All four variables are highly intercorrelated (see below). It might
be argued that all four are indicators of one underlying factor.


  Correlation matrix for four dependent variables in Exercise 11.12
                         Pearson Correlation Coefficients, N = 33
                                Prob > |r| under H0: Rho=0

                           anxiety         social            approp        assert

          anxiety          1.00000       -0.82209       -0.85925         -0.89866
                                           <.0001         <.0001           <.0001

          social          -0.82209        1.00000           0.87183       0.83709
                            <.0001                           <.0001        <.0001

          approp          -0.85925        0.87183           1.00000       0.93552
                            <.0001         <.0001                          <.0001

          assert          -0.89866        0.83709           0.93552       1.00000
                            <.0001         <.0001            <.0001




We first analyze the four dependent measures using MANOVA. The results (shown below) show a
statistically significant effect (p < 0.001). Thus, we reject the null hypothesis that the means of the
dependent variables are the same across treatment groups.


  MANOVA test of all four dependent variables for Exercise 11.12
                            Multivariate Analysis of Variance

     Statistic                            Value   F Value    Num DF   Den DF   Pr > F
     Wilks' Lambda                   0.38164971      4.18         8       54   0.0006




Examining the means of the variables across groups, we see that the most dramatic differences are
between the control group and the two treatment groups (behavioral rehearsal and behavioral rehearsal
plus cognitive restructuring).


  Treatment group means for four dependent variables
      Level of           ----------anxiety---------    ----------social----------
      group         N            Mean       Std Dev            Mean       Std Dev

      1             11     4.27272727     0.64666979        4.27272727    0.64666979
      2             11     4.09090909     0.30151134        4.27272727    0.90453403
      3             11     5.45454545     0.82019953        2.54545455    1.03572548
Answers to Exercises                                                                                   25


    Level of          ----------approp----------       ----------assert----------
    group        N            Mean       Std Dev               Mean       Std Dev

    1            11    4.18181818         0.60302269        3.81818182    0.75075719
    2            11    4.27272727         0.78624539        4.09090909    0.70064905
    3            11    2.54545455         0.93419873        2.54545455    0.93419873




This raises an interesting question: is there any incremental effect of cognitive restructuring over and
above behavioral rehearsal? To test this, we test the contrast between the two treatment groups (1
versus 2). The result (shown below) is not significant.


  MANOVA: Contrast of Group 1 versus Group 2
                           Multivariate Analysis of Variance

   Statistic                             Value    F Value    Num DF   Den DF   Pr > F
   Wilks' Lambda                    0.94238950       0.41         4       27   0.7980




We also analyzed the data by factor analyzing the four dependent variables and extracting one factor
(which accounted for almost 85 percent of the variation in the data) and conducting a simple ANOVA.
The results are similar: the treatment effects are significant (relative to control), and the impact of
cognitive restructuring and behavioral rehearsal (relative to behavioral rehearsal alone) is not
significant.


  ANOVA of single factor extracted from four dependent variables in Exercise 11.12
  Dependent Variable: Factor1

                                                 Sum of
   Source                            DF         Squares        Mean Square     F Value   Pr > F
   Model                              2     16.22951626         8.11475813       16.60   <.0001
   Error                             30     14.66709800         0.48890327
   Corrected Total                   32     30.89661426

             R-Square         Coeff Var          Root MSE      Factor1 Mean
             0.525285        6.49479E17          0.699216        1.0766E-16

   Contrast                          DF     Contrast SS       Mean Square      F Value   Pr > F
   1 versus 2                         1      0.14401863        0.14401863         0.29   0.5913

  GROUP MEANS:

                Level of                  -----------Factor1-----------
                group           N                 Mean          Std Dev
                1              11           0.41277050       0.59258357
                2              11           0.57458893       0.60827970
26                                                              Answers to Exercises

            3            11      -0.98735943       0.86345255




 SAS PROGRAM FOR EXERCISE 11.12
 options ls=72;

 data skills;
     infile 'SOC_SKILLS.txt';
     input group $ anxiety social approp assert;
 run;

 proc corr data=skills;
     var anxiety social approp assert;

 proc glm data=skills;
     class group;
     model anxiety social approp assert = group;
     contrast '1 versus 2' group -1 1 0;
     manova h=group;
     means group;

 proc factor data=skills method=principal priors=smc n=1
     score outstat=stat;
     var anxiety social approp assert;
 run;




 Title




 Title
Answers to Exercises                                                                                   27




12. Discriminant Analysis

Exercise 12.7

The data are collected from two groups of patients: 15 classified as ill and 30 classified as well. There
is no information accompanying the problem to say whether this constitutes a representative sample
from any population, so it is hard to say how to generalize this analysis. In the absence of any other
specific information, we will assume that the sample proportions (1/3, 2/3) represent our priors.

We can test for a difference between group centroids (i.e., the mean value on each of the five measures
of everyday functionality) using discriminant analysis. Because discriminant analysis is a special case
of canonical correlation, the results from PROC CANCORR in SAS are shown below. The correlation
is 0.70 and the associated F-test of significance (based on Wilks's ) is significant at the 0.0001 level.



                                      Adjusted      Approximate          Squared
                    Canonical        Canonical         Standard        Canonical
                  Correlation      Correlation            Error      Correlation
            1        0.696192         0.666624         0.077687         0.484683

                  Test of H0: The canonical correlations in the
                    current row and all that follow are zero

                  Likelihood      Approximate
                       Ratio          F Value     Num DF      Den DF     Pr > F
            1     0.51531669             7.34          5          39     <.0001




We can look at the canonical loadings (i.e., the correlations between original variables and canonical
variates) to interpret the results. They show that four of the five measures load highly (i.e., all but
"feeling capable of making decisions"). This suggests that this one variable is perhaps a less reliable
indicator of illness than the other four. Note that the standardized canonical coefficients (i.e., the
weights used to form the linear combinations that are the canonical variates) suggest the same pattern
(i.e., closest to zero for the decision-making variable).



  Total Canonical Structure
                         Variable                    Can1
                         useful                  0.875057
                         content                 0.663438
                         decide                  0.381177
                         nostart                 0.778799
                         dread                   0.789652

  Total-Sample Standardized Canonical Coefficients
                         Variable              Can1
28                                                                                 Answers to Exercises

                            useful         0.6069552778
                            content        0.2857822573
                            decide         -.0828879014
                            nostart        0.3842507350
                            dread          0.4922908730

  Class Means on Canonical Variables
                             ill                   Can1
                               1           -1.340710160
                               2            0.670355080




To test the predictive performance of the discriminant function, we use a one-at-a-time holdout cross-
validation (in this case, using PROC DISCRIM in SAS). The results, shown below, suggest a hit rate
of 38/45 = 84.4 percent. The proportional chance criterion is only (1/3) 2 + (2/3)2 = 55.6 percent. How
likely are we to achieve a hit rate of 38 out of 45 if the true performance of our linear discriminant
function were no better than 0.556? The standard deviation of the number of expected hits under this
null hypothesis is equal to sqrt( 45 x 0.556 x 0.444 ) = 3.33. The t-ratio is t = (38 - 25) / 3.33 = 3.9,
which is significant.



                           Linear Discriminant Function

                     _      -1 _                                           -1 _
      Constant = -.5 X' COV    X + ln PRIOR            Coefficient = COV      X
                       j        j           j          Vector                  j

                      Variable                1                2
                      Constant        -11.12483        -18.48019
                      useful            0.18715          1.48700
                      content           3.02631          3.72234
                      decide            5.22426          4.94589
                      nostart          -0.79384         -0.10000
                      dread             3.17374          4.54374

  Cross-Validation:    Number of Observations and Percent Classified into ill

              From ill               1             2          Total
                     1              12             3             15
                                 80.00         20.00         100.00
                      2              4            26             30
                                 13.33         86.67         100.00
                  Total             16            29             45
                                 35.56         64.44         100.00




We also test the assumption that the within-group covariance matrices are the same across ill and well
patients. As shown below, this assumption clearly does not hold; among ill patients, two or more
variables are perfectly collinear within sample. However, if we use a quadratic discriminant function
instead of linear, we find the prediction performance goes down (when evaluated using one-at-a-time
Answers to Exercises                                                                                   29

holdout cross-validation) compared to the linear function. This suggests that the results are too
sensitive to the differences in the estimated within-group sample covariance matrices across groups.



  Within Covariance Matrix Information
                                             Natural Log of the
                               Covariance    Determinant of the
                    ill       Matrix Rank     Covariance Matrix
                      1                 4             -27.23757
                      2                 5              -2.80739
                 Pooled                 5              -4.06452

                           Chi-Square       DF    Pr > ChiSq
                           245.651366       15        <.0001


  Cross-Validation:    Number of Observations and Percent Classified into ill

              From ill                1           2         Total
                     1               15           0            15
                                 100.00        0.00        100.00
                       2             15          15            30
                                  50.00       50.00        100.00
                 Total               30          15            45
                                  66.67       33.33        100.00
30                                                                                Answers to Exercises


Exercise 12.8

These data are from an exercise that appeared in the original text by Green and Carroll, developed back
in the day when students were asked to do some of these calculations by hand. There is a typo in the
statement of the problem in the textbook: the data are in fact coded Y=1 in favor and Y=0 against the
control bill. The data are hypothetical. Since the data are not drawn from an actual legislative body,
we cannot say that they are representative of any true population of voters. We will assume that the
sample proportions reflect our priors about voting behavior.

A plot of the data (shown below) suggests that the dividing line between those for and those against the
gun control bill might be positively sloped: those voting in favor of the bill (Y=1) fall below this
positively sloped line.



                Plot of age*guns$vote.    Symbol points to label.

     age |
         |
      70 +
         |
         |
         |
      60 +                                                    > 0
         |                                                              > 0
         |                                          > 0
         |                                > 0
      50 +
         |            > 0                 > 0
         |                              0 2 1
         |                      > 0
      40 + > 1
         | > 0        > 1
         |
         |
      30 + 2 1
         | > 1
         |
         |
      20 +
         |
         ---+---------+---------+---------+---------+---------+---------+--
            0         1         2         3         4         5         6
                                        guns




The results from applying Fisher's approach to discriminant analysis (given by PROC CANDISC in
SAS) are shown below. The outcome is just significant at the 0.01 level; Wilks's  = 0.468, with an
F-test significant at 0.0105.
Answers to Exercises                                                                                     31

                                      Adjusted       Approximate          Squared
                    Canonical        Canonical          Standard        Canonical
                  Correlation      Correlation             Error      Correlation
            1        0.729673         0.718972          0.124965         0.532423

                  Test of H0: The canonical correlations in the
                    current row and all that follow are zero

                  Likelihood      Approximate
                       Ratio          F Value      Num DF      Den DF     Pr > F
            1     0.46757711             6.83           2          12     0.0105




Fisher's discriminant function coefficients are proportional to the standardized canonical coefficients
shown below. These are the weights we use to form the discriminant function score t = Xk. Note that
the coefficient for age is positive and the coefficient for guns is negative. This is consistent with the
interpretation from the scatter plot above: if the line dividing the two groups of observations (i.e., the
Mahalanobis locus of points) is positively sloped, then the Fisher discriminant function axis must be
negatively sloped. By looking at the group means, we can see that those voting against the legislation
(i.e., vote = 0) have a positive mean discriminant score, while those voting in favor (vote = 1) have a
negative score). This suggests that holding the number of guns constant, the older the legislator, the
more likely he/she is to vote against the bill. Holding the age of the legislator constant, the greater the
number of guns owned, the more likely he/she is to vote in favor of the gun control bill.

For an interpretation of the discriminant function, we look at the loadings. Here, we see (due to the
positive correlation between age and number of guns owned) that both are positively correlated with
the discriminant function.



  Total Canonical Structure
                         Variable                    Can1
                         age                     0.986597
                         guns                    0.818619

  Total-Sample Standardized Canonical Coefficients
                         Variable              Can1
                         age            1.868967847
                         guns          -0.531004155

  Class Means on Canonical Variables
                            vote                   Can1
                               0            0.811114481
                               1           -1.216671722




A test of the homogeneity of within-group covariance matrices across groups cannot be rejected, so we
find it is appropriate to pool the estimates and use a linear discriminant function.
32                                                                                   Answers to Exercises

  Within Covariance Matrix Information
                                               Natural Log of the
                               Covariance      Determinant of the
                    vote      Matrix Rank       Covariance Matrix
                       0                2                 3.81676
                       1                2                 3.27740
                  Pooled                2                 3.72112

                           Chi-Square         DF     Pr > ChiSq
                             1.193126          3         0.7547




PROC DISCRIM in SAS gives the coefficients of the Mahalanobis discriminant functions (i.e., the
distance calculation for group 1 and group 2). Note that age has a bigger impact on the score for those
voting against the gun control bill (coefficient equal to 2.51 for vote = 0) than for those voting in favor
(coefficient = 2.14 for vote = 1). Thus, holding all else constant, the greater the age, the greater the age
of the legislator, the greater the discriminant function score for vote = 0 (i.e., those voting against the
bill).



  Linear Discriminant Function

                     _      -1 _                                           -1 _
      Constant = -.5 X' COV    X + ln PRIOR            Coefficient = COV      X
                       j        j           j          Vector                  j


  Linear Discriminant Function for vote

                       Variable                 0             1
                       Constant         -50.04961     -35.72468
                       age                2.51012       2.13764
                       guns              -8.34471      -7.80113




These Mahalanobis distances can be used to calculate posterior probabilities of group membership.
This is done below using one-at-a-time holdout validation (which means each observation is classified
using the discriminant function coefficients calculated using only the data from the remaining n-1
observations).



                  Posterior Probability of Membership in vote

                           From    Classified
                Obs        vote     into vote            0          1
                  1           1           1         0.0343     0.9657
                  2           1           1         0.2757     0.7243
                  3           1           0 *       0.8221     0.1779
                  4           1           1         0.0220     0.9780
                  5           1           1         0.0721     0.9279
Answers to Exercises                                                                                    33

                  6           1            0 *     0.9309     0.0691
                  7           0            0       0.8801     0.1199
                  8           0            0       0.9608     0.0392
                  9           0            0       0.9981     0.0019
                 10           0            0       0.9800     0.0200
                 11           0            1 *     0.2168     0.7832
                 12           0            0       0.9783     0.0217
                 13           0            1 *     0.4903     0.5097
                 14           0            0       0.5108     0.4892
                 15           0            0       0.8576     0.1424

                           * Misclassified observation




The "hits and misses" table below summarizes the results from the classification shown above. The hit
rate is 11/15 = 73.3 percent. While this seems an improvement over the proportional chance hit rate of
55.6 percent, we difference is not statistically significant.



           Number of Observations and Percent Classified into vote

             From vote                0              1        Total
                     0                7              2            9
                                  77.78          22.22       100.00
                       1              2              4            6
                                  33.33          66.67       100.00
                  Total               9              6           15
                                  60.00          40.00       100.00
                 Priors             0.6            0.4




Exercise 12.12

This data set is used in the SYSTAT manual (Wilkinson) in the chapter on “Discriminant Analysis” by
Englemann. I do not know about the sampling scheme; it seems unlikely that these countries are a
representative sample of the countries of the world. For the purposes of this exercise, proportional
priors are assumed.

a) Are the differences across groups of countries significant? The differences are clearly significant.
Wilks’s  is 0.051 (p < 0.0001); in fact, each of the independent variables in the data set is by itself a
significant discriminator. (Note that the results below are based on including 11 of the independent
variables; the 12th, which is a ratio of birth to death rate, is left out because it is a combination of
information already present in the model).

Because there are three groups of countries, there are two discriminant functions. We can test the
second discriminant function alone and we find that it is also significant: Wilks’s  = 0.402 (p <
0.0001)
34                                                                                   Answers to Exercises


                                     Adjusted      Approximate             Squared
                    Canonical       Canonical         Standard           Canonical
                  Correlation     Correlation            Error         Correlation

            1        0.933842         0.920284        0.017410           0.872061
            2        0.773599         0.735910        0.054643           0.598456

                  Test of H0: The canonical correlations in the
                    current row and all that follow are zero

                  Likelihood     Approximate
                       Ratio         F Value      Num DF      Den DF      Pr > F
            1     0.05137337           13.03          22          84      <.0001
            2     0.40154436            6.41          10          43      <.0001




b) Interpret each discriminant function. For the purpose of interpretation, it is probably best to look
at the discriminant function loadings (called the “canonical structure” in PROC CANDISC in SAS).
The discriminant function coefficients are interpretable as partial regression coefficients; i.e., they
describe the impact of each dependent variable holding all other variables in the model constant.

As shown below, the first discriminant function is negatively correlated with birth rate and infant
mortality, and positively correlated with life expectancy, literacy, per capita GDP, and spending on
health and education. As shown by the group means, this discriminant function separates countries in
Europe (which score most highly) from New World and Islamic countries. The second discriminant
function is strongly positively correlated with death rate, and negatively correlated with literacy and
life expectancy. Unlike the first discriminant function, the second is negatively correlated with percent
of population living in cities. As shown by the group means, the second discriminant function serves
to separate the Islamic countries (which score the highest) from the New World countries.



                             Total Canonical Structure
                  Variable                Can1                  Can2
                  citypop             0.644525             -0.427281
                  birth              -0.912000              0.351380
                  death              -0.140933              0.752922
                  infdeath           -0.794393              0.453549
                  gdppc               0.860334              0.132150
                  educ                0.678699              0.124903
                  health              0.757760              0.127102
                  military            0.498877              0.368253
                  lifex_m             0.719396             -0.477567
                  lifex_f             0.779220             -0.465591
                  literacy            0.773787             -0.560221

                        Class Means on Canonical Variables
                  group                  Can1              Can2
                  Europe          3.381723396       0.411490370
                  Islamic        -2.719775968       1.462927569
                  NewWorld       -1.116957381      -1.417249074
Answers to Exercises                                                                                    35




                 Plot of Can2*Can1$group.    Symbol points to label.

  Can2 |
       |
     3 + Islamic Islamic
       |     ^     ^     > Islamic
       |    Islamic > Islamic
     2 +     ^ ^ ^ > Islamic                                   Europe
       | Islamic < > NewWorld                                    ^ Europe
       |           > Islamic                             Europe <v ^   Europe
     1 +          > Islamic                                  EuropevEurope^
       |          > Islamic                           Europe < Europ^     v
       |                                               Europe <^ ^     Europe
     0 +            NewWorld > Islamic      NewWorld 2 Europe v> Europe
       |    Islamic   ^^ > Islamic                  Europe < Europe
       |       ^ > NewWorld                             2 Europe
    -1 +                > NewWorld <        > NewWorld       > Europe
       |        NewWorld <> NewWorld NewWorld
       |       NewWorld < > NewWorld      ^     > NewWorld
    -2 +                      > NewWorld
       |         NewWorld < NewWorld      > NewWorld
       |        NewWorld <    ^     > NewWorld
    -3 +               > NewWorld
       |
       --+------+------+------+------+------+------+------+------+------+-
         -4     -3     -2       -1     0       1      2      3       4      5
                                         Can1




c) How well does linear discriminant analysis perform in correctly classifying countries by group?
To answer this question, we assume that it is appropriate to pool across groups in estimating the
within-group covariance matrix. (We address the question of whether this assumption is appropriate in
part d below). We use the cross-validation option in SAS to assess predictive validity and calculate the
hit rate. The result is that 45 out of 55 countries (two have missing data) are correctly categories, a hit
rate of 82 percent. This compares favorably to the proportional chance hit rate of (19/55) 2 + (15/55)2 +
(21/55)2 = 34 percent.



  Cross-Validation:     Number of Observations and Percent Classified into group

         From                                              New
         group            Europe       Islamic           World         Total

         Europe                18               0            1            19
                            94.74            0.00         5.26        100.00
         Islamic                0              12            3            15
                             0.00           80.00        20.00        100.00
         NewWorld               2               4           15            21
                             9.52           19.05        71.43        100.00
         Total                 20              16           19            55
                            36.36           29.09        34.55        100.00
36                                                                                   Answers to Exercises




d) Test the assumption that linear discriminant analysis is appropriate. This we do by using Box’s
test of the equality of covariance matrices across groups. Shown below are the log-determinants of
each group covariance matrix. Note that the Islamic group is smaller than the others; this difference
turns out to be highly significant. This calls into question our decision in part c to pool across groups
in estimating within-group covariance

The results from the quadratic discriminant analysis are also shown below. Clearly, the quadratic
analysis does not outperform the linear analysis in classification accuracy when using a cross-
validation approach. Part of the problem is that the procedure assigns too few countries to the holdout
group and too many to the New World group. This is because the log-determinant of the within-group
covariance matrix is smallest for the Islamic group (and largest for the New World group).



                      Within Covariance Matrix Information
                                             Natural Log of the
                               Covariance    Determinant of the
                  group       Matrix Rank     Covariance Matrix

                  Europe                 11                52.25294
                  Islamic                11                49.66450
                  NewWorld               11                55.19725
                  Pooled                 11                64.02932

                         Chi-Square            DF   Pr > ChiSq
                         412.706444           132       <.0001

  Cross-Validation (Quadratic Discriminant Function)

           Number of Observations and Percent Classified into group

         From                                              New
         group            Europe       Islamic           World         Total
         Europe               17             0               2            19
                           89.47          0.00           10.53        100.00
         Islamic               0             7               8            15
                            0.00         46.67           53.33        100.00
         NewWorld              2             1              18            21
                            9.52          4.76           85.71        100.00
         Total                19             8              28            55
                           34.55         14.55           50.91        100.00




Exercise 12.13

Marketers are often concerned with the accuracy of new product testing methods. Here is an example
data set in which each of 24 new products is testing using two methods: a concept test and a panel test.
Although the question is not posed explicitly in the exercise, it might be interesting to investigate
Answers to Exercises                                                                                    37

which of these two tests is more valuable in discriminating successful new products from failures. One
could also ask whether it is worth purchasing the second test in addition to the first (i.e., is the
improvement in discrimination worthwhile?). Of course, to answer these questions, one would need to
know something about the costs of the tests, the cost of product development, the profits from a
successful new product, and the cost of launching a failure.

The graph below shows that the combined information from the panel and concept tests clearly help
discriminate between successes and failures.



          Plot of p_score*c_score$success.     Symbol points to label.

     p_score |
             |
          20 +
             |
             |                                        > 1
             |                                                      > 1
             |                              > 1
          15 +         > 0
             |                      > 1                   > 1      > 1
             |                              > 1
             |             > 0               > 1 > 0           > 1
             |                  > 0              > 0     > 1
          10 +                            > 1
             |                 > 0      > 0          > 0
             |     > 0
             |                                 > 1       > 0
             |                        > 0
           5 +     > 0
             |
             ---+------------+------------+------------+------------+--
               20           40              60              80          100

                                           c_score




A single canonical discriminant function based on both concept and panel scores proves to be a
significant discriminator between successes and failures. Wilks’s  = 0.562, significant at p < 0.01.
The canonical structure suggests that concept score and panel score load almost equally on the
discriminant function.



              Multivariate Statistics and Exact F Statistics
  Statistic                      Value F Value Num DF Den DF Pr > F
   Wilks' Lambda             0.56156799     8.20       2     21 0.0023
   Pillai's Trace            0.43843201     8.20       2     21 0.0023
   Hotelling-Lawley Trace    0.78072828     8.20       2     21 0.0023
   Roy's Greatest Root       0.78072828     8.20       2     21 0.0023

                           Total Canonical Structure
38                                                                                 Answers to Exercises

                            Variable                  Can1
                            c_score               0.849651
                            p_score               0.820445

                       Class Means on Canonical Variables
                           success              Can1
                                 0      -.8459713895
                                 1      0.8459713895




The performance of the discriminant function based on the two tests in correctly classifying successes
and failures is reasonably good. Due to the small sample size, there is some capitalization on chance.
Using resubstitution, the hit rate is 20/24 = 83 percent; using cross-validation, the estimate of the hit
rate drops to 16/24 = 67 percent. There is, of course, some question about the appropriate priors for
such an analysis. The literature suggests that despite best efforts to the contrary, “most new products
fail.” There are also asymmetric costs of misclassification that should also be taken into account when
trying to decide whether a new product should be classified as potential success or failure.



                           Linear Discriminant Function

                      Variable                0               1
                      Constant         -9.18461       -17.57076
                      c_score           0.16171         0.23381
                      p_score           0.97063         1.33842

          Resubstitution Summary using Linear Discriminant Function
          Number of Observations and Percent Classified into success

           From success              0                1       Total
                      0             10                2          12
                                 83.33            16.67      100.00
                       1             2               10          12
                                 16.67            83.33      100.00
                   Total            12               12          24
                                 50.00            50.00      100.00




         Cross-validation Summary using Linear Discriminant Function
          Number of Observations and Percent Classified into success

           From success              0                1       Total
                      0              7                5          12
                                 58.33            41.67      100.00
                       1             3                9          12
                                 25.00            75.00      100.00
                   Total            10               14          24
                                 41.67            58.33      100.00
Answers to Exercises                                                                                   39

Exercise 12.14

These data on vehicle ownership require multiple discriminant analysis because there are three
categories of ownership: car only, van only, and both car and van. We seek to discriminate among
these three categories of ownership using data on income, family size, and age of head of household.
The results of canonical discriminant analysis shows that there are indeed significant differences
among these groups. Wilks’s  = 0.556, which is significant at the 0.01 level.

In fact, we can form two discriminant functions, both of which are significant. As shown in the table
below, after removing the first discriminant function, Wilks’s for the second is equal to 0.797, which
is significant at the 0.05 level. The discriminant function loadings suggest that the first function is
primarily associated with higher income families with older heads of household. The second function
is primarily large families that tend to be younger and have lower income.



                 Multivariate Statistics and F Approximations

  Statistic                         Value F Value Num DF Den DF Pr > F
   Wilks' Lambda                0.55620611    3.52      6     62 0.0046
   Pillai's Trace               0.50511605    3.60      6     64 0.0039
   Hotelling-Lawley Trace       0.68764390    3.50      6 39.604 0.0071
   Roy's Greatest Root          0.43305620    4.62      3     32 0.0085

                                     Adjusted      Approximate         Squared
                    Canonical       Canonical         Standard       Canonical
                  Correlation     Correlation            Error     Correlation
            1        0.549719        0.449059         0.117951        0.302191
            2        0.450472         .               0.134730        0.202925

                  Test of H0: The canonical correlations in the
                    current row and all that follow are zero

                  Likelihood     Approximate
                       Ratio         F Value     Num DF      Den DF     Pr > F
            1     0.55620611            3.52          6          62     0.0046
            2     0.79707461            4.07          2          32     0.0265

                             Total Canonical Structure
                  Variable                Can1              Can2
                  income              0.771759         -0.631940
                  fam_size            0.362713          0.839026
                  age_hh              0.607869         -0.430959




A plot of the observations in the discriminant function space shows that the first discriminant function
is separating families who own both car and van from families who own only a car or only a van.
Thus, it appears that older, wealthier families are more likely to own both types of vehicle. The second
discriminant function is separating families who own only a van (i.e., the larger, younger, lower
income families).
40                                                                                Answers to Exercises


               Plot of Can2*Can1$own.    Symbol points to label.

     Can2 |
          |
        3 +
          |
          |                                      > 0
        2 +
          |                                      > 1        > 2
          |                              > 1 2             > 1
        1 +                          > 0 > 1 ^> 1
          |                         > 1 0 > 1               > 2
          |                     0 0     1 2> 2 > 1                > 2
        0 +                     ^^> 1               > 2
          |                           > 1      > 2 > 2
          |                                         > 0
       -1 +        > 0        > 0              > 2      2 > 2
          |                               0 < > 0 < ^ > 2
          |
       -2 +                                     > 0       > 0
          |
          ---+-------+-------+-------+-------+-------+-------+-------+--
            -4      -3      -2       -1       0         1       2     3
                                        Can1

                       Class Means on Canonical Variables
                      own              Can1               Can2
                        0      -.5276820278      -.4991924912
                        1      -.3413484420      0.6796258485
                        2      0.8845582686      -.0821984957




The two discriminant functions do a reasonable job of accurately classifying households into
ownership categories. Based on cross-validation, the estimated hit rate is 21/36 = 58.3 percent, which
compares favorably to the proportional chance hit rate of 33.5 percent.
Answers to Exercises                                                                                    41




                              Linear Discriminant Function

                     _      -1 _                                              -1 _
      Constant = -.5 X' COV    X + ln PRIOR              Coefficient = COV       X
                       j        j           j            Vector                   j

              Variable                 0               1                  2
              Constant         -25.98898       -27.44166          -36.75413
              income             0.61400         0.53149            0.67633
              fam_size           3.01115         3.74437            3.90843
              age_hh             0.18752         0.23891            0.26387


         Cross-validation Summary using Linear Discriminant Function
            Number of Observations and Percent Classified into own

        From own                 0             1              2          Total
               0                 7             3              3             13
                             53.85         23.08          23.08         100.00
                1                3             7              1             11
                             27.27         63.64           9.09         100.00
                2                2             3              7             12
                             16.67         25.00          58.33         100.00
           Total                12            13             11             36
                             33.33         36.11          30.56         100.00




If we test the homogeneity of within-group covariance matrices across groups, the result is marginally
significant: Box’s chi-square test is 2(12) = 21.4, which is significant at the 0.05 level but not at the
0.01 level.



                       Within Covariance Matrix Information

                                                   Natural Log of the
                                Covariance         Determinant of the
                       own     Matrix Rank          Covariance Matrix
                         0               3                    9.40027
                         1               3                    8.10301
                         2               3                    8.79143
                    Pooled               3                    9.54972

              Test of Homogeneity of Within Covariance Matrices
                       Chi-Square        DF    Pr > ChiSq
                        21.351007        12        0.0455

      Since the Chi-Square value is significant at the 0.1 level, the
      within covariance matrices will be used in the discriminant
      function.
42                                                                                  Answers to Exercises

If we run a quadratic discriminant analysis, we fit that the fitted classification accuracy (based on
resubstitution) improves over linear, but the predictive accuracy actually declines to a hit rate of 16/36
= 44.4 percent. Thus, even though the test suggests that there are differences across groups in their
covariance structure, we are better off (from a classification accuracy standpoint) pooling our estimates
across groups and using linear (rather than non-linear) discriminant analysis.



         Resubstitution Summary using Quadratic Discriminant Function
            Number of Observations and Percent Classified into own

       From own               0             1              2         Total
              0               9             3              1            13
                          69.23         23.08           7.69        100.00
               1              2             8              1            11
                          18.18         72.73           9.09        100.00
               2              2             2              8            12
                          16.67         16.67          66.67        100.00
           Total             13            13             10            36
                          36.11         36.11          27.78        100.00


       Cross-validation Summary using Quadratic Discriminant Function
           Number of Observations and Percent Classified into own

       From own               0             1              2         Total
              0               5             5              3            13
                          38.46         38.46          23.08        100.00
               1              5             4              2            11
                          45.45         36.36          18.18        100.00
               2              2             3              7            12
                          16.67         25.00          58.33        100.00
           Total             12            12             12            36
                          33.33         33.33          33.33        100.00
Answers to Exercises                                                                                      43


Exercise 12.15

Once again, we revisit the famous Iris data. The data set is comprised of 50 specimens of each of three
different species of iris: Iris setosa (1), Iris versicolor (2), and Iris virginica (3). Four characteristics
of each are measured: sepal length, sepal width, petal length, and petal width.

The results from a multiple discriminant analysis indicate highly significant differences across species
of iris. Wilks’s  = 0.02, which is significant at p < 0.0001. In fact, two discriminant functions are
significant. For the second function only, Wilks’s  = 0.78, also significant at p < 0.0001. However,
while both are significant, the first is by far the more important, accounting for much more of the
variance in the data. The first function, which is highly correlated with all but sepal width, clearly
separates Iris setosa (1) from Iris versicolor (2) and Iris virginica (3): the difference in means is quite
large. By contrast, the second function (which is correlated with sepal width) provides a subtle
distinction between (2) and (3): the difference in group means is about 1.2 (versus 4.0 on the first
function).



                                       Adjusted      Approximate            Squared
                    Canonical         Canonical         Standard          Canonical
                  Correlation       Correlation            Error        Correlation
            1        0.984821          0.984508         0.002468           0.969872
            2        0.471197          0.461445         0.063734           0.222027

                  Test of H0: The canonical correlations in the
                    current row and all that follow are zero

                  Likelihood      Approximate
                       Ratio          F Value       Num DF      Den DF     Pr > F
            1     0.02343863           199.15            8         288     <.0001
            2     0.77797337            13.79            3         145     <.0001

                             Total Canonical Structure
                  Variable                Can1                   Can2
                  sep_l               0.791888               0.217593
                  sep_w              -0.530759               0.757989
                  pet_l               0.984951               0.046037
                  pet_w               0.972812               0.222902

                       Class Means on Canonical Variables
                  species              Can1               Can2
                        1      -7.607599927       0.215133017
                        2       1.825049490      -0.727899622
                        3       5.782550437       0.512766605




              Plot of Can2*Can1$species.      Symbol points to label.

              Can2 |
                   |
                 3 +
44                                                                              Answers to Exercises

                    |     > 1                             > 3
                    |                                  3 <> 3
                    |                                   > 3 > 3
                2   +                               3 < > 3
                    | 1 < > 1                          > 3 3
                    |1 < 1 > 1                      3 > 3 2 3
                    | 1 < ^^ > 1               > 2 ^^> 3
                1   +   1 <v > 1                  > 2 3
                    |   1 22 1                  > 2 ^2 2 3
                    |   1 <2^^> 1              > 2 2 3 v> 3
                    |   1 <2 > 1             2 < ^^^^3 > 3
                0   +     1 <2^1          2 <^^^>22 ^^3> 3
                    |     1 <32 1           2 2>22 2 v2 3 > 3
                    |       1 2> 1         2 24 > 2 > 3
                    |      1 <5 1         > 2 >^2 ^ > 3       > 3
               -1   +      1 2^> 1         2 < v2 v> 2 ^ > 3
                    |                     > 2 <> 2 3
                    |                    2 < 2 3 > 2 > 3
                    |                     2 <3 2 > 2
               -2   +            > 1        > 2 > 2
                    |                                 2 3
                    |
                    |                        2 2
               -3   +
                    |
                    ---+---------+---------+---------+---------+--
                      -10        -5        0          5         10

                                          Can1




According to Box’s test, we reject the homogeneity of covariance matrices across groups. However,
unlike previous examples in this chapter, the classification accuracy of the quadratic discriminant
functions (hit rate of 97 percent based on holdout validation) is closely comparable to the linear
discriminant functions (hit rate of 98 percent based on holdout validation).



                    Linear Discriminant Function for species

              Variable              1               2             3
              Constant      -86.30847       -72.85261    -104.36832
              sep_l          23.54417        15.69821      12.44585
              sep_w          23.58787         7.07251       3.68528
              pet_l         -16.43064         5.21145      12.76654
              pet_w         -17.39841         6.43423      21.07911

  Cross-Validation:    Number of Observations and Percent Classified into species

     From species             1             2            3         Total
                1            50             0            0            50
                         100.00          0.00         0.00        100.00
               2              0            48            2            50
                           0.00         96.00         4.00        100.00
               3              0             1           49            50
                           0.00          2.00        98.00        100.00
Answers to Exercises                                                                45

          Total             50           49           51            150
                         33.33        32.67        34.00         100.00




                      Within Covariance Matrix Information
                                            Natural Log of the
                              Covariance    Determinant of the
                  species    Matrix Rank     Covariance Matrix
                        1              4             -13.06736
                        2              4             -10.87433
                        3              4              -8.92706
                   Pooled              4              -9.95854

             Test of Homogeneity of Within Covariance Matrices
                      Chi-Square        DF    Pr > ChiSq
                      140.943050        20        <.0001


  Cross-validation:    Number of Observations and Percent Classified into species
   From species              1            2            3        Total
              1             50            0            0           50
                        100.00         0.00         0.00       100.00
              2              0           47            3           50
                          0.00        94.00         6.00       100.00
              3              0            1           49           50
                          0.00         2.00        98.00       100.00
          Total             50           48           52          150
                         33.33        32.00        34.67       100.00
46                                                                                      Answers to Exercises




13. Logit Choice Models

Exercise 13.1

 a.    We first calibrate a logit choice model conditional on making a purchase in the category on
       the trip. That is, we only account for trips with a purchase in the category. The model is

                                                         1
                               S1t 
                                       1  exp(    * ( price1t  price2t ))


 Model Fit Statistics
                                   Intercept
                   Intercept          and
 Criterion           Only         Covariates
 AIC                1718.496        1311.916
 SC                 1723.621        1322.167
 -2 Log L           1716.496        1307.916

         Testing Global Null Hypothesis: BETA=0
 Test                 Chi-Square       DF     Pr > ChiSq
 Likelihood Ratio       408.5798        1         <.0001
 Score                  382.6007        1         <.0001
 Wald                   329.0601        1         <.0001

                 Analysis of Maximum Likelihood Estimates
                                   Standard          Wald
 Parameter       DF    Estimate       Error    Chi-Square                  Pr > ChiSq
 Intercept        1     -0.2596      0.0693       14.0435                      0.0002
 Coke_rp          1     -1.2234      0.0674      329.0601                      <.0001



 b.    For the conditional logit choice model calibrated in part a, we have the formula for price
       elasticity for Coke as

                                                         ˆ
                                          1t   * (1  S1t ) * price1t

 The estimated elasticity each week is listed in the table below (eta)


 Obs      week     Coke_p         eta         Pep_p          cv              eta2
   1        1       4.33       -2.99052       4.33        -4.72561         -5.03855
   2        2       4.33       -2.99052       4.33        -4.72561         -5.03855
   3        3       4.33       -2.99052       4.33        -4.72561         -5.03855
   4        4       3.24       -1.00941       4.33        -3.92952         -3.40798
   5        5       3.24       -1.00941       4.33        -3.92952         -3.40798
   6        6       4.33       -4.40235       3.24        -3.77878         -5.11123
   7        7       4.33       -4.40235       3.24        -3.77878         -5.11123
   8        8       3.24       -1.00941       4.33        -3.92952         -3.40798
Answers to Exercises                                                                                     47

    9          9      3.24         -1.00941    4.33        -3.92952     -3.40798
   10         10      3.24         -1.00941    4.33        -3.92952     -3.40798
   11         11      4.33         -4.38400    3.26        -3.79908     -5.10995
   12         12      4.33         -4.38400    3.26        -3.79908     -5.10995
   13         13      3.24         -1.00941    4.33        -3.92952     -3.40798
   14         14      4.33         -4.40235    3.24        -3.77878     -5.11123
   15         15      4.33         -4.40235    3.24        -3.77878     -5.11123
   16         16      3.24         -1.00941    4.33        -3.92952     -3.40798
   17         17      3.24         -1.00941    4.33        -3.92952     -3.40798
   18         18      4.33         -4.39321    3.25        -3.78894     -5.11059
   19         19      3.15         -0.90303    4.33        -3.84632     -3.26703
   20         20      4.33         -4.30764    3.34        -3.87938     -5.10482



  c. We build a model that takes into account of both purchase incidence probability and
  conditional brand choice. That is:

                          P(coke)  P(coke | c)* P(c)
                                              1                              1
                                                                   *
                              1  exp(   *( price1t  price2t )) 1  exp(a   cv)

  where      cv  ln(exp(    * price1t )  exp(  * price2t ))
  After some calculation, we can find the derivative of purchase probability with respect to price is:

        P(coke) / price1t
             P(coke | c)           P(c)
                         P (c )           P(Coke | c)
              price1t             price1t
          * P(coke | c)*(1  P(Coke | c))* P(c)   *  * P(coke | c) 2 * P(c)*(1  P(c))
          * P(c)* P(coke | c)[1  Pr ob(coke | c)   *(1  P(c))* P(coke | c)]

                                                      price1
  Then the elasticity for this case is   2   *            . Since now we take into account category
                                                        s1
  purchase probability elasticity, we get a higher price elasticity (eta is the conditional price
  elasticity, eta2 is the unconditional price elasticity).


Exercise 13.2

a. We estimate a simple binary logit model with age, income and mobility as explanatory variables.


            Model Fit Statistics
                                       Intercept
                      Intercept           and
  Criterion             Only          Covariates
  AIC                   139.628          114.107
48                                                                                 Answers to Exercises

  SC                 142.233           124.528
  -2 Log L           137.628           106.107

          Testing Global Null Hypothesis: BETA=0
  Test                 Chi-Square       DF     Pr > ChiSq
  Likelihood Ratio        31.5207        3         <.0001
  Score                   27.9401        3         <.0001
  Wald                    21.5713        3         <.0001

  Analysis of Maximum Likelihood Estimates
                                 Standard                 Wald
  Parameter    DF    Estimate       Error           Chi-Square    Pr > ChiSq
  Intercept     1      0.2810      0.4312               0.4246        0.5147
  age           1      1.0691      0.5550               3.7105        0.0541
  income        1     -2.2221      0.5430              16.7481        <.0001
  mobility      1     -0.9101      0.5218               3.0420        0.0811




As we can see, the model is jointly significant. Age and mobility are significant at 10% level, but
not at 5% level. Income is highly significant.

b. We do a stepwise model selection for models with interaction terms. As we can
see from the output, none of the terms remain significant when we add all the interaction terms.


  Analysis of Maximum Likelihood Estimates

                                         Standard          Wald
  Parameter           DF    Estimate        Error    Chi-Square    Pr > ChiSq
  Intercept            1      0.2777       0.4705        0.3483        0.5551
  age                  1      0.7576       0.6791        1.2444        0.2646
  income               1     -1.7300       1.2267        1.9891        0.1584
  mobility             1     -0.7703       1.3378        0.3316        0.5647
  age*income           1      0.3109       1.3886        0.0501        0.8228
  age*mobility         1      0.4076       1.4206        0.0823        0.7742
  income*mobility      1     -1.4765       1.1666        1.6019        0.2056



Our stepwise selection model chooses the model with age, income and mobility as explanatory
variables. There does not seem to be any significant interaction effects. After adding all the
interaction terms, even income effect is not significant.


Exercise 13.3

a. As we can see from the estimation result, after controlling for gun ownership effect, there is no
statistically significant relationship between age and voting behavior. It is negative, but not
statistically significant.


     Analysis of Maximum Likelihood Estimates
                                 Standard                Wald
Answers to Exercises                                                                                       49

    Parameter      DF     Estimate        Error      Chi-Square      Pr > ChiSq
    Intercept       1      17.0683      11.6918          2.1312          0.1443
    age             1      -0.4372       0.3081          2.0134          0.1559
    gun_own         1       0.5301       0.9835          0.2905          0.5899



b. With our logit model, we set the following rule: if the predicted probability of a subject voting for
gun control is greater than 0.5, then we classify the subject as voting =1. Given this rule, we
perform a cross validation. For the original data, we correctly predicted 12 out of 15 subjects’
behavior. Comparing with discrimination analysis, using simple linear discrimination analysis and
prediction, the hitting rate is the same.


Exercise 13.5

a. We estimate logit models with brand dummy only, and with price and display in addition to brand
dummies as covariate. The output is as following:


                              Model Fit Statistics
                                      Without           With
                    Criterion      Covariates     Covariates
                    -2 LOG L           87.889         87.542
                    AIC                87.889         91.542
                    SBC                87.889         94.920

                   Testing Global Null Hypothesis: BETA=0
           Test                 Chi-Square       DF     Pr > ChiSq
           Likelihood Ratio         0.3466        2         0.8409
           Score                    0.3500        2         0.8395
           Wald                     0.3489        2         0.8399

                    Analysis of Maximum Likelihood Estimates
                    Parameter    Standard                                  Hazard
   Variable   DF     Estimate       Error Chi-Square Pr > ChiSq             Ratio
   dummya      1      0.14310     0.37893      0.1426      0.7057           1.154
   dummyb      1     -0.08005     0.40032      0.0400      0.8415           0.923




                              Model Fit Statistics
                                     Without           With
                    Criterion      Covariates     Covariates
                    -2 LOG L           87.889         68.241
                    AIC                87.889         76.241
                    SBC                87.889         82.997

                   Testing Global Null Hypothesis: BETA=0
           Test                 Chi-Square       DF     Pr > ChiSq
           Likelihood Ratio        19.6475        4         0.0006
           Score                   19.0897        4         0.0008
           Wald                    14.1197        4         0.0069

                    Analysis of Maximum Likelihood Estimates
50                                                                                        Answers to Exercises

                     Parameter       Standard                                   Hazard
     Variable   DF    Estimate          Error    Chi-Square     Pr > ChiSq       Ratio
     dummya      1     2.98015        0.95903        9.6564         0.0019      19.691
     dummyb      1     0.91253        0.56518        2.6069         0.1064       2.491
     price       1    -5.49366        1.70215       10.4166         0.0012       0.004
     disp        1     0.72674        0.68873        1.1134         0.2913       2.068



The information added by price and display can be computed using the following statistics:



                                            LLr      68.241
                                 2  1         1         0.224
                                            LL0      87.542

That is, about 22% of the uncertainty in choice in the intercept only model is explained by the full
model with price and display covariates. This is not a bad fit.

To test whether price and display are significant in explaining the choice behavior of the household,
we use the likelihood ration test; that is, X2(2) = -2(LLR -LLf) = 87.542 - 68.241 = 19.301, which is
significant at 0.05 level. This is consistent with what we estimated from the  statistics. When we
                                                                                      2

look at price and display separately, we see that the t-statistics for price is highly significant, while
the t-statistics for display is not significant. But our 2 statistics says that they are jointly significant.



b. The probability of each brand chosen given price=[1.5, 0.5, 0.80] and disp=0 is

                                 exp(a   * price _ A)
     SA 
          exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
      0.089571
                                 exp(b   * price _ B)
     SB 
          exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
      0.69761
                                   exp( * price _ C )
     SC 
          exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
      0.21282
when price of A change from 1.50 to 1.20, the probability that each brand will be chosen is: (0.33832,
0.50701, 0.15467).


Exercise 13.6
     Answers to Exercises                                                                                     51

a.         Simple logistic regression show that order of the samples does not really matter for the rating:


                              Model Fit Statistics
                                         Intercept
                          Intercept         and
       Criterion            Only        Covariates
       AIC                  278.759        278.328
       SC                   282.057        284.924
       -2 Log L             276.759        274.328

               Testing Global Null Hypothesis: BETA=0
       Test                 Chi-Square       DF     Pr > ChiSq
       Likelihood Ratio         2.4310        1         0.1190
       Score                    2.4261        1         0.1193
       Wald                     2.4161        1         0.1201

                     Analysis of Maximum Likelihood Estimates
                                       Standard          Wald
       Parameter     DF    Estimate       Error    Chi-Square          Pr > ChiSq
       Intercept      1      0.1201      0.2004        0.3596              0.5487
       ord            1     -0.4429      0.2849        2.4161              0.1201



     The coefficients for order of sample factor is not significant. As a whole, the model is not
     significant either. The probability modeled is for the choice of 45.

     b. Adding attributes into model does not change the overall preference.


       Model Fit Statistics
                                         Intercept
                          Intercept         and
       Criterion            Only        Covariates
       AIC                  278.759        275.712
       SC                   282.057        321.888
       -2 Log L             276.759        247.712

               Testing Global Null Hypothesis: BETA=0
       Test                 Chi-Square       DF     Pr > ChiSq
       Likelihood Ratio        29.0471       13         0.0064
       Score                   26.6749       13         0.0138
       Wald                    23.0145       13         0.0415

                                        Standard          Wald
       Parameter     DF      Estimate         Error    Chi-Square      Pr > ChiSq
       Intercept      1        1.3350        1.7544        0.5791          0.4467
       ord            1       -0.4758        0.3151        2.2808          0.1310
       rate271        1       -0.3982        0.1656        5.7805          0.0162
       rate272        1        0.4840        0.1806        7.1846          0.0074
       rate273        1       -0.6150        0.3300        3.4732          0.0624
       rate274        1        0.3954        0.2536        2.4311          0.1189
       rate275        1       -0.2127        0.2019        1.1095          0.2922
       rate276        1        0.0970        0.2465        0.1549          0.6939
       rate451        1       -0.2833        0.1561        3.2938          0.0695
       rate452        1       -0.1333        0.2012        0.4389          0.5077
       rate453        1        0.1378        0.2857        0.2326          0.6296
52                                                                                   Answers to Exercises

  rate454         1      -0.3989        0.2107          3.5860          0.0583
  rate455         1       0.4548        0.1954          5.4188          0.0199
  rate456         1       0.2289        0.2299          0.9911          0.3195




c. To see the contribution of attributes rating to the preference formation, we use the following
statistic:

                                         LLr      247.712
                               2  1        1          0.097 .
                                         LL0      274.328
This statistics also says that there is not much contribution by adding attribute rating to the model.

								
To top