# 4 by wuzhengqin

VIEWS: 0 PAGES: 52

• pg 1
```									Answers to Exercises                                                                                   1

10. Structural Equation Models with Latent Variables

Exercise 10.2

I went back to the original article by Crosby, Evans and Cowles (1990) and obtained the standard
deviations for the variables examined in this exercise. This makes it possible to analyze the covariance
matrix rather than the correlation matrix. The data are shown below (in the format used by AMOS):

row    var
type_ name_     Y1       Y2       Y3       Y4       X1       X2       X3       X4       X5      X6
n               151      151      151      151      151      151      151      151      151    151
corr     Y1     1.00
corr     Y2     0.63     1.00
corr     Y3     0.28     0.22     1.00
corr     Y4     0.23     0.24     0.51     1.00
corr     X1     0.38     0.33     0.29     0.20     1.00
corr     X2     0.42     0.28     0.36     0.39     0.57     1.00
corr     X3     0.37     0.30     0.39     0.29     0.48     0.59     1.00
corr     X4     0.30     0.36     0.21     0.18     0.15     0.29     0.30     1.00
corr     X5     0.45     0.37     0.31     0.39     0.29     0.41     0.35     0.44    1.00
corr     X6     0.56     0.56     0.24     0.29     0.18     0.33     0.30     0.46    0.63 1.00
stddev 0.78     1.32     0.83     1.01     1.36     1.09     1.28     0.73     1.29 1.17

The measurement equations and structural equations for the model of salesperson service outcomes are
shown below. The command syntax is taken from AMOS. There are two independent constructs
("similarity," measured by X 1, X2, and X3) and ("interaction," measured by X4, X5, and X6) and one
dependent construct ("attitude," measured by Y1 and Y2) in the model. The total number of observed
variances and covariances available to estimate the model parameters is 36; the total number of
parameters estimated is 19 (six factor loadings and six error variances for the two independent
constructs, one covariance term between independent constructs, two structural equation coefficients,
one parameter describing the variance of the error term in the structural equation, and one factor
loading and two error variances for the dependent construct). Thus, there are 17 degrees of freedom
associated with the model specified below.

Measurement and Structural Equations for 10.2:
"Y1 = (1) attitude + (1) eps1"
"Y2 =     attitude + (1) eps2"

"X1 = (1) similarity + (1) eps3"
"X2 =     similarity + (1) eps4"
"X3 =     similarity + (1) eps5"

"X4 = (1) interaction + (1) eps6"
"X5 =     interaction + (1) eps7"
"X6 =     interaction + (1) eps8"

"attitude = similarity + interaction + (1) zeta"

The goodness-of-fit statistics for the calibrated model are shown below. The chi-square test is
significant at the p=0.025 level: 2(17)=30.14. Strictly speaking, this suggests we should reject the
model (i.e., if this were in fact the true model, we would expect a fit this discrepant less than three
percent of the time). However, both GFI (0.95) and AGFI (0.90) indicate good levels of fit. Thus, we
are inclined to accept the model fit.

Summary of models
Model   NPAR         CMIN      DF            P        CMIN/DF
----------------   ----    ---------      --    ---------      ---------
Default model     19       30.144      17        0.025          1.773
Saturated model     36        0.000       0
Independence model      8      468.748      28        0.000        16.741

Model          RMR          GFI           AGFI           PGFI
----------------   ----------   ----------     ----------     ----------
Default model        0.061        0.954          0.903          0.451
Saturated model        0.000        1.000
Independence model        0.475        0.453          0.297         0.352

The parameter estimates (non-standardized, estimated from the covariance matrix) are shown below.
The column "C.R." reports the critical ratio, given by the estimate divided by its standard error (S.E.).
This is effectively a z-score, since the maximum likelihood routine provides asymptotic standard
errors. These results suggest (and the standardized solution confirms) that "interaction" has a greater
impact on "attitude" (in terms of explaining variance in the construct) than "similarity."

Maximum Likelihood Estimates and Standard Errors:
Regression Weights:                         Estimate       S.E.         C.R.     Label
-------------------                         --------     -------      -------   -------
attitude <-------- similarity          0.177       0.071        2.489
attitude <------- interaction          1.042       0.213        4.891
Y1 <---------------- attitude          1.000
Y2 <---------------- attitude          1.537       0.183       8.385
X1 <-------------- similarity          1.000
X2 <-------------- similarity          0.967       0.128       7.530
X3 <-------------- similarity          0.986       0.137       7.185
X4 <------------- interaction          1.000
X5 <------------- interaction          2.356       0.380       6.198
X6 <------------- interaction          2.487       0.384       6.472

Variances:                                  Estimate       S.E.         C.R.     Label
----------                                  --------     -------      -------   -------
similarity         0.858       0.200        4.281
interaction         0.162       0.048        3.379
zeta         0.146       0.044        3.350
eps1         0.185       0.044        4.192
eps2         0.741       0.125        5.920
eps3         0.979       0.144        6.799
eps4         0.377       0.088        4.265

eps5          0.793       0.124       6.370
eps6          0.368       0.046       7.919
eps7          0.756       0.117       6.444
eps8          0.360       0.093       3.888

Covariances:                                 Estimate      S.E.           C.R.     Label
------------                                 --------    -------        -------   -------
similarity <----> interaction           0.191      0.051          3.728

Standardized Solution:
Regression Weights:                          Estimate
--------------------------------             --------
attitude <-------- similarity           0.254
attitude <------- interaction           0.647
Y1 <---------------- attitude           0.833
Y2 <---------------- attitude           0.756
X1 <-------------- similarity           0.683
X2 <-------------- similarity           0.825
X3 <-------------- similarity           0.716
X4 <------------- interaction           0.552
X5 <------------- interaction           0.737
X6 <------------- interaction           0.857

Correlations:                                Estimate
-------------                                --------
similarity <----> interaction           0.513

For those using SAS, I have included the relevant portions of the SAS output for the purposes of
comparison with AMOS.

Results from PROC CALIS Model in SAS:
Goodness of Fit Index (GFI)                                     0.9542
GFI Adjusted for Degrees of Freedom (AGFI)                      0.9030
Root Mean Square Residual (RMR)                                 0.0448
Parsimonious GFI (Mulaik, 1989)                                 0.5793
Chi-Square                                                     30.1438
Chi-Square DF                                                       17
Pr > Chi-Square                                                 0.0253

Standardized Solution:
Manifest Variable Equations:
y1      =   0.8329 f_sales     +   0.5535 e1
y2      =   0.7564*f_sales     +   0.6541 e2
c2
x1      =   0.6835*f_sim       +   0.7300 e3
c3
x2      =   0.8249*f_sim       +   0.5653 e4
c4
x3      =   0.7162*f_sim       +   0.6979 e5
c5

x4       =    0.5524*f_int     +   0.8336 e6
c6
x5       =    0.7366*f_int     +   0.6764 e7
c7
x6       =    0.8573*f_int     +   0.5149 e8
c8

Latent Variable Equations:
f_sales =   0.2536*f_sim      +   0.6467*f_int     +   0.5910 d
g1                    g2

Correlations Among Exogenous Variables:
Var1    Var2    Parameter             Estimate
f_sim   f_int   phi                    0.51311

Exercise 10.4

Using AMOS, it is possible to perform a multiple group analysis on these data (where one group is
comprised of 154 men and the other group is comprised of 125 women). We begin by specifying a
model of the impact of personableness (denoted "person") and quality of argument (denoted "quality")
on the perceived outcome of the debate (denoted "success"). We allow all coefficients of the model to
differ across groups; that is, we allow different factor loadings, error variances, path coefficients, etc.,
for men and women. The equations describing this model are shown below:

Measurement and Structural Equations for Two Groups:

Men:
"S1 = (1) success + (1) eps1"
"S2 =     success + (1) eps2"
"S3 =     success + (1) eps3"

"P1 = (1) person + (1) eps4"
"P2 =     person + (1) eps5"

"Q1 = (1) quality + (1) eps6"
"Q2 =     quality + (1) eps7"

"success = person + quality + (1) zeta"

Women:
"S1 = (1) success + (1) eps1"
"S2 =     success + (1) eps2"
"S3 =     success + (1) eps3"

"P1 = (1) person + (1) eps4"
"P2 =     person + (1) eps5"

"Q1 = (1) quality + (1) eps6"
"Q2 =     quality + (1) eps7"

The total number of observed variances and covariances we have to estimate the model parameters is
equal to 56 (28 each for men and women). The number of parameters estimated is 34 (17 each for
mean and women), leaving a total number of degrees of freedom of 22. The goodness of fit results are
shown below. Note that the fit of the model is exceptionally good.

Summary of Results: Model I
Model    NPAR        CMIN      DF            P        CMIN/DF
----------------    ----   ---------      --    ---------      ---------
Default model      34      15.255      22        0.851          0.693

Model           RMR          GFI          AGFI           PGFI
----------------    ----------   ----------    ----------     ----------
Default model         0.207        0.985         0.961          0.387

Note that most of the parameter estimates are quite closely comparable across groups. For example,
the factor loading for men and women seem remarkably similar. One noticeable difference is in the
past coefficient that describes the impact of "person" on "success." The estimate appears to be much
greater among women (1.879) than among men (0.955). This is something we can subject to a more
rigorous statistical test.

Results for group: men
Regression Weights:                         Estimate       S.E.         C.R.     Label
-------------------                         --------     -------      -------   -------
success <----- person          0.955       0.137        6.962
success <---- quality          1.116       0.154        7.227
S1 <--------- success          1.000
S2 <--------- success          0.996       0.045      22.347
S3 <--------- success          1.035       0.039      26.606
P1 <---------- person          1.000
P2 <---------- person          0.931       0.110       8.470
Q1 <--------- quality          1.000
Q2 <--------- quality          1.044       0.120        8.711

Covariances:                                Estimate       S.E.         C.R.     Label
------------                                --------     -------      -------   -------
person <----> quality         0.066       0.479        0.137

Results for group: women
Regression Weights:                         Estimate       S.E.         C.R.     Label
-------------------                         --------     -------      -------   -------
success <----- person          1.879       0.199        9.463
success <---- quality          1.231       0.229        5.374
S1 <--------- success          1.000
S2 <--------- success          0.994       0.031      31.635
S3 <--------- success          1.014       0.031      32.645
P1 <---------- person          1.000
P2 <---------- person          0.884       0.090       9.801
Q1 <--------- quality          1.000
Q2 <--------- quality          1.196       0.204       5.853

Covariances:                                Estimate       S.E.         C.R.     Label
------------                                --------     -------      -------   -------
person <----> quality          0.159       0.500        0.318

We are now in a position to test for significant differences in model parameters between groups. We
do this by constraining parameter values to be the same across groups. By convention, we proceed by
first testing for equality in the measurement models, and then test for equality in the structural equation
models. Model II, shown below, constrains the factor loadings and measurement error variances for all
three constructs to be the same across groups. (Note that we might proceed by testing one at a time;
we've accelerated the process somewhat to save space). This is done in AMOS by labeling the
parameters across groups with the same label (e.g., the correlation between "person" and "quality" is
given the same label, phi, in the models for men and women).

In all, we constrain 12 parameters to be the same across groups: four factor loadings (those that are not
already set to 1.0 to identify the model), seven measurement error variances, and one covariance
between the independent factors "person" and "quality." Thus, Model II has 34 degrees of freedom.
As shown below, this constrained model still fits extremely well. The chi-square test is even less
significant than in Model I. Thus, we are unable to reject the hypotheses that the factor structures are
not different across groups.

Summary of Results: Model II
Model   NPAR         CMIN      DF            P        CMIN/DF
----------------   ----    ---------      --    ---------      ---------
Default model     22       22.829      34        0.927          0.671

Model          RMR           GFI          AGFI           PGFI
----------------   ----------    ----------    ----------     ----------
Default model        0.228         0.976         0.960          0.592

Results for group: men
Regression Weights:                         Estimate       S.E.         C.R.     Label
-------------------                         --------     -------      -------   -------
success <----- person          0.934       0.126        7.399
success <---- quality          1.164       0.154        7.569
S1 <--------- success          1.000
S2 <--------- success          0.995       0.026      37.737     f1
S3 <--------- success          1.022       0.024      42.491     f2
P1 <---------- person          1.000
P2 <---------- person          0.885       0.066      13.338     f3
Q1 <--------- quality          1.000
Q2 <--------- quality          1.097       0.105      10.399     f4

Covariances:                                Estimate       S.E.         C.R.     Label
------------                                --------     -------      -------   -------
person <----> quality          0.146       0.356        0.411    phi

Results for group: women
Regression Weights:                         Estimate       S.E.        C.R.      Label
-------------------                         --------     -------     -------    -------
success <----- person          1.856       0.183      10.126

success <---- quality        1.154      0.201       5.748
S1 <--------- success        1.000
S2 <--------- success        0.995      0.026      37.737     f1
S3 <--------- success        1.022      0.024      42.491     f2
P1 <---------- person        1.000
P2 <---------- person        0.885      0.066      13.338     f3
Q1 <--------- quality        1.000
Q2 <--------- quality        1.097      0.105      10.399     f4

Covariances:                                   Estimate      S.E.         C.R.     Label
------------                                   --------    -------      -------   -------
person <----> quality        0.146      0.356        0.411    phi

We now move on to test for equality of the structural equation model coefficients across groups. (As
before, we can test these parameters separately, but for our purposes it serves to test all parameters at
once). In addition to constraining the parameters of the measurement model, we now also constrain
the structural equation coefficients. The fit results (shown below) suggest that this Model III is only
marginally significant. However, the change in fit from Model II to Model III is 26.3 on a change of
only three degrees of freedom. This suggests a significant deterioration in fit, which leads us to
conclude that there are in fact differences across groups in the values of the path coefficients.

Summary of Results: Model III
Model      NPAR         CMIN     DF            P        CMIN/DF
----------------      ----    ---------     --    ---------      ---------
Default model        19       49.057     37        0.089          1.326

Model             RMR          GFI          AGFI           PGFI
----------------      ----------   ----------    ----------     ----------
Default model           3.804        0.952         0.927          0.629

It is not now possible to perform a multiple group analysis using PROC CALIS in SAS. One approach
we might use to test for a difference between groups is to calibrate the model on one set of data (say,
men), and then use the estimated parameter values to predict to the other sample (i.e., women). If the
only differences between the two groups are non-systematic sampling error, then the predicted values
should not be significantly difference from the observed. If there is a difference, then the parameters
estimated using the data from the men should not fit the women's data very well.

The model fits the men's data very well: 2(11)=4.4, which is not significant.

Goodness of Fit Index (GFI)                                   0.9917
GFI Adjusted for Degrees of Freedom (AGFI)                    0.9790
Root Mean Square Residual (RMR)                               0.1114
Parsimonious GFI (Mulaik, 1989)                               0.5195
Chi-Square                                                    4.4300
Chi-Square DF                                                     11
Pr > Chi-Square                                               0.9556

Manifest Variable Equations with Estimates
s1         =   1.0000 f_succ   + 1.0000 e1
s2         =   0.9960*f_succ   + 1.0000 e2

Std Err        0.0446 c2
t Value       22.3546
s3        =    1.0347*f_succ   +   1.0000 e3
Std Err        0.0389 c3
t Value       26.6158
p1        =    2.3637*f_pers   +   1.0000 e4
Std Err        0.2173 c4
t Value       10.8782
p2        =    2.1999*f_pers   +   1.0000 e5
Std Err        0.1966 c5
t Value       11.1889
q1        =    2.1214*f_qual   +   1.0000 e6
Std Err        0.1917 c6
t Value       11.0683
q2        =    2.2155*f_qual   +   1.0000 e7
Std Err        0.1974 c7
t Value       11.2260

Latent Variable Equations with Estimates
f_succ =     2.2570*f_pers   + 2.3670*f_qual    + 1.0000 d
Std Err      0.3083 g1          0.3106 g2
t Value      7.3210             7.6209

Covariances Among Exogenous Variables
Standard
Var1   Var2   Parameter      Estimate         Error         t Value
f_pers f_qual phi             0.01315       0.09613            0.14

When we use the parameters estimated from the men's data to predict the data observed for the women,
the correspondence is not very good. Note that there is no estimation going on at this stage: no model
parameters are being fitted. Because the chi-square statistic is highly significant, we reject the model
and conclude that the two groups are indeed different.

Predicting WOMEN Using Model Calibrated on MEN:
Goodness of Fit Index (GFI)                                0.8216
GFI Adjusted for Degrees of Freedom (AGFI)                 0.8216
Root Mean Square Residual (RMR)                            8.9304
Parsimonious GFI (Mulaik, 1989)                            1.0955
Chi-Square                                                86.9466
Chi-Square DF                                                  28
Pr > Chi-Square                                            <.0001

The problem is that we do not know exactly the basis of the difference between the two groups. One
thing we can do is increase the generality of the model by relaxing the constraints on the structural
equation model parameters. Allowing these three parameters (the two path coefficients and the error
variance) to be different across groups, we get the following result.

Relaxing Constraints on Structural Equation Parameters:
Goodness of Fit Index (GFI)                                   0.9406
GFI Adjusted for Degrees of Freedom (AGFI)                    0.9335
Root Mean Square Residual (RMR)                               0.6635
Parsimonious GFI (Mulaik, 1989)                               1.1197
Chi-Square                                                   26.0779
Chi-Square DF                                                     25
Pr > Chi-Square                                               0.4035

Latent Variable Equations with Estimates
f_succ =    4.3922*f_pers   + 2.2710*f_qual           +    1.0000 d
Std Err     0.3848 g1          0.3854 g2
t Value    11.4127             5.8931

Standardized:
f_succ =       0.7352*f_pers +   0.3801*f_qual +       0.5547 d
g1                g2

11. Analysis of Variance

Exercise 11.4

As currently stated, this problem asks for an ANOVA that involves a within-subjects treatment (i.e.,
each subject is asked to taste a product and a modified form of that product, which means we observe
two different treatment levels within the same subject). Since we have not discussed these designs, I
changed the format of the problem and asked students to look at the differences in perceptions of the
two products (across six attributes) and test to see if there is a difference due to order of taste. The
modified data format is shown below (the two products are identified as 27 and 45; the first product
tasted is listed in column 2; columns 3-8 contain the attribute rating for product 27 - product 45):

1    27   -2    -1        0     0    0     0
2    27   -1     1        0    -2   -3    -1
3    27    0     0        0    -1    0     0
4    27    2     1       -1    -1    0    -1
5    27    1     1        0     0    0     0
6    27   -1    -1        1    -2    2    -1
7    27   -2     0        1    -1    0     0
8    27   -4     0        1    -1    1    -1
9    27    1    -2       -1     0    0     0
10    27    2    -1        0     0    0     0
< 89 rows omitted >
100    27    1     0        1     1    0     2
101    45   -1    -1        0     1   -2     0
102    45    1    -1        1     0    0     0
103    45    0     0        0     1    0     0
104    45   -1    -1       -1    -1   -1     0
105    45    1     0        0    -1   -1     2
106    45    1     2        0     0    0     1
107    45    0     0        0     0    0     0
108    45   -1     0        0     1    0     0
109    45   -2     2        0    -1    0     0
110    45   -2     0        1    -1    0    -1
< 90 rows omitted >

The appropriate test is a MANOVA across all six attributes (one can look at the correlation matrix and
verify that there is a high level of collinearity across attributes; Bartlett's test of sphericity could also be
used here). As shown below, the null hypothesis is rejected, which means there are significant
differences. This is interesting, since one would not necessarily expect the perceptions of the products
to depend on which is tasted first. However, there are studies in the marketing literature that
substantiate the existence of this type of first-mover reference effect.

MANOVA Test Criteria and Exact F Statistics for
the Hypothesis of No Overall first Effect
H = Type III SSCP Matrix for first
E = Error SSCP Matrix

S=1     M=2    N=95.5

Statistic                          Value   F Value    Num DF   Den DF   Pr > F
Wilks' Lambda                 0.82829345      6.67         6      193   <.0001
Pillai's Trace                0.17170655      6.67         6      193   <.0001
Hotelling-Lawley Trace        0.20730159      6.67         6      193   <.0001
Roy's Greatest Root           0.20730159      6.67         6      193   <.0001

Treatment group means

Level of         ------------a1------------      ------------a2------------
first        N           Mean       Std Dev              Mean       Std Dev
27         100    -0.55000000    1.69595478       -0.20000000    1.29490064
45         100     0.37000000    1.66153756        0.21000000    1.17460503

Level of         ------------a3------------      ------------a4------------
first        N           Mean       Std Dev              Mean       Std Dev
27         100     0.21000000    0.70057696        0.08000000    1.07007033
45         100    -0.01000000    0.67412495       -0.32000000    0.91981553

Level of         ------------a5------------      ------------a6------------
first        N           Mean       Std Dev              Mean       Std Dev
27         100     0.09000000    1.15553127       -0.02000000    1.01483939
45         100    -0.11000000    1.01399301       -0.18000000    1.08599905

A canonical correlation analysis helps to facilitate the interpretation. Note that the test (based on
Wilks's  is exactly the same. What we also can see is the canonical structure (i.e., the correlations
between the attributes and their canonical variable. It shows that the product tasted first benefits on the
first two attributes, but does more poorly on the last four.

Canonical Correlation Analysis

Canonical Structure

Correlations Between the VAR Variables and Their Canonical Variables
V1
first        1.0000

Correlations Between the WITH Variables and Their Canonical Variables
W1
a1        0.6407
a2        0.3967
a3       -0.3832
a4       -0.4766
a5       -0.2222
a6       -0.1841

Exercise 11.6

These data are meant to mimic the results of an experiment exploring the impact of competitive
expectations. The results are not particularly exciting. Single ANOVAs reveal that both share and
profit are influenced by the experimental manipulation.

Dependent Variable: share

Sum of
Source               DF         Squares      Mean Square    F Value     Pr > F
Model                 2     10290.70000       5145.35000      12.76     <.0001
Error                57     22982.90000        403.20877
Corrected Total      59     33273.60000

Dependent Variable: profit

Sum of
Source               DF         Squares       Mean Square   F Value     Pr > F
Model                 2     10157.70000        5078.85000      7.05     0.0018
Error                57     41043.90000         720.06842
Corrected Total      59     51201.60000

This results from the single ANOVAs is supported by the MANOVA. The pattern of means shows
that the "competitive" treatment group performed relatively better in terms of profitability, while the
"cooperative" treatment group performed relatively better on share. Both treatment groups seemed to
perform better than the control.

MANOVA Test Criteria and F Approximations for
the Hypothesis of No Overall treat Effect
H = Type III SSCP Matrix for treat
E = Error SSCP Matrix

Statistic                         Value F Value Num DF Den DF Pr > F
Wilks' Lambda                0.45856613   13.35      4    112 <.0001
Pillai's Trace               0.62548024   12.97      4    114 <.0001
Hotelling-Lawley Trace       0.99742972   13.88      4 66.174 <.0001
Roy's Greatest Root          0.75451889   21.50      2     57 <.0001

NOTE: F Statistic for Roy's Greatest Root is an upper bound.
NOTE: F Statistic for Wilks' Lambda is exact.

Level of       -----------share----------     ----------profit----------
treat      N           Mean       Std Dev             Mean       Std Dev

1          20       92.750000    18.8815560           91.600000      21.7894228
2          20       99.550000    16.5449149          122.650000      30.9078001
3          20      123.300000    24.0702918          113.350000      27.0209957

Almost nothing changes if we include past work experience as a covariate:

Dependent Variable: share

Sum of
Source                 DF         Squares       Mean Square        F Value     Pr > F
Model                   3     13707.42043        4569.14014          13.08     <.0001
Error                  56     19566.17957         349.39606
Corrected Total        59     33273.60000

Source                   DF    Type III SS       Mean Square        F Value     Pr > F
treat                    2    9929.133346       4964.566673          14.21     <.0001
past_exp                 1    3416.720431       3416.720431           9.78     0.0028

Dependent Variable: profit

Sum of
Source                 DF         Squares       Mean Square        F Value     Pr > F
Model                   3     13252.53660        4417.51220           6.52     0.0007
Error                  56     37949.06340         677.66185
Corrected Total        59     51201.60000

Source                   DF    Type III SS       Mean Square        F Value     Pr > F
treat                    2    9949.292723       4974.646362           7.34     0.0015
past_exp                 1    3094.836603       3094.836603           4.57     0.0370

MANOVA Test Criteria and F Approximations for
the Hypothesis of No Overall treat Effect
H = Type III SSCP Matrix for treat
E = Error SSCP Matrix

Statistic                          Value     F Value    Num DF    Den DF     Pr > F

Wilks' Lambda                0.44174276       13.88          4        110   <.0001
Pillai's Trace               0.64961118       13.47          4        112   <.0001
Hotelling-Lawley Trace       1.05695747       14.45          4     64.974   <.0001
Roy's Greatest Root          0.79771155       22.34          2         56   <.0001

NOTE: F Statistic for Roy's Greatest Root is an upper bound.
NOTE: F Statistic for Wilks' Lambda is exact.

Least Squares Means
profit
treat      share LSMEAN            LSMEAN
1             93.015415         91.852604
2             99.583177        122.681575

3           123.001408        113.065821

Canonical correlation also helps the interpretation of the MANOVA. In this case, we have three
treatment groups and two dependent variables, so there are two pairs of canonical correlations. As
shown by the sequential test based on Wilks's , both pairs are significant.

Canonical Correlation Analysis

Canonical        Canonical         Standard        Canonical
Correlation      Correlation            Error      Correlation
1       0.655777         0.637219         0.074202         0.430043
2       0.442083          .               0.104745         0.195437

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood      Approximate
Ratio          F Value      Num DF     Den DF     Pr > F
1    0.45856613            13.35           4        112     <.0001
2    0.80456294            13.85           1         57     0.0005

Looking at the canonical structure, we see that the first pair of variables differentiates between the
"competitive" and "cooperative" treatments (note that comp loads positively on V1, while coop loads
negatively). Looking at W1, we see that it reflects the difference in performance on the two variables
(loading positively on profit and negatively on share). The second pair of canonical variables reflects
the difference in performance between the two treatment groups and the control group (which does
worse in terms of profit and share).

Canonical Structure

Correlations Between the VAR Variables and Their Canonical Variables
V1            V2
comp        0.7937        0.6083
coop       -0.9236        0.3833

Correlations Between the WITH Variables and Their Canonical Variables
W1            W2
share        -0.6966        0.7175
profit        0.1121        0.9937

Exercise 11.7

It is possible to approach this problem using MANOVA to examine the impact of the experimental
factors on all four dependent variables (Y1, Y2, Y3 and Y4). However, it seems clear that these four
dependent measures are designed to measure two distinct constructs: liking (Y1 and Y2) and purchase
intention (Y3 and Y4).

We begin by using exploratory factor analysis to examine the factor structure of the four dependent
variables. As shown below, we find that a two-factor solution accounts for most of the variation in the
data: the sum of the communalities is 3.23, which suggests that two common factors account for
3.23/4.00 or over 80 percent of the variation in the data.

Exploratory Factor Analysis of Four Dependent Variables in Exercise 11.7
Prior Communality Estimates: SMC
0.80643919        0.80162491      0.71857006          0.69248843

Eigenvalues of the Reduced Correlation Matrix:
Total = 3.0191226 Average = 0.75478065

Eigenvalue      Difference      Proportion     Cumulative
1    2.74173543      2.24935193          0.9081         0.9081
2    0.49238350      0.58158550          0.1631         1.0712
3    -.08920200      0.03659233         -0.0295         1.0417
4    -.12579433                         -0.0417         1.0000

2 factors will be retained by the NFACTOR criterion.

Factor Pattern
Factor1           Factor2
like1        0.86169          -0.33046
like2        0.85272          -0.34438

Variance Explained by Each Factor
Factor1         Factor2
2.7417354       0.4923835

Final Communality Estimates: Total = 3.234119
0.85170968      0.84573766      0.78365811      0.75301347

An oblique rotation of the factor solution results in two clear factors: “product liking” and “purchase
intention.” The factor loadings exhibit simple structure (Y1 and Y2 load clearly on the first factor and
Y3 and Y4 load clearly on the second). The two factors exhibit a correlation of 0.62 (i.e., product
liking is strongly positively associated with purchase intention).

Results of Oblique Rotation for Exercise 11.7
Rotation Method: Promax (power = 3)

Inter-Factor Correlations
Factor1         Factor2
Factor1          1.00000         0.61950
Factor2          0.61950         1.00000

Rotated Factor Pattern (Standardized Regression Coefficients)
Factor1         Factor2
like1         0.88132         0.06483
like2         0.89175         0.04397

Using the factor score coefficients from the factor analysis, we calculated factor scores and used these
as the dependent variables in a MANOVA. (Note that the results would be similar if we simply took
the sum of Y1 and Y2 and the sum of Y3 and Y4 and subjected them to the same analysis).

The results of the simple ANOVAs of Factor1 (“Liking”) and Factor 2 (“Purchase Intent”) are shown
below. Interestingly, we see that the inclusion of a uniqueness claim has a significant impact on liking
(p < 0.05) and the inclusion of a competitive claim has a significant impact on purchase intent (p <
0.05). There is also a significant interaction between claims on liking (p < 0.05) and purchase intent (p
< 0.05).

Simple ANOVA Results for Exercise 11.7
Dependent Variable: Factor1

Sum of
Source                      DF         Squares      Mean Square     F Value    Pr > F
Model                        3      9.12005460       3.04001820        3.66    0.0150
Error                       96     79.64772132       0.82966376
Corrected Total             99     88.76777592

R-Square      Coeff Var       Root MSE      Factor1 Mean
0.102741     2.56384E17       0.910859        3.5527E-16

Source                      DF     Type III SS      Mean Square     F Value    Pr > F
compet                       1      0.00543779       0.00543779        0.01    0.9356
unique                       1      4.53521680       4.53521680        5.47    0.0215
compet*unique                1      4.57940001       4.57940001        5.52    0.0209

Dependent Variable: Factor2

Sum of
Source                      DF         Squares      Mean Square     F Value    Pr > F
Model                        3      9.34817426       3.11605809        4.03    0.0096
Error                       96     74.24853373       0.77342223

Corrected Total               99     83.59670799

R-Square        Coeff Var        Root MSE     Factor2 Mean
0.111825       8.25139E17        0.879444       1.0658E-16

Source                        DF     Type III SS      Mean Square      F Value   Pr > F
compet                         1      3.94340484       3.94340484         5.10   0.0262
unique                         1      0.59738878       0.59738878         0.77   0.3817
compet*unique                  1      4.80738064       4.80738064         6.22   0.0144

The statistical tests from the MANOVA of Factor1 and Factor2 yield the same conclusions: all effects
significant at p < 0.05.

MANOVA Results for Exercise 11.7
Multivariate Analysis of Variance

COMPETITIVE CLAIM:

Statistic                         Value    F Value   Num DF   Den DF   Pr > F
Wilks' Lambda                0.90767677       4.83        2       95   0.0100

UNIQUENESS CLAIM:

Statistic                         Value    F Value   Num DF   Den DF   Pr > F
Wilks' Lambda                0.93481584       3.31        2       95   0.0407

INTERACTION:

Statistic                         Value    F Value   Num DF   Den DF   Pr > F
Wilks' Lambda                0.93300518       3.41        2       95   0.0371

The effects themselves are revealed by the mean values of Factor1 and Factor2, shown below:

Mean Values on Factor1 and Factor 2
Level of          ----------Factor1---------   ----------Factor2---------
compet        N           Mean       Std Dev           Mean       Std Dev
0            50     0.00737414    0.94288753     0.19858008    0.89527570
1            50    -0.00737414    0.96043709    -0.19858008    0.90777697

Level of          ----------Factor1---------   ----------Factor2---------
unique        N           Mean       Std Dev           Mean       Std Dev
0            50     0.21296048    1.00187542     0.07729093    0.96192245
1            50    -0.21296048    0.84574078    -0.07729093    0.87668059

Level of Level of    ---------Factor1--------- ---------Factor2---------
compet   unique    N         Mean      Std Dev         Mean      Std Dev
0        0        25   0.00633929   1.07020031   0.05661361   1.00625374

0           1       25    0.00840898     0.81840458    0.34054655       0.76282283
1           0       25    0.41958167     0.90280717    0.09796825       0.93579176
1           1       25   -0.43432995     0.82974666   -0.49512841       0.78964385

Note that the main effect of the inclusion of a competitive claim is largest (and negative) with respect
to purchase intent. It suggests that a competitive claim has little effect on product liking and a
pronounced negative effect on purchase intent. Similarly, the effect of the inclusion of a uniqueness
claim is largest (and negative) with respect to product liking. It suggests that a uniqueness claim has
little effect on purchase intent and a pronounced negative effect on product liking.

These seemingly counterintuitive results make more sense when we examine the interaction effect
revealed by the cell means. These suggest that the inclusion of a competitive claim alone has a
positive effect on liking (but not purchase intent) relative to no competitive claim, and that the
inclusion of a uniqueness claim alone has a positive effect on purchase intent (but not liking) relative
to no uniqueness claim. However, when both claims are included, the effect on both liking and
purchase intent is substantially negative relative to no claims at all.

We can compare the results above to those from a MANOVA of all four dependent variables
separately. As shown below, the while the main effects of competitive claim and uniqueness claim are
both significant (at p < 0.05), the interaction effect is not significant. This seems unusual, in light of
the fact that the interaction effect is significant in each of the simple ANOVAs.

MANOVA for all four dependent variables in Exercise 11.7
Multivariate Analysis of Variance

COMPETITIVE CLAIM:

Statistic                      Value    F Value   Num DF   Den DF    Pr > F
Wilks' Lambda             0.81605120       5.24        4       93    0.0007

UNIQUENESS CLAIM:

Statistic                      Value    F Value   Num DF   Den DF    Pr > F
Wilks' Lambda             0.87563562       3.30        4       93    0.0141

INTERACTION:

Statistic                      Value    F Value   Num DF   Den DF    Pr > F
Wilks' Lambda             0.93077476       1.73        4       93    0.1501

The reason may be due to the fact that the pattern of the mean values differs across the four dependent
variables. As shown below, the combination of competitive claim and uniqueness claim results in the
lowest average score for each of the four dependent measures. But there appears to be more noise in
the measures across four variables, and the pattern of interaction does not hold up in this analysis.

Mean Values on Four Dependent Variables for Exercise 11.7

Level of Level of         ----------like1---------- ----------like2----------
compet   unique       N           Mean      Std Dev         Mean      Std Dev

0        0           25    49.6400000   9.34469546   49.4000000   10.0332780
0        1           25    49.2400000   8.00666389   49.5200000    7.4056285
1        0           25    53.3200000   8.24984848   53.3600000    8.3260635
1        1           25    45.4000000   8.50490055   46.0400000    7.1032856

compet   unique       N           Mean      Std Dev         Mean      Std Dev

0        0           25    51.0000000   8.96753403   50.3600000   8.37595766
0        1           25    52.8800000   6.55311631   53.9200000   7.69696910
1        0           25    52.6000000   8.91160292   48.4800000   7.94837510
1        1           25    46.5200000   7.93263302   45.4000000   6.74536878

SAS PROGRAM FOR EXERCISE 11.7
options ls=72;

data claims;
run;

proc factor data=claims method=principal priors=smc n=2
rotate=promax score outstat=stat;
run;

proc score data=claims score=stat out=scores;
run;

proc glm data=scores;
class compet unique;
model factor1 factor2 = compet unique compet*unique;
manova h=compet unique compet*unique;
means compet unique compet*unique;
run;

proc glm data=claims;
class compet unique;
manova h=compet unique compet*unique;
means compet unique compet*unique;
run;

Exercise 11.8

Ever wondered how different methods of cooking fish impact the aroma, flavor, texture and moisture
of the final product? Here is your chance to find out! All four of these measures are relatively highly
correlated, so MANOVA is appropriate here.

Pearson Correlation Coefficients, N = 36
Prob > |r| under H0: Rho=0

aroma         flavor       texture      moisture
aroma              1.00000        0.73024       0.49060       0.39795
<.0001        0.0024        0.0162
flavor             0.73024        1.00000       0.36140       0.41800
<.0001                       0.0303        0.0112
texture            0.49060        0.36140       1.00000       0.59300
0.0024         0.0303                      0.0001
moisture           0.39795        0.41800       0.59300       1.00000
0.0162         0.0112        0.0001

The results of the MANOVA show a significant difference across cooking methods. Looking at the
patterns of means, we see the larges differences across groups in terms of flavor.

MANOVA Test Criteria and F Approximations for
the Hypothesis of No Overall method Effect
H = Type III SSCP Matrix for method
E = Error SSCP Matrix

Statistic                              Value F Value Num DF Den DF Pr > F
Wilks' Lambda                     0.24182119    7.75      8     60 <.0001
Pillai's Trace                    0.85684365    5.81      8     62 <.0001
Hotelling-Lawley Trace            2.72727944   10.04      8 40.602 <.0001
Roy's Greatest Root               2.56842429   19.91      4     31 <.0001

NOTE: F Statistic for Roy's Greatest Root is an upper bound.
NOTE: F Statistic for Wilks' Lambda is exact.

Level of            -----------aroma----------    ----------flavor----------
method         N            Mean       Std Dev            Mean       Std Dev

1              12     5.38333333     0.58439298    5.70833333    0.44201673
2              12     5.25833333     0.76331613    5.23333333    0.57419245
3              12     4.97500000     0.54292976    4.83333333    0.45990776

Level of            ----------texture---------    ---------moisture---------
method         N            Mean       Std Dev            Mean       Std Dev

1              12     5.52500000     0.65244296    5.98333333    0.69652819

2          12      5.30833333      0.59460962       5.87500000      0.51720402
3          12      5.90833333      0.51071845       6.23333333      0.45593726

Canonical correlation provides some additional interpretation. Again, with three methods and four
variables, we have two pairs of canonical correlations. In this case, however, only the first is
significant, so we only bother to attempt to interpret this. Looking at the canonical structure, we see
that methods 1 and 2 load positively on the first canonical variable (so we may interpret this variable as
being associated with these two cooking methods and not with method 3). Looking at the loadings on
W1, we see that flavor (especially) and to some extent aroma load positively, while texture and
moisture load negatively. Putting these together, we can conclude that methods 1 and 2 tend to lead to
higher evaluations on flavor and aroma, but lower on texture and moisture than method 3. This is
borne out by the patterns of treatment group means.

Canonical Correlation Analysis

Canonical       Canonical         Standard       Canonical
Correlation     Correlation            Error     Correlation
1           0.848389        0.832288         0.047368        0.719764
2           0.370242        0.312605         0.145860        0.137079

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood     Approximate
Ratio         F Value      Num DF     Den DF     Pr > F
1        0.24182119            7.75           8         60     <.0001
2        0.86292062            1.64           3         31     0.1999

Canonical Structure

Correlations Between the VAR Variables and Their Canonical Variables
V1            V2
m1        0.7259        0.6878
m2        0.2327       -0.9726

Correlations Between the WITH Variables and Their Canonical Variables
W1            W2
aroma           0.3177        0.0106
flavor          0.6811        0.4560
texture        -0.3770        0.6613
moisture       -0.2618        0.3999

Exercise 11.11

This problem calls for a test of an orientation program on outcome measures of anxiety, depression,
and anger. Presumably, the goal of the program is to increase the effectiveness of psychotherapy and
lead to lower measures on these variables.

Pearson Correlation Coefficients, N = 46
Prob > |r| under H0: Rho=0

anxiety        depress         anger
anxiety          1.00000        0.85554       0.65612
<.0001        <.0001
depress          0.85554        1.00000       0.63084
<.0001                       <.0001
anger            0.65612        0.63084       1.00000
<.0001         <.0001

A MANOVA shows that the results of the program are not significant at the 0.05 level. Thus, even
though the patterns of means are consistent with the goals of the program, we cannot reject the null
hypothesis that these differences might have arisen due to chance.

MANOVA Test Criteria and Exact F Statistics for
the Hypothesis of No Overall treat Effect
H = Type III SSCP Matrix for treat
E = Error SSCP Matrix

Statistic                              Value F Value Num DF Den DF Pr > F
Wilks' Lambda                     0.85025028    2.47      3     42 0.0754
Pillai's Trace                    0.14974972    2.47      3     42 0.0754
Hotelling-Lawley Trace            0.17612428    2.47      3     42 0.0754
Roy's Greatest Root               0.17612428    2.47      3     42 0.0754

Level of          ----------anxiety---------     ----------depress---------
treat        N            Mean       Std Dev             Mean       Std Dev
0           26      158.769231     80.173466       182.884615    124.205419
1           20      104.350000    105.812782       154.250000    137.158715

Level of        ------------anger------------
treat       N           Mean          Std Dev
0          26     66.0000000       51.1890613
1          20     51.0500000       51.1967053

Exercise 11.11

This problem is a relatively straightforward application of MANOVA. This is a one factor design with
three factor levels: control group, behavioral rehearsal treatment group, and behavioral rehearsal with
cognitive restructuring. There are four dependent variables in the study: anxiety, social skills,

appropriateness, and assertiveness. All four variables are highly intercorrelated (see below). It might
be argued that all four are indicators of one underlying factor.

Correlation matrix for four dependent variables in Exercise 11.12
Pearson Correlation Coefficients, N = 33
Prob > |r| under H0: Rho=0

anxiety         social            approp        assert

anxiety          1.00000       -0.82209       -0.85925         -0.89866
<.0001         <.0001           <.0001

social          -0.82209        1.00000           0.87183       0.83709
<.0001                           <.0001        <.0001

approp          -0.85925        0.87183           1.00000       0.93552
<.0001         <.0001                          <.0001

assert          -0.89866        0.83709           0.93552       1.00000
<.0001         <.0001            <.0001

We first analyze the four dependent measures using MANOVA. The results (shown below) show a
statistically significant effect (p < 0.001). Thus, we reject the null hypothesis that the means of the
dependent variables are the same across treatment groups.

MANOVA test of all four dependent variables for Exercise 11.12
Multivariate Analysis of Variance

Statistic                            Value   F Value    Num DF   Den DF   Pr > F
Wilks' Lambda                   0.38164971      4.18         8       54   0.0006

Examining the means of the variables across groups, we see that the most dramatic differences are
between the control group and the two treatment groups (behavioral rehearsal and behavioral rehearsal
plus cognitive restructuring).

Treatment group means for four dependent variables
Level of           ----------anxiety---------    ----------social----------
group         N            Mean       Std Dev            Mean       Std Dev

1             11     4.27272727     0.64666979        4.27272727    0.64666979
2             11     4.09090909     0.30151134        4.27272727    0.90453403
3             11     5.45454545     0.82019953        2.54545455    1.03572548

Level of          ----------approp----------       ----------assert----------
group        N            Mean       Std Dev               Mean       Std Dev

1            11    4.18181818         0.60302269        3.81818182    0.75075719
2            11    4.27272727         0.78624539        4.09090909    0.70064905
3            11    2.54545455         0.93419873        2.54545455    0.93419873

This raises an interesting question: is there any incremental effect of cognitive restructuring over and
above behavioral rehearsal? To test this, we test the contrast between the two treatment groups (1
versus 2). The result (shown below) is not significant.

MANOVA: Contrast of Group 1 versus Group 2
Multivariate Analysis of Variance

Statistic                             Value    F Value    Num DF   Den DF   Pr > F
Wilks' Lambda                    0.94238950       0.41         4       27   0.7980

We also analyzed the data by factor analyzing the four dependent variables and extracting one factor
(which accounted for almost 85 percent of the variation in the data) and conducting a simple ANOVA.
The results are similar: the treatment effects are significant (relative to control), and the impact of
cognitive restructuring and behavioral rehearsal (relative to behavioral rehearsal alone) is not
significant.

ANOVA of single factor extracted from four dependent variables in Exercise 11.12
Dependent Variable: Factor1

Sum of
Source                            DF         Squares        Mean Square     F Value   Pr > F
Model                              2     16.22951626         8.11475813       16.60   <.0001
Error                             30     14.66709800         0.48890327
Corrected Total                   32     30.89661426

R-Square         Coeff Var          Root MSE      Factor1 Mean
0.525285        6.49479E17          0.699216        1.0766E-16

Contrast                          DF     Contrast SS       Mean Square      F Value   Pr > F
1 versus 2                         1      0.14401863        0.14401863         0.29   0.5913

GROUP MEANS:

Level of                  -----------Factor1-----------
group           N                 Mean          Std Dev
1              11           0.41277050       0.59258357
2              11           0.57458893       0.60827970

3            11      -0.98735943       0.86345255

SAS PROGRAM FOR EXERCISE 11.12
options ls=72;

data skills;
infile 'SOC_SKILLS.txt';
input group \$ anxiety social approp assert;
run;

proc corr data=skills;
var anxiety social approp assert;

proc glm data=skills;
class group;
model anxiety social approp assert = group;
contrast '1 versus 2' group -1 1 0;
manova h=group;
means group;

proc factor data=skills method=principal priors=smc n=1
score outstat=stat;
var anxiety social approp assert;
run;

Title

Title

12. Discriminant Analysis

Exercise 12.7

The data are collected from two groups of patients: 15 classified as ill and 30 classified as well. There
is no information accompanying the problem to say whether this constitutes a representative sample
from any population, so it is hard to say how to generalize this analysis. In the absence of any other
specific information, we will assume that the sample proportions (1/3, 2/3) represent our priors.

We can test for a difference between group centroids (i.e., the mean value on each of the five measures
of everyday functionality) using discriminant analysis. Because discriminant analysis is a special case
of canonical correlation, the results from PROC CANCORR in SAS are shown below. The correlation
is 0.70 and the associated F-test of significance (based on Wilks's ) is significant at the 0.0001 level.

Canonical        Canonical         Standard        Canonical
Correlation      Correlation            Error      Correlation
1        0.696192         0.666624         0.077687         0.484683

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood      Approximate
Ratio          F Value     Num DF      Den DF     Pr > F
1     0.51531669             7.34          5          39     <.0001

We can look at the canonical loadings (i.e., the correlations between original variables and canonical
variates) to interpret the results. They show that four of the five measures load highly (i.e., all but
"feeling capable of making decisions"). This suggests that this one variable is perhaps a less reliable
indicator of illness than the other four. Note that the standardized canonical coefficients (i.e., the
weights used to form the linear combinations that are the canonical variates) suggest the same pattern
(i.e., closest to zero for the decision-making variable).

Total Canonical Structure
Variable                    Can1
useful                  0.875057
content                 0.663438
decide                  0.381177
nostart                 0.778799

Total-Sample Standardized Canonical Coefficients
Variable              Can1

useful         0.6069552778
content        0.2857822573
decide         -.0828879014
nostart        0.3842507350

Class Means on Canonical Variables
ill                   Can1
1           -1.340710160
2            0.670355080

To test the predictive performance of the discriminant function, we use a one-at-a-time holdout cross-
validation (in this case, using PROC DISCRIM in SAS). The results, shown below, suggest a hit rate
of 38/45 = 84.4 percent. The proportional chance criterion is only (1/3) 2 + (2/3)2 = 55.6 percent. How
likely are we to achieve a hit rate of 38 out of 45 if the true performance of our linear discriminant
function were no better than 0.556? The standard deviation of the number of expected hits under this
null hypothesis is equal to sqrt( 45 x 0.556 x 0.444 ) = 3.33. The t-ratio is t = (38 - 25) / 3.33 = 3.9,
which is significant.

Linear Discriminant Function

_      -1 _                                           -1 _
Constant = -.5 X' COV    X + ln PRIOR            Coefficient = COV      X
j        j           j          Vector                  j

Variable                1                2
Constant        -11.12483        -18.48019
useful            0.18715          1.48700
content           3.02631          3.72234
decide            5.22426          4.94589
nostart          -0.79384         -0.10000

Cross-Validation:    Number of Observations and Percent Classified into ill

From ill               1             2          Total
1              12             3             15
80.00         20.00         100.00
2              4            26             30
13.33         86.67         100.00
Total             16            29             45
35.56         64.44         100.00

We also test the assumption that the within-group covariance matrices are the same across ill and well
patients. As shown below, this assumption clearly does not hold; among ill patients, two or more
variables are perfectly collinear within sample. However, if we use a quadratic discriminant function
instead of linear, we find the prediction performance goes down (when evaluated using one-at-a-time

holdout cross-validation) compared to the linear function. This suggests that the results are too
sensitive to the differences in the estimated within-group sample covariance matrices across groups.

Within Covariance Matrix Information
Natural Log of the
Covariance    Determinant of the
ill       Matrix Rank     Covariance Matrix
1                 4             -27.23757
2                 5              -2.80739
Pooled                 5              -4.06452

Chi-Square       DF    Pr > ChiSq
245.651366       15        <.0001

Cross-Validation:    Number of Observations and Percent Classified into ill

From ill                1           2         Total
1               15           0            15
100.00        0.00        100.00
2             15          15            30
50.00       50.00        100.00
Total               30          15            45
66.67       33.33        100.00

Exercise 12.8

These data are from an exercise that appeared in the original text by Green and Carroll, developed back
in the day when students were asked to do some of these calculations by hand. There is a typo in the
statement of the problem in the textbook: the data are in fact coded Y=1 in favor and Y=0 against the
control bill. The data are hypothetical. Since the data are not drawn from an actual legislative body,
we cannot say that they are representative of any true population of voters. We will assume that the
sample proportions reflect our priors about voting behavior.

A plot of the data (shown below) suggests that the dividing line between those for and those against the
gun control bill might be positively sloped: those voting in favor of the bill (Y=1) fall below this
positively sloped line.

Plot of age*guns\$vote.    Symbol points to label.

age |
|
70 +
|
|
|
60 +                                                    > 0
|                                                              > 0
|                                          > 0
|                                > 0
50 +
|            > 0                 > 0
|                              0 2 1
|                      > 0
40 + > 1
| > 0        > 1
|
|
30 + 2 1
| > 1
|
|
20 +
|
---+---------+---------+---------+---------+---------+---------+--
0         1         2         3         4         5         6
guns

The results from applying Fisher's approach to discriminant analysis (given by PROC CANDISC in
SAS) are shown below. The outcome is just significant at the 0.01 level; Wilks's  = 0.468, with an
F-test significant at 0.0105.

Canonical        Canonical          Standard        Canonical
Correlation      Correlation             Error      Correlation
1        0.729673         0.718972          0.124965         0.532423

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood      Approximate
Ratio          F Value      Num DF      Den DF     Pr > F
1     0.46757711             6.83           2          12     0.0105

Fisher's discriminant function coefficients are proportional to the standardized canonical coefficients
shown below. These are the weights we use to form the discriminant function score t = Xk. Note that
the coefficient for age is positive and the coefficient for guns is negative. This is consistent with the
interpretation from the scatter plot above: if the line dividing the two groups of observations (i.e., the
Mahalanobis locus of points) is positively sloped, then the Fisher discriminant function axis must be
negatively sloped. By looking at the group means, we can see that those voting against the legislation
(i.e., vote = 0) have a positive mean discriminant score, while those voting in favor (vote = 1) have a
negative score). This suggests that holding the number of guns constant, the older the legislator, the
more likely he/she is to vote against the bill. Holding the age of the legislator constant, the greater the
number of guns owned, the more likely he/she is to vote in favor of the gun control bill.

For an interpretation of the discriminant function, we look at the loadings. Here, we see (due to the
positive correlation between age and number of guns owned) that both are positively correlated with
the discriminant function.

Total Canonical Structure
Variable                    Can1
age                     0.986597
guns                    0.818619

Total-Sample Standardized Canonical Coefficients
Variable              Can1
age            1.868967847
guns          -0.531004155

Class Means on Canonical Variables
vote                   Can1
0            0.811114481
1           -1.216671722

A test of the homogeneity of within-group covariance matrices across groups cannot be rejected, so we
find it is appropriate to pool the estimates and use a linear discriminant function.

Within Covariance Matrix Information
Natural Log of the
Covariance      Determinant of the
vote      Matrix Rank       Covariance Matrix
0                2                 3.81676
1                2                 3.27740
Pooled                2                 3.72112

Chi-Square         DF     Pr > ChiSq
1.193126          3         0.7547

PROC DISCRIM in SAS gives the coefficients of the Mahalanobis discriminant functions (i.e., the
distance calculation for group 1 and group 2). Note that age has a bigger impact on the score for those
voting against the gun control bill (coefficient equal to 2.51 for vote = 0) than for those voting in favor
(coefficient = 2.14 for vote = 1). Thus, holding all else constant, the greater the age, the greater the age
of the legislator, the greater the discriminant function score for vote = 0 (i.e., those voting against the
bill).

Linear Discriminant Function

_      -1 _                                           -1 _
Constant = -.5 X' COV    X + ln PRIOR            Coefficient = COV      X
j        j           j          Vector                  j

Linear Discriminant Function for vote

Variable                 0             1
Constant         -50.04961     -35.72468
age                2.51012       2.13764
guns              -8.34471      -7.80113

These Mahalanobis distances can be used to calculate posterior probabilities of group membership.
This is done below using one-at-a-time holdout validation (which means each observation is classified
using the discriminant function coefficients calculated using only the data from the remaining n-1
observations).

Posterior Probability of Membership in vote

From    Classified
Obs        vote     into vote            0          1
1           1           1         0.0343     0.9657
2           1           1         0.2757     0.7243
3           1           0 *       0.8221     0.1779
4           1           1         0.0220     0.9780
5           1           1         0.0721     0.9279

6           1            0 *     0.9309     0.0691
7           0            0       0.8801     0.1199
8           0            0       0.9608     0.0392
9           0            0       0.9981     0.0019
10           0            0       0.9800     0.0200
11           0            1 *     0.2168     0.7832
12           0            0       0.9783     0.0217
13           0            1 *     0.4903     0.5097
14           0            0       0.5108     0.4892
15           0            0       0.8576     0.1424

* Misclassified observation

The "hits and misses" table below summarizes the results from the classification shown above. The hit
rate is 11/15 = 73.3 percent. While this seems an improvement over the proportional chance hit rate of
55.6 percent, we difference is not statistically significant.

Number of Observations and Percent Classified into vote

From vote                0              1        Total
0                7              2            9
77.78          22.22       100.00
1              2              4            6
33.33          66.67       100.00
Total               9              6           15
60.00          40.00       100.00
Priors             0.6            0.4

Exercise 12.12

This data set is used in the SYSTAT manual (Wilkinson) in the chapter on “Discriminant Analysis” by
Englemann. I do not know about the sampling scheme; it seems unlikely that these countries are a
representative sample of the countries of the world. For the purposes of this exercise, proportional
priors are assumed.

a) Are the differences across groups of countries significant? The differences are clearly significant.
Wilks’s  is 0.051 (p < 0.0001); in fact, each of the independent variables in the data set is by itself a
significant discriminator. (Note that the results below are based on including 11 of the independent
variables; the 12th, which is a ratio of birth to death rate, is left out because it is a combination of
information already present in the model).

Because there are three groups of countries, there are two discriminant functions. We can test the
second discriminant function alone and we find that it is also significant: Wilks’s  = 0.402 (p <
0.0001)

Canonical       Canonical         Standard           Canonical
Correlation     Correlation            Error         Correlation

1        0.933842         0.920284        0.017410           0.872061
2        0.773599         0.735910        0.054643           0.598456

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood     Approximate
Ratio         F Value      Num DF      Den DF      Pr > F
1     0.05137337           13.03          22          84      <.0001
2     0.40154436            6.41          10          43      <.0001

b) Interpret each discriminant function. For the purpose of interpretation, it is probably best to look
at the discriminant function loadings (called the “canonical structure” in PROC CANDISC in SAS).
The discriminant function coefficients are interpretable as partial regression coefficients; i.e., they
describe the impact of each dependent variable holding all other variables in the model constant.

As shown below, the first discriminant function is negatively correlated with birth rate and infant
mortality, and positively correlated with life expectancy, literacy, per capita GDP, and spending on
health and education. As shown by the group means, this discriminant function separates countries in
Europe (which score most highly) from New World and Islamic countries. The second discriminant
function is strongly positively correlated with death rate, and negatively correlated with literacy and
life expectancy. Unlike the first discriminant function, the second is negatively correlated with percent
of population living in cities. As shown by the group means, the second discriminant function serves
to separate the Islamic countries (which score the highest) from the New World countries.

Total Canonical Structure
Variable                Can1                  Can2
citypop             0.644525             -0.427281
birth              -0.912000              0.351380
death              -0.140933              0.752922
infdeath           -0.794393              0.453549
gdppc               0.860334              0.132150
educ                0.678699              0.124903
health              0.757760              0.127102
military            0.498877              0.368253
lifex_m             0.719396             -0.477567
lifex_f             0.779220             -0.465591
literacy            0.773787             -0.560221

Class Means on Canonical Variables
group                  Can1              Can2
Europe          3.381723396       0.411490370
Islamic        -2.719775968       1.462927569
NewWorld       -1.116957381      -1.417249074

Plot of Can2*Can1\$group.    Symbol points to label.

Can2 |
|
3 + Islamic Islamic
|     ^     ^     > Islamic
|    Islamic > Islamic
2 +     ^ ^ ^ > Islamic                                   Europe
| Islamic < > NewWorld                                    ^ Europe
|           > Islamic                             Europe <v ^   Europe
1 +          > Islamic                                  EuropevEurope^
|          > Islamic                           Europe < Europ^     v
|                                               Europe <^ ^     Europe
0 +            NewWorld > Islamic      NewWorld 2 Europe v> Europe
|    Islamic   ^^ > Islamic                  Europe < Europe
|       ^ > NewWorld                             2 Europe
-1 +                > NewWorld <        > NewWorld       > Europe
|        NewWorld <> NewWorld NewWorld
|       NewWorld < > NewWorld      ^     > NewWorld
-2 +                      > NewWorld
|         NewWorld < NewWorld      > NewWorld
|        NewWorld <    ^     > NewWorld
-3 +               > NewWorld
|
--+------+------+------+------+------+------+------+------+------+-
-4     -3     -2       -1     0       1      2      3       4      5
Can1

c) How well does linear discriminant analysis perform in correctly classifying countries by group?
To answer this question, we assume that it is appropriate to pool across groups in estimating the
within-group covariance matrix. (We address the question of whether this assumption is appropriate in
part d below). We use the cross-validation option in SAS to assess predictive validity and calculate the
hit rate. The result is that 45 out of 55 countries (two have missing data) are correctly categories, a hit
rate of 82 percent. This compares favorably to the proportional chance hit rate of (19/55) 2 + (15/55)2 +
(21/55)2 = 34 percent.

Cross-Validation:     Number of Observations and Percent Classified into group

From                                              New
group            Europe       Islamic           World         Total

Europe                18               0            1            19
94.74            0.00         5.26        100.00
Islamic                0              12            3            15
0.00           80.00        20.00        100.00
NewWorld               2               4           15            21
9.52           19.05        71.43        100.00
Total                 20              16           19            55
36.36           29.09        34.55        100.00

d) Test the assumption that linear discriminant analysis is appropriate. This we do by using Box’s
test of the equality of covariance matrices across groups. Shown below are the log-determinants of
each group covariance matrix. Note that the Islamic group is smaller than the others; this difference
turns out to be highly significant. This calls into question our decision in part c to pool across groups
in estimating within-group covariance

The results from the quadratic discriminant analysis are also shown below. Clearly, the quadratic
analysis does not outperform the linear analysis in classification accuracy when using a cross-
validation approach. Part of the problem is that the procedure assigns too few countries to the holdout
group and too many to the New World group. This is because the log-determinant of the within-group
covariance matrix is smallest for the Islamic group (and largest for the New World group).

Within Covariance Matrix Information
Natural Log of the
Covariance    Determinant of the
group       Matrix Rank     Covariance Matrix

Europe                 11                52.25294
Islamic                11                49.66450
NewWorld               11                55.19725
Pooled                 11                64.02932

Chi-Square            DF   Pr > ChiSq
412.706444           132       <.0001

Number of Observations and Percent Classified into group

From                                              New
group            Europe       Islamic           World         Total
Europe               17             0               2            19
89.47          0.00           10.53        100.00
Islamic               0             7               8            15
0.00         46.67           53.33        100.00
NewWorld              2             1              18            21
9.52          4.76           85.71        100.00
Total                19             8              28            55
34.55         14.55           50.91        100.00

Exercise 12.13

Marketers are often concerned with the accuracy of new product testing methods. Here is an example
data set in which each of 24 new products is testing using two methods: a concept test and a panel test.
Although the question is not posed explicitly in the exercise, it might be interesting to investigate

which of these two tests is more valuable in discriminating successful new products from failures. One
could also ask whether it is worth purchasing the second test in addition to the first (i.e., is the
improvement in discrimination worthwhile?). Of course, to answer these questions, one would need to
know something about the costs of the tests, the cost of product development, the profits from a
successful new product, and the cost of launching a failure.

The graph below shows that the combined information from the panel and concept tests clearly help
discriminate between successes and failures.

Plot of p_score*c_score\$success.     Symbol points to label.

p_score |
|
20 +
|
|                                        > 1
|                                                      > 1
|                              > 1
15 +         > 0
|                      > 1                   > 1      > 1
|                              > 1
|             > 0               > 1 > 0           > 1
|                  > 0              > 0     > 1
10 +                            > 1
|                 > 0      > 0          > 0
|     > 0
|                                 > 1       > 0
|                        > 0
5 +     > 0
|
---+------------+------------+------------+------------+--
20           40              60              80          100

c_score

A single canonical discriminant function based on both concept and panel scores proves to be a
significant discriminator between successes and failures. Wilks’s  = 0.562, significant at p < 0.01.
The canonical structure suggests that concept score and panel score load almost equally on the
discriminant function.

Multivariate Statistics and Exact F Statistics
Statistic                      Value F Value Num DF Den DF Pr > F
Wilks' Lambda             0.56156799     8.20       2     21 0.0023
Pillai's Trace            0.43843201     8.20       2     21 0.0023
Hotelling-Lawley Trace    0.78072828     8.20       2     21 0.0023
Roy's Greatest Root       0.78072828     8.20       2     21 0.0023

Total Canonical Structure

Variable                  Can1
c_score               0.849651
p_score               0.820445

Class Means on Canonical Variables
success              Can1
0      -.8459713895
1      0.8459713895

The performance of the discriminant function based on the two tests in correctly classifying successes
and failures is reasonably good. Due to the small sample size, there is some capitalization on chance.
Using resubstitution, the hit rate is 20/24 = 83 percent; using cross-validation, the estimate of the hit
rate drops to 16/24 = 67 percent. There is, of course, some question about the appropriate priors for
such an analysis. The literature suggests that despite best efforts to the contrary, “most new products
fail.” There are also asymmetric costs of misclassification that should also be taken into account when
trying to decide whether a new product should be classified as potential success or failure.

Linear Discriminant Function

Variable                0               1
Constant         -9.18461       -17.57076
c_score           0.16171         0.23381
p_score           0.97063         1.33842

Resubstitution Summary using Linear Discriminant Function
Number of Observations and Percent Classified into success

From success              0                1       Total
0             10                2          12
83.33            16.67      100.00
1             2               10          12
16.67            83.33      100.00
Total            12               12          24
50.00            50.00      100.00

Cross-validation Summary using Linear Discriminant Function
Number of Observations and Percent Classified into success

From success              0                1       Total
0              7                5          12
58.33            41.67      100.00
1             3                9          12
25.00            75.00      100.00
Total            10               14          24
41.67            58.33      100.00

Exercise 12.14

These data on vehicle ownership require multiple discriminant analysis because there are three
categories of ownership: car only, van only, and both car and van. We seek to discriminate among
these three categories of ownership using data on income, family size, and age of head of household.
The results of canonical discriminant analysis shows that there are indeed significant differences
among these groups. Wilks’s  = 0.556, which is significant at the 0.01 level.

In fact, we can form two discriminant functions, both of which are significant. As shown in the table
below, after removing the first discriminant function, Wilks’s for the second is equal to 0.797, which
is significant at the 0.05 level. The discriminant function loadings suggest that the first function is
primarily associated with higher income families with older heads of household. The second function
is primarily large families that tend to be younger and have lower income.

Multivariate Statistics and F Approximations

Statistic                         Value F Value Num DF Den DF Pr > F
Wilks' Lambda                0.55620611    3.52      6     62 0.0046
Pillai's Trace               0.50511605    3.60      6     64 0.0039
Hotelling-Lawley Trace       0.68764390    3.50      6 39.604 0.0071
Roy's Greatest Root          0.43305620    4.62      3     32 0.0085

Canonical       Canonical         Standard       Canonical
Correlation     Correlation            Error     Correlation
1        0.549719        0.449059         0.117951        0.302191
2        0.450472         .               0.134730        0.202925

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood     Approximate
Ratio         F Value     Num DF      Den DF     Pr > F
1     0.55620611            3.52          6          62     0.0046
2     0.79707461            4.07          2          32     0.0265

Total Canonical Structure
Variable                Can1              Can2
income              0.771759         -0.631940
fam_size            0.362713          0.839026
age_hh              0.607869         -0.430959

A plot of the observations in the discriminant function space shows that the first discriminant function
is separating families who own both car and van from families who own only a car or only a van.
Thus, it appears that older, wealthier families are more likely to own both types of vehicle. The second
discriminant function is separating families who own only a van (i.e., the larger, younger, lower
income families).

Plot of Can2*Can1\$own.    Symbol points to label.

Can2 |
|
3 +
|
|                                      > 0
2 +
|                                      > 1        > 2
|                              > 1 2             > 1
1 +                          > 0 > 1 ^> 1
|                         > 1 0 > 1               > 2
|                     0 0     1 2> 2 > 1                > 2
0 +                     ^^> 1               > 2
|                           > 1      > 2 > 2
|                                         > 0
-1 +        > 0        > 0              > 2      2 > 2
|                               0 < > 0 < ^ > 2
|
-2 +                                     > 0       > 0
|
---+-------+-------+-------+-------+-------+-------+-------+--
-4      -3      -2       -1       0         1       2     3
Can1

Class Means on Canonical Variables
own              Can1               Can2
0      -.5276820278      -.4991924912
1      -.3413484420      0.6796258485
2      0.8845582686      -.0821984957

The two discriminant functions do a reasonable job of accurately classifying households into
ownership categories. Based on cross-validation, the estimated hit rate is 21/36 = 58.3 percent, which
compares favorably to the proportional chance hit rate of 33.5 percent.

Linear Discriminant Function

_      -1 _                                              -1 _
Constant = -.5 X' COV    X + ln PRIOR              Coefficient = COV       X
j        j           j            Vector                   j

Variable                 0               1                  2
Constant         -25.98898       -27.44166          -36.75413
income             0.61400         0.53149            0.67633
fam_size           3.01115         3.74437            3.90843
age_hh             0.18752         0.23891            0.26387

Cross-validation Summary using Linear Discriminant Function
Number of Observations and Percent Classified into own

From own                 0             1              2          Total
0                 7             3              3             13
53.85         23.08          23.08         100.00
1                3             7              1             11
27.27         63.64           9.09         100.00
2                2             3              7             12
16.67         25.00          58.33         100.00
Total                12            13             11             36
33.33         36.11          30.56         100.00

If we test the homogeneity of within-group covariance matrices across groups, the result is marginally
significant: Box’s chi-square test is 2(12) = 21.4, which is significant at the 0.05 level but not at the
0.01 level.

Within Covariance Matrix Information

Natural Log of the
Covariance         Determinant of the
own     Matrix Rank          Covariance Matrix
0               3                    9.40027
1               3                    8.10301
2               3                    8.79143
Pooled               3                    9.54972

Test of Homogeneity of Within Covariance Matrices
Chi-Square        DF    Pr > ChiSq
21.351007        12        0.0455

Since the Chi-Square value is significant at the 0.1 level, the
within covariance matrices will be used in the discriminant
function.

If we run a quadratic discriminant analysis, we fit that the fitted classification accuracy (based on
resubstitution) improves over linear, but the predictive accuracy actually declines to a hit rate of 16/36
= 44.4 percent. Thus, even though the test suggests that there are differences across groups in their
covariance structure, we are better off (from a classification accuracy standpoint) pooling our estimates
across groups and using linear (rather than non-linear) discriminant analysis.

Resubstitution Summary using Quadratic Discriminant Function
Number of Observations and Percent Classified into own

From own               0             1              2         Total
0               9             3              1            13
69.23         23.08           7.69        100.00
1              2             8              1            11
18.18         72.73           9.09        100.00
2              2             2              8            12
16.67         16.67          66.67        100.00
Total             13            13             10            36
36.11         36.11          27.78        100.00

Cross-validation Summary using Quadratic Discriminant Function
Number of Observations and Percent Classified into own

From own               0             1              2         Total
0               5             5              3            13
38.46         38.46          23.08        100.00
1              5             4              2            11
45.45         36.36          18.18        100.00
2              2             3              7            12
16.67         25.00          58.33        100.00
Total             12            12             12            36
33.33         33.33          33.33        100.00

Exercise 12.15

Once again, we revisit the famous Iris data. The data set is comprised of 50 specimens of each of three
different species of iris: Iris setosa (1), Iris versicolor (2), and Iris virginica (3). Four characteristics
of each are measured: sepal length, sepal width, petal length, and petal width.

The results from a multiple discriminant analysis indicate highly significant differences across species
of iris. Wilks’s  = 0.02, which is significant at p < 0.0001. In fact, two discriminant functions are
significant. For the second function only, Wilks’s  = 0.78, also significant at p < 0.0001. However,
while both are significant, the first is by far the more important, accounting for much more of the
variance in the data. The first function, which is highly correlated with all but sepal width, clearly
separates Iris setosa (1) from Iris versicolor (2) and Iris virginica (3): the difference in means is quite
large. By contrast, the second function (which is correlated with sepal width) provides a subtle
distinction between (2) and (3): the difference in group means is about 1.2 (versus 4.0 on the first
function).

Canonical         Canonical         Standard          Canonical
Correlation       Correlation            Error        Correlation
1        0.984821          0.984508         0.002468           0.969872
2        0.471197          0.461445         0.063734           0.222027

Test of H0: The canonical correlations in the
current row and all that follow are zero

Likelihood      Approximate
Ratio          F Value       Num DF      Den DF     Pr > F
1     0.02343863           199.15            8         288     <.0001
2     0.77797337            13.79            3         145     <.0001

Total Canonical Structure
Variable                Can1                   Can2
sep_l               0.791888               0.217593
sep_w              -0.530759               0.757989
pet_l               0.984951               0.046037
pet_w               0.972812               0.222902

Class Means on Canonical Variables
species              Can1               Can2
1      -7.607599927       0.215133017
2       1.825049490      -0.727899622
3       5.782550437       0.512766605

Plot of Can2*Can1\$species.      Symbol points to label.

Can2 |
|
3 +

|     > 1                             > 3
|                                  3 <> 3
|                                   > 3 > 3
2   +                               3 < > 3
| 1 < > 1                          > 3 3
|1 < 1 > 1                      3 > 3 2 3
| 1 < ^^ > 1               > 2 ^^> 3
1   +   1 <v > 1                  > 2 3
|   1 22 1                  > 2 ^2 2 3
|   1 <2^^> 1              > 2 2 3 v> 3
|   1 <2 > 1             2 < ^^^^3 > 3
0   +     1 <2^1          2 <^^^>22 ^^3> 3
|     1 <32 1           2 2>22 2 v2 3 > 3
|       1 2> 1         2 24 > 2 > 3
|      1 <5 1         > 2 >^2 ^ > 3       > 3
-1   +      1 2^> 1         2 < v2 v> 2 ^ > 3
|                     > 2 <> 2 3
|                    2 < 2 3 > 2 > 3
|                     2 <3 2 > 2
-2   +            > 1        > 2 > 2
|                                 2 3
|
|                        2 2
-3   +
|
---+---------+---------+---------+---------+--
-10        -5        0          5         10

Can1

According to Box’s test, we reject the homogeneity of covariance matrices across groups. However,
unlike previous examples in this chapter, the classification accuracy of the quadratic discriminant
functions (hit rate of 97 percent based on holdout validation) is closely comparable to the linear
discriminant functions (hit rate of 98 percent based on holdout validation).

Linear Discriminant Function for species

Variable              1               2             3
Constant      -86.30847       -72.85261    -104.36832
sep_l          23.54417        15.69821      12.44585
sep_w          23.58787         7.07251       3.68528
pet_l         -16.43064         5.21145      12.76654
pet_w         -17.39841         6.43423      21.07911

Cross-Validation:    Number of Observations and Percent Classified into species

From species             1             2            3         Total
1            50             0            0            50
100.00          0.00         0.00        100.00
2              0            48            2            50
0.00         96.00         4.00        100.00
3              0             1           49            50
0.00          2.00        98.00        100.00

Total             50           49           51            150
33.33        32.67        34.00         100.00

Within Covariance Matrix Information
Natural Log of the
Covariance    Determinant of the
species    Matrix Rank     Covariance Matrix
1              4             -13.06736
2              4             -10.87433
3              4              -8.92706
Pooled              4              -9.95854

Test of Homogeneity of Within Covariance Matrices
Chi-Square        DF    Pr > ChiSq
140.943050        20        <.0001

Cross-validation:    Number of Observations and Percent Classified into species
From species              1            2            3        Total
1             50            0            0           50
100.00         0.00         0.00       100.00
2              0           47            3           50
0.00        94.00         6.00       100.00
3              0            1           49           50
0.00         2.00        98.00       100.00
Total             50           48           52          150
33.33        32.00        34.67       100.00

13. Logit Choice Models

Exercise 13.1

a.    We first calibrate a logit choice model conditional on making a purchase in the category on
the trip. That is, we only account for trips with a purchase in the category. The model is

1
S1t 
1  exp(    * ( price1t  price2t ))

Model Fit Statistics
Intercept
Intercept          and
Criterion           Only         Covariates
AIC                1718.496        1311.916
SC                 1723.621        1322.167
-2 Log L           1716.496        1307.916

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio       408.5798        1         <.0001
Score                  382.6007        1         <.0001
Wald                   329.0601        1         <.0001

Analysis of Maximum Likelihood Estimates
Standard          Wald
Parameter       DF    Estimate       Error    Chi-Square                  Pr > ChiSq
Intercept        1     -0.2596      0.0693       14.0435                      0.0002
Coke_rp          1     -1.2234      0.0674      329.0601                      <.0001

b.    For the conditional logit choice model calibrated in part a, we have the formula for price
elasticity for Coke as

ˆ
1t   * (1  S1t ) * price1t

The estimated elasticity each week is listed in the table below (eta)

Obs      week     Coke_p         eta         Pep_p          cv              eta2
1        1       4.33       -2.99052       4.33        -4.72561         -5.03855
2        2       4.33       -2.99052       4.33        -4.72561         -5.03855
3        3       4.33       -2.99052       4.33        -4.72561         -5.03855
4        4       3.24       -1.00941       4.33        -3.92952         -3.40798
5        5       3.24       -1.00941       4.33        -3.92952         -3.40798
6        6       4.33       -4.40235       3.24        -3.77878         -5.11123
7        7       4.33       -4.40235       3.24        -3.77878         -5.11123
8        8       3.24       -1.00941       4.33        -3.92952         -3.40798

9          9      3.24         -1.00941    4.33        -3.92952     -3.40798
10         10      3.24         -1.00941    4.33        -3.92952     -3.40798
11         11      4.33         -4.38400    3.26        -3.79908     -5.10995
12         12      4.33         -4.38400    3.26        -3.79908     -5.10995
13         13      3.24         -1.00941    4.33        -3.92952     -3.40798
14         14      4.33         -4.40235    3.24        -3.77878     -5.11123
15         15      4.33         -4.40235    3.24        -3.77878     -5.11123
16         16      3.24         -1.00941    4.33        -3.92952     -3.40798
17         17      3.24         -1.00941    4.33        -3.92952     -3.40798
18         18      4.33         -4.39321    3.25        -3.78894     -5.11059
19         19      3.15         -0.90303    4.33        -3.84632     -3.26703
20         20      4.33         -4.30764    3.34        -3.87938     -5.10482

c. We build a model that takes into account of both purchase incidence probability and
conditional brand choice. That is:

P(coke)  P(coke | c)* P(c)
1                              1
                                         *
1  exp(   *( price1t  price2t )) 1  exp(a   cv)

where      cv  ln(exp(    * price1t )  exp(  * price2t ))
After some calculation, we can find the derivative of purchase probability with respect to price is:

   P(coke) / price1t
 P(coke | c)           P(c)
                 P (c )           P(Coke | c)
price1t             price1t
  * P(coke | c)*(1  P(Coke | c))* P(c)   *  * P(coke | c) 2 * P(c)*(1  P(c))
  * P(c)* P(coke | c)[1  Pr ob(coke | c)   *(1  P(c))* P(coke | c)]

price1
Then the elasticity for this case is   2   *            . Since now we take into account category
s1
purchase probability elasticity, we get a higher price elasticity (eta is the conditional price
elasticity, eta2 is the unconditional price elasticity).

Exercise 13.2

a. We estimate a simple binary logit model with age, income and mobility as explanatory variables.

Model Fit Statistics
Intercept
Intercept           and
Criterion             Only          Covariates
AIC                   139.628          114.107

SC                 142.233           124.528
-2 Log L           137.628           106.107

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio        31.5207        3         <.0001
Score                   27.9401        3         <.0001
Wald                    21.5713        3         <.0001

Analysis of Maximum Likelihood Estimates
Standard                 Wald
Parameter    DF    Estimate       Error           Chi-Square    Pr > ChiSq
Intercept     1      0.2810      0.4312               0.4246        0.5147
age           1      1.0691      0.5550               3.7105        0.0541
income        1     -2.2221      0.5430              16.7481        <.0001
mobility      1     -0.9101      0.5218               3.0420        0.0811

As we can see, the model is jointly significant. Age and mobility are significant at 10% level, but
not at 5% level. Income is highly significant.

b. We do a stepwise model selection for models with interaction terms. As we can
see from the output, none of the terms remain significant when we add all the interaction terms.

Analysis of Maximum Likelihood Estimates

Standard          Wald
Parameter           DF    Estimate        Error    Chi-Square    Pr > ChiSq
Intercept            1      0.2777       0.4705        0.3483        0.5551
age                  1      0.7576       0.6791        1.2444        0.2646
income               1     -1.7300       1.2267        1.9891        0.1584
mobility             1     -0.7703       1.3378        0.3316        0.5647
age*income           1      0.3109       1.3886        0.0501        0.8228
age*mobility         1      0.4076       1.4206        0.0823        0.7742
income*mobility      1     -1.4765       1.1666        1.6019        0.2056

Our stepwise selection model chooses the model with age, income and mobility as explanatory
variables. There does not seem to be any significant interaction effects. After adding all the
interaction terms, even income effect is not significant.

Exercise 13.3

a. As we can see from the estimation result, after controlling for gun ownership effect, there is no
statistically significant relationship between age and voting behavior. It is negative, but not
statistically significant.

Analysis of Maximum Likelihood Estimates
Standard                Wald

Parameter      DF     Estimate        Error      Chi-Square      Pr > ChiSq
Intercept       1      17.0683      11.6918          2.1312          0.1443
age             1      -0.4372       0.3081          2.0134          0.1559
gun_own         1       0.5301       0.9835          0.2905          0.5899

b. With our logit model, we set the following rule: if the predicted probability of a subject voting for
gun control is greater than 0.5, then we classify the subject as voting =1. Given this rule, we
perform a cross validation. For the original data, we correctly predicted 12 out of 15 subjects’
behavior. Comparing with discrimination analysis, using simple linear discrimination analysis and
prediction, the hitting rate is the same.

Exercise 13.5

a. We estimate logit models with brand dummy only, and with price and display in addition to brand
dummies as covariate. The output is as following:

Model Fit Statistics
Without           With
Criterion      Covariates     Covariates
-2 LOG L           87.889         87.542
AIC                87.889         91.542
SBC                87.889         94.920

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio         0.3466        2         0.8409
Score                    0.3500        2         0.8395
Wald                     0.3489        2         0.8399

Analysis of Maximum Likelihood Estimates
Parameter    Standard                                  Hazard
Variable   DF     Estimate       Error Chi-Square Pr > ChiSq             Ratio
dummya      1      0.14310     0.37893      0.1426      0.7057           1.154
dummyb      1     -0.08005     0.40032      0.0400      0.8415           0.923

Model Fit Statistics
Without           With
Criterion      Covariates     Covariates
-2 LOG L           87.889         68.241
AIC                87.889         76.241
SBC                87.889         82.997

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio        19.6475        4         0.0006
Score                   19.0897        4         0.0008
Wald                    14.1197        4         0.0069

Analysis of Maximum Likelihood Estimates

Parameter       Standard                                   Hazard
Variable   DF    Estimate          Error    Chi-Square     Pr > ChiSq       Ratio
dummya      1     2.98015        0.95903        9.6564         0.0019      19.691
dummyb      1     0.91253        0.56518        2.6069         0.1064       2.491
price       1    -5.49366        1.70215       10.4166         0.0012       0.004
disp        1     0.72674        0.68873        1.1134         0.2913       2.068

The information added by price and display can be computed using the following statistics:

LLr      68.241
2  1         1         0.224
LL0      87.542

That is, about 22% of the uncertainty in choice in the intercept only model is explained by the full
model with price and display covariates. This is not a bad fit.

To test whether price and display are significant in explaining the choice behavior of the household,
we use the likelihood ration test; that is, X2(2) = -2(LLR -LLf) = 87.542 - 68.241 = 19.301, which is
significant at 0.05 level. This is consistent with what we estimated from the  statistics. When we
2

look at price and display separately, we see that the t-statistics for price is highly significant, while
the t-statistics for display is not significant. But our 2 statistics says that they are jointly significant.

b. The probability of each brand chosen given price=[1.5, 0.5, 0.80] and disp=0 is

exp(a   * price _ A)
SA 
exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
 0.089571
exp(b   * price _ B)
SB 
exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
 0.69761
exp( * price _ C )
SC 
exp(a   * price _ A)  exp(b   * price _ B)  exp( * price _ C )
 0.21282
when price of A change from 1.50 to 1.20, the probability that each brand will be chosen is: (0.33832,
0.50701, 0.15467).

Exercise 13.6

a.         Simple logistic regression show that order of the samples does not really matter for the rating:

Model Fit Statistics
Intercept
Intercept         and
Criterion            Only        Covariates
AIC                  278.759        278.328
SC                   282.057        284.924
-2 Log L             276.759        274.328

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio         2.4310        1         0.1190
Score                    2.4261        1         0.1193
Wald                     2.4161        1         0.1201

Analysis of Maximum Likelihood Estimates
Standard          Wald
Parameter     DF    Estimate       Error    Chi-Square          Pr > ChiSq
Intercept      1      0.1201      0.2004        0.3596              0.5487
ord            1     -0.4429      0.2849        2.4161              0.1201

The coefficients for order of sample factor is not significant. As a whole, the model is not
significant either. The probability modeled is for the choice of 45.

b. Adding attributes into model does not change the overall preference.

Model Fit Statistics
Intercept
Intercept         and
Criterion            Only        Covariates
AIC                  278.759        275.712
SC                   282.057        321.888
-2 Log L             276.759        247.712

Testing Global Null Hypothesis: BETA=0
Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio        29.0471       13         0.0064
Score                   26.6749       13         0.0138
Wald                    23.0145       13         0.0415

Standard          Wald
Parameter     DF      Estimate         Error    Chi-Square      Pr > ChiSq
Intercept      1        1.3350        1.7544        0.5791          0.4467
ord            1       -0.4758        0.3151        2.2808          0.1310
rate271        1       -0.3982        0.1656        5.7805          0.0162
rate272        1        0.4840        0.1806        7.1846          0.0074
rate273        1       -0.6150        0.3300        3.4732          0.0624
rate274        1        0.3954        0.2536        2.4311          0.1189
rate275        1       -0.2127        0.2019        1.1095          0.2922
rate276        1        0.0970        0.2465        0.1549          0.6939
rate451        1       -0.2833        0.1561        3.2938          0.0695
rate452        1       -0.1333        0.2012        0.4389          0.5077
rate453        1        0.1378        0.2857        0.2326          0.6296

rate454         1      -0.3989        0.2107          3.5860          0.0583
rate455         1       0.4548        0.1954          5.4188          0.0199
rate456         1       0.2289        0.2299          0.9911          0.3195

c. To see the contribution of attributes rating to the preference formation, we use the following
statistic:

LLr      247.712
 2  1        1          0.097 .
LL0      274.328
This statistics also says that there is not much contribution by adding attribute rating to the model.

```
To top