# Proactive Monte Carlo Analysis in Structural Equation Modeling

Document Sample

```					Proactive Monte Carlo Analysis in
Structural Equation Modeling

James H. Steiger
Vanderbilt University
Some Unhappy Scenarios

   A Confirmatory Factor Analysis
–   You fit a 3 factor model to 9 variables with N=150
–   You obtain a Heywood Case
   Comparing Two Correlation Matrices
–   You wish to test whether two population matrices
are equivalent, using ML estimation
–   You obtain an unexpected rejection
Some Unhappy Scenarios

   Fitting a Trait-State Model
–   You fit the Kenny-Zautra TSE model to 4 waves of panel
data with N=200. You obtain a variance estimate of zero.
   Writing a Program Manual
–   You include an example analysis in your widely distributed
computer manual
–   The analysis remains in your manuals for more than a
–   The analysis is fundamentally flawed, and gives incorrect
results
Some Common Elements

   Models of covariance or correlation structure
   Potential problems could have been
identified before data were ever gathered,
using “proactive Monte Carlo analysis”
Confirmatory Factor Analysis

Variable   Factor 1   Factor 2   Factor 3
VIS_PERC        X
CUBES           X
LOZENGES        X
PAR_COMP                   X
SEN_COMP                   X
WRD_MNG                    X
CNT_DOT                               X
ST_CURVE                              X
Confirmatory Factor Analysis

Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.46                          0.79
CUBES       0.65                          0.58
LOZENGES     0.25                          0.94
PAR_COMP               1.00                0.00
SEN_COMP               0.41                0.84
WRD_MNG                0.22                0.95
CNT_DOT                        1.00       0.00
ST_CURVE                        0.30       0.91
Confirmatory Factor Analysis

Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.60                          0.64
CUBES       0.60                          0.64
LOZENGES     0.60                          0.64
PAR_COMP               0.60                0.64
SEN_COMP               0.60                0.64
WRD_MNG                0.60                0.64
CNT_DOT                        0.60       0.64
ST_CURVE                        0.60       0.64
Proactive Monte Carlo Analysis

   Take the model you anticipate fitting
   Insert reasonable parameter values
   Generate a population covariance or correlation
matrix and fit this matrix, to assess identification
problems
   Examine Monte Carlo performance over a range of
sample sizes that you are considering
   Assess convergence problems, frequency of
improper estimates, Type I Error, accuracy of fit
indices
   Preliminary investigations may take only a few hours
Confirmatory Factor Analysis

(Speed)-1{.3}->[VIS_PERC]
(Speed)-2{.4}->[CUBES]
(Speed)-3{.5}->[LOZENGES]

(Verbal)-4{.6}->[PAR_COMP]
(Verbal)-5{.3}->[SEN_COMP]
(Verbal)-6{.4}->[WRD_MNG]

(Visual)-8{.6}->[CNT_DOT]
(Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis

(Speed)-1{.53}->[VIS_PERC]
(Speed)-2{.54}->[CUBES]
(Speed)-3{.55}->[LOZENGES]

(Verbal)-4{.6}->[PAR_COMP]
(Verbal)-5{.3}->[SEN_COMP]
(Verbal)-6{.4}->[WRD_MNG]

(Visual)-8{.6}->[CNT_DOT]
(Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis

Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.60                          0.64
CUBES       0.60                          0.64
LOZENGES     0.60                          0.64
PAR_COMP               0.60                0.64
SEN_COMP               0.60                0.64
WRD_MNG                0.60                0.64
CNT_DOT                        0.60       0.64
ST_CURVE                        0.60       0.64
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Percentage of Heywood Cases

75            80%          30%            0%

100           78%          11%            0%

150           62%            3%           0%

300           21%            0%           0%

500           01%            0%           0%
Standard Errors
Standard Errors
Standard Errors
Distribution of Estimates
Estimates for Parameter 1, N=72
140

120

100

80

60
No of obs

40

20

0
-0.0374            0.1355            0.3084            0.4813            0.6542            0.8271            1.0000
0.0490            0.2219            0.3948            0.5677            0.7406            0.9135
PAR_1
Standard Errors (N =300)
Standard Errors (N = 300)
Distribution of Estimates
Estimates for Parameter 1 (N=300)
140

120

100

80

60
No of obs

40

20

0
0.3754            0.4553            0.5351            0.6150            0.6949            0.7747            0.8546
0.4154            0.4952            0.5751            0.6549            0.7348            0.8146
PAR_1
Correlational Pattern Hypotheses

   “Pattern Hypothesis”
–   A statistical hypothesis that specifies that
parameters or groups of parameters are equal to
each other, and/or to specified numerical values
–   Only about equality, so they are invariant under
nonlinear monotonic transformations (e.g., Fisher
Transform).
Correlational Pattern Hypotheses

   Caution! You cannot use the Fisher
transform to construct confidence intervals
for differences of correlations
–   For an example of this error, see Glass and
Stanley (1970, p. 311-312).
Comparing Two Correlation Matrices in
Two Independent Samples

   Jennrich (1970)
–   Method of Maximum Likelihood (ML)
–   Method of Generalized Least Squares (GLS)
–   Example
   Two 11x11 matrices
   Sample sizes of 40 and 89
Comparing Two Correlation Matrices in
Two Independent Samples

   ML Approach

   D1RD1;    D2 RD2
   Minimizes ML discrepancy function
   Can be programmed with standard SEM
software packages that have multi-sample
capability
Comparing Two Correlation Matrices in
Two Independent Samples

   Generalized Least Squares Approach
   Minimizes GLS discrepancy function
   SEM programs will iterate the solution
   Freeware (Steiger, 2005, in press) will
perform direct analytic solution
Monte Carlo Results – Chi-Square
Statistic

Mean     S.D.
Observed        75.8     13.2

Expected        66       11.5
Monte Carlo Results – Distribution of
p-Values

Comparing Tw o Correlation Matrices (ML)
N1 = 40, N2 = 89
350

300

250

200
No of obs

150

100

50

0
-0.1   0.0   0.1     0.2     0.3    0.4     0.5     0.6   0.7   0.8   0.9   1.0   1.1
p Value
Monte Carlo Results – Distribution of
Chi-Square Statistics

Observed vs. Expected Frequencies
220

200

180

160

140

120

100

80

60

40

20

0                                        Observed
Expected
-20
Monte Carlo Results (ML) – Empirical
vs. Nominal Type I Error Rate

Nominal a     .010   .050

Empirical a   .076   .208
Monte Carlo Results (ML)
Empirical vs. Nominal Type I Error Rate
N = 250 per Group

Nominal a    .010   .050

Empirical a   .011   .068
Monte Carlo Results – Chi-Square
Statistic, N = 250 per Group

Mean     S.D.
Observed        67.7     11.6

Expected        66       11.5
Kenny-Zautra TSE Model

TSE model                       T

e2                  e3                eJ
b                    b              bJ-
O1           O2                  O3                OJ
Y1               Y2                  Y3      …         YJ

d1               d2                  d3                dJ
Likelihood of Improper Values in the
TSE Model
Constraint Interaction

   Steiger, J.H. (2002). When constraints
interact: A caution about reference variables,
identification constraints, and scale
dependencies in structural equation
modeling. Psychological Methods, 7, 210-
227.
Constraint Interaction
X1                                  1                            Y1
Respondent's                                                    Respondent's
e1
Parental Aspiration          g1,1                              Occupational Aspiration
l1,1 = 1
h1
X2
g1,2           Respondent's
Respondent's                       Ambition
Intelligence       g1,3
l2,1               Y2
X3
Respondent's
g1,4                                   Educational Aspiration    e2
Respondent's
Socioeconomic Status
b2,1        b1,2 = b2,1
Y3
X4                    g2,3
Best Friend's
Best Friend's                                               Educational Aspiration    e3
Socioeconomic Status     g2,4                          l3,2
h2
X5                           Best Friend's
g2,5           Ambition
Best Friend's
Intelligence                                                          Y4
g2,6                          l4,2 = 1
Best Friend's
X6                                                      Occupational Aspiration   e3
Best Friend's                           1
Parental Aspiration
Constraint Interaction
1,1                2,2                               3,3                   4,4

e1                  e2                                 e3                     e4

Y1                  Y2                                 Y3                     Y4

l2,1           l3,2 = 1
l1,1 = 1                                              l4,2

b 2,1
h1                                 h2

g1,1               g2,1
1                                                             2

1
 1,1                                                           2,2

 1,1 = 1                   2,1
f 1,1

X1                                 X2

d1                                 d2

d 1,1                              d 2,2
Constraint Interaction
1,1                  2,2                             3,3                   4,4

e1                    e2                               e3                     e4

Y1                    Y2                               Y3                     Y4

l2,1           l3,2
l1,1                                            l4,2

b 2,1
h1                               h2

g1,1             g2,1
1                                                             2

1
 1,1                                                           2,2

 1,1                   2,1
1

X1                               X2

d1                               d2

d 1,1                            d 2,2
Constraint Interaction
Constraint Interaction – Model without
ULI Constraints (Constrained
Estimation)

   (XI1)-1->[X1]
   (XI1)-2->[X2]
   (XI1)-{1}-(XI1)

   (DELTA1)-->[X1]
   (DELTA2)-->[X2]

   (DELTA1)-3-(DELTA1)
   (DELTA2)-4-(DELTA2)

   (ETA1)-98->[Y1]
   (ETA1)-5->[Y2]

   (ETA2)-99->[Y3]
   (ETA2)-6->[Y4]

   (EPSILON1)-->[Y1]
   (EPSILON2)-->[Y2]
   (EPSILON3)-->[Y3]
   (EPSILON4)-->[Y4]

   (EPSILON1)-7-(EPSILON1)
   (EPSILON2)-8-(EPSILON2)
   (EPSILON3)-9-(EPSILON3)
   (EPSILON4)-10-(EPSILON4)

   (ZETA1)-->(ETA1)
   (ZETA2)-->(ETA2)

   (ZETA1)-11-(ZETA1)
   (ZETA2)-12-(ZETA2)

   (XI1)-13->(ETA1)
   (XI1)-13->(ETA2)
Constraint Interaction
Constraint Interaction
Constraint Interaction – Model With ULI
Constraints
   (XI1)-->[X1]
   (XI1)-2->[X2]
   (XI1)-1-(XI1)

   (DELTA1)-->[X1]
   (DELTA2)-->[X2]

   (DELTA1)-3-(DELTA1)
   (DELTA2)-4-(DELTA2)

   (ETA1)-->[Y1]
   (ETA1)-5->[Y2]

   (ETA2)-->[Y3]
   (ETA2)-6->[Y4]

   (EPSILON1)-->[Y1]
   (EPSILON2)-->[Y2]
   (EPSILON3)-->[Y3]
   (EPSILON4)-->[Y4]

   (EPSILON1)-7-(EPSILON1)
   (EPSILON2)-8-(EPSILON2)
   (EPSILON3)-9-(EPSILON3)
   (EPSILON4)-10-(EPSILON4)

   (ZETA1)-->(ETA1)
   (ZETA2)-->(ETA2)

   (ZETA1)-11-(ZETA1)
   (ZETA2)-12-(ZETA2)

   (XI1)-13->(ETA1)
   (XI1)-13->(ETA2)
   (ETA1)-15->(ETA2)
Constraint Interaction – Model With ULI
Constraints
Typical Characteristics of Statistical
Computing Cycles

–   Occur late in the research cycle, after data are
gathered
   Reactive
–   Often occur in support of analytic activities that
are reactions to previous analysis results

   Data come first
   Analyses come second
   Analyses are well-understood and will work
   Before the data arrive, there is nothing to
analyze and no reason to start analyzing
Modern Statistical World View

   Planning comes first
–   Power Analysis, Precision Analysis, etc.
   Planning may require some substantial
computing
–   Goal is to estimate required sample size
   Data analysis must wait for data
Proactive SEM Statistical World View

   SEM involves interaction between specific model(s)
and data.
–   Some models may not “work” with many data sets
   Planning involves:
–   Power Analysis
–   Precision Analysis
–   Confirming Identification
–   Proactive Analysis of Model Performance
   Without proper proactive analysis, research can be stopped
cold with an “unhappy surprise.”
Barriers

   Software
–   Design
–   Availability
   Education

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 14 posted: 6/28/2011 language: English pages: 57