Docstoc

Proactive Monte Carlo Analysis in Structural Equation Modeling

Document Sample
Proactive Monte Carlo Analysis in Structural Equation Modeling Powered By Docstoc
					Proactive Monte Carlo Analysis in
  Structural Equation Modeling


                James H. Steiger
                Vanderbilt University
Some Unhappy Scenarios

   A Confirmatory Factor Analysis
    –   You fit a 3 factor model to 9 variables with N=150
    –   You obtain a Heywood Case
   Comparing Two Correlation Matrices
    –   You wish to test whether two population matrices
        are equivalent, using ML estimation
    –   You obtain an unexpected rejection
Some Unhappy Scenarios

   Fitting a Trait-State Model
    –   You fit the Kenny-Zautra TSE model to 4 waves of panel
        data with N=200. You obtain a variance estimate of zero.
   Writing a Program Manual
    –   You include an example analysis in your widely distributed
        computer manual
    –   The analysis remains in your manuals for more than a
        decade
    –   The analysis is fundamentally flawed, and gives incorrect
        results
Some Common Elements

   Models of covariance or correlation structure
   Potential problems could have been
    identified before data were ever gathered,
    using “proactive Monte Carlo analysis”
Confirmatory Factor Analysis

Variable   Factor 1   Factor 2   Factor 3
VIS_PERC        X
CUBES           X
LOZENGES        X
PAR_COMP                   X
SEN_COMP                   X
WRD_MNG                    X
ADDITION                              X
CNT_DOT                               X
ST_CURVE                              X
Confirmatory Factor Analysis

 Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.46                          0.79
 CUBES       0.65                          0.58
LOZENGES     0.25                          0.94
PAR_COMP               1.00                0.00
SEN_COMP               0.41                0.84
WRD_MNG                0.22                0.95
 ADDITION                       0.38       0.85
 CNT_DOT                        1.00       0.00
ST_CURVE                        0.30       0.91
Confirmatory Factor Analysis

 Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.60                          0.64
 CUBES       0.60                          0.64
LOZENGES     0.60                          0.64
PAR_COMP               0.60                0.64
SEN_COMP               0.60                0.64
WRD_MNG                0.60                0.64
 ADDITION                       0.60       0.64
 CNT_DOT                        0.60       0.64
ST_CURVE                        0.60       0.64
Proactive Monte Carlo Analysis

   Take the model you anticipate fitting
   Insert reasonable parameter values
   Generate a population covariance or correlation
    matrix and fit this matrix, to assess identification
    problems
   Examine Monte Carlo performance over a range of
    sample sizes that you are considering
   Assess convergence problems, frequency of
    improper estimates, Type I Error, accuracy of fit
    indices
   Preliminary investigations may take only a few hours
Confirmatory Factor Analysis

 (Speed)-1{.3}->[VIS_PERC]
 (Speed)-2{.4}->[CUBES]
 (Speed)-3{.5}->[LOZENGES]

 (Verbal)-4{.6}->[PAR_COMP]
 (Verbal)-5{.3}->[SEN_COMP]
 (Verbal)-6{.4}->[WRD_MNG]

 (Visual)-7{.5}->[ADDITION]
 (Visual)-8{.6}->[CNT_DOT]
 (Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis

  (Speed)-1{.53}->[VIS_PERC]
  (Speed)-2{.54}->[CUBES]
  (Speed)-3{.55}->[LOZENGES]

  (Verbal)-4{.6}->[PAR_COMP]
  (Verbal)-5{.3}->[SEN_COMP]
  (Verbal)-6{.4}->[WRD_MNG]

  (Visual)-7{.5}->[ADDITION]
  (Visual)-8{.6}->[CNT_DOT]
  (Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis

 Variable   Factor 1 Factor 2 Factor 3   Unique Var.

VIS_PERC     0.60                          0.64
 CUBES       0.60                          0.64
LOZENGES     0.60                          0.64
PAR_COMP               0.60                0.64
SEN_COMP               0.60                0.64
WRD_MNG                0.60                0.64
 ADDITION                       0.60       0.64
 CNT_DOT                        0.60       0.64
ST_CURVE                        0.60       0.64
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Percentage of Heywood Cases

N       Loading .4   Loading .6   Loading .8

75            80%          30%            0%

100           78%          11%            0%

150           62%            3%           0%

300           21%            0%           0%

500           01%            0%           0%
Standard Errors
Standard Errors
Standard Errors
Distribution of Estimates
                                                Estimates for Parameter 1, N=72
             140



             120



             100



             80



             60
 No of obs




             40



             20



              0
               -0.0374            0.1355            0.3084            0.4813            0.6542            0.8271            1.0000
                         0.0490            0.2219            0.3948            0.5677            0.7406            0.9135
                                                                      PAR_1
Standard Errors (N =300)
Standard Errors (N = 300)
Distribution of Estimates
                                             Estimates for Parameter 1 (N=300)
            140



            120



            100



            80



            60
No of obs




            40



            20



             0
              0.3754            0.4553            0.5351            0.6150            0.6949            0.7747            0.8546
                       0.4154            0.4952            0.5751            0.6549            0.7348            0.8146
                                                                    PAR_1
Correlational Pattern Hypotheses

   “Pattern Hypothesis”
    –   A statistical hypothesis that specifies that
        parameters or groups of parameters are equal to
        each other, and/or to specified numerical values
   Advantages of Pattern Hypotheses
    –   Only about equality, so they are invariant under
        nonlinear monotonic transformations (e.g., Fisher
        Transform).
Correlational Pattern Hypotheses

   Caution! You cannot use the Fisher
    transform to construct confidence intervals
    for differences of correlations
    –   For an example of this error, see Glass and
        Stanley (1970, p. 311-312).
Comparing Two Correlation Matrices in
Two Independent Samples

   Jennrich (1970)
    –   Method of Maximum Likelihood (ML)
    –   Method of Generalized Least Squares (GLS)
    –   Example
            Two 11x11 matrices
            Sample sizes of 40 and 89
Comparing Two Correlation Matrices in
Two Independent Samples

   ML Approach

   D1RD1;    D2 RD2
   Minimizes ML discrepancy function
   Can be programmed with standard SEM
    software packages that have multi-sample
    capability
Comparing Two Correlation Matrices in
Two Independent Samples

   Generalized Least Squares Approach
   Minimizes GLS discrepancy function
   SEM programs will iterate the solution
   Freeware (Steiger, 2005, in press) will
    perform direct analytic solution
Monte Carlo Results – Chi-Square
Statistic



                Mean     S.D.
Observed        75.8     13.2



Expected        66       11.5
Monte Carlo Results – Distribution of
p-Values

                                        Comparing Tw o Correlation Matrices (ML)
                                                    N1 = 40, N2 = 89
               350



               300



               250



               200
   No of obs




               150



               100



                50



                 0
                     -0.1   0.0   0.1     0.2     0.3    0.4     0.5     0.6   0.7   0.8   0.9   1.0   1.1
                                                               p Value
Monte Carlo Results – Distribution of
Chi-Square Statistics

       Observed vs. Expected Frequencies
220

200

180

160

140

120

100

 80

 60

 40

 20

  0                                        Observed
                                           Expected
-20
Monte Carlo Results (ML) – Empirical
vs. Nominal Type I Error Rate


        Nominal a     .010   .050


        Empirical a   .076   .208
Monte Carlo Results (ML)
Empirical vs. Nominal Type I Error Rate
N = 250 per Group



          Nominal a    .010   .050


         Empirical a   .011   .068
Monte Carlo Results – Chi-Square
Statistic, N = 250 per Group



                Mean     S.D.
Observed        67.7     11.6



Expected        66       11.5
Kenny-Zautra TSE Model

   TSE model                       T




                    e2                  e3                eJ
        b                    b              bJ-
  O1           O2                  O3                OJ
        Y1               Y2                  Y3      …         YJ

        d1               d2                  d3                dJ
Likelihood of Improper Values in the
TSE Model
Constraint Interaction

   Steiger, J.H. (2002). When constraints
    interact: A caution about reference variables,
    identification constraints, and scale
    dependencies in structural equation
    modeling. Psychological Methods, 7, 210-
    227.
Constraint Interaction
          X1                                  1                            Y1
      Respondent's                                                    Respondent's
                                                                                            e1
   Parental Aspiration          g1,1                              Occupational Aspiration
                                                       l1,1 = 1
                                         h1
          X2
                         g1,2           Respondent's
      Respondent's                       Ambition
      Intelligence       g1,3
                                                         l2,1               Y2
          X3
                                                                      Respondent's
                           g1,4                                   Educational Aspiration    e2
      Respondent's
  Socioeconomic Status
                                       b2,1        b1,2 = b2,1
                                                                            Y3
          X4                    g2,3
                                                                      Best Friend's
      Best Friend's                                               Educational Aspiration    e3
  Socioeconomic Status     g2,4                          l3,2
                                       h2
          X5                           Best Friend's
                         g2,5           Ambition
      Best Friend's
      Intelligence                                                          Y4
                         g2,6                          l4,2 = 1
                                                                      Best Friend's
          X6                                                      Occupational Aspiration   e3
      Best Friend's                           1
   Parental Aspiration
Constraint Interaction
            1,1                2,2                               3,3                   4,4



            e1                  e2                                 e3                     e4




            Y1                  Y2                                 Y3                     Y4



                                        l2,1           l3,2 = 1
                     l1,1 = 1                                              l4,2


                                               b 2,1
                                h1                                 h2


                                        g1,1               g2,1
                    1                                                             2


                                               1
                    1,1                                                           2,2

                                 1,1 = 1                   2,1
                                               f 1,1

                                X1                                 X2




                                d1                                 d2



                                d 1,1                              d 2,2
Constraint Interaction
            1,1                  2,2                             3,3                   4,4



            e1                    e2                               e3                     e4




            Y1                    Y2                               Y3                     Y4



                                          l2,1           l3,2
                           l1,1                                            l4,2


                                                 b 2,1
                                  h1                               h2


                                          g1,1             g2,1
                    1                                                             2


                                                 1
                    1,1                                                           2,2

                                     1,1                   2,1
                                                  1

                                  X1                               X2




                                  d1                               d2



                                  d 1,1                            d 2,2
Constraint Interaction
Constraint Interaction – Model without
ULI Constraints (Constrained
Estimation)

   (XI1)-1->[X1]
   (XI1)-2->[X2]
   (XI1)-{1}-(XI1)

   (DELTA1)-->[X1]
   (DELTA2)-->[X2]

   (DELTA1)-3-(DELTA1)
   (DELTA2)-4-(DELTA2)


   (ETA1)-98->[Y1]
   (ETA1)-5->[Y2]

   (ETA2)-99->[Y3]
   (ETA2)-6->[Y4]

   (EPSILON1)-->[Y1]
   (EPSILON2)-->[Y2]
   (EPSILON3)-->[Y3]
   (EPSILON4)-->[Y4]

   (EPSILON1)-7-(EPSILON1)
   (EPSILON2)-8-(EPSILON2)
   (EPSILON3)-9-(EPSILON3)
   (EPSILON4)-10-(EPSILON4)

   (ZETA1)-->(ETA1)
   (ZETA2)-->(ETA2)

   (ZETA1)-11-(ZETA1)
   (ZETA2)-12-(ZETA2)

   (XI1)-13->(ETA1)
   (XI1)-13->(ETA2)
Constraint Interaction
Constraint Interaction
Constraint Interaction – Model With ULI
Constraints
   (XI1)-->[X1]
   (XI1)-2->[X2]
   (XI1)-1-(XI1)

   (DELTA1)-->[X1]
   (DELTA2)-->[X2]

   (DELTA1)-3-(DELTA1)
   (DELTA2)-4-(DELTA2)


   (ETA1)-->[Y1]
   (ETA1)-5->[Y2]

   (ETA2)-->[Y3]
   (ETA2)-6->[Y4]

   (EPSILON1)-->[Y1]
   (EPSILON2)-->[Y2]
   (EPSILON3)-->[Y3]
   (EPSILON4)-->[Y4]

   (EPSILON1)-7-(EPSILON1)
   (EPSILON2)-8-(EPSILON2)
   (EPSILON3)-9-(EPSILON3)
   (EPSILON4)-10-(EPSILON4)

   (ZETA1)-->(ETA1)
   (ZETA2)-->(ETA2)

   (ZETA1)-11-(ZETA1)
   (ZETA2)-12-(ZETA2)

   (XI1)-13->(ETA1)
   (XI1)-13->(ETA2)
   (ETA1)-15->(ETA2)
Constraint Interaction – Model With ULI
Constraints
Typical Characteristics of Statistical
Computing Cycles

   Back-loaded
    –   Occur late in the research cycle, after data are
        gathered
   Reactive
    –   Often occur in support of analytic activities that
        are reactions to previous analysis results
Traditional Statistical World-View

   Data come first
   Analyses come second
   Analyses are well-understood and will work
   Before the data arrive, there is nothing to
    analyze and no reason to start analyzing
Modern Statistical World View

   Planning comes first
    –   Power Analysis, Precision Analysis, etc.
   Planning may require some substantial
    computing
    –   Goal is to estimate required sample size
   Data analysis must wait for data
Proactive SEM Statistical World View

   SEM involves interaction between specific model(s)
    and data.
    –   Some models may not “work” with many data sets
   Planning involves:
    –   Power Analysis
    –   Precision Analysis
    –   Confirming Identification
    –   Proactive Analysis of Model Performance
            Without proper proactive analysis, research can be stopped
             cold with an “unhappy surprise.”
Barriers

   Software
    –   Design
    –   Availability
   Education

				
DOCUMENT INFO