Docstoc

path

Document Sample
path Powered By Docstoc
					Path Analysis and Structural
    Equation Modeling:

    Part I: Path Analysis

       David L. Streiner, Ph.D.
    Professor, Dep’t of Psychiatry, U of T
 Professor, Dep’t of Psychiatry & Behavioural
     Neurosciences, McMaster University
         Senior Editor, Health Reports
A Bit of Philosophy of Science
                       Experimental      Correlational
Control of variables      Yes                 No

Subject assignment       Possible             No

Typical design            RCT           Cross-sectional

Statistics              ANOVA             Correlation

Homogeneity               High                Low

Search for ...          Group effects    Relationships

Show causation             Yes                No
The Problems:
  Science does not require control.
  But, cannot draw causation from
   correlation.
  Can we make any causal statements
   from non-experimental studies?
  One attempt was path analysis.
  It doesn’t, but it remains a powerful
   tool.
The Path to Path Analysis:
    Step 1 - Bivariate correlation
     – Limited to two variables
     – No distinction between DV and IV
The Path to Path Analysis:
  Step 1 - Bivariate correlation
  Step 2 - Multiple correlation
     – Distinguishes between DV and IVs
     – Unlimited number of IVs
          But:
     – Assumes IVs measured without error
     – Variable must be either DV or IV (DV in
       one step can’t be IV in next)
The Path to Path Analysis:
  Step 1 - Bivariate correlation
  Step 2 - Multiple correlation
  Step 3 - Path analysis
     – Can have many DVs
     – DV at one step can be IV in next
An Example:

  Well-being   is a function of:

   – Symptoms

   – Wealth

   – Intelligence
In Regression Terms:



YWell-Being = b0 + b1 Symptoms + b2 Wealth + b3 IQ + e
In Path Analysis Terms:
      Symptoms



       Wealth         Well-Being   e




         IQ
But Maybe...
      Symptoms



       Wealth    Well-Being   e




         IQ
Correlation Matrix:

               Well-Being   Symptoms   Wealth    IQ
  Well-Being     1.000        -.608     .807    .677


  Symptoms                    1.000     -.505   -.433


  Wealth                               1.000    .678


  IQ                                            1.000
Adding Correlations:
             Symptoms

         -.433

                           .677
 -.505           Wealth                Well-Being
                          (.199)

         .678

                  IQ
                                   ( weights in parentheses)
Relationship Between r and :
  Correlation between Well-Being and
   Symptoms is -0.608
   weight between Well-Being and
   Symptoms is -0.245

    Is there a relationship between these
     parameters?
Relationship Between r and :
  Using Symptoms and Well-Being:
    – Its  weight is -0.245
    – Exerts indirect effect through Wealth:
       -0.433 x 0.199 = -0.086
    – Also indirect effect through IQ:
       -0.505 x 0.548 = -0.277
    – So, total effect is:
       (-0.245) + (-0.086) + (-0.277) = -0.608
       which is the correlation
Relationship Between r and :
 So, the correlation, r, is the sum of:

    the direct effect of the IV on the DV
                    plus
    its indirect effects through its
       correlation with the other IVs
Relationship Between r and :
For the correlation between Well-Being and
  Symptoms:

rWB-Sx = Sx + (rWB-Wealth X Wealth) + (rWB-IQ X
 IQ)
Correlation and Reproduced Matrix:

               Well-Being   Symptoms   Wealth    IQ
  Well-Being     1.000        -.608     .807    .677


  Symptoms       -.608        1.000     -.505   -.433


  Wealth          .807        -.505    1.000    .678


  IQ              .677        -.433     .678    1.000
The Alternative Model:
         Symptoms




 -.505    Wealth               Well-Being
                      (.200)

             (.678)


            IQ
Correlation and Reproduced Matrix:

               Well-Being   Symptoms   Wealth    IQ
  Well-Being     1.000        -.608     .807    .677


  Symptoms       -.593        1.000     -.505   -.433


  Wealth          .810        -.505    1.000    .678


  IQ              .573        -.342     .678    1.000
Rules for Following Paths:
 1 For any single path you can go through
   a given variable only once.
 2 Once you’ve gone forward along a path
   using one arrow, you can’t go back on a
   path using a different arrow.
 3 You can’t go through a double-headed
   curved arrow more than one time.
 4 You can’t enter a variable on one
   arrowhead and leave it on another
   arrowhead.
Valid Paths For Symptoms:
      Symptoms



       Wealth        Well-Being



         IQ
Valid Paths For Wealth:
      Symptoms



       Wealth         Well-Being



         IQ
Valid Paths For Symptoms:
      Symptoms



       Wealth        Well-Being



         IQ
An Invalid Path For Symptoms:
      Symptoms



       Wealth        Well-Being



         IQ
Path Analysis  Causality:
         Symptoms




 -.505    Wealth               Well-Being
                      (.200)

             (.678)


            IQ
Some Terminology:
    Exogenous variables:
     – Have straight arrows emerging from them
       and none pointing to them.


    Endogenous variables:
     – Have at least one straight arrow pointing
       to them.
Why the Change in Terms?
                    Independent Variable

         Symptoms
                                           Dependent Variable
     ?


          Wealth                      Well-Being



            IQ

                    Independent Variable
Why the Change in Terms?
                        Exogenous Variable
         Symptoms
                                             Endogenous Variable
  Endogenous Variable


          Wealth                        Well-Being



             IQ

                        Exogenous Variable
Types of Path Models:

       X1



                        Y



       X2
Types of Path Models:

       X1



                        Y



       X2
Types of Path Models:

       X1               Y1




       X2               Y2
For Example:

     Mom’          Kid’s
     Anxiety      Anxiety




     Mom’s         Kid’s
    Depression   Depression
For Example:

     Anxiety      Anxiety
     (Time 1)     (Time 2)




    Depression   Depression
     (Time 1)     (Time 2)
Types of Path Models:

       X1



                        Y



       X2
For Example:

    Medication



                 Symptoms



     Family EE
Types of Path Models:

       X1               Y




                X2
For Example:

     Having
                           Depression
     a child




                Social
               Isolation
Types of Path Models:

       X1               X2




       Y1               Y2
Nonrecursive Models:

       X1              X2




       Y1              Y2
For Example:

     Mom’s         Kid’s
     Anxiety      Anxiety




     Mom’s         Kid’s
    Depression   Depression
For Example:

     Mom’s         Kid’s
     Anxiety      Anxiety




     Mom’s         Kid’s
    Depression   Depression
Disturbance Terms:

       X1            Y1   D1




       X2            Y2   D2
K.I.S.S.


Number of Parameters  Number of Observations
K.I.S.S.


                                    k x (k + 1)
 Number of Observations =
                                        2



    where k = number of variables
How Many Parameters?
    Purpose to determine what affects
     endogenous variables:
     – Which paths are important (straight paths)
     – How exogenous variables work together
       (curved paths)
     – Variances of exogenous variables
     – Disturbances of endogenous variables
    Not variances of endogenous variables
Counting Parameters:
                    7

         Symptoms

     4
                    8
                                         10
                        2
 6        Wealth            Well-Being   D1

     5
                    9

            IQ
Counting Parameters:
    3 exogenous variables + 1 endogenous
     variable, so k = 4
    Number of observations = (4 x 5) / 2 =
     10
    Number of parameters = Number of
     observations
Counting Parameters:
                    6

         Symptoms


     9                                   8
                        2
    D2    Wealth            Well-Being   D1

5            4
                    7

            IQ
Counting Parameters:
    3 exogenous variables + 1 endogenous
     variable, so k = 4
    Number of observations = (4 x 5) / 2 =
     10
    Number of parameters < Number of
     observations
Counting Parameters:
    Why not count variance of Well-Being?

    Why variance of Wealth counted in 1st
     diagram but not 2nd?

    Why no more parameters than
     observations?
Why Not Variance of Well-Being?
    Endogenous variable

    Not free to vary; dependent on values
     of exogenous variables

    Goal of PA to explain variances of
     variables and covariances between
     variables that can vary
Why Count Wealth in 1st Diagram
But Not 2nd?

              Exogenous                 Endogenous
   Symptoms
                             Symptoms




    Wealth      Well-Being    Wealth       Well-Being




      IQ                        IQ
Why No More Parameters Than
Observations?

a=b+c
 If a = 5, what are b and c?

   – Infinite number of solutions

   – Model is undefined (under-identified)

   – There isn’t a unique solution
Why No More Parameters Than
Observations?

a=b+c
 If a = 5 and b = -3 what is c?

   – Only one solution

   – Model is defined (just-identified)
Why No More Parameters Than
Observations?

a=b+c
 If a = 5, b = -3 and c is 8

   – Model is correct

   – Nothing to identify (trivial)

   – Model is over-defined (over-identified)
As Good As It Gets (Goodness-of-Fit):

     Significance of path coefficients



     Reproduced (implied) correlation matrix



     Model as a whole
Significance of Paths:

 Path coefficients are parameters


 Therefore, estimated with some error


  z = Path Coefficient / SEEstimate
Reproduced Correlation Matrix:
    In 1st diagram, reproduced correlations
     = actual correlations
    In 2nd diagram, reproduced
     correlations < actual correlations
    Model 1 better than Model 2, but:
     – Model 1 too good (10 Pars, 10 Obs)
     – In Model 2, 0bs = 10, Parameters = 9
The Model as a Whole:

    Goodness-of-Fit Chi-Squared (2GoF)

    In most tests, bigger is better

    Here, we want 2GoF to be as small as

     possible
Why     2
             GoF   Should be Small:
    2 tests difference between observed
     and expected findings.
    Usually, expected values determined
     under HO of no effect.
    We want findings to be different from
     this.
Why     2
           GoF   Should be Small:
    For goodness of fit, we are not testing
     difference between observed and HO.
    Testing difference between observed
     and hypothesized models.
    Do not want there to be a difference.
    df = (#Observations - #Parameters)
Interpreting       2
                       GoF:

    Greatly affected by sample size:
     – If low, SEs large, so hard to find difference
     – If high, every model differs from data

    Does not mean there may not be a
     better model.
    Does not indicate causality!
Two Different Models.
             2GoF(1) = 2.044                 2GoF(1) = 2.044

  Symptoms                         Symptoms




   Wealth             Well-Being    Wealth             Well-Being




     IQ                               IQ
An Over-Identified Model:

    Symptoms

                            # Observations = 10
                            # Parameters = 10
     Wealth    Well-Being
                            df = 0
                            Untestable

       IQ
Assumptions:
  Similar to OLS regression.
  Exogenous variables measured without
   error.
     – If violated, overestimates direct paths,
       underestimates indirect paths
  All important variables included.
  Additive model.
  Only moderate correlations among
   exogenous variables.
Sample Size:
    Affects SEs of path coefficients,
     variances, and covariances.
    No formulae for calculating N.
    Minimum of 10 subjects per parameter
     (some argue for 20).
    Minimum of 100 (some say 200).

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:25
posted:10/12/2010
language:English
pages:66