DCM by ewghwehws

VIEWS: 0 PAGES: 30

									              Group analyses of fMRI data

Klaas Enno Stephan
Laboratory for Social and Neural Systems Research
Institute for Empirical Research in Economics
University of Zurich

Functional Imaging Laboratory (FIL)
Wellcome Trust Centre for Neuroimaging
University College London


With many thanks for slides & images to:
FIL Methods group,
particularly Will Penny



     Methods & models for fMRI data analysis in neuroeconomics
                        11 November 2009
                                Overview of SPM
Image time-series     Kernel           Design matrix       Statistical parametric map (SPM)




 Realignment        Smoothing       General linear model


                                                             Statistical      Gaussian
         Normalisation                                       inference       field theory




                                                                            p <0.05
                         Template
                                     Parameter estimates
         Why hierachical models?
  fMRI, single subject             EEG/MEG, single subject




fMRI, multi-subject                  ERP/ERF, multi-subject


              Hierarchical models for all imaging
                            data!
Reminder: voxel-wise time series analysis!

                                     model
                                  specification
                                   parameter
                                   estimation




                   Time
                                   hypothesis
                                    statistic




                    BOLD signal
    single voxel
    time series                       SPM
              The model: voxel-wise GLM
    1                   p           1         1
                                                    y  X  e
                                
                            p                       e ~ N (0,  I )       2


    y   =        X                      +   e       Model is specified by
                                                    1. Design matrix X
                                                    2. Assumptions about e

                                                    N: number of scans
                                        N           p: number of regressors
N       N
        The design matrix embodies all available knowledge about
        experimentally controlled factors and potential confounds.
GLM assumes Gaussian “spherical” (i.i.d.) errors

  sphericity = iid:     Examples for non-sphericity:
 error covariance is
  scalar multiple of
                                                 4 0
   identity matrix:                     Cov(e)  
    Cov(e) = 2I                                  0 1
                                                      
                                                non-identity




                                                 2 1
                                        Cov(e)  
                                                  1 2
                                                      
         1 0
Cov(e)  
          0 1
              
                                           non-independence
   Multiple covariance components at 1st level

                                        V  Cov(e)
   e ~ N (0,  V )        2
                                        V   iQi
   enhanced noise model             error covariance components Q
                                    and hyperparameters


           V                      Q1                          Q2
                 = 1                    + 2

Estimation of hyperparameters  with ReML (restricted maximum
likelihood).
            t-statistic based on ML estimates
Wy  WX  We                c Tˆ                        ˆ
                                                st d (cT  ) 
                                                 ˆ
                      t
                                    ˆ
                         st d ( cT  )
                          ˆ                        c (WX ) (WX ) c
                                                   ˆ 2 T                   T
ˆ
  (WX )  Wy
 c=10000000000          W V        1/ 2
                                                   
                                                   ˆ 2
                                                               
                                                            Wy  WXˆ          2



                      V  Cov(e)
                        2                                          tr( R)
       X                                             R  I  WX (WX ) 
                                 V 
                                   Q  i   i




                                                For brevity:

                             ReML-              (WX )  ( X TWX )1 X T
                            estimates
    Group level inference: fixed effects (FFX)

• assumes that parameters are “fixed properties of the
  population”

• all variability is only intra-subject variability, e.g. due to
  measurement errors

• Laird & Ware (1982): the probability distribution of the data
  has the same form for each individual and the same
  parameters

• In SPM: simply concatenate the data and the design
  matrices
   lots of power (proportional to number of scans),
      but results are only valid for the group studied and
      cannot be generalized to the population
  Group level inference: random effects (RFX)

• assumes that model parameters are probabilistically
  distributed in the population

• variance is due to inter-subject variability

• Laird & Ware (1982): the probability distribution of the data
  has the same form for each individual, but the parameters
  vary across individuals

• In SPM: hierarchical model
   much less power (proportional to number of
      subjects), but results can be generalized to the
      population
              Linear hierarchical model

   Hierarchical model                    Multiple variance components
                                                  at each level
      y  X (1) (1)   (1)
   (1)  X ( 2) ( 2)   ( 2)
                                             C   Q
                                                (i)                (i)   (i)
                                                
                                                          k
                                                               k         k


 ( n 1)  X ( n ) ( n )   ( n )

               At each level, distribution of parameters
                       is given by level above.

        What we don’t know: distribution of parameters
         and variance parameters (hyperparameters).
                    Example: Two-level model
                   1 1               1
     yX                       
    1            2  2               2 
         X                    

          X 1(1)                         1
                                     
                                                                           2 
y =                X 2(1)                      +  1    1 = X 2             +  2 

                            X 3(1)
                                                                Second level
                      First level
                                          Two-level model


                         y  X (1) (1)   (1)
                      (1)  X (2) (2)   (2)




             y  X (1)  X (2) (2)   (2)    (1)
                 X (1) X (2) (2)  X (1) (2)   (1)
                     fixed effects      random effects



Friston et al. 2002, NeuroImage
                   Mixed effects analysis

   Non-hierarchical model         y  X (1) X (2) (2)  X (1) (2)   (1)

                                 ˆ(1)  X (1) y
 Estimating 2nd level effects           X (2) (2)   (2)  X (1) (1)
                                        X (2) (2)   (2)


                                 Cov             C
Variance components at 2nd                                           (1)           (1) T
           level
                                             (2)          (2)
                                                                X            (1)
                                                                             C X
                                                    within-level  between-level
                                                   non-sphericity non-sphericity


Within-level non-sphericity at
                                               k Qk( i )
                                     (i )               (i )
    both levels: multiple        C
  covariance components                            k
                                                                Friston et al. 2005, NeuroImage
                               Estimation
 y  X                                             EM-algorithm
N 1    N  p p1       N 1

                                            C | y  ( X T C1 X ) 1
                                                                             E-step
                                             | y  C | y X C y
                                                                T   1

                maximise L  ln p( y | λ)
                                                   dL
                                               g
                                                   d
                                                   d 2L                      M-step
                                               J 2
   C   k Qk                                    d
         k                                         J 1 g       GN gradient ascent


 Assume, at voxel j:
                        jk   j k
                                                          Friston et al. 2002, NeuroImage
                Algorithmic equivalence

                           y  X (1) (1)   (1)
                                                                 Parametric
 Hierarchical            (1)
                                X ( 2)   ( 2)
                                                     ( 2)
                                                                  Empirical
    model
                                                               Bayes (PEB)

                     ( n 1)  X ( n ) ( n )   ( n )

                                                        EM = PEB = ReML



Single-level       y   (1)  X (1) ( 2 )                     Restricted
   model                       ...                              Maximum
                       X (1)  X ( n 1) ( n )                 Likelihood
                                                                  (ReML)
                        X (1)  X ( n ) ( n )
       Practical problems

Most 2-level models are just too big to
              compute.


   And even if, it takes a long time!


   Moreover, sometimes we are only
interested in one specific effect and do
     not want to model all the data.


    Is there a fast approximation?
               Summary statistics approach
       First level                              Second level                ˆ
                                                                         cT 
                                                                  t
Data          Design Matrix   Contrast Images
                                                                               ˆ
                                                                       Var (cT  )
                                                                        ˆ
       ˆ
       1
        12
        ˆ                                                              SPM(t)

        ˆ
       2
       2
       ˆ2



       ˆ
       11
        11
        ˆ2



        ˆ
       12
                                                                  One-sample
        12
        ˆ2                                                     t-test @ 2nd level
Validity of the summary statistics approach


     The summary stats approach is exact if for each
                   session/subject:

                  Within-session covariance the same

                      First-level design the same

                       One contrast per session



  But:   Summary stats approach is fairly robust
         against violations of these conditions.
                                  Mixed effects analysis
                                                                      y  data
                                                                                                            non-hierarchical model

                                                       X  [ X (0)    X (1) ]                X  [ X ( 0)     X (1) X ( 2) ]
                                                       V I                              Q  {Q1(1) ,, X (1) Q1( 2) X (1)T ,}

          Summary                                                         Step 1
          statistics                           ˆ
                                               (1)  ( X T V 1 X ) 1 X T V 1 y


                                                                           REML{ yy T n , X , Q}                             pooling over
                                                                                                                               voxels


                                                          ˆ
                                                     Y   (1)
                                                     X  X ( 2)
                                                     V   (i1) X (1)Qi(1) X (1)T   (j2)Q (j 2)
                                                            i                            j
                                                                1st level              2nd level
                                                                non-sphericity       non-sphericity
             EM
                                                                        Step 2
           approach                                   ˆ
                                                      ( 2 )  ( X T V 1 X ) 1 X T V 1 y



Friston et al. 2005, NeuroImage                         ˆ ( 2)
                 Reminder: sphericity

    y  X                  C  Cov( )  E ( )
                                                  T


  „sphericity“ means:                         Scans

Cov( )   I      2

i.e. Var ( )  
                  2
           i



                     1 0
                              Scans

           Cov( )  
                      0 1
                          
             2nd level: non-sphericity

                                       Error
                                     covariance


     Errors are independent
          but not identical:
  e.g. different groups (patients,
              controls)


   Errors are not independent
        and not identical:
e.g. repeated measures for each
subject (multiple basis functions,
     multiple conditions etc.)
Example 1: non-identical & independent errors

Stimuli:         Auditory Presentation (SOA = 4 secs) of
               (i) words and (ii) words spoken backwards


                                       e.g.
                                      “Book”
                                        and
                                      “Koob”




Subjects:   (i) 12 control subjects
            (ii) 11 blind subjects


Scanning:      fMRI, 250 scans per
                 subject, block design
                                                           Noppeney et al.
1st level:   Controls       Blinds




2nd level:

                        V            cT  [1  1]



                                     X
Example 2: non-identical & non-independent errors

     Stimuli:         Auditory Presentation (SOA = 4 secs) of words

                    1. Motion        2. Sound         3. Visual        4. Action

                    “jump”           “click”          “pink”           “turn”


     Subjects:   (i) 12 control subjects
                                                          1. Words referred to body motion. Subjects decided
                                                          if the body movement was slow.

     Scanning:   fMRI, 250 scans per                      2. Words referred to auditory features. Subjects
                 subject, block design                    decided if the sound was usually loud

                                                          3. Words referred to visual features. Subjects
                 What regions are generally               decided if the visual form was curved.
                 affected by the semantic content
                 of the words?                            4. Words referred to hand actions. Subjects decided
     Question:                                            if the hand action involved a tool.
                 Contrast: semantic decisions >
                 auditory decisions on reversed
                 words (gender identification task)                            Noppeney et al. 2003, Brain
             Repeated measures ANOVA

1st level:    1.Motion       2.Sound       3.Visual       4.Action



                         ?             ?              ?
                         =             =              =
                                             X




2nd level:
             Repeated measures ANOVA

1st level:    1.Motion       2.Sound       3.Visual           4.Action



                         ?             ?              ?
                         =             =              =
                                             X




2nd level:                                                  1 1 0 0 
                                                                      
                                                      cT   0 1  1 0 
                                                            0 0 1  1
                                                                      


                                V

                                                          X
                      Practical conclusions
• Linear hierarchical models are used for group analyses of multi-
  subject imaging data.

• The main challenge is to model non-sphericity (i.e. non-identity
  and non-independence of errors) within and between levels of
  the hierarchy.

• This is done using EM or ReML (which are equivalent for linear
  models).

• The summary statistics approach is robust approximation to a
  full mixed-effects analysis.
   – Use mixed-effects model only, if seriously in doubt about validity of
     summary statistics approach.
                   Recommended reading



Linear hierarchical models




Mixed effect models
Thank you

								
To top