Multilevel Models for Family and Child Development Data by sdfwerte


									Why use multilevel modelling?

             Clio Day

          Jon Rasbash
         November 2006
                           A simple question

How much of the variability in pupil attainment is attributable to
schools level factors and how much to pupil level factors?

First of all we have to define what we mean by “variability”
                    A toy example – first we calculate the mean

Two schools each with two pupils.


                                            Overall mean(0)


               School 1   School 2

 Overall mean= (3+2+(-1)+(-4))/4=0
                             Calculating the “variance”


                                           Overall mean(0)


             School 1   School 2

The total variance is the sum of the squares of the departures of the
observations around mean divided by the sample size(4) =

             The variance of the school means around the overall mean


                                                Overall mean(0)

               School 1   School 2

             The variance of the school means around the overall mean=

             Total variance =7.5
     The variance of the pupils scores around their school’s mean



               School 1   School 2
The variance of the pupils scores around their school’s mean=
 ((3-2.5)2 + (2-2.5)2 + (-1-(-2.5))2 + (-4-(-2.5))2 )/4 =1.25
The variance of the school means around overall mean = (2.52+(-
Total variance =7.5=6.25+1.25
                        Returning to our question

How much of the variability in pupil attainment is attributable to
schools level factors and how much to pupil level factors?

In terms of our toy example we can now say

6.25/7.5= 82% of the total variation of pupils attainment is
attributable to school level factors

1.25/7.5= 18% of the total variation of pupils attainment is
attributable to pupil level factors
    Now lets do the same thing on real data(65 schools+4000
                                                             Overall mean=0
                                                            (attainment scaled
                                                            to have 0 mean

                                                             Total variation = 1

                                                             Variance of school
                                                             means around
                                                             overall mean=0.15

                                                              Variance of
                                                              pupil’s attainment
                                                              scores around
85% of variability due to pupil level factors, 15% due to     school mean=0.85
school level factors
                  Estimating parameters of distributions

The multilevel model assumes the school means and the pupils
departures around their school means are Normally distributed.

The Normal distribution has two
parameters: the mean and the
variance. Our model estimates
N(0,0.85) for pupil within school
effects and N(0,0.15) for school

We gain a great deal of
modelling power and
flexibility by making these
Normality assumption
  Can we explain the variation at the school and pupil levels?
With educational data we typically want to take account of pupil intake ability
when they enter school. In our data set the plot looks like this:
  Can we explain the variation at the school and pupil levels?

What might happen to the between school and between pupil within school
variances when we correct for prior ability?

         Prior ability                     Prior ability

 Both the between school and between pupil within school
 variation will be reduced when we model mean attainment as a
 function of prior ability
                      And on the real world data…

Recall that the between school and between pupil variances before
taking account of prior ability were 0.15 and 0.85 respectively.

 After taking account of prior ability between school and
 between pupil variances are reduced to 0.092 an 0.566.

 So accounting for prior ability explains 39% and 35% of between school
 and between pupil variation.

The effect of taking account of prior attainment is important but not
entirely surprising. School variability is partly determined by school
intake profile and pupil prior attainment is a good predictor of
subsequent attainment.
                        Obvious next questions
Are there other school level variables that can explain why schools
Does school gender(mixed school, boy school, girl school) explain
some of the between school variation?

                                                If we (additionally to prior
                                                ability) allow the mean to
                                                be a function of school
                                                gender the between school
                                                variance is reduced by
Focussing on school gender differences

                       mixed school   boy school   girl school
                      Problems with traditional techniques

How much between school variation is there? What school level predictor
variables can explain some of this variation?

This line of enquiry is powerful. Traditional statistical analysis techniques can
not pursue this exploratory avenue.

Estimate the between school variability OR

Estimate school level predictors such as school gender

                              But not both
                                      It gets worse
                                                      If we fit a single level
                                                      model we get incorrect
                                                      uncertainty intervals
                                                      leading to incorrect

                                                      This is because the
                                                      single level model
                                                      ignores the clustering
                                                      effect of schools

   mixed school   boy school   girl school

The distributional assumptions made by the multilevel model allows the
estimation of between school variance and school level predictors.
                         Variance is our business

We have modelled mean attainment as a function of prior ability
and school gender

We have simultaneously modelled the total variation in attainment as
function of school and pupil levels.

Traditional modelling techniques are unable to partition the variation in
this way and just estimate a single term and refer to it as “error”.

 We think this variation is not error, it contains a lot of interesting
 structure. Multilevel modelling is a great tool for exploring the structure
 in the “error” term.
                       Is there a family effect?

Recent studies in developmental psychology and behavioural
genetics(BG) emphasise non-shared environment and genetic
influences are much more important in explaining children’s
adjustment than shared environment has led to a focus on non-
shared environment.(Plomin et al, 1994; Turkheimer&Waldron,

Multilevel modelling can replicate the BG analysis. It can also
extend them to more reasonably represent the complexities of
family structures and processes. When this is done persistent
family effects are found.
10 schools two scenarios
                       Is there a family effect?
Recent studies in developmental psychology and behavioural
genetics emphasise non-shared environment and genetic
influences are much more important in explaining children’s
adjustment than shared environment has led to a focus on non-
shared environment.(Plomin et al, 1994; Turkheimer&Waldron,

My collaborators from psychology Jenny Jenkins(Toronto University)
and Tom O’Connor(Rochester University) were concerned that perhaps
the analytic techniques being used might have some simplifying
assumptions that made it difficult to pick up the shared family context.
They were interested to see if applying multilevel models, with the
recognised strengths in exploring contextual effects might turn up some
different findings.
                            Two analyses

1. Understanding the sources of differential parenting: the role of child
and family level effects. Jenny Jenkins, Jon Rasbash and Tom O’Connor
Developmental Psychology 2003(1) 99-113

2. Applying social network models to within family processes.
Currently being written up for publication.
                Differential parental treatment
•One key aspect of the non-shared environment that has been
investigated is differential parental treatment of siblings.
•Differential treatment predicts differences in sibling
•What are the sources of differential treatment?
•Child specific/non-shared: age, temperament, biological
•Can family level shared environmental factors influence
differential treatment?
The Stress/Resources Hypothesis

Do family contexts(shared environment) increase or decrease the extent
to which children within the same family are treated differently?

“Parents have a finite amount of resources in terms of time, attention,
patience and support to give their children. In families in which most of
these resources are devoted to coping with economic stress, depression
and/or marital conflict, parents may become less consciously or
intentionally equitable and more driven by preferences or child
characteristics in their childrearing efforts”. Henderson et al 1996.

This is the hypothesis we wish to test. We operationalised the
stress/resources hypothesis using four contextual variables:
socioeconomic status, single parenthood, large family size, and marital
                                  Modelling the mean and variance simultaneously

                     We show a possible pattern of how the mean, within family variance and
                     between family variance might behave as functions of HSES in the schematic
                     diagram below.
                                                               Here are 5 families of increasing
                                                               HSES(in the actual data set there
                                                               are 3900 families.
                                                               We can fit a linear function of SES
positive parenting

                                                               to the mean.
                                                                The family means now vary around
                                                                the dashed trend line. This is now
                                                                the between family variation;
                                                                which is pretty constant wrt HSES

                     However, the within family variation(measure of differential
                     parenting) decreases with HSES – this supports the SR hypothesis.
               Conclusion on differential parental treatment

• We have found strong support for the stress/resources hypothesis. That
  is although differential parenting is a child specific factor that drives
  differential adjustment, differential parenting itself is influenced by
  family factors such as HSES.
• This challenges the current tendency in developmental psychology and
  behavioural genetics to focus on child specific factors.
• Multilevel models which model both the mean and the variability
  simultaneously are needed to uncover these relationships.
Deconstructing relationships: what determines how people
get on within a family?

family Culture
 the individual

the dyad(the two people relating)

   Applying models from social network theory to family data

Non-Shared Environment Adolescent Development(NEAD) data set, Reiss et al(1994).

•2 wave longitudinal family study, designed for testing hypothesis about genetic
and environmental effects
•277 full-sib pairs, 109 half-sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ
twins, aged between 9 and 18 years
•Wave 2 followed 3 years after wave 1 and any families where the older sib was
older than 18 were not followed up.
•A wide range of self-report, parental-report and observer variables were collected.
•All families had 2 parents and 2 kids of the same sex.
•We focus here on data on relationship quality collected by observers.
                                  Within family structure

    We start with 12 relationship scores in each family. These can be classified :

     actor partner dyad
                                     Family 1…

  Dyad          d1           d2          d3            d4          d5         d6

Actor:               c1                       c2                  m                      f

Relationship: c1c2 c1m c1f        c2c1 c2m c2f        mc1 mc2 mf          fc1 fc2 fm

Partner:        c1                 c2                    m                           f

   This model is the multilevel social relations model-Snijders+Kenny(1999)
        Useful diagrams for thinking about multilevel structure

         The relationship scores are contained within a cross classification of actor, dyad and
         partner and all of this structure is nested within families. This can structure can be
         shown diagramatically with:

        A unit diagram – one node per unit                                       A classification diagram with one node per

                                          Family 1…

  Dyad       d1         d2           d3           d4     d5         d6

                                                                                               actor        partner         dyad
Actor:             c1                     c2            m                    f

Relationship: c1c2 c1m c1f   c2c1 c2m c2f     mc1 mc2 mf    fc1 fc2 fm

Partner:      c1                c2              m                        f                             Relationship score
                    Interpretation of variance components
Family:the extent to which family level factors effect all the relationships in a
Actor: the extent to which individuals act similarly across relationships with
other family members(actor stability, trait-like behaviour)
Partner: We actually have two traits operating, in addition to the trait of
common acting to other family members we also have the trait of elicitation
from other family members. The greater the partner variance component the
greater the evidence for such a trait operating.
Dyad: The extent to which relationship quality is specific to the dyad. A high
dyad random effect means that the relationship score from joe->fred is similar of
that from fred->joe. In social network theory this is known as reciprocity.
Reciprocity is a context specific effect(non trait-like)
Relationship: residual variation across relationships in relationship quality.
                           Results of SRM more detail
Table shows variance partition coefficients

                  Pos         Neg                       For positivity 44% of the
                  SRM         SRM                       variablity is attributable to actors
                                                        indicating that individuals act in a
      Family      0.12        0.19                      consistent way across relationships
      Actor       0.44        0.12                      with other family members. There
      Partner     0.01        0.03                      is a strong actor trait component to
      Dyad        0.18        0.41                      For negativity 0.41 of the
      Relat.      0.25        0.24                      variability is attributable to dyad.
      -2loglike   10225.7     17800.9                   Indicating the dyad is an
                                                        important structure in
                                                        determining negativity in
                                                        relationships. There is a strong
                                                        context specific component to
     There is little evidence of an elicitation or partner trait for either response.

  At the family level there are stronger effects for negativity than positivity.
      Modelling the mean relationship quality in terms of role
 The basic unit, a relationship, has an actor and a partner. Actors and
 partners are classified into the roles of children, mothers and fathers by the
 two categorical variables actor_role and partner_role.
relation         Actor_role                    Partner_role
ship     child    mother      father   child     mother   father
c1c2    1        0           0        1         0        0             We use child as the
c1m     1        0           0        0         1        0           reference category for
c1f     1        0           0        0         0        1               actor_role and
c2c1    1        0           0        1         0        0
c2m     1        0           0        1         1        0
c2f     1        0           0        0         0        1
mc1     0        1           0        1         0        0
mc2     0        1           0        1         0        0
mf      0        1           0        0         0        1
fc1     0        0           1        1         0        0
fc2     0        0           1        1         0        0
fm      0        0           1        0         1        0
               Including actor and partner roles-positivity
                param(se)      param(se)       Modelling actor and partner role
                                               drops likelihood by over 1000
                                               units with 4df.
intercept       2.834(0.011)   2.263(0.014)
                                                 The effect is dominated by the actor
a_mother        -              0.502(0.016)      role categories. With mothers and
                                                 then fathers being much more
a_father        -              0.351(0.016)
                                                 positive as actors than the reference
p_mother        -              0.021(0.011)      category child.

p_father        -              -0.032(0.011)      These actor_role role variables
                                                  explain over 50% of the actor level
random                                            variance.
family          0.034(0.004)   0.050(0.004)       Adding interactions between
actor           0.124(0.005)   0.061(0.004)       actor_role and partner-role does not
                                                  improve the model.
partner         0.003(0.002)   0.001(0.002)
                                                 Since we have explained actor level
dyad            0.050(0.003)   0.051(0.003)      variance this means actor role
relationship    0.073(0.002)   0.073(0.002)      explains the some of the trait
                                                 component of relationship positivity.
-2loglike       10225.7        9092.64
              Graphing actor and partner role effects for positivity

                                               The graph shows actor_role having a
                                               big effect on relationship quality and
                                               partner role having a marginal effect.

actor child          actor m       actor f
               Including actor and partner roles-negativity
                   param(se)      param(se)
                                                      Now an interaction is required
fixed                                                 between actor_role and
intercept          0.348(0.018)   0.729(0.027)        partner_role. Note the interaction
                                                      categories a_moth*p_moth and
a_mother           -              -0.375(0.030)       a_fath*p_fath structurally do not
a_father           -              -0.516(0.031)       exist.
p_mother           -              -0.319(0.028)       Modelling actor and partner role
p_father           -              -0.625(0.028)       and the interaction drops the
                                                      loglike by 500 units with 6df.
a_moth*p_fath      -              0.359(0.040)
a_fath*p_moth      -              0.563(0.040)    }   Note the main drop in the variance
random                                                occurs at the dyad level which
                                                      reduces by 15%. This means
family             0.137(0.012)   0.144(0.012)        modelling actor and partner roles
actor              0.082(0.006)   0.087(0.006)        has explained context specific
partner            0.022(0.005)   0.018(0.004)        variation in relationship quality for
dyad               0.282(0.010)   0.239(0.009)
relationship       0.165(0.005)   0.162(0.005)
-2loglike          17800.9        17305.18
                   Graphing actor and partner role effects for negativity

                                                     With respect to actor and partner roles
                                                     the main context specific effects for
                                                     relationship quality occur in
                                                     relationships where the child is an

                                                     Whether the partner is another child, a
                                                     mother or a father greatly effects the
                                                     negativity of the predicted relationship
     actor child           actor m       actor f

A possible psychological explanation for this pattern is that negativity is “ high
stakes” behaviour. The amount of negativity a child feels “safe” to express is
determined by the power/authority of the partner.

Note that parents are trait-like wrt actor negativity effects.
                                      Genetic effects

 Individuals exhibit some trait-like behaviour for both relationship positivity and
 negativity. With individuals exhibiting stronger trait-like behaviour for relationship

 Such trait-like behaviour may have a genetic component.

 The standard behavioural genetics model for children within families estimates
 shared environment(family), non-shared environment(individual) and genetic
 components of variation.
Our structure is more complex in that the lowest level is not the individual but a
relationship between two individuals. Also we have a dyad component of variation
and the individual component of variation is split into actor and partner components.

However, we can extend the basic BG model (which incorporates some questionable
assumptions) to our structure. The extended model gives heritabilities (genetic
variance)/(total variance) of 0.42 and 0.16 for positivity and negativity respectively.
The actor and partner variance components were reduced with the inclusion of genetic
effects but the family variance component was undiminished.
                             Stability of effects over time
The data has two waves where the same relationships were measured three years later.
This allows us to explore the stability of family, actor, partner, dyad and relationship
effects over time.
We can operationalise the longitudinal structure by fitting a multivariate response social
relations model where the first response is the time 1 relationship score and the second the
time 2 relationship score.
We simultaneously estimate all variance components for each response
 and the following correlations
     time 1 relationship score                              time 2 relationship score

              family                                                  family

   actor     partner     dyad                              actor     partner     dyad

           Relationship score                                      Relationship score
                      Stability – results of two bivariate SRM
                    Positivity                     Negativity           The basic patterns of the
           w1 vpc    w2 vpc      12      w1 vpc     w2 vpc      12    vpc’s found in wave 1 are
                                                                        repeated in wave 2 for
family     0.11      0.12        0.77     0.20       0.17        0.8    both positivity and
actor      0.44      0.46        0.87     0.11       0.11        0.67
                                                                        Family effects are
partner    0.01      0.01        1.5??    0.03       0.04        0.88   very stable over time
dyad       0.17      0.12        0.15     0.42       0.41        0.34   for both positivity (12
                                                                        = 0.77) and negativity
relat.     0.26      0.29        0.11     0.25       0.27        0.16   (12=0.8). Family
                                                                        effects are a bit
 Actor effects are stronger for positivity than negativity but          stronger for negativity.
 stability across time is high for both actor behaviours(0.87
 and 0.67)

  Dyad effects are much stronger for negativity than positivity. But
  the stability of dyad effects for both behaviours is lower than
  actor, partner and family effect stabilities. Dyads are more stable
  for negativity than positivity.
                        A comment on family effects

Developmental psychology and behavioural genetics, .(Plomin et al, 1994;
Turkheimer&Waldron, 2000). Have suggested that after taking account of
genetic and individual level factors there is scant evidence for family level
effects. Our work shows strong family level effects, that persist over time, even
when genetic, actor, partner, dyad and relationship level variance components
are included in the model.
Part of the previous failure to find family effects may be the analytical strategy
of breaking down families into series of overlapping dyads and analyising each
dyad separately. This strategy is probably in part determined by the
methodology available to the researchers.
    A comment on dyad effects for relationship negativity

For relationship negativity we saw large dyad effects and relatively low stability
over time.
This means that at wave 1 there is a large within family variability in dyad
negativity and likewise at wave 2. However the dyads which are most and least
negative within the family are to an extent switching around. The next step is to see
if we can find some systematic pattern to these dyadic dynamics for relationship
  Alspac data – an example of highly complex multilevel
 All the children born in the Avon area in 1990 followed up

 Many measurements made including educational
 attainment measures

Children span 3 school year cohorts(say 1994,1995,1996)
Suppose we wish to model development of numeracy over
the schooling period. We may have the following attainment
measures on a child :

 m1 m2 m3 m4              m5 m6 m7 m8
 primary school           secondary school
                    Structure for primary schools

                       Primary school

                       P School Cohort

                            Pupil               P. Teacher

                     M. Occasion

 •Measurement occasions within pupils
•At each occasion there may be a different teacher

 •Pupils are nested within primary school cohorts
•All this structure is nested within primary school
• Pupils are nested within residential areas
              A mixture of nested and crossed relationships

                         Primary school

                         P School Cohort

                               Pupil              P. Teacher

                        M. occasions
  Nodes directly connected by a single arrow are nested, otherwise nodes are cross-
  classified. For example, measurement occasions are nested within pupils. However,
  cohort are cross-classified with primary teachers, that is teachers teach more than one
  cohort and a cohort is taught by more than one teacher.
                          T1           T2    T3
             Cohort 1     95           96    97
             Cohort 2     96           97    98
             Cohort 3     98           99    00
                                  Multiple membership

It is reasonable to suppose the attainment of a child in a particualr year is
influenced not only by the current teacher, but also by teachers in previous
years. That is measurements occasions are “multiple members” of teachers.
         m1      m2          m3         m4

         t1     t2      t3         t4

                                                    Primary school

                                                    P School Cohort
We represent this in          Area
the classification
diagram by using a                                        Pupil                 P. Teacher
double arrow.

                                                   M. occasions
                              What happens if pupils move area?
                    Primary school

                                                                               Classification diagram
 Area          P School Cohort                                                 without pupils moving
                                                P. Teacher
                                                                               residential areas

                 M. occasions

   If pupils move area, then pupils are no longer nested within areas. Pupils and areas are cross-classified.
   Also it is reasonable to suppose that pupils measured attainments are effected by the areas they have
   previously lived in. So measurement occasions are multiple members of areas

                     Primary school

                                                                            Classification diagram
                P School Cohort                  P. Teacher                 where pupils move between
                                                                            residential areas
Area                  Pupil

                  M. occasions
              If pupils move area they will also move schools
                    Primary school

                                                                  Classification diagram
              P School Cohort                                     where pupils move between
                                              P. Teacher
                                                                  areas but not schools
Area                 Pupil

                  M. occasions

If pupils move schools they are no longer nested within primary school or primary school
cohort. Also we can expect, for the mobile pupils, both their previous and current cohort
and school to effect measured attainments

                             Primary school
                                                                          Classification diagram
                                                                          where pupils move
Area      Pupil                    P School Cohort         P. Teacher     between schools and

                    M. occasions
       If pupils move area they will also move schools cnt’d

  And secondary schools…

                        Primary school

Area     Pupil                  P School Cohort   P. Teacher

                 M. occasions

  We could also extend the above model to take account of Secondary school,
  secondary school cohort and secondary school teachers.
                       So why use multilevel models?
  It gives the correct answers for the standard errors of regression
  coefficients(in the presence of clustering). Thereby protecting against
  incorrect inferences(school gender example).

Modelling the variance(in addition to the mean) gives a framework that
allows a greater range of questions. For example, how does variability in
parental treatment of sibs partition between and within families? Does the
within family variance change as a function of social class?(As in the
differential parenting example)

Multilevel models extend to handle situations where there are multiple
classifications arranged in nested, crossed and multiple membership
relations. For example in the social relations model with relationship score,
actor, partner, dyad, family and genetic effects.
                          Other predictor variables

Remember we are partitioning the variability in attainment over time
between primary school, residential area, pupil, p. school cohort,
teacher and occasion. We also have predictor variables for these
classifications, eg pupil social class, teacher training, school budget and
so on. We can introduce these predictor variables to see to what extent
they explain the partitioned variability.

To top