Why use multilevel modelling? Clio Day Jon Rasbash GSOE November 2006 A simple question How much of the variability in pupil attainment is attributable to schools level factors and how much to pupil level factors? First of all we have to define what we mean by “variability” A toy example – first we calculate the mean Two schools each with two pupils. 3 2 attainment Overall mean(0) -1 -4 School 1 School 2 Overall mean= (3+2+(-1)+(-4))/4=0 Calculating the “variance” 3 2 attainment Overall mean(0) -1 -4 School 1 School 2 The total variance is the sum of the squares of the departures of the observations around mean divided by the sample size(4) = (9+4+1+16)/4=7.5 The variance of the school means around the overall mean 3 2.5 2 attainment Overall mean(0) -1 -2.5 -4 School 1 School 2 The variance of the school means around the overall mean= (2.52+(-2.5)2)/2=6.25 Total variance =7.5 The variance of the pupils scores around their school’s mean 3 2.5 2 attainment -1 -2.5 -4 School 1 School 2 The variance of the pupils scores around their school’s mean= ((3-2.5)2 + (2-2.5)2 + (-1-(-2.5))2 + (-4-(-2.5))2 )/4 =1.25 The variance of the school means around overall mean = (2.52+(- 2.5)2)/2=6.25 Total variance =7.5=6.25+1.25 Returning to our question How much of the variability in pupil attainment is attributable to schools level factors and how much to pupil level factors? In terms of our toy example we can now say 6.25/7.5= 82% of the total variation of pupils attainment is attributable to school level factors 1.25/7.5= 18% of the total variation of pupils attainment is attributable to pupil level factors Now lets do the same thing on real data(65 schools+4000 pupils) Overall mean=0 (attainment scaled to have 0 mean overall) Total variation = 1 Variance of school means around overall mean=0.15 Variance of pupil’s attainment scores around 85% of variability due to pupil level factors, 15% due to school mean=0.85 school level factors Estimating parameters of distributions The multilevel model assumes the school means and the pupils departures around their school means are Normally distributed. The Normal distribution has two parameters: the mean and the variance. Our model estimates N(0,0.85) for pupil within school effects and N(0,0.15) for school effects We gain a great deal of modelling power and flexibility by making these Normality assumption Can we explain the variation at the school and pupil levels? With educational data we typically want to take account of pupil intake ability when they enter school. In our data set the plot looks like this: Can we explain the variation at the school and pupil levels? What might happen to the between school and between pupil within school variances when we correct for prior ability? Prior ability Prior ability Both the between school and between pupil within school variation will be reduced when we model mean attainment as a function of prior ability And on the real world data… Recall that the between school and between pupil variances before taking account of prior ability were 0.15 and 0.85 respectively. After taking account of prior ability between school and between pupil variances are reduced to 0.092 an 0.566. So accounting for prior ability explains 39% and 35% of between school and between pupil variation. The effect of taking account of prior attainment is important but not entirely surprising. School variability is partly determined by school intake profile and pupil prior attainment is a good predictor of subsequent attainment. Obvious next questions Are there other school level variables that can explain why schools differ? Does school gender(mixed school, boy school, girl school) explain some of the between school variation? If we (additionally to prior ability) allow the mean to be a function of school gender the between school variance is reduced by 13%. Focussing on school gender differences mixed school boy school girl school Problems with traditional techniques How much between school variation is there? What school level predictor variables can explain some of this variation? This line of enquiry is powerful. Traditional statistical analysis techniques can not pursue this exploratory avenue. Estimate the between school variability OR Estimate school level predictors such as school gender But not both It gets worse If we fit a single level model we get incorrect uncertainty intervals leading to incorrect inferences This is because the single level model ignores the clustering effect of schools mixed school boy school girl school The distributional assumptions made by the multilevel model allows the estimation of between school variance and school level predictors. Variance is our business We have modelled mean attainment as a function of prior ability and school gender We have simultaneously modelled the total variation in attainment as function of school and pupil levels. Traditional modelling techniques are unable to partition the variation in this way and just estimate a single term and refer to it as “error”. We think this variation is not error, it contains a lot of interesting structure. Multilevel modelling is a great tool for exploring the structure in the “error” term. Is there a family effect? Recent studies in developmental psychology and behavioural genetics(BG) emphasise non-shared environment and genetic influences are much more important in explaining children’s adjustment than shared environment has led to a focus on non- shared environment.(Plomin et al, 1994; Turkheimer&Waldron, 2000) Multilevel modelling can replicate the BG analysis. It can also extend them to more reasonably represent the complexities of family structures and processes. When this is done persistent family effects are found. 10 schools two scenarios Is there a family effect? Recent studies in developmental psychology and behavioural genetics emphasise non-shared environment and genetic influences are much more important in explaining children’s adjustment than shared environment has led to a focus on non- shared environment.(Plomin et al, 1994; Turkheimer&Waldron, 2000) My collaborators from psychology Jenny Jenkins(Toronto University) and Tom O’Connor(Rochester University) were concerned that perhaps the analytic techniques being used might have some simplifying assumptions that made it difficult to pick up the shared family context. They were interested to see if applying multilevel models, with the recognised strengths in exploring contextual effects might turn up some different findings. Two analyses 1. Understanding the sources of differential parenting: the role of child and family level effects. Jenny Jenkins, Jon Rasbash and Tom O’Connor Developmental Psychology 2003(1) 99-113 2. Applying social network models to within family processes. Currently being written up for publication. Differential parental treatment •One key aspect of the non-shared environment that has been investigated is differential parental treatment of siblings. •Differential treatment predicts differences in sibling adjustment •What are the sources of differential treatment? •Child specific/non-shared: age, temperament, biological relatedness •Can family level shared environmental factors influence differential treatment? The Stress/Resources Hypothesis Do family contexts(shared environment) increase or decrease the extent to which children within the same family are treated differently? “Parents have a finite amount of resources in terms of time, attention, patience and support to give their children. In families in which most of these resources are devoted to coping with economic stress, depression and/or marital conflict, parents may become less consciously or intentionally equitable and more driven by preferences or child characteristics in their childrearing efforts”. Henderson et al 1996. This is the hypothesis we wish to test. We operationalised the stress/resources hypothesis using four contextual variables: socioeconomic status, single parenthood, large family size, and marital conflict Modelling the mean and variance simultaneously We show a possible pattern of how the mean, within family variance and between family variance might behave as functions of HSES in the schematic diagram below. Here are 5 families of increasing HSES(in the actual data set there are 3900 families. We can fit a linear function of SES positive parenting to the mean. The family means now vary around the dashed trend line. This is now the between family variation; which is pretty constant wrt HSES HSES However, the within family variation(measure of differential parenting) decreases with HSES – this supports the SR hypothesis. Conclusion on differential parental treatment • We have found strong support for the stress/resources hypothesis. That is although differential parenting is a child specific factor that drives differential adjustment, differential parenting itself is influenced by family factors such as HSES. • This challenges the current tendency in developmental psychology and behavioural genetics to focus on child specific factors. • Multilevel models which model both the mean and the variability simultaneously are needed to uncover these relationships. Deconstructing relationships: what determines how people get on within a family? family Culture the individual the dyad(the two people relating) genes Applying models from social network theory to family data Non-Shared Environment Adolescent Development(NEAD) data set, Reiss et al(1994). •2 wave longitudinal family study, designed for testing hypothesis about genetic and environmental effects •277 full-sib pairs, 109 half-sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ twins, aged between 9 and 18 years •Wave 2 followed 3 years after wave 1 and any families where the older sib was older than 18 were not followed up. •A wide range of self-report, parental-report and observer variables were collected. •All families had 2 parents and 2 kids of the same sex. •We focus here on data on relationship quality collected by observers. Within family structure We start with 12 relationship scores in each family. These can be classified : actor partner dyad Family 1… Dyad d1 d2 d3 d4 d5 d6 Actor: c1 c2 m f Relationship: c1c2 c1m c1f c2c1 c2m c2f mc1 mc2 mf fc1 fc2 fm Partner: c1 c2 m f This model is the multilevel social relations model-Snijders+Kenny(1999) Useful diagrams for thinking about multilevel structure The relationship scores are contained within a cross classification of actor, dyad and partner and all of this structure is nested within families. This can structure can be shown diagramatically with: A unit diagram – one node per unit A classification diagram with one node per classification family Family 1… Dyad d1 d2 d3 d4 d5 d6 actor partner dyad Actor: c1 c2 m f Relationship: c1c2 c1m c1f c2c1 c2m c2f mc1 mc2 mf fc1 fc2 fm Partner: c1 c2 m f Relationship score Interpretation of variance components Family:the extent to which family level factors effect all the relationships in a family. Actor: the extent to which individuals act similarly across relationships with other family members(actor stability, trait-like behaviour) Partner: We actually have two traits operating, in addition to the trait of common acting to other family members we also have the trait of elicitation from other family members. The greater the partner variance component the greater the evidence for such a trait operating. Dyad: The extent to which relationship quality is specific to the dyad. A high dyad random effect means that the relationship score from joe->fred is similar of that from fred->joe. In social network theory this is known as reciprocity. Reciprocity is a context specific effect(non trait-like) Relationship: residual variation across relationships in relationship quality. Results of SRM more detail Table shows variance partition coefficients Pos Neg For positivity 44% of the SRM SRM variablity is attributable to actors indicating that individuals act in a Family 0.12 0.19 consistent way across relationships Actor 0.44 0.12 with other family members. There Partner 0.01 0.03 is a strong actor trait component to positivity. Dyad 0.18 0.41 For negativity 0.41 of the Relat. 0.25 0.24 variability is attributable to dyad. -2loglike 10225.7 17800.9 Indicating the dyad is an important structure in determining negativity in relationships. There is a strong context specific component to negativity. There is little evidence of an elicitation or partner trait for either response. At the family level there are stronger effects for negativity than positivity. Modelling the mean relationship quality in terms of role The basic unit, a relationship, has an actor and a partner. Actors and partners are classified into the roles of children, mothers and fathers by the two categorical variables actor_role and partner_role. relation Actor_role Partner_role ship child mother father child mother father c1c2 1 0 0 1 0 0 We use child as the c1m 1 0 0 0 1 0 reference category for c1f 1 0 0 0 0 1 actor_role and partner_role c2c1 1 0 0 1 0 0 variables. c2m 1 0 0 1 1 0 c2f 1 0 0 0 0 1 mc1 0 1 0 1 0 0 mc2 0 1 0 1 0 0 mf 0 1 0 0 0 1 fc1 0 0 1 1 0 0 fc2 0 0 1 1 0 0 fm 0 0 1 0 1 0 Including actor and partner roles-positivity param(se) param(se) Modelling actor and partner role drops likelihood by over 1000 fixed units with 4df. intercept 2.834(0.011) 2.263(0.014) The effect is dominated by the actor a_mother - 0.502(0.016) role categories. With mothers and then fathers being much more a_father - 0.351(0.016) positive as actors than the reference p_mother - 0.021(0.011) category child. p_father - -0.032(0.011) These actor_role role variables explain over 50% of the actor level random variance. family 0.034(0.004) 0.050(0.004) Adding interactions between actor 0.124(0.005) 0.061(0.004) actor_role and partner-role does not improve the model. partner 0.003(0.002) 0.001(0.002) Since we have explained actor level dyad 0.050(0.003) 0.051(0.003) variance this means actor role relationship 0.073(0.002) 0.073(0.002) explains the some of the trait component of relationship positivity. -2loglike 10225.7 9092.64 Graphing actor and partner role effects for positivity The graph shows actor_role having a big effect on relationship quality and partner role having a marginal effect. actor child actor m actor f Including actor and partner roles-negativity param(se) param(se) Now an interaction is required fixed between actor_role and intercept 0.348(0.018) 0.729(0.027) partner_role. Note the interaction categories a_moth*p_moth and a_mother - -0.375(0.030) a_fath*p_fath structurally do not a_father - -0.516(0.031) exist. p_mother - -0.319(0.028) Modelling actor and partner role p_father - -0.625(0.028) and the interaction drops the loglike by 500 units with 6df. a_moth*p_fath - 0.359(0.040) a_fath*p_moth - 0.563(0.040) } Note the main drop in the variance random occurs at the dyad level which reduces by 15%. This means family 0.137(0.012) 0.144(0.012) modelling actor and partner roles actor 0.082(0.006) 0.087(0.006) has explained context specific partner 0.022(0.005) 0.018(0.004) variation in relationship quality for dyad 0.282(0.010) 0.239(0.009) negativity. relationship 0.165(0.005) 0.162(0.005) -2loglike 17800.9 17305.18 Graphing actor and partner role effects for negativity With respect to actor and partner roles the main context specific effects for relationship quality occur in relationships where the child is an actor.. Whether the partner is another child, a mother or a father greatly effects the negativity of the predicted relationship quality actor child actor m actor f A possible psychological explanation for this pattern is that negativity is “ high stakes” behaviour. The amount of negativity a child feels “safe” to express is determined by the power/authority of the partner. Note that parents are trait-like wrt actor negativity effects. Genetic effects Individuals exhibit some trait-like behaviour for both relationship positivity and negativity. With individuals exhibiting stronger trait-like behaviour for relationship positivity. Such trait-like behaviour may have a genetic component. The standard behavioural genetics model for children within families estimates shared environment(family), non-shared environment(individual) and genetic components of variation. Our structure is more complex in that the lowest level is not the individual but a relationship between two individuals. Also we have a dyad component of variation and the individual component of variation is split into actor and partner components. However, we can extend the basic BG model (which incorporates some questionable assumptions) to our structure. The extended model gives heritabilities (genetic variance)/(total variance) of 0.42 and 0.16 for positivity and negativity respectively. The actor and partner variance components were reduced with the inclusion of genetic effects but the family variance component was undiminished. Stability of effects over time The data has two waves where the same relationships were measured three years later. This allows us to explore the stability of family, actor, partner, dyad and relationship effects over time. We can operationalise the longitudinal structure by fitting a multivariate response social relations model where the first response is the time 1 relationship score and the second the time 2 relationship score. We simultaneously estimate all variance components for each response and the following correlations time 1 relationship score time 2 relationship score family family actor partner dyad actor partner dyad Relationship score Relationship score Stability – results of two bivariate SRM Positivity Negativity The basic patterns of the w1 vpc w2 vpc 12 w1 vpc w2 vpc 12 vpc’s found in wave 1 are repeated in wave 2 for family 0.11 0.12 0.77 0.20 0.17 0.8 both positivity and negativity. actor 0.44 0.46 0.87 0.11 0.11 0.67 Family effects are partner 0.01 0.01 1.5?? 0.03 0.04 0.88 very stable over time dyad 0.17 0.12 0.15 0.42 0.41 0.34 for both positivity (12 = 0.77) and negativity relat. 0.26 0.29 0.11 0.25 0.27 0.16 (12=0.8). Family effects are a bit Actor effects are stronger for positivity than negativity but stronger for negativity. stability across time is high for both actor behaviours(0.87 and 0.67) Dyad effects are much stronger for negativity than positivity. But the stability of dyad effects for both behaviours is lower than actor, partner and family effect stabilities. Dyads are more stable for negativity than positivity. A comment on family effects Developmental psychology and behavioural genetics, .(Plomin et al, 1994; Turkheimer&Waldron, 2000). Have suggested that after taking account of genetic and individual level factors there is scant evidence for family level effects. Our work shows strong family level effects, that persist over time, even when genetic, actor, partner, dyad and relationship level variance components are included in the model. Part of the previous failure to find family effects may be the analytical strategy of breaking down families into series of overlapping dyads and analyising each dyad separately. This strategy is probably in part determined by the methodology available to the researchers. A comment on dyad effects for relationship negativity For relationship negativity we saw large dyad effects and relatively low stability over time. This means that at wave 1 there is a large within family variability in dyad negativity and likewise at wave 2. However the dyads which are most and least negative within the family are to an extent switching around. The next step is to see if we can find some systematic pattern to these dyadic dynamics for relationship negativity. Alspac data – an example of highly complex multilevel structure All the children born in the Avon area in 1990 followed up longitudinally Many measurements made including educational attainment measures Children span 3 school year cohorts(say 1994,1995,1996) Suppose we wish to model development of numeracy over the schooling period. We may have the following attainment measures on a child : m1 m2 m3 m4 m5 m6 m7 m8 primary school secondary school Structure for primary schools Primary school P School Cohort Area Pupil P. Teacher M. Occasion •Measurement occasions within pupils •At each occasion there may be a different teacher •Pupils are nested within primary school cohorts •All this structure is nested within primary school • Pupils are nested within residential areas A mixture of nested and crossed relationships Primary school P School Cohort Area Pupil P. Teacher M. occasions Nodes directly connected by a single arrow are nested, otherwise nodes are cross- classified. For example, measurement occasions are nested within pupils. However, cohort are cross-classified with primary teachers, that is teachers teach more than one cohort and a cohort is taught by more than one teacher. T1 T2 T3 Cohort 1 95 96 97 Cohort 2 96 97 98 Cohort 3 98 99 00 Multiple membership It is reasonable to suppose the attainment of a child in a particualr year is influenced not only by the current teacher, but also by teachers in previous years. That is measurements occasions are “multiple members” of teachers. m1 m2 m3 m4 t1 t2 t3 t4 Primary school P School Cohort We represent this in Area the classification diagram by using a Pupil P. Teacher double arrow. M. occasions What happens if pupils move area? Primary school Classification diagram Area P School Cohort without pupils moving P. Teacher residential areas Pupil M. occasions If pupils move area, then pupils are no longer nested within areas. Pupils and areas are cross-classified. Also it is reasonable to suppose that pupils measured attainments are effected by the areas they have previously lived in. So measurement occasions are multiple members of areas Primary school Classification diagram P School Cohort P. Teacher where pupils move between residential areas Area Pupil M. occasions BUT… If pupils move area they will also move schools Primary school Classification diagram P School Cohort where pupils move between P. Teacher areas but not schools Area Pupil M. occasions If pupils move schools they are no longer nested within primary school or primary school cohort. Also we can expect, for the mobile pupils, both their previous and current cohort and school to effect measured attainments Primary school Classification diagram where pupils move Area Pupil P School Cohort P. Teacher between schools and areas M. occasions If pupils move area they will also move schools cnt’d And secondary schools… Primary school Area Pupil P School Cohort P. Teacher M. occasions We could also extend the above model to take account of Secondary school, secondary school cohort and secondary school teachers. So why use multilevel models? It gives the correct answers for the standard errors of regression coefficients(in the presence of clustering). Thereby protecting against incorrect inferences(school gender example). Modelling the variance(in addition to the mean) gives a framework that allows a greater range of questions. For example, how does variability in parental treatment of sibs partition between and within families? Does the within family variance change as a function of social class?(As in the differential parenting example) Multilevel models extend to handle situations where there are multiple classifications arranged in nested, crossed and multiple membership relations. For example in the social relations model with relationship score, actor, partner, dyad, family and genetic effects. Other predictor variables Remember we are partitioning the variability in attainment over time between primary school, residential area, pupil, p. school cohort, teacher and occasion. We also have predictor variables for these classifications, eg pupil social class, teacher training, school budget and so on. We can introduce these predictor variables to see to what extent they explain the partitioned variability.
Pages to are hidden for
"Multilevel Models for Family and Child Development Data"Please download to view full document