Mixture Modeling - FHSS Research Support Center by malj


									Mixture Modeling

   Chongming Yang
Research Support Center
     FHSS College
Mixture of Distributions
Mixture of Distributions
       Classification Techniques
• Latent Class Analysis (categorical indicators)
• Latent Profile Analysis (continuous Indicators)
• Finite Mixture Modeling (multivariate normal 
• …
    Integrate Classification Models into 
               Other Models
•   Mixture Factor Analysis
•   Mixture Regressions
•   Mixture Structural Equation Modeling
•   Growth Mixture Modeling
•   Multilevel Mixture Modeling 
Disadvantages of Multi-steps Practice
• Multistep practice
  – Run classification model 
  – Save membership Variable
  – Model membership variable and other variables 
• Disadvantages
  – Biases in parameter estimates
  – Biases in standard errors 
     • Significance
     • Confidence Intervals
        Latent Class Analysis (LCA)
• Setting
  – Latent trait assumed to be categorical
  – Trait measured with multiple categorical 
  – Example: drug addiction, Schizophrenia
• Aim
  – Identify heterogeneous classes/groups 
  – Estimate class probabilities
  – Identify good indicators of classes
  – Relate covariates to Classes     
           Graphic LCA Model
• Categorical Indicators u: u1, u2,u3, …ur
• Categorical Latent Variable C: C =1, 2, …, or K
                 Probabilistic Model 
• Assumption: Conditional independence of u
        so that interdependence is explained by C like factor analysis model

• An item probability

• Joint Probability of all indicators 

             LCA Parameters
• Number of Classes -1
• Item Probabilities -1
      Class Means (Logit)
 Latent Class Analysis with Covariates
 Posterior Probability
(membership/classification of cases)
• Maximum Likelihood estimation via 
• Expectation-Maximization algorithm
  – E (expectation) step: compute average posterior 
    probabilities for each class and item
  – M (maximization) step: estimate class and item 
  – Iterate EM to maximize the likelihood of the 
            Test against Data
• O = observed number of response patterns
• E = model estimated number of response 
• Pearson

• Chi-square based on likelihood ratio  
     Determine Number of Classes 
•   Substantive theory (parsimonious, interpretable)
•   Predictive validity
•   Auxiliary variables / covariates
•   Statistical information and tests
    – Bayesian Information Criterion (BIC)
    – Entropy
    – Testing K against K-1 Classes
       • Vuong-Lo-Mendell-Rubin likelihood-ratio test
       • Bootstrapped likelihood ratio test
Bayesian Information Criterion 

L = likelihood
h = number of parameters
N = sample size
Choose model with smallest BIC
BIC Difference > 4 appreciable 
      Quality of Classification
     Testing K against K-1 Classes 
• Bootstrapped likelihood ratio test
    LRT = 2[logL(model 1)- logL(model2)], where   
        model 2 is nested in model 1.
Bootstrap Steps:
1. Estimate LRT for both models
2. Use bootstrapped samples to obtain  
     distributions for LRT of both models
3. Compare LRT and get p values
    Testing K against K-1 Classes 
• Vuong-Lo-Mendell-Rubin likelihood-ratio test
  Determine Quality of Indicators
• Good indicators
  – Item response probability is close to 0 or 1 in each 
• Bad indicators
  – Item response probability is high in more than one 
    classes, like cross-loading in factor analysis
  – Item response probability is low in all classes like 
    low-loading in factor analysis
              LCA Examples
• LCA with covariates
• Class predicts a categorical outcome 
       Save Membership Variable
    idvar = id;

Savedata: File = cmmber.txt;
                  Save = cprob; 
      Latent Profile Analysis
          Finite Mixture Modeling
           (multivariate normal variables)
• Finite = finite number of subgroups/classes
• Variables are normally distributed in each class
• Means differ across classes 
• Variances are the same across 
• Covariances can differ without restrictions or 
  equal with restrictions across classes
• Latent profile can be special case with 
  covariances fixed at zero.   
       Mixture Factor Analysis
• Allow one to examine measurement 
  properties of items in heterogeneous 
  subgroups / classes
• Measurement invariance is not required 
  assuming heterogeneity
• Factor structure can change
• See Mplus outputs
            Factor Mixture Analysis
• Parental Control
    Parents let you make your own decisions about the time you must be home on weekend nights
    Parents let you make your own decisions about the people you hang around with
    Parents let you make your own decisions about what you wear
    Parents let you make your own decisions about which television programs you watch
    Parents let you make your own decisions about which television programs you watch
    Parents let you make your own decisions about what time you go to bed on week nights
    Parents let you make your own decisions about what you eat

• Parental Acceptance
    Feel people in your family understand you
    Feel you want to leave home
    Feel you and your family have fun together
    Feel that your family pay attention to you
    Feel your parents care about you
    Feel close to your mother
    Feel close to your father
Two dimensions of Parenting 
              Mixture SEM
• See mixture growth modeling
Mixture Modeling with Known Classes
• Identify hidden classes within known groups
• Under nonrandomized experiments 
  – Impose equality constraints on covariates to 
    identify similar classes from known groups 
  – Compare classes that differ in covariates

To top