Online material supplementary to Linscott et al. — page S1 ONLINE MATERIAL SUPPLEMENTARY TO: Seeking verisimilitude in a class: A systematic review of evidence that the criterial clinical symptoms of schizophrenia are taxonic Richard J. Linscott1-3, Judith Allardyce3, and Jim van Os3,4 1 To whom correspondence should be addressed; tel: +64 3 479 5689, fax: +64 3 479 8335, e-mail: email@example.com 2 Department of Psychology, University of Otago, P. O. Box 56, Dunedin 9054, New Zealand; 3Department of Psychiatry and Neuropsychology, South Limburg Mental Health Research and Teaching Network, EURON, Maastricht University, P. O. Box 616 (DRT 10), 6200 MD Maastricht, The Netherlands; 4Division of Psychological Medicine, Institute of Psychiatry, De Crespigny Park, Denmark Hill, London SE5 8AF, UK. Overview In this online supplement, we first describe two statistical approaches that may yield evidence that is potentially relevant to questions about the latent1 structure of the criterion symptoms of schizophrenia. Following these descriptions we present a pair of simulations to illustrate their application. Finally, we illustrate reverse association using evidence reported by Kendler et al.2 Coherent Cut Kinetic (CCK) Methods There are at least half a dozen CCK procedures, one of the most commonly used of which is maximum covariance (MAXCOV) analysis. MAXCOV analysis is a theorem-based bootstrapping procedure. It is unconventional in that it relies on visual 1 ―Latent‖ is this context is statistical parlance for underlying or unobservable or indirectly inferred, and is contrasted with ―manifest,‖ which means observable or directly measurable.1 Thus, symptoms of schizophrenia, reflected in scores on symptom rating scales, are manifest but underlying symptom factors (e.g., reality distortion, disorganization, and negative) are latent. Equally, the endophenotypes of schizophrenia, reflected in performance accuracy or reaction time or volume or current and so on, are also manifest. Online material supplementary to Linscott et al. — page S2 analysis of descriptive statistics displayed graphically and is supplemented by substantive tests of consistency among the bootstrapped parameters. CCK methods can be understood by the following example. If a sample of men (n = 100) and another of women (n = 100) are mixed and the correlation between the indicator variables strength and muscle mass are plotted, a linear association will be visible in the plot with no evidence for subgroups of men and women. If, however, the covariance between strength and muscle mass on the y-axis is plotted against a ―cut‖ variable shoe size on the x-axis, the covariance will be noticeably high at the point of shoe size that overlaps most between men and women, and much lower on the right and left side, where groups are homogeneously male or female. (An example of this pattern of covariance is illustrated in the top left panel of Figure S2). The principle underlying these analyzes is that if two discrete groups exist that are discriminated by an indicator variable, y, it follows that the two groups will differ in the mean of y. This means that if the cases are sorted into taxon and complement groups, these groups must differ in their means on y by some value, dy. If a second variable, x, exists that also discriminates the taxon from the complement group but is not correlated with y within either of the groups, then any cut of the distribution at a given value of x will lead to a degree of separation between the means of y above and below the cut. The MAXCOV procedure requires a minimum of three indicators, X, Y, and Z, about which it can be assumed there is negligible within-class covariance. (This is referred to as the assumption of conditional independence or local independence in latent variable modeling.) Under this assumption, all observed covariance in a two- class mix is determined by the class sizes (taxon base rate) and the separation between taxon and complement groups. The commingled sample is ordered along one indicator, X, cut into a sequence of sub-samples, and the covariance of the remaining indicators, covYZ, within each sub-sample is calculated. If the latent structure of the data is taxonic, the covariance coefficients will rise to a cusp or peak in the sub- sample in which the base rate is closest to 50%. If the latent structure is dimensional, covariance coefficients will be comparatively stable. The analysis is repeated by ordering on Y (taking covXZ coefficients) and then on Z (covXY), and median or mean covariance curves are obtained. If a peak or cusp is present, its location is used to estimate the taxon base rate, and Bayesian posterior taxon membership probabilities are obtained using the corresponding indicator cut scores. Case membership then Online material supplementary to Linscott et al. — page S3 serves as a basis for finding taxon-complement separation, within class covariance, and estimating goodness of fit. As with all the methods considered here, indicators must be carefully selected. In the case of MAXCOV and other CCK methods, the method should be used iteratively to identify and eliminate inappropriate indicators.3 Indicator screening should also be performed prior to analysis where possible. Specific considerations include separation and conditional independence. For example, MAXCOV is not recommended in situations where separation between classes is likely to be less than 1.2 SD2, or when the degree of correlation between indicators is so high that the assumption of conditional independence (zero nuisance covariance) is substantially violated. Latent Variable Methods A latent variable is a random variable that is unobservable and that cannot be expressed as a function of manifest or observable indicators alone.4 Instead, its existence is inferred from patterns (correlation or covariation) among observed indicators. Latent variables may be continuous (factors, traits, random effects) or categorical (finite mixture or mixture distributions, latent class variables). Continuous variables model unobserved dimensional constructs by explaining the observed correlations, and are found using exploratory or confirmatory factor analysis (including structural equation modeling / item response theory). Categorical variables model unobserved typologies or classes of individuals and are found using latent class analysis (LCA) or latent profile analysis (LPA). That is, these statistical procedures can be used to provide competing (or complementary) dimensional and categorical representations, respectively, of the same data, and interpretations of such representations are constrained by the statistical procedures themselves. However, both approaches are used simultaneously in hybrid models that explain covariance using both continuous and categorical latent variables within a generalized latent variable framework. Exploratory and confirmatory factor analysis methods are based on the common factor model. This model specifies that single or multiple continuous latent variables (common factors) explain the correlations among the measured indicators, whereas a unique factor (with a specific and an error component) influences only one 2 In the class simulation used here, we are purposively overlooking this recommendation. Use of lesser separation may generate unstable parameter estimates. Online material supplementary to Linscott et al. — page S4 indicator and does not account for the observed correlations. The common factor model is used to estimate the pattern of association between the common factors and each indicator, and indexes the common factor as loadings while residuals measure the unique factor. Consequently, scores on one or more dimensions locate each individual. Assumptions associated with exploratory and confirmatory factor analyzes are that: (a) the sample is homogenous; (b) the latent variables are multivariate normal; (c) residuals are normally distributed; (d) factors and residuals are independent; (e) there is zero autocorrelation of residuals; and (f) relationships between indicators and common factors are linear. Under these assumptions, the distributions of the indicators are multivariate normal with no skewness or kurtosis. Finite mixture modeling identifies categorical latent variables representing two or more subpopulations that differ qualitatively or quantitatively. Classical LCA and LPA are special cases of mixture modeling. LCA uses categorical indicators as input and LPA uses continuously scaled indicators. As in factor analysis, a single categorical latent variable explains the covariance of the measured variables, and error is indexed by residuals. In contrast to factor analysis, mixture models allow the assignment of individuals to particular classes. Classical LCA and LPA have a number of assumptions: (a) the sample is heterogeneous, that is, it is a mixture of two or more subpopulations; (b) there is local independence, that is, within-class covariance is zero; (c) the variance-covariance matrices are homogeneous across classes; (d) the relationship between indicators and latent classes is linear. Given these assumptions, the higher order moments (i.e., skewness and kurtosis) will deviate from zero as differences between latent classes (i.e., of proportions, means, and variances) increase. A common assumption of latent continuous and categorical models is that the indicators are independent, conditional on the class or common factor. In factor analysis, this corresponds to the specification of uncorrelated residuals; in LCA and LPA, to conditional independence. However, recently developed generalized latent variable models take both categorical and continuous observed indicators as input. This permits the relaxation of the conditional independence assumption.5-7 One such hybrid approach, factor mixture modeling (FMM) allows complex structural relationships by simultaneously modeling common factor models within two or more latent classes. Thus, FMM is useful when there is reason to expect within class correlations of observed variables. FMM assumptions include that: (a) the observed sample is heterogeneous, that is, the joint distribution of the observed variables is a Online material supplementary to Linscott et al. — page S5 mixture distribution; and (b) there is multivariate normality within classes. Deviation from the latter assumption may lead to over-extraction of classes. Recent simulation studies suggest that hybrid modeling distinguishes correctly between simulations with categorical and continuous latent structures, although the degree of class separation and class base rates affect performance.5-7 There are several important general considerations to bear in mind when conducting or evaluating modeling research. Firstly, at the conceptual level, the difference between categorical and continuous latent variables has important heuristic significance. However, statistically, these are structurally equivalent; the K-class model is structurally equivalent to a K – 1 factor model with continuous indicators.8, 9 Therefore, if one assumes sample heterogeneity but, in fact, indicators derive from a latent continuous variable in a homogeneous population, LCA will extract (over- extract) classes. Likewise, assuming sample homogeneity when, in fact, indicators derive from a mixed distribution, exploratory factor analyzes will extract (over- extract) factors.7 Secondly, during model fitting, comparison is made among solutions from a series of models with increasing numbers of classes, factors, or both. Interpretation of the results—the choice of which heuristic model best fits data—cannot be made solely on the basis of fit or other indices,1, 10-12 although some attempt to.13, 14 Rather, interpretation depends on both statistical fit criteria and substantive reasoning. Several fit indices are used in latent variable modeling. If the maximum likelihood algorithm is used to generate solutions, the log-likelihood estimation indicates model fit. However, as log-likelihood increases with the number of parameters used, fit is better determined using both log-likelihood and a log-likelihood-based information criterion that favors parsimony. The most commonly encountered information criteria are the Bayesian information criterion (BIC), the sample size adjusted BIC (BIC´), and the Akaike information criterion (AIC). These three fit indices differ in the ways they correct for sample size and free parameters. Higher log- likelihood and lower information criteria scores indicate better fit. The likelihood ratio test (LRT) allows comparison of a K – 1 class mixture model with a K class mixture model. The mostly widely used LRT is the bootstrapped LRT.15 Finally, some software packages provide estimates of overall classification quality, or entropy, for which unity indicates perfect classification. Substantive considerations that may influence model Online material supplementary to Linscott et al. — page S6 selection involve analyzes of antecedents, covariates, and distal outcomes. Multiple methodological and design variables also should not be overlooked. Thirdly, when the assumptions of the common factor model are met, unmixed or single class distribution will have near-zero skewness and kurtosis. In contrast, skewness and kurtosis deviate significantly from zero in mixed or commingled distributions. Thus, maximum likelihood estimations of latent class and exploratory factor models should be comparable using log-likelihood-based information criteria. Here, again, better fit is indicated by larger log likelihood values.7 Simulations Two Sample Simulations with Different Latent Structures Consider a sample of 700 patients from a population in which all members are believed to be affected by Disorder A or Disorder B. For each individual, 8 normally distributed continuous measures are available, representing scores on measures of attributes that, although not pathognomonic, are known to be associated with the two disorders. Indicators X1 to X5 are associated with Disorder A; and indicators X4 to X8 are associated with Disorder B. Thus, among these, indicators X4 and X5 are not discriminating, measuring unitary features associated with both disorders. For each measure, a higher score indicates more of the associated attribute. Two separate data sets were constructed, one representing a latent dimensional structure in the population, the other representing a latent class structure where the true prevalence of Disorder A is 25%. The latter was constructed first, using the Stata random normal number generator, drawnorm, to generate n = 700 cases each with 8 normally distributed continuous scores. Subsequently, a latent structure was introduced into the data by adding a constant to 25% of the data values, and subtracting the same constant from the remaining data values. The constant was that which resulted in a separation of 1 SD between groups’ mean scores on the discriminating indices (i.e., X1 to X3 and X6 to X8) once the data were restandardized. To simulate a dimensional data set, the correlation matrix for the class simulation was obtained and used with the Stata drawnorm function to generate dimensional data with the same correlation matrix. For all indicators in both simulations, M = 0.0 and SD = 1.0. Other statistics for the data sets are presented in Table S1 and the correlation matrices in Table S2. Univariate and bivariate density plots for two example indicators from each set are illustrated in Figure S1. Online material supplementary to Linscott et al. — page S7 Table S1. Descriptive statistics for the class and dimension simulated data sets. Class Dimension Indicator Skewness Kurtosis XA - XB Odds Skewness Kurtosis X1 0.12 -0.19 1.00 6.53 0.05 0.26 X2 0.10 -0.06 0.94 5.15 -0.04 -0.01 X3 0.05 0.01 1.09 8.42 -0.07 -0.12 X4 -0.04 -0.01 0.08 1.10 -0.07 -0.41 X5 0.08 -0.18 0.09 1.26 -0.09 -0.21 X6 -0.20 -0.04 -1.06 7.17 0.15 0.20 X7 -0.07 0.25 -0.93 5.27 0.03 -0.02 X8 -0.11 -0.08 -0.98 6.40 0.05 0.34 Table S2. Covariance / correlation matrices for the class (beneath diagonal) and dimension (above diagonal) simulated data. Parenthetical values along the diagonal are the mean absolute differences in coefficients between the two simulations. Indicator X1 X2 X3 X4 X5 X6 X7 X8 X1 (0.03) 0.14 0.25 0.07 -0.09 -0.20 -0.19 -0.20 X2 0.17 (0.03) 0.23 0.01 0.03 -0.20 -0.14 -0.23 X3 0.20 0.21 (0.03) -0.02 -0.04 -0.25 -0.14 -0.28 X4 0.02 0.03 -0.03 (0.03) 0.07 -0.03 -0.02 -0.08 X5 -0.02 0.07 0.04 0.04 (0.05) -0.03 0.00 0.09 X6 -0.19 -0.14 -0.23 -0.02 0.00 (0.02) 0.22 0.18 X7 -0.19 -0.14 -0.15 0.02 -0.01 0.19 (0.02) 0.14 X8 -0.20 -0.21 -0.22 0.00 0.02 0.19 0.20 (0.04) Online material supplementary to Linscott et al. — page S8 Class simulation Dimensional simulation Figure S1. Illustrative bivariate and univariate density plots for indicators X1 and X2 from the class (upper) and dimensional (lower) simulations. Online material supplementary to Linscott et al. — page S9 Indicator Selection and MAXCOV Analyzes of Simulations Two observations affect indicator selection. First, high scores on indicators X1 to X3 characterize Disorder A and high scores on X6 to X8 characterize Disorder B. There is no a priori reason for including indicators X4 and X5 in an analysis because these are not discriminative attributes. Secondly, and following from the first, low scores on indicators X6 to X8 are possibly characteristic of Disorder A, given the sampling population. So, as the six discriminating indicators must be consistent in the direction of their discrimination, the scoring of X6 to X8 should be reversed. If we were not in a position to know as much about the population sample at the outset, the same selection decisions could be made on the basis of scrutiny of scatter plots and the correlation matrix (Table S2) and some preliminary MAXCOV iterations. First, if indicators are sensitive, with separations of between 0.9 and 1.1 SD, and the base rate is assumed to range between 0.2 and 0.3, given conditional independence, the pairwise correlation coefficients should fall between r = 0.13 and r = 0.25.16, 17 Consequently, the near-zero pairwise correlations of X4 to X5 suggest these measures cannot be used to separate the taxon from the complement and so warrant exclusion. Second, the negative or parataxonic correlations among indicators X1 to X3 and X6 to X8 imply that these indicators are not sensitive to the same latent class.17 One might arrive at the conclusion that three scores should be reversed after demonstrating taxon overlap when the two sets of three indicators are analyzed separately. Thus, for both the class and dimensional simulations, the scores on X6 to X8 were reversed and these along with X1 to X3 were subjected to MAXCOV analyzes. With six indicators, there are 60 indicator triplet combinations to be analyzed. The principal graphical results are presented in Figure S2. The covariance curve for the class simulation has a pronounced peak to the right of center, whereas for the dimensional simulation, the covariance curve is both elevated and reveals no clear peak, suggesting the latent structure is not taxonic. The observed base rates for the class and dimensional simulations were M = 0.270 and M = 0.458, respectively. Online material supplementary to Linscott et al. — page S10 Class simulation 0.3 300 Covariance 0.2 Frequency 200 0.1 100 0.0 0 -2 0 2 0.0 0.5 1.0 Subsample z-coordinate Taxon membership probability Dimensional simulation 0.3 300 Covariance 0.2 Frequency 200 0.1 100 0.0 0 -2 0 2 0.0 0.5 1.0 Subsample z-coordinate Taxon membership probability Figure S2. MAXCOV covariance curves and class membership probability histograms obtained from analysis of six indicators for the class and dimensional simulations. The solid line in the covariance curves provides a loess-smoothed covariance plot. Several observations corroborate the findings from the class simulation: (a) the mean within-class correlation coefficients for the taxon and complement were low, r = -0.027 and r = 0.058, respectively, suggesting conditional independence (minimal nuisance covariance); (b) the mean of indicator separations between the taxon and complement classes identified from posterior probabilities was M = 0.95 (SD = 0.12); (c) the Bayesian classification of individuals to classes resulted in a clear U-shaped distribution of membership probabilities (Figure S2); (d) the mean residual pairwise Online material supplementary to Linscott et al. — page S11 covariance in the sample was covresidual = 0.036; and (e) the Jöreskog and Sörbom goodness of fit index was GFI = 0.992. Base-rate variance is also used as a consistency test, with large variance suggesting an unstable and therefore inconsistent solution. In the class simulation, the observed standard deviation of base rates was SD = 0.199, a value that is larger than desired. However, when the separation was increased to 1.2 SD, the base rates were much more consistent, with SD = 0.109. Stronger corroborative evidence would be obtained through the use of one or two other coherent cut kinetic procedures, and such an approach is strongly recommended. Strictly speaking, the failure to observe a clear peak in the dimensional simulation (Figure S2) means that there is no taxonicity to be corroborated by consistency tests. Consistency tests are not dispositive given a flat or ambiguous covariance curve. For comparison, however, the consistency indices from the dimensional simulation were: (a) r = 0.050 and r = 0.114; (b) indicator separation M = 0.74, SD = 0.16; (c) Figure 2; (d) covresidual = 0.066; and (e) GFI = 0.974. The base-rate SD = 0.322; analyzes of a dimensional simulation corresponding to the 1.2 SD separation class simulation also yielded high base-rate variance, SD = 0.286. Given our omniscience with respect to true class membership in the class simulation, it is possible to determine the classification accuracy of the MAXCOV analysis for that simulation. There were 77 misclassifications: 31 false positives and 46 false negatives, giving sensitivity and specificity estimates of 0.74 and 0.94, respectively, and a likelihood ratio of 12.4 (a likelihood ratio of 10 or greater is generally considered diagnostic). Latent Variable Modeling of Simulations We fitted a series of models to the dimensional and two-class simulated data sets (Tables S3 and S4). For FMM fitted models, factor variance was fixed at zero and factor loadings were constrained so that all the factor parameters are class specific. That is, the analysis is fully exploratory. Log-likelihood ratios and information criteria obtained for the dimensional and class simulations are shown in Tables S3 and S4, respectively. Given the hypothetical nature of the simulation, we set aside consideration of potential substantive issues; models were rejected on the basis of fit alone. Online material supplementary to Linscott et al. — page S12 Table S3. Fit indices obtained for the dimensional simulation. Log LRT Model AIC BIC BIC´ likelihood p-valuea LPA models 1C -5956.54 11937.08 11991.69 11953.59 — 2C -5830.11 11698.23 11784.70 11724.37 0.00 3C -5803.86 11659.73 11778.06 11695.50 0.00 4C -5795.65 11657.30 11807.50 11702.65 0.18 FMM models 1F1Cb -5806.76 11649.53 11731.45 11674.29 — 2F1C -5801.99 11649.99 11754.66 11681.63 — 3F1C -5800.06 11654.11 11776.99 11691.26 — 1F2C -5792.02 11634.04 11747.82 11668.44 1.00 Note. C = class; F = factor; 1F2C = 1-factor 2-class; AIC = Akaike information criterion; BIC = Bayesian information criterion; BIC´ = sample size adjusted BIC; LRT = bootstrapped likelihood ratio test. a The LRT is not calculable for models with one class. b The true latent structure. Table S4. Fit indices obtained for the class simulation. Log LRT Model AIC BIC BIC´ likelihood p-valuea LPA models 1C -5956.54 11937.08 11991.69 11953.59 — b 2C -5794.02 11626.03 11712.50 11652.18 0.00 3C -5787.85 11627.70 11746.03 11663.47 1.00 FMM models 1F1C -5822.02 11680.04 11761.96 11704.81 — 2F1C -5820.25 11686.50 11791.16 11718.13 — 3F1C -5819.11 11692.23 11815.10 11729.37 — 1F2C -5800.13 11650.27 11764.05 11668.67 0.43 Note. See note to Table 3. a The LRT is not calculable for models with one class. b The true latent structure. Online material supplementary to Linscott et al. — page S13 Analysis and model appraisal proceeded as follows for both simulations. Beginning with one-class (1C) LPA, classes were added (i.e., 1C, 2C, 3C, . . . ) until the bootstrapped LRT statistic was not significant, indicating the model should be rejected in favor of one with fewer classes. Subsequently, increasing numbers of factors (F) were added to one-class and then two-class FMM (i.e., 1F1C, 2F1C, 3F1C, . . . , 1F2C, 2F2C, . . . ) until the model with the larger number of components was rejected. To evaluate results for the dimensional simulation, consider Table S3. Among the LPA models it can be seen that the information criteria deteriorate (decrease) as the number of classes increase (Table S3). Critically, the bootstrapped LRT for the four-class model is nonsignificant (p = 0.18), suggesting the three-class model is preferable to the four-class model. Although neither the two- nor the three-class models have good classification qualities, there were no obvious signs of inadmissible parameter estimates (residuals) or extremely small classes to suggest over-extraction of classes. Thus, had we only performed LPA, results from the dimensional simulation would have likely been interpreted as supporting the two- or three-class models rather than the correct 1F1C model. However, we went on to perform FMM. Comparing only the exploratory factor analytic models (i.e., 1F1C, 2F1C, and 3F1C), the correct model, 1F1C, would have been chosen on the basis of any of the information criteria in Table S3. When the results of all the fitted models are compared, the AIC and BIC´ favor the one-factor two-class (1F2C) solution. However, 1F2C is not supported by the bootstrapped LRT, for which p = 1.0. That is, 1F2C is rejected because of the bootstrapped LRT and 1C1F, the correct model, has the best fit. Turning to the class-simulation (Table S4), the information criteria improve dramatically in the step from the one- to the two-class LPA model, but begin to uniformly deteriorate in the subsequent transition to a three-class LPA model. If the analyzes had been restricted to LPA only, the correct model (2C) would have been chosen given any statistical criterion, including the bootstrapped LRT. In contrast, if only FMM were performed, the erroneous 1F1C solution would have been selected on the basis of a non-significant LRT for the 1F2C solution, although moderate factor loadings obtained with this might have cast doubt on the solution. Comparison across all LPA and FMM models shows the correct model, 2C, would be chosen using any of the statistical indices. Online material supplementary to Linscott et al. — page S14 As with MAXCOV, we compared true class membership and membership assigned using the 2C LPA model. There were 49 misclassifications: 29 false positives and 20 false negatives, giving sensitivity and specificity estimates of 0.89 and 0.94, respectively, and a likelihood ratio of 16.0. Illustration of Reverse Association To illustrate reverse association, we will consider data reported by Kendler et al.2 that was used to test the validity of a six-class solution describing the latent structure of features of schizophrenia and affective disorders. The six classes were labeled classic schizophrenia, major depression, schizophreniform, bipolar schizomania, schizodepression, and hebephrenia. The validating evidence included differences in the frequencies of probands’ social and psychopathology outcomes and relatives’ morbidity risk for psychosis and affective disorder among six classes. For simplicity, we will restrict our discussion to the latter, which included rates for six morbid outcomes: three schizophrenia-related outcomes (schizophrenia, nonaffective psychoses, schizophrenia spectrum disorders) and three affective outcomes (bipolar affective illness, unipolar affective illness, and affective illness). These outcome data are shown in Figure S3. Schizophrenia-Related Outcomes Affective Illness Outcomes Schizophrenia Bipola r illn ess 50 Nonaffective psychoses 50 Unipolar illness Schizophrenia spectrum Affective illess 40 40 Morbid risk Morbid risk 30 30 20 20 10 10 0 0 I II III IV V VI I II III IV V VI Figure S3. Relatives’ schizophrenia-related and affective illness morbidity risks for six classes identified by Kendler et al.2 Class abbreviations: I = classic schizophrenia; II = major depression; III = schizophreniform disorder; IV = bipolar-schizomania; V = schizodepression; and VI = hebephrenia. Online material supplementary to Linscott et al. — page S15 For the three schizophrenia-related outcomes, the differences in relatives’ morbidity risk across the six classes (Figure S3) are, in essence, a series of single dissociations. If the same-process hypothesis were correct, plots of three or more values of one parameter against the corresponding values of the other parameter must yield monotonic-decreasing or monotonic-increasing curves. Thus, pairwise plots of the schizophrenia-related outcomes are monotonic because, as can be seen in the top panel of Figure S4, monotonic curves—specifically, curves that have no negatively sloping segments—can be drawn over this panel to adequately represent the position of all six classes. This suggests a single process could indeed account for the differences among the six classes on these three variables. If such plots yielded nonmonotonic curves (i.e., increases in one variable are sometimes associated with decreases in the other variable and are sometimes associated with increases), this is logically incompatible with the same-process hypothesis. Thus, the situation appears to be different for the affective illness outcomes; one could not draw monotonic curves to represent the groups shown in each of the plots in the bottom panel of Figure S4. That is, monotonic curves do not adequately capture the relationships of bipolar illness with the other two outcomes (affective illness, unipolar illness) because curves that will adequately represent the position of all six classes must have segments that are positively sloping (i.e., go up) as well as segments that are negatively sloping (i.e., go down). Consequently, given the assumption that a monotonic process leads to unipolar morbidity in relatives, these plots allow one to reject the notion that the same process could possibly lead to bipolar morbidity in relatives. Online material supplementary to Linscott et al. — page S16 Schizophrenia-Related Outcomes Nonaffective Psychosis VI III V IV I II Schizophrenia Spectrum VI VI V V III III IV I IIV II II Schizophrenia Nonaffective Psychosis Affective Illness Outcomes Bipolar Illness III II IV V I VI Affective Illness II IV II IV V V I I III III VI VI Unipolar Illness Bipolar Illness Figure S4. Pairwise associations among relatives’ morbidity risks for six classes identified by Kendler et al.2 Error bars represent standard errors of risk. Class abbreviations: I = classic schizophrenia; II = major depression; III = schizophreniform disorder; IV = bipolar-schizomania; V = schizodepression; and VI = hebephrenia. Online material supplementary to Linscott et al. — page S17 References 1. Meehl PE. What's in a taxon? Journal of Abnormal Psychology 2004;113:39-43. 2. Kendler KS, Karkowski LM, Walsh D. The structure of psychosis: latent class analysis of probands from the Roscommon Family Study. Archives of General Psychiatry 1998;55:492-499. 3. Golden RR, Meehl PE. Detection of the schizoid taxon with MMPI indicators. Journal of Abnormal Psychology 1979;88:217-233. 4. Bentler PM. Linear systems with multiple levels and types of latent variables. In: Jöreskog KG, Wold H, eds. Systems under indirect observation: Causality, structure, prediction. Vol 1. Amsterdam: North Holland; 1982:101-130. 5. Lubke GH, Muthén B. Investigating population heterogeneity with factor mixture models. Psychological Methods 2005;10:21-39. 6. Lubke GH, Muthén B. Performance of factor mixture models as a function of model size, covariate effects, and class-specific parameters. Structural Equation Modeling 2007;14:26-47. 7. Lubke GH, Neale MC. Distinguishing between latent classes and continuous factors: Resolution by maximum likelihood? Multivariate Behavioral Research 2006;41:499-532. 8. Bauer DJ, Curran PJ. The integration of continuous and discrete latent variable models: Potential problems and promising opportunities. Psychological Methods 2004;9:3-29. 9. Bartholomew DJ. Latent variable models and factor analysis. London: Charles Griffin; 1987. 10. Armstrong JS. Derivation of theory by means of factor analysis or Tom Swift and his electrical factor analysis machine. The American Statistician 1967;21:17-21. 11. Armstrong JS, Soelberg P. On the interpretation of factor analysis. Psychological Bulletin 1968;70:361-364. 12. Waller NG, Meehl PE. Multivariate taxometric procedures: Distinguishing types from continua. Thousand Oaks, CA: Sage; 1998. 13. Jablensky A, Woodbury MA. Dementia praecox and manic-depressive insanity in 1908: A grade of membership analysis of the Kraepelinian dichotomy. European Archives of Psychiatry and Clinical Neuroscience 1995;245:202-209. Online material supplementary to Linscott et al. — page S18 14. Boks MPM, Leask S, Vermunt JK, Kahn RS. The structure of psychosis revisited: the role of mood symptoms. Schizophrenia Research 2007;93:178-185. 15. McLachlan GJ, Peel D. Finite mixture models. New York: John Wiley; 2000. 16. Meehl PE. Factors and taxa, traits and types, differences of degree and differences in kind. Journal of Personality 1992;60:117-174. 17. Meehl PE. Clarifications about taxometric method. Applied and Preventive Psychology 1999;8:165-174.