Heterogeneity+and+Bias+in+Models+of+Vote+Choice

Reviews
Shared by: Myrna Carlson
Categories
Tags
Stats
views:
37
rating:
not rated
reviews:
0
posted:
7/1/2008
language:
English
pages:
0
Voters in the United States do not behave in a homogenous manner. Voting models typically account for such heterogeneity by seeking to decompose the process of vote choice into a number of distinct components. By examining voting choice data in this way, researchers are able to ascertain reasonable estimates of the average effect of various socio-economic and political variables on the candidate selection process. Models of this sort, while plausible, may not properly reflect the true heterogeneity of the American voter. At their core, simple models assume that voters use a common and uniform decision rule when deciding between candidates. But, it is possible, if not likely, that different groups and classes of citizens use differently structured processes to determine their choice of candidates. Researchers have attempted to account for this heterogeneity in a variety of ways. Rivers (1988), for example, has accounted for differences in the voting behavior of individuals by allowing the mean effect of theoretically important variables to vary across those individuals. Once voter heterogeneity is taken into account in estimating issue voting models, Rivers concludes, few voters resemble the “average” voter supposedly described by traditional voting models. Jackson (1992) uses different statistical methods than Rivers to estimate the nature of voter heterogeneity, but comes to the same conclusion. Individuals do appear to use non-uniform decision rules when choosing among presidential candidates. While these approaches are extremely promising, in this paper I will take a different approach and examine heterogeneity in the vote choice process in three more subtle ways. The first form of heterogeneity I discuss in this paper concerns the possible links between the turnout and vote choice decisions. Landmark voting studies, such as Voting (1954) and The American Voter (1960) set an early agenda for empirical studies of the electorate that have been followed and expanded through the years. But while the Michigan and Columbia models both made important contributions to the studies of vote choice and turnout, neither gave much in-depth consideration to the possible linkages between those processes. To this day, in fact, models of turnout and vote choice often remain distinct studies. Work on selection bias, however, indicates that this approach may be problematic. A number of authors (Dubin and Rivers 1990; Sanders 1995) have recently constructed models that attempt to correct for such selection bias. In this paper, I will examine one particular model — the bivariate probit model advanced by Dubin and Rivers (1990). To account for a second source of heterogeneity — non-constant variance — I will extend this model to account for heteroskedasticity in the vote choice equation. I will then estimate the bivariate probit selection model and the heteroskedastic bivariate probit selection model using data from the 1984 National Election Study (NES), under two specifications of the vote choice process; a retrospective voting model and a spatial voting model. I will then compare these estimates to estimates obtained through an independent model of the vote choice to gauge the differences across the models and estimate the degree of bias in standard analyses of vote choice. Finally, I will examine a third source of heterogeneity in vote choice models by estimating how patterns of missing data may introduce bias into estimates of vote choice model parameters. This paper, then, analyses the effects of three forms of sample-based heterogeneity in models of vote choice: (1) heterogeneity induced by non-random selection from the full population of citizens into the vote choice model sample; (2) heterogeneity due to the interaction of selection bias and non-constant variance; and (3) heterogeneity in the patterns of missing data across groups of the respondents. While much of the discussion in the paper is focused on the first two forms of heterogeneity, as I will demonstrate below, it is the third form of heterogeneity 1 — one not typically addressed in the political science literature— that is the most important determinant of the degree of bias in vote choice models. MODEL CONSTRUCTION While several recent presidential American elections and most elections in other countries involve several alternatives, the situation where representatives of two parties compete for office is the simplest case to model. In the interests of “building from the ground up”, then, I will restrict the analyses in this paper to the situation where an individual faces three choices (1) Vote for candidate A, (2) Vote for candidate B, or (3) Do not Vote. Independent Probit Model The probabilities of falling into the three different response categories can most simply be calculated using Bayes’ rule. In the two candidate case, the probability of not voting is simply Pr(Don’t vote). The probability of voting for Candidate A is Pr(Vote) x Pr(Vote Candidate A|Vote). Finally the probability of voting for Candidate B is Pr(Vote) x Pr(Vote Candidate B|Vote). The most obvious starting point is to assume that the processes of candidate choice and turnout are unrelated. In probability terms, then, we would say that the two decisions are independent events. Under such circumstances, the probability of voting for Candidate A collapses to Pr(Vote) x Pr(Vote Candidate A), the probability of voting for Candidate B collapses to Pr(Vote) x Pr(Vote Candidate B) and the probability of not voting remains Pr(Don’t vote). The important point for the purposes of this paper, however, is that because — under this specification — vote choice and turnout are considered separate processes, the estimates of the predictive power of the variables in the vote choice model can be estimated independent of the turnout model without bias. Selection Bias Most analysts of voting behavior do not go beyond this simple starting point. Traditional studies of vote choice, for example, include only the population of individuals who come to the polls on election day. Such a “naive” approach is potentially problematic if researchers attempt — as they often do — to draw inferences about the political preferences of the U.S. population, because the naive approach ignores the selection mechanism at work in electoral data — namely the process by which people decide whether to turnout. Dubin and Rivers (1990) note, “restricting data analysis to the sample of voters leaves us with a self-selected sample,” that is, those respondents who have chosen to go to the polls on election day. There is good reason to believe that this self-selected sample is not a random subset of the full population. Empirical work on turnout (see, for example, Wolfinger and Rosenstone 1980; Rosenstone and Hansen 1993) has demonstrated that, on average, voters possess higher level of education, larger incomes, and are older than non-voters. These differences may make it difficult to translate inferences about political preferences from the subsumable of voters to the entire population because the characteristics identified by Rosenstone, Hansen, and Wolfinger may be related to vote choice as well as the decision to turn out. As Achen (1986) argues, the effects of selection bias can be avoided in regression analysis if and only if the unobserved factors influencing selection are uncorrelated with the 2 unobserved factors influencing outcomes. Such a state of affairs may arise if (1) the turnout and vote choice processes are independent events or (2) if every variable influencing selection is controlled in the outcome equation. But previous empirical research strongly suggests that the processes of turnout and vote choice are linked, calling into question the first condition. Moreover, as Achen points out, it is practically impossible to fully “control” for the selection mechanism in the outcome equation and achieve the second condition. In sum if there are common factors that determine both turnout and preferences — as there almost certainly are — turnout will be a source of selection bias in analyses of voting behavior.1 Such selection bias can have highly deleterious effects on our inferences. Ignoring the sample selection mechanism in effect omits a variable — the effect of selection — from the outcome equation. This form of omitted variable bias is illustrated most transparently in the context of the Heckman selection model. In that instance, the outcome equation, estimated by itself, omits the expected value of the error term of the outcome equation under censoring (Achen 1986).2 This analogy of the ignored selection mechanism to the omitted variable bias — identified first by Heckman — is transferable to other selection models, such as the Tobit (Breen 1996) or the bivariate probit selection model (Dubin and Rivers 1990).3 In all these cases, ignoring the sample selection mechanism in the presence of selection bias will lead to two problems. First, estimation will produce biased estimates of b because, in essence, the variable measuring l has been omitted. More precisely, we say that the estimates of b are inconsistent (Maddala 1983). In addition, the estimates of b will be inefficient, because the error term of the outcome equation is heteroskedastic.4 Thus, T-tests and other classical hypothesis tests will lead to incorrect inferences concerning the statistical significance of variables in the outcome equation (Greene 1993). 1 Specifically, as Dubin and Rivers note, errors occurring in the outcome equation sample do not have zero mean because the sampling procedure has picked out those observations that are, in terms of the theory, “unusual.” Achen (1986) makes the same point in the case where the outcome equation dependent variable is continuous (and is, therefore, estimated using OLS). Specifically, he notes that if the error terms in the selection and outcome equations — u1i and u2i respectively — are correlated in the censored sample, the disturbance term of the outcome equation u2i has neither mean zero nor zero correlation with the outcome independent variables, even though it has both properties in the full sample. Thus, when the sample selection process is related to the error of the outcome equations, separate estimation of the selection and outcome equation will lead to faulty inferences concerning the effect of the variables of interest in the outcome equation. 2 The fix for this misspecification in the case where the outcome equation is estimated using OLS is to add a new variable to the outcome equation, l, which is the expected value of the outcome equation error term under censoring. This variable is also known as the “inverse mill’s ratio” or the “hazard rate” (Breen 1996). As Achen (1986) notes, asymptotically, this procedure removes from the disturbance that part of the outcome equation error term which is correlated with the independent variables, creating a new disturbance term that is uncorrelated with the independent variables. While this procedure will not be undertaken in this paper, the bivariate probit selection model is simply another “fix” for the omitted variable problem identified by Heckman and Achen, undertaken in a full maximum likelihood context. 3 The Heckman approach cannot be used in the case where the outcome equation is specified as a probit because when we add l to the outcome equation, the resulting disturbances are no longer normally distributed. Thus, the coefficients in the outcome probit will be inconsistent (Achen 1986). 4 This heteroskedasticity results because the disturbance term is correlated with the independent variables. Thus, the variance of u2i varies systematically with the values of the independent variables. 3 In sum, if we want to make accurate inferences about the relationship between the socioeconomic and political characteristics of all citizens and their political preferences we must in some way account for the selection bias introduced when the subpopulation of voters is drawn from the full sample. In Greene’s terms, we must somehow account for the incidental truncation of the vote choice dependent variable. Otherwise, as Dubin and Rivers note, inferences drawn from the subsumable of non-missing observations — that is, those individuals who turn out to the polls — are likely to produce misleading conclusions regarding the relationships of interest.5 Dubin Rivers Model Several remedies exist for the problem specified above. One correction for selection bias in the case of an outcome equation with a binary dependent variable is the bivariate probit selection model, described by Dubin and Rivers (1990). Their model combines information about the untruncated successes and failures (the outcome equation) with information about the untruncated observations (those excluded by the selection equation) in a MLE switching model.6 The Dubin-Rivers model can be represented as a bivariate probit with one of the quadrants collapsed over those individuals who are selected out of the outcome equation. Mathematically, this model is written as: Ln L (b1, b2, r | y) = å y2i * y1i (ln (BiNorm(X1b1, X2b2, r))) + y2i * (1-y1i) (ln (F(X2b2) -(BiNorm(X1b1, X2b2, r))))) + (1-y2i)* (ln (F(X2b2) Where: Y1i ~ fbern (y1i | p1i), p1i defined by the underlying probability term Y1i* = X1b1 + u1i, is the outcome process, Y2i ~ fbern (y2i | p2i), p2i defined by the underlying probability term Y2i* = X2b2 + u2i, is the selection process y1i =0 and y2i =1 is an untruncated failure, y1i =1 and y2i =1 is an untruncated sucess, y2i =0 is a truncated observation. BiNorm(X1b1, X2b2, r) is the cumulative bivariate normal function defined by X1b1, X2b2, and r; and u1i and u2i are bivariate normally distributed iid, with su1,u2 = r.7 (1) 5 It is important that the joint model of turnout and vote choice advanced in this paper is designed only to correct for selection bias in the vote choice equation. In particular it is not an attempt to construct a unified model of turnout and vote choice (see Sanders 1995; 1996). 6 The Bivariate probit estimation is feasible in the case of most survey data on elections because, more often than not, the sample of voters is censored, not truncated. That is, electoral survey data often contains some information about the people whose vote choice values fall in the truncated portion of the distribution (Breen 1996). 7 In practice, this model is identified as long as the same variables are not included in both the selection and outcome equations. That is, while it is possible to identify the model through the nonlinearity of the selection equation, identification should proceed from exclusion restrictions. 4 Estimation of this model yields three sets of parameters: b1, the effect parameters for the outcome equation; b2, the effect parameters for the selection equation; and r, the correlation of the errors between the two equations. It should be noted that the non-zero correlation between the error terms, denoted by r, should not result from the omission of explanatory variables common to both equations. Instead, as Breen (1996) notes, “the correlation should be thought of as intrinsic to the model. In other words, we assume r ¹ 0 in the theoretic model that we posit for the population and not simply for the sample in which we may have omitted the measurement of a variable common” to the selection and outcome equations. The errors, therefore, covary despite proper model specification. The cause of the correlation terms should, then, be unmeasurable. “In essence,” Breen concludes, “both equations are affected (in part) by the same random perturbations (or random perturbations that tend to covary)” (p. 35.). Thus r can be seen as representing the link between the unmeasured factors in the selection equation and in the outcome equation.8 Heteroskedasticity and Selection Bias While vote choice models seem to be a prime candidate for selection bias effects, the evidence to date using the Dubin-Rivers correction has been spotty (see Dubin and Rivers 1990; Sanders 1995). Many researchers, therefore, have concluded that selection bias, though problematic in theory, is inconsequential for estimation purposes. Thus, it appears that with regard to selection bias, there is a large disjunct between theoretical expectations and empirical reality. One reason for this discrepancy might be that the selection bias models currently in use are inappropriate for voting data. Monte Carlo simulations indicate that models of selection bias will fail in the presence of heteroskedasticity (Ameminya 1981; Greene 1993; Breen 1996). The model used by Dubin and Rivers assumes constant variance. But there are reasons to question this assumption. Recent work by Alvarez and Brehm (1995; 1996) indicate that, under certain circumstances, survey attitudes exhibit heteroskedasticity. While attitudes and vote choice decisions are not necessarily realizations of the same underlying processes, it may reasonable to assume that the vote choices of individuals would also have non-constant variance. Thus, to obtain reliable estimates of the relationship between political preferences and vote choice, theory indicates that we should develop voting models that allow for both selection bias — by incorporating the preferences of those who do not turn out — and heteroskedasticity — by allowing for diversity in the variance ascribed to an individual’s vote choice. A model that incorporates both of these effects is the heteroskedastic bivariate probit selection model, which is similar to the model presented in Equation 1, but parameterizes the variance of the outcome equation as a function of Zi, a set of explanatory variables.9 This parameterization follows the technique described by Greene (1993) and utilized by Alvarez and Brehm (1995, 1996). Instead of assuming that the probability (p1i ) of individual i achieving a “sucess” in the outcome equation is generated by a homogenous process, we assume that some individuals have a wider underlying distribution of choices than others. This heterogeneity is 8 The selection and outcome equations may also be correlated through measured variables. If this is the case, the relevant independent variables should be included in both equations to account for selection effects. 9 In the heteroskedastic bivariate probit selection models estimated in this paper, I parametizered the variance term only in the outcome equation because the theoretical and empirical concerns cited above indicates that heteroskedasticity could exist in the vote choice process. 5 modeled in the likelihood function through a slightly different systematic component than is used in the traditional probit model, which assumes a constant variance. Specifically, a variance model is included in the denominator of the systematic component of the outcome equation (Alvarez and Brehm 1995, pp. 1061-2). Mathematically, the heteroskedastic bivariate probit selection model can be represented as: 6 Ln L (b1, b2, s1, s2, r | y) = (2) å y2i * y1i (ln (BiNorm((X1b1/exp(Zig)), X2b2, r))) + y2i * (1-y1i) (ln (F(X2b2) -(BiNorm((X1b1/exp(Zig)), X2b2, r))))) + (1-y2i)* (ln (F(X2b2) Where: Y1i ~ fbern (y1i | p1i), p1i defined by the underlying probability term Y1i* = X1b1 + u1i, is the outcome process, Y2i ~ fbern (y2i | p2i), p2i defined by the underlying probability term Y2i* = X2b2 + u2i, is the selection process y1i = 0 and y2i = 1 is an untruncated failure, y1i = 1 and y2i = 1 is an untruncated sucess, y2i = 0 is a truncated observation. BiNorm(X1b1/exp(Zig)), X2b2, r) is the cumulative bivariate normal function defined by X1b1/exp(Zig), X2b2, and r; Var(u1i) = exp(Zig)2, Var(u2i) = 1;10 and u1i and u2i are bivariate normally distributed iid, with su1,u2 = r. Estimation of this model yields four sets of parameters: b1, the effect parameters for the outcome equation; b2, the effect parameters for the selection equation; g, the effects parameters for the variance term in the outcome equation; and r, the correlation of the errors between the two equations. Summary of Models The three models of the vote choice decisions presented above proceed from different assumptions concerning the link between the vote choice and turnout processes. The separate model presumes that the two acts are independent. The bivariate probit model, on the other hand, presumes that voting and turnout are distinct, but linked processes. Finally, the heteroskedastic bivariate probit also assumes that the vote choice and turnout processes are linked, but allows for non-constant variance in the outcome equation. MODEL SPECIFICATION Many of the previous efforts to detail selection bias effects have gauged the degree of such bias by examining only one model of vote choice (see Dubin and Rivers 1990; Sanders 1995, 1996). While such an approach is certainly valid, estimating only a single model of vote choice allows us only a partial view of any selection bias because we can gauge only the presence or absence of selection bias under a single set of assumptions concerning the vote choice process. A potentially more comprehensive procedure is to examine the relative degree of 10 I assume here that the variance of the selection process is constant and, following convention, I have standardized the variance to 1. 7 selection bias across a variety of models which operationalize different theories of the vote choice. Adopting such a research strategy allows us to examine several situations where we would expect to find selection bias and gauge that bias in shades of gray, not simply by its presence or absence. Vote Choice (Outcome) Equations In the analyses that will follow, therefore, I will estimate two equations, each designed to operationalize a theory of the vote choice process: (1) the retrospective voting paradigm, and (2) the spatial voting paradigm. These are described below, in turn. 1. Retrospective Voting Paradigm Several authors have posited that voting is inherently retrospective in nature (Fiorina 1981, Downs 1957). That is, they argue that individuals make their vote choice on the basis of how the candidates — or their parties — have performed in the past. Whether retrospective voting proceeds from Key’s “reward/punishment” framework or a Downsian pseudo-prospective voting perspective (Fiorina 1981), the same model results: vote choice is primarily determined by retrospective evaluation of the performance of the in-party. While retrospective assessments of party performance may encompass a number of factors, previous work has indicated that economic performance is critical (Fiorina 1981). Thus, the two central variables in the retrospective voting framework are measures of a citizen’s evaluation of the economy. The first, national economy, captures sociotropic concerns (Kinder and Kiewiet 1981) by gauging how well an individual feels the nation is doing as whole, relative to the previous year. The second, personal economic situation, captures “pocketbook concerns” by measuring an individual’s assessment of their own economic situation compared to a year earlier. While economic conditions play an important role in the vote choice process, other factors also play a part in the candidate choice process. If there is one thing that 40 years of voting research has taught us, it is that party identification is a major determinant of vote choice. Whether partisanship is seen as a fixed perceptual screen (Campbell, Converse, Miller, and Stokes 1960) or a more fluid summary view of the major political parties (Jackson 1975; Jackson and Franklin 1983) it seems that at the moment of vote decision, party identification plays a major role in determining candidate selection.11 More importantly, partisan effects may condition economic assessments and should be included as a control variable. Partisanship, then, is the third variable in my retrospective vote choice model.12 Finally, I included race to avoid 11 The two perspectives, in practice, lead to very different evaluations of how party identification should be treated in the analysis of vote choice. Where the Michigan model presumes that party identification is stable over an individual’s lifetime and may, therefore, be treated as an exogenous variable, the Jackson/Franklin perspective would treat party identification as an endogenous variable, because retrospective evaluations not only effect the direction of vote choice, but also play a major role in determining shifts in partisan identification from election to election. While Jackson and Franklin’s perspective certainly has a great deal of merit, for computational ease, partisan identification will be treated as an exogenous variable for the purposes of this paper. 12 For the purposes of these analyses, I have collapsed party identification to a five point scale from the traditional seven point scale. In light of the findings of Keith, et al. (1992) that independent leaners “vote very much like the outright partisans of the parties toward which they incline,” I undertook additional analyses to determine whether independent leaners should be collapsed into the weak partisan categories. 8 specification error in my equation, as race is highly correlated with both vote choice and party identification. 2. Spatial Voting Model The spatial theory of voting, in the tradition of Downs (1957), presumes that voters will cast their vote for the candidate closest to themselves in a space that describes all the factors that are of concern to the voters. These factors may, as Enelow and Hinich note, include classic campaign issues, such as defense spending and unemployment or they may encompass candidate attributes, such as integrity and leadership ability (Enelow and Hinich 1984). For the purposes of my analysis, I have constructed a four dimensional evaluative space. Because the partisanship of the candidates in a presidential election is readily apparent, the party identification variable is sufficient to capture the importance of “partisan distance” in the vote choice decision. In order to capture the effect of distance along issue dimensions, I included three measures of issue distance between the respondent and each of the major party candidates in the 1984 presidential election.13 The first “ideological distance,” captures the general issue distance on a liberal-conservative continuum. The second measure, distance concerning position on general social services, is intended to capture differences on domestic policy, while the third measure, distance on views concerning cooperation with Russia, captures feelings about the cold war, the most important foreign policy issue in the early 1980s. Turnout (Selection Equation) While I posit two models of the vote choice equation, there is no a priori reason to think that the decision to turnout would be different across those equations. Thus, I use the same model of turnout across both models. Following Rosenstone and Hansen (1993) I propose a multi-dimensional model of political participation — which includes variables accounting for the effect of individual resources, mobilization activity by the political parties, social involvement, and the benefits of voting — as the framework for my turnout equation. First, following those authors, I recognize that voting is a costly activity, in terms of acquiring information and bearing the costs of getting to the polls. Thus, individuals with ample political resources are better able to participate than people with meager resources. In order to account for the effects of these resources, I include in my model measures of four variables: income, education, political information, and age. These resource variables are both theoretically and empirically important determinants of the decision to vote. Those with a greater income are better able to subsidize the costs of voting. Moreover, they are more likely to move in social networks where voting is viewed as a desirable activity. Education gives people the skills and knowledge necessary to understand politics and the bureaucratic requirements of Probit analyses in which each of the seven NES partisan identification categories was coded as a dummy variable (with “independent” excluded to avoid perfect multicolinearity) indicated that independent leaners were, in anything, more likely to behave like strong partisans than were weak partisans. 13 I calculated the issue distances as quadratic loss functions, as is the norm in analyses of spatial voting. That is, I presume that the importance of absolute distance between the voter and the candidate increases as an individual moves further from the candidate on a particular issue continuum. While there is, of course, no a priori reason to believe that the quadratic term is the most appropriate functional form for the loss function (see Jackson 1992 for a discussion of the plausibility of alternative forms of the loss function), I choose the quadratic form to follow convention. 9 registration and voting. Along the same lines, those with high levels of political information are better positioned to bear the costs of participation. In particular, pre-existing political knowledge reduces the marginal costs of becoming informed about the issues at stake in a particular election. Finally, I include age as a variable in my model, not so much for its own effects as for a proxy for life experience. That is, over the course of a lifetime, people acquire knowledge, skills, and attachments that better enable them to participate in elections. At the same time, the benefits of life experience are not boundless. Pervious empirical work has found that participation drops off among older individuals. To capture this curvilinear relationship, therefore, I also include a quadratic age term in my specification of the turnout equation. While individual resources are important determinants of the turnout decision, political actors may also impact that decision through mobilization activities undertaken by the political parties. To account for the external reduction of costs undertaken by the parties, then, I included a measure of whether respondents were contacted by the parties (and given “free” political information, thus subsidizing the costs of their participation). Social involvement may also play a role in the turnout decision process. As Rosenstone and Hansen (1993) argue, “people receive information and rewards through their social networks, and the better placed they are within it, the more likely they are to take part in electoral politics.” (p. 159). In other words, the greater the extent of an individual’s social involvement, the more likely they are to vote in national elections. To account for the effect of these social networks, I included a variable measuring the length of an individuals residence in their current city or town.14 Finally, it should be recognized that turnout may not reach 100 percent even among those individuals with ample political resources and deep ties to the community, because some costs of participation, no matter how small, must always be borne by the individual. But though participation is certainly a costly activity, participating in the political system brings its own rewards which may offset some of the residual costs of participation. To capture the potential benefits of voting, I included a measure of the strength of partisan identification.15 Deep attachment to the major parties will presumably increase an individual’s personal stake in the outcome of the election. This increase should, in turn, increase the probability that an individual will vote. As Rosenstone and Hansen (1993) write, “when citizens expect to get a benefit out of participation, whether it is policy or simple satisfaction, they are more inclined to devote their efforts to electoral politics.” Variance Term The construction of my variance term is based on the assumption that those who are better engaged with the campaign should have more structured vote choices. Specifically, I assume that individuals who pay closer attention to the campaign should have a more stable basis upon which to make their vote choice and, therefore, should exhibit less variant vote decisions than similar individuals who take only a passing interest in the campaign. The most obvious measure of such engagement is the “political information” variable described by Zaller (1992) (see Verba, Scholzman, and Brady 1995). But while political information, in theory, is a natural 14 Because relevant social networks will presumably develop within the first few years of residence, indicating a “ceiling effect,” I used the natural log of length of residence in my analysis in place of the raw scale. 15 Party strength is measured by the square of partisan identification. 10 predictor of vote choice variance, in practice, the use of that variable is questionable. Specifically, when entered as an independent variable in both the retrospective and spatial voting models, political information acts as a powerful predictor not merely of the dispersion of an individual’s vote choice, but also of the mean of that choice. This result is problematic because the variance term in the heteroskedastic probit will confound the effects of non-constant variance with variables omitted from the outcome equation (Achen 1996). Thus, using the political information variable to model variance could confound the outcome equation variable coefficients with the variance term coefficients. Because this confound has potentially serious consequences for the plausibility of our estimates, I searched for measures of campaign engagement to model the variance term that did not also predict an individual’s vote choice. I found four such measures: campaign attention — measured in both the pre- and post- election surveys — attention given to campaign news in newspapers, and attention given to campaign news on television. Unfortunately, while these variables seemed capable in theory, they did not perform well as predictors of vote choice variance. In both of the models, then, I used the single engagement variable which best predicted vote choice variance. Specifically, for the retrospective voting model, I used the post-election measure of campaign attention; for the spatial voting model, I used newspaper attention. INSERT TABLE 1 ABOUT HERE MODEL ESTIMATION AND INTERPRETATION 1: THE RETROSPECTIVE VOTING MODEL Table 1 presents my coefficient estimates for the independent probit (model 1), the bivariate probit selection model (model 2), and the heteroskedastic bivariate probit selection model (model 3). The vote predictors are all highly significant in both a statistical and a substantive sense. The more interesting question, however, is whether introducing controls for selection bias and heterogeneity alters our interpretation of the power of the predictors. A comparison of the independent probit (model 1) results and the bivariate probit (model 2) results indicates that correcting for selection bias in the analysis of vote choice does not greatly alter the substantive performance of the vote choice variables. While the coefficient estimate of r is highly significant in a substantive sense and is statistically significant at the 95 percent level, none of the point estimates of the vote choice coefficients move more than three or four percentage points once the selection bias correction is undertaken. This result in and of itself is not reason enough to dismiss the utility of correcting for selection bias. As will be shown below, the predictions of the voting behavior of individuals differs systematically across the independent probit model and bivariate probit selection model. The slight movement of the coefficients does, however, indicate that ignoring the selection bias in the retrospective voting vote choice equation will produce coefficient estimates rather close to the coefficients obtained when selection effects are ignored. Adding a heteroskedastic term to the bivariate probit selection model yields rather inconclusive results (see model 3). The coefficient on post-election assessment of campaign attention is in the expected direction and is substantively significant. More importantly, the coefficient estimates of the model are all attenuated — in some cases significantly — once the heteroskedasticity is modeled by campaign attention. For example, the power of the party identification coefficient is reduced over 10 percent in the heteroskedastic bivariate probit 11 relative to the homoskedastic bivariate probit. But while the introduction of the variance term alters the substantive predictions of the model, the campaign attention coefficient is statistically insignificant. Moreover, the likelihood-ratio test for heteroskedasticity in the probit model, described by Greene (1993; p. 650), indicates that we cannot reject the null hypothesis of homoskedasticity at the 99 percent level of confidence. In sum, then, the statistical findings of homoskedasticity in the heteroskedasticity bivariate probit selection model are at odds with the substantive effect of introducing a variance term into the selection bias correction model. It is possible that this apparent paradox — the attenuation of the coefficients in the heteroskedasticity bivariate probit relative to the homoskedastic probit specification in the presence of apparent homoskedasticity — is more the result of adding a variance term to the model than an indication of the failure of the Dubin-Rivers model in the presence of heteroskedasticity. In an attempt to peel apart the independent power of the heteroskedastic correction from the heteroskedastic/selection bias interactive effect, I estimated a heteroskedastic probit model of vote choice (model 4), which does not incorporate the Dubin-Rivers selection bias correction (see Table 1). As the results in Table 1 demonstrate, it is indeed the variance term, not the interaction of the selection bias and heteroskedasticity which drives the differences between the homoskedastic and heteroskedastic bivariate probit results.16 The heteroskedastic bivariate probit results (model 3), while significantly different from the bivariate probit results (model 2) are rather close to the results of the heteroskedastic probit (model 4). Thus, the “odd” results obtained in Table 1 are a result of the attenuation of coefficients due to the introduction of the variance term, not the failure of the selection bias model in the presence of heteroskedasticity. In sum, then, the choice of which model — the homoskedastic or the heteroskedastic bivariate probit selection model — is the best correction for the selection bias present in the data depends on whether we accept the statistical test findings of homoskedasticity, or if we chalk up the “null result” to a sample size problem and are more persuaded by the substantive changes in the coefficient estimates affected by introducing the heteroskedastic term into the model. Such a decision is not, however, critical for the purposes of this paper because the variance term was introduced to allay concerns about the performance of the Dubin-Rivers model in the presence of heteroskedasticity. However, the introduction of that term does not affect our conclusions regarding the presence of selection bias in the retrospective voting model. To get a better notion of the contrasts among the models, I undertook two additional analyses to further examine the differences in the predictions of the three models. First I constructed a table of “first differences,” which presents the estimates of the predicted probability of a “success” given a particular change in the single explanatory variable, while holding statistically constant the other variables in the equation. In the context of my model, I prepared a table of first differences to illustrate the “broad range effect” of the independent variables in the model. Specifically, I calculated the changes in predicted probability of voting for Mondale, given a movement from the minimum in-sample value of a particular independent 16 One possible reason for this result is that the campaign attention, while a statistically and substantively insignificant predictor of vote choice, does have minimal predictive power when entered into the vote choice equation. It could be that the variance term is picking up this “omitted variable” leading to the attenuation of the other coefficients in the model. 12 variable to the maximum value of that variable (while holding constant the other variables at their means).17 INSERT TABLE 2 ABOUT HERE Although the general pattern of variable effects remain the same model to model, some variation does exist in the predicative maximum power of the specific variables across the three models. In particular, it appears that some of the variables — most notably party identification and personal economic situation — are attenuated in the independent probit model relative to the situation where selection bias has been controlled using the Dubin-Rivers method. Furthermore, the introduction of the variance term to the results decreases the maximum power of all the variables, as is to be expected from the coefficient attenuation demonstrated in Table 1. INSERT FIGURE 1 ABOUT HERE The second set of analyses examined the examined the effect of partisanship on the probability of voting for Mondale across all five categories of party identification. These results are presented in Figure 1 The clearest result is that, for all five categories, the bivariate probit overpredicts Mondale support relative to the independent probit model. Mathematically, this difference in predication is a result of the larger intercept term for the bivariate selection probit, which pushes the entire curve of predicted probabilities closer to the “Mondale” end of the preference spectrum, thereby increasing support for the Democratic candidate across the board. Conceptually, this phenomenon is a reflection of the fact that the negative correlation on r means that those individuals with a large negative error term in the selection equation are more likely to support Mondale. Thus, since those who do not vote are more likely to favor Mondale than Reagan, once we account for selection bias, overall support for Mondale increases. Introducing a variance term into the bivariate probit selection model gives a slight boost in support for Mondale to those individuals least predisposed to support that candidate — strong Republican and weak Republicans — but this Mondale “boost” trails off for weak Democrats and Strong Democrats, bringing the model predictions in line with the independent probit. Mathematically, this result is obtained because the lower intercept of the heteroskedastic bivariate probit model favors Mondale relative to the other two models. But as the variable values move in a manner that favors Mondale, those movements give a smaller boost relative to the other models because of the attenuated vote choice coefficients. In sum, then, while the model estimates do not diverge wildly from each other, significant differences do exist in the performance of those models. First, while the bivariate probit and independent probit coefficient estimates are very similar, the bivariate probit estimates encompass a slight pro-Mondale bias relative to the independent probit model. Second, the results demonstrate that our assumptions concerning the consistency of variance across the population greatly effects our interpretation of the coefficient estimates bias. However, this change is due to the independent effect of modeling the variance term, not the interactive effects of heteroskedasticity and selection bias. In fact, the introduction of the variance term into the model does not affect our conclusion that the voting model suffers from selection bias, as 17 The race independent variable is dichotomous, so instead of holding its values at its mean, I held it at the point where it was most likely to occur within the sample. Because 90 percent of respondents were white, the first differences in the table are calculated under the situation in which the individual is white. 13 reflected by the steady — albeit small — movements in the vote choice coefficients and the statistically and substantive significant coefficient estimate of r. INSERT TABLE 3 ABOUT HERE MODEL ESTIMATION AND INTERPRETATION 2: THE SPATIAL VOTING MODEL Table 3 presents the parameter estimates for the independent probit (model 1), the bivariate probit selection model (model 2), and the heteroskedastic bivariate probit selection model (model 3). While the variables in the vote choice equation are very different from those used in the retrospective voting framework, the general pattern of results have a similar flavor to that first setup. All the coefficients are statistically and substantively significant predictors of the vote choice decision. In addition, once again, the coefficients all run in the expected direction. As was the case with the retrospective voting model, the vote choice equation coefficients move only slightly when the Dubin-Rivers control for selection bias is introduced. But, unlike the results presented above, r is statistically indistinguishable from zero at even the 70 percent level of confidence. In fact, the standard error of r is larger than the coefficient estimate. At the same time, the estimate of r is substantively significant. The diminished statistical significance of the coefficient may well be a reflection of the fact that the spatial voting model sample size is only about sixty percent the size of the retrospective voting sample. Still, the available evidence does seem to suggest that where a slight degree of selection bias appears to be present in the retrospective voting framework, excluding the population of nonvoters when calculating the spatial model has almost no effect on the parameter estimates. Like the retrospective voting model, adding a heteroskedastic term muddles the interpretation of the models considerably. The coefficient on newspaper attention is statistically insignificant, though it is in the expected direction. However, the introduction of a variance term does attenuate the coefficients in the model, in some cases significantly. Once again, it appears that these movements are the result of introducing the variance term, not the interaction between heteroskedasticity and selection effects. As Table 3 shows, while the differences between the coefficient estimates in the homoskedastic and heteroskedastic bivariate probit models (models 2 and three respectively) may be large, the differences between the heteroskedastic probit (model 4) and heteroskedastic bivariate probit (model 3) are relatively minimal. Thus, even though introducing the heteroskedastic term into the selection model effects our coefficient estimates, the strong finding of nonsignificance on the newspaper attention variable indicates that we should give pause before concluding that these coefficient movements are more than a statistical artifact. Most important for the purposes of this paper, the introduction of the variance term does not affect our conclusion regarding the presence of selection bias. Specifically, the estimate of r in the heteroskedastic bivariate probit selection model is — as was the case in the homoskedastic version of that model — less than the estimate of the standard error and almost surely insignificant from a statistical standpoint. INSERT TABLE 4 ABOUT HERE 14 The retrospective and spatial voting setups again look like each other when the analysis turns to first differences (see Table 4).18 Again, the bivariate probit model produces predicted probabilities that favor Mondale slightly, relative to the independent probit model for the issue distance variables, though the maximum effect of party identification is slightly attenuated. Still, the degree of this pro-Mondale advantage is slightly attenuated relative to the retrospective voting paradigm, as the parameter estimates reported in Table 3 would suggest. Once again, the introduction of the variance term diminishes the power of all the variables in the model, underscoring the importance of our assumptions regarding the model of variance in our data. INSERT FIGURE 2 ABOUT HERE Figure 2 details the effects of movements in partisanship on the “average” individual’s probability of voting for Mondale. The findings mirror those in the retrospective voting model, though the pro-Mondale boost of the bivariate probit selection model is reduced relative to the retrospective voting model estimates. While the model estimates do not diverge wildly from each other, the analysis above again indicates that differences do exist among these models. The bivariate probit estimates contain a slight pro-Mondale bias, relative to the independent probit model. However, the bivariate probit and independent probit estimates are relatively close. Moreover, the minimal degree of selection bias, as measured by the change in the coefficients in the vote choice equation, and the interval estimate of r, indicate that this difference may be more the result of a statistical artifact than selection bias. The introduction of the variance term, while it changes our estimates of the coefficients, does not change our conclusions regarding the absence of selection bias. In sum, then, the selection bias present in the retrospective voting analysis appears absent in the spatial voting models presented above. MISSING DATA We should not, however, be so quick as to dismiss the possibility that selection bias exists in the spatial voting sample. For one, as noted above, the finding of “no significance” on the r coefficient could reflect the relatively small size of the spatial voting sample rather than the true relationship between the voting and turnout processes in the population. In addition, and perhaps more importantly, the estimates of selection bias generated by the bivariate probit model could have been confounded by a second instance of incidental truncation which removed a significant source of heterogeneity from the sample. Typically, when certain variables are not measured for some of the cases in the data matrix, those cases with incomplete or “missing” data are discarded and analysis proceeds only with those units with complete data. This procedure, known as listwise deletion of missing data, is the norm in social science analysis. In fact, most computer packages automatically delete missing data listwise. Thus, following convention, I simply removed those cases which had missing values on any of the variables before proceeding with my analysis. This procedure, as 18 The “ideology” “Russia” and “government services” first differences represent the change in predicted probabilities for an individual who is average in every way (except that they are a political independent who was not mobilized by a political party) who moves from being 0 points from Reagan and 6 points from Mondale to one who is 0 points from Mondale and six points from Reagan. 15 Little and Rubin (1987) note, may be satisfactory with small amounts of missing data. However, under certain circumstances, listwise deletion of missing data can lead to serious biases. Little and Rubin list three conditions under which missing data may have been generated in the bivariate case. In the first, the probability of response is independent of both the dependent and independent variables. In the second, the observation of the value on the dependent variable depends on the independent variable, but not on the value of the dependent variable. In the third, the probability of responses depends on the value of the dependent variable and possibly the independent variable. If the first condition holds, the missing data are missing at random (MAR) and the observed data are observed at random (OAR). Alternatively, we may say that the missing data are missing completely at random (MCAR). Under such circumstances, the missing data mechanism may be ignored without any resulting bias in the analysis. In the second case, the missing data are only MAR. Thus, the observed values of the dependent variable are, as Little and Rubin note, “not necessarily a random subsumable of the sampled values, but they are a random sample of the sampled values within subclasses defined by values” of the independent variables. If the data is MAR, but not OAR, the missing data mechanism is ignorable for likelihood based inference, though not sample-based inference.19 Finally, if the data is neither MAR nor OAR, as under the third condition, the mechanism is non-ignorable. These intuitions are readily transferable to the multivariate case. Thus, unless the process governing whether data is missing or observed is independent of the dependent variable, the data matrix will not be a random subsumable of the full data matrix, but rather a censored subsumable of that matrix.20 INSERT TABLE 5 ABOUT HERE Unfortunately it appears that the third non-ignorable case described by Little and Rubin, applies to the spatial voting sample. While the decision to turnout does not determine whether an individual will respond to an issue or ideological placement question, the same variables which determine whether an individual will turnout also predict whether an individual will answer the spatial voting placement questions. Specifically, those who do not vote are likely to be less educated, have lower incomes, and possess lower levels of political information than respondents who turn out. These individuals are, in turn, more likely to abstain from the issue placement questions than the rest of the population and, as a result, are more likely to be recorded as missing data for those items than voters. To the extent that any selection bias exists in the data, this process introduces the non-ignorable missing data problem described by Little and Rubin. The differences between the “placers” and the “non-placers” can be seen most clearly in Table 5, which compares those respondents who are willing to answer all the issue placement questions to those who do not give an answer to at least one of the questions. The subsumable of “issue placers” used to estimate the spatial voting models is by no means a random set of the full sample. As Table 5 demonstrates, the sample with which the spatial voting analysis was conducted was more financially well off, better educated, had higher levels of political information and — most damming — turned out at a rate eight points higher than the full sample. 19 As Little and Rubin (1987) note, “if interest is in the marginal distribution of Y, or summary measures such as the mean of Y, then an analysis based on the m complete units is generally biased unless the data are MCAR.” (p. 15.) 20 In other words, if the probability that the ith case is not observed depends on the value of the dependent variable, we say that the sample has been censored. 16 In sum, then, the listwise deletion of the missing data for the issue placement questions introduces a non-ignorable missing data problem into our analyses. Thus, though the bivariate probit model may have corrected for one form of selection bias, the spatial voting analysis remains contaminated by a sample censoring problem. INSERT TABLE 6 ABOUT HERE One way to test the implications of this proposition would be to reintroduce the deleted cases into the spatial voting sample through a missing data replacement technique and reestimate the models presented above. Several methods for missing data replacement have been proposed, most notable multiple imputation (see Little and Rubin 1987). Such procedures are beyond the scope of this paper. However, the effect of the missing data mechanism on parameter estimates in vote choice models can be estimated a second way; by inducing the spatial voting model missing data mechanism in the retrospective voting sample. Table 6 presents the retrospective voting estimates for the independent probit and bivariate probit selection models, using a reduced sample of the original observations.21 This new sample was constructed by removing from the original retrospective voting sample those respondents who were unable to place either themselves or one of the candidates on the issue and ideology scales used to estimate the spatial voting model. Reanalysis of the retrospective voting model indicates that the spatial voting model missing data bias may even be more severe than the bias introduced by the sample censoring through the independent estimation of the vote choice equation. Many of the point estimates of the coefficients in the vote choice equation move greatly relative to those presented in Table 1 — the race coefficient, for example, is attenuated by 10 percent — and, more importantly, the significance — both statistically and substantively — of the personal economic situation variable is attenuated by almost 50 percent. Most tellingly, the estimate of r has been attenuated by over 10 percent relative to the full sample and no longer appears to be statistically significant. These results are not surprising. By deleting those cases with missing data on the issue placement questions, we expunge from the sample the highly uneducated, the apoliticals, and the chronic non-participators. Previous work on political reasoning (Sniderman, Brody, and Tetlock 1991; Rosenberg 1988) indicates that this segment of the population thinks very differently about politics than the highly educated and politically sophisticated. In other words, by performing listwise deletion on the missing data, we are, in effect, removing a significant source of heterogeneity from our sample. It is not surprising, then, that the political preferences of the full sample would look similar to the observed sample of voters under such circumstances, as they do given the limited movement in the coefficient estimates from the independent probit to the bivariate probit and the finding of “no significance” on r in Table 6. In sum, then, it seems that we should take the finding of “no bias” in the spatial voting model with a grain of salt. It could well be that under the vote choice conditions specified in that model, the vote choice and turnout decisions are not linked in any significant way. But such conclusion must be withheld until a full analysis of the entire sample is undertaken — a task for future research. This result also gives pause to accepting the estimates of the retrospective voting model. While the sample used to estimate that model is significantly larger than that used 21 I do not present the results of the heteroskedastic probits in this table because the introduction of the variance term did not change the model results significantly. Thus, I use only the homoskedastic model results for illustrative purposes. 17 to estimate the spatial voting model, it is still a subset of the full sample that has been reduced through listwise deletion of missing data. Deleting those cases with missing data on the economic situation and party identification question might have introduced similar bias to that identified in the spatial voting sample.22 Thus, it could be that the degree of selection bias estimated in the retrospective voting framework above is attenuated relative to its “true” values. CONCLUSION On their face, the analyses presented in this paper do not present strong evidence to suggest that independent estimation of vote choice and turnout models — the preferred mode of analysis in political science research — is a problematic estimation strategy. Specifically, correcting for selection bias using the method proposed by Dubin and Rivers does not significantly change the coefficient estimates across the retrospective and spatial voting models. Modeling the variance term through levels of engagement with the electoral campaign does not — contrary to theoretical expectation — change this conclusions. Thus, if a researcher is only interested in estimating the importance of various factors in the vote choice decision, it seems that acceptable estimates of the “true” coefficients may be produced by independent analysis of the vote choice equation — which ignores the selection mechanism at work in the data. The coefficients obtained through such estimation may be biased and inefficient, but the degree of these inaccuracies appear to be so slight as to call into question the need for corrective techniques, such as the bivariate probit selection model. But while this picture may, at first glance, appear to be rather rosy, there are several reasons why we should temper our original optimism. First, on a substantive level, ignoring the selection mechanism at work in survey research data throws away important information about the process which generated that data. In the case of the dataset at hand, the estimates of a significant coefficient on r indicate that that in the context of the 1984 elections, individuals who turned out were more likely to vote for Reagan than Mondale. This result is substantive interesting in and of itself and is potentially important for both researchers and professional politicians alike. Other factors also argue against a quick dismissal of mechanisms which account for selection bias. Specifically, the dataset under consideration here may be flawed in ways that do not allow for the estimation of the true degree and effect of selection bias in the population at large. If Brehm is correct and the survey sample differs in important ways from the population about which we are attempting to draw inferences, we should be cautious about making a firm conclusion of “no difference” between the preferences of voters and the preferences of the population at large. 22 It is also possible that all of the analyses reported in this paper have been contaminated by a missing data bias. Research by Brehm (1993) indicates that reluctant respondents (e.g. those respondents who require multiple callbacks and/or additional persuasion before agreeing to be interviewed) are less informed about politics and are less likely to participate in the political process than amenable respondents. Brehm argues that this result suggests that individuals who refuse to be interviewed for the NES are less interested in politics than those who agree to be interviewed. Thus, if Brehm’s intuition is correct, the full NES sample may, as a whole, be more politically informed and more likely to vote than the target population about which we wish to draw inferences. If this is the case, our coefficient estimates will be biased and lead to incorrect inferences about the relationships of interest, as described above. 18 Most importantly, even if Brehm is not correct, the use of ideology and issue placement variables — and perhaps other measures of political attitudes, such as partisanship and assessments of the economy — may introduce a missing data mechanism into our analyses which removes from the sample those individuals who are less politically knowledgeable and more likely to abstain from political activity than the rest of the population. As the reanalysis of the retrospective voting data suggests, this missing data problem may greatly affect our estimates of the parameters of interest in vote choice models. Thus, the incidental truncation of the less politically engaged through listwise missing data deletion should give us pause before accepting the “minimal bias” findings of the retrospective voting and, especially, the spatial voting model at face value. In all likelihood, the missing data concerns identified in this paper are not universal to models of vote choice. Future work should address the portability of the missing data problem to identify circumstances where — as is the case with the issue placement questions in the 1984 NES dataset — the use of certain vote choice predictor variables introduces biases into our coefficient estimates of the effects of those variables on the vote choice decisions. Based on the analyses above, my intuition is that the degree of selection bias present in the data will affect the degree to which the missing data concerns identified in this paper bias the coefficient estimates. In any event, though, the fact that a non-ignorable missing data bias was identified in the analysis reported in this paper underscores the need to recognize and account for the processes that generate our data when examining individual political behavior, not merely in an electoral contexts, but in the examination of models of political attitudes as well. Thus, heterogeneity within the sample of respondents affects the vote choice model estimates, just not in the way I originally envisioned. It is not just heterogeneity in the variance term, or in the selection into the vote choice process that poses a threat to accurate estimates of the power of the predictors in our vote choice models. Rather, it is the failure to preserve or account for the heterogeneity of the paths by which people answer survey questions that is the real bogeyman of vote choice models. In a context — like the retrospective voting model — where selection bias exists, we need to think carefully before introducing variables — like ideological distance — that “kick out” significant portions of the sample who may think about politics in a manner differently from the rest of the sample. In sum, then, while the analyses reported in this paper indicate that acceptable estimators of the processes governing the vote choice decision can be obtained if the non-voters are excluded from analysis, potentially substantial problems with both our data collection undertakings and our statistical techniques should give pause to such a rosy conclusion. After all, the findings of “minimal bias” in this paper, which seem so reassuring on their face, may be more the result of imperfect data collection and analysis than a reflection of the absence of a potentially disruptive section bias in the underlying processes which generated that data. 19 TABLE 1: RETROSPECTIVE VOTING MODEL ESTIMATE RESULTS Variable Model 1: Separate Probit Coefficient (SE) Model 2: Model 3: Bivariate Probit Heteroskedastic Bivariate Probit Coefficient (SE) Coefficient (SE) VOTE CHOICE -1.217 (0.143)** 1.559 (0.097)** 1.309 (0.219)** 0.488 (0.175)* 1.110 (0.181)** — TURNOUT -2.761 (0.300)** 1.034 (0.204)** 4.720 (1.312)** -3.473 (1.365)** 0.117 (0.031)** 0.591 (0.123)** 1.894 (0.233)** 0.504 (0.104)** 0.135 (0.053)** CORRELATION -0.357 (0.165)** 1548/-1.061.337 -1.076 (0.159)** 1.384 (0.151)** 1.145 (0.240)** 0.444 (0.157)** 1.021 (0.174)** -0.180 (0.134) -2.756 (0.300)** 1.033 (0.204)** 4.706 (1.311)** -3.459 (1.364)** 0.117 (0.031)** 0.591 (0.123)** 1.894 (0.233)** 0.504 (0.104)** 0.134 (0.053)** -0.367 (0.162)** 1548/-1.060.578 Model 4: Heteroskedastic Probit Coefficient (SE) Model Constant Party Identification National Economy Personal Economic Sit. Black Heteroskedastic Term Campaign Attention Constant Education Age Age2 Length of Residence Strength of Partisanship Political Information Mobilized Income -1.350 (0.124)** 1.618 (0.092)** 1.277 (0.228)** 0.487 (0.191)** 1.092 (0.211)** — — — — — — — — — — — 1162/-386.687 -1.212 (0.155)** 1.456 (0.154)** 1.130 (0.243)** 0.446 (0.161)* 1.011 (0.180)** -0.164 (0.145) — — — — — — — — — — 1162/-386.066 H N/Log Likelihood * = p < .10; ** = p < .05 20 TABLE 2: FIRST DIFFERENCES FOR INDEPENDENT VARIABLES IN THE RETROSPECTIVE VOTING MODEL Variable Separate Probit +.838 +.443 +.174 +.414 Change in Probability of Voting for Mondale Bivariate Probit +.848 +.473 +.184 +.413 Heteroskedastic Probit +.802 +.401 +.162 +.386 Heteroskedastic Bivariate Probit +.804 +.423 +.177 +.382 Party Identification National Economy Personal Economic Sit. Black Note: First Differences are calculated with all other variables held at their means, except that it is assumed that the individual is Black 21 TABLE 3: SPATIAL VOTING MODEL ESTIMATE RESULTS Variable Model 1: Separate Probit Coefficient (SE) Model 2: Model 3: Bivariate Probit Heteroskedastic Bivariate Probit Coefficient (SE) Coefficient (SE) VOTE CHOICE -0.491 (0.171)** 1.180 (0.133)** 1.660 (0.424)** -1.761 (0.711)** 0.990 (0.249)** -1.274 (0.437)** 1.518 (0.414)** -1.108 (0.544)* — TURNOUT -3.095 (0.471)** 0.950 (0.290)** 5.026 (2.289)** -4.137 (2.564) 0.112 (0.045)** 0.746 (0.190)** 2.252 (0.351)** 0.532 (0.155)** 0.190 (0.080)** CORRELATION -0.246 (0.266) 907 /-539.891 -0.453 (0.149)** 1.039 (0.152)** 1.462 (0.389)** -1.494 (0.636)** 0.791 (0.251)** -1.058 (0.427)** 1.399 (0.380)** -1.039 (0.485)** -0.300 (.221) -3.098 (0.471)** 0.950 (0.290)** 5.017 (2.294)** -4.128 (2.570) 0.112 (0.045)** 0.745 (0.190)** 2.254 (0.351)** 0.531 (0.155)** 0.192 (0.080)** -0.218 (0.260) 907/-538.949 Model 4: Heteroskedastic Probit Coefficient (SE) Model Constant Party Identification Dist to Reagan: Ideology Dist to Mondale: Ideology Dist to Reagan: Russia Dist to Mondale: Russia Dist to Reagan: Services Dist to Mondale: Services Heteroskedastic Term Political Information Constant Education Age Age2 Length of Residence Strength of Partisanship Political Information Mobilized Income -0.581 (0.138)** 1.201 (0.135)** 1.683 (0.458)** -1.769 (0.703)** 0.973 (0.298)** -1.298 (0.520)** 1.547 (0.483)** -1.100 (0.540)** — — — — — — — — — — — 741/-207.565 -0.519 (0.129)** 1.048 (0.150)** 1.466 (0.391)** -1.491 (0.632)** 0.768 (0.247)** -1.064 (0.427)** 1.411 (0.380)** -1.026 (0.491)** -0.317 (0.217) H N/Log Likelihood * = p < .10; ** = p < .05 741/-206.502 22 TABLE 4: FIRST DIFFERENCES FOR INDEPENDENT VARIABLES IN THE SPATIAL VOTING MODEL Variable Separate Probit +.725 +.865 +.633 +.782 Change in Probability of Voting for Mondale Bivariate Probit +.710 +.875 +.660 +.791 Heteroskedastic Probit +.646 +.814 +.552 +.754 Heteroskedastic Bivariate Probit +.656 +825 +.575 +.758 Party Identification Ideology Social Services Note: First Differences are calculated with all other variables held at their means. 23 TABLE 5: SAMPLE COMPARISONS Turnout Percentage Education Income Political Information Full Sample .74 .46 9.25 .38 Issue Distance Missing Data Removed .82 .55 11.37 .51 Turnout Percentage Education Income Political Information Issue Placers 82 Some College $15,000-$16,999 Upper Half Excluded Population 64 High School Degree, No College $11,000-$11,999 Lower Quarter 24 TABLE 6: RETROSPECTIVE VOTING MODEL ESTIMATE RESULTS — LIMITED SAMPLE Variable Separate Probit Full Sample Coefficient (SE) Separate Probit Limited Sample Coefficient (SE) VOTE CHOICE -1.191 (0.150)** 1.681 (0.116)** 1.280 (0.290)** 0.268 (0.245) 0.976 (0.277)** CORRELATION — 750/-245.524 Bivariate Probit Full Sample Coefficient (SE) Bivariate Probit Limited Sample Coefficient (SE) Model Constant Party Identification National Economy Personal Economic Sit. Black -1.350 (0.124)** 1.618 (0.092)** 1.277 (0.228)** 0.487 (0.191)** 1.092 (0.211)** — 1162/-386.687 -1.217 (0.143)** 1.559 (0.097)** 1.309 (0.219)** 0.488 (0.175)* 1.110 (0.181)** -0.357 (0.165)** 1548/-1,061.337 -1.091 (0.160)** 1.643 (0.122)** 1.302 (0.276)** 0.260 (0.219) 0.999 (0.244)** -0.318 (0.229) 920/-585.484 H N/Log Likelihood * = p < .10; ** = p < .05 25 APPENDIX A: CODING PROTOCOL VOTE CHOICE VARIABLES Party Identification 5 category partisanship variable. This variable is simply the NES 7-Category partisanship variable with the independent leaners collapsed with the weak partisans (1=Strong Republican; 1=Strong Democrat). 5 category variable gauging respondent’s assessment of the national economy as compared to one year before the interview (0=much better; 1=much worse) 5 category variable gauging respondent’s personal financial situation as compared to one year before the interview (0=much better; 1=much worse). Dummy indicating the race of the respondent (0=nonblack; 1=black) 7 category variable gauging the respondent’s perceived ideological distance from themselves to the candidate. This variable is calculated by the following formula: (respondent position-candidate position)2 (0=no distance; 1=maximal distance Same as ideological distance, but gauging distance on cooperation with the Soviet Union National Economy Personal Economics Race Ideological Distance from Candidate Distance from Candidate on Russia Distance from Candidate on Services Same as ideological distance, but gauging distance on desired level of government services 44 TURNOUT VARIABLES Partisan Strength 3 category variable measuring strength of partisanship. This variable is simply the square of the party identification variable (0=independent; 1=strong partisan). 27 category NES variable measuring knowledge of politics. See Zaller (1992) for coding details (0=low; 1=high). Natural log of the 22 category NES income variable (0=low; 3.09=high ). Natural log of the number of years an individual has resided in their current city or town (+1) (0=less than six months; 4.50=entire life). 7 category NES education variable measuring highest level of education (0=grade school; 1=advanced degree). Actual respondent age divided by 100. Dummy variable indicating whether the respondent was contacted by a representative of one or both of the major political parties (0=not contacted; 1=contacted). Political Information Income: Length of Residence Education Age Mobilized 45 VARIANCE VARIABLES Newspaper Attention 5 category variable measuring attention paid to newspaper stories about the campaign. Individuals who do not read newspapers are coded as paying “no attention” to newspaper campaign news. (0=No attention; 1=a great deal of attention). 5 category variable measuring attention paid to television stories about the campaign. Individuals who do not watch television are coded as paying “no attention” to television campaign news. (0=No attention; 1=a great deal of attention). 3 category variable, probed in the Pre-Election survey, measuring the amount of interest in political campaigns “so far this year.” (0=Not much interested; 1=very much interested). Television Attention Campaign Attention (Pre-Election): Campaign Attention (Post-Election): 3 category variable, probed in the Post-Election survey, measuring the amount of interest in political campaigns “so far this year.” (0=Not much interested; 1=very much interested). 46 REFERENCES Achen, Christopher H. 1986. The Statistical Analysis of Quasi-Experiments. Berkeley: University of California Press. ————. 1996. Discussant Remarks at the 1996 Annual Meeting of the Midwest Political Science Association, April, 1996, Chicago, IL. Amemiya, Takeshi. 1984. “Tobit Models: A Survey,” Journal of Econometrics 24:3-61. Berelson, Bernard R., Paul F. Lazarsfeld, William N. McPhee. 1954. Voting: A Study of Opinion Formation in a Presidential Campaign. Chicago: University of Chicago Press. Campbell, Angus, Philip E. Converse, Warren E. Miller, and Donald E. Stokes. 1960. The American Voter. Chicago: University of Chicago Press. Breen, Richard. 1996. Regression Models: Censored, Sample Selected, or Truncated Data. Sage University Paper series on Quantitative Applications in the Social Sciences, 07-111. Thousand Oaks, CA: Sage. Brehm, John. 1993. The Phantom Respondents: Opinion Surveys and Political Representation. Ann Arbor: University of Michigan Press. Brehm, John and R. Michael Alvarez. 1995. “American Ambivalence Towards Abortion Policy: A Heteroskedastic Probit Method for Assessing Conflicting Values.” American Journal of Political Science, 39:1055-82. ————. 1996. “Uncertainty and Ambivalence in the Ecology of Race.” Paper presented at the 1996 Annual Conference of the American Political Science Association, San Francisco, CA. Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper and Row. Dubin, Jeffrey A. and Douglas Rivers. 1990. “Statistical Bias in Linear Regression, Logit and Probit Models.” Sociological Methods and Research, 18:360-90. Enelow. James M. And Melvin J. Hinich. 1984. The Spatial Theory of Voting. Cambridge: Cambridge University Press. Fiorina, Morris. 1981. Retrospective Voting in American National Elections. New Haven: Yale University Press. Franklin, Charles H. and John E. Jackson. 1983. “The Dynamics of Party Identification,” American Political Science Review. 77:957-73 Greene, William H. 1993. Econometric Analysis, Second Edition. New York: MacMillian Publishing Company. Jackson, John E. 1975. “Issues, Party Choices, and Presidential Votes,” American Journal of Political Science. 19:161-85. ————. 1992. “Estimation of Models with Variable Coefficients,” Political Analysis, 3:27-49. Keith, Bruce E., David B. Magleby, Candice J. Nelson, Elizabeth Orr, Mark C. Westlye, and Raymond E. Wolfinger. 1992. The Myth of the Independent Voter. Berkeley: University of California Press. Kinder, Donald R. and Roderick Kiewiet. 1981. "Sociotropic Politics: The American Case." British Journal of Political Science 11: 129-61. Little, Roderick J.A. and Donald Rubin. 1987. Statistical Analysis with Missing Data. New York: Wiley. Madalla, G. S. 1983. Limited-Dependent and Qualitative Variables in Econometrics. Cambridge: Cambridge University Press. Rivers, Douglas. 1988. “Heterogeneity in Models of Electoral Choice,” American Journal of Political Science, 32: 737-57. Rosenberg, Shawn W. 1988. Reason, Ideology and Politics. Princeton, NJ: Princeton University Press. Rosenstone, Steven J. and John Mark Hansen. 1993. Mobilization, Participation, and Democracy in America. New York: MacMillian Publishing Company. Sanders, Mitchell S. 1995. “Unified Models of Turnout and Vote Choice for 2- and 3- Candidate Elections.” Paper presented at the 1995 annual conference of the American Political Science Association, Chicago, IL. Sniderman, Paul M., Richard A. Brody, and Philip E. Tetlock. 1991. Reasoning and Choice. Cambridge: Cambridge University Press. Verba, Sidney, Kay Lehman Schlozman and Henry E. Brady. 1995. Voice and Equality: Civic Voluntarism in American Politics. Cambridge, MA: Harvard University Press. Wolfinger, Raymond E. And Steven J. Rosenstone. 1980. Who Votes. New Haven: Yale University Press. Zaller, John. 1992. The Nature and Origins of Mass Opinion. Cambridge: Cambridge University Press. HETEROGENEITY AND BIAS IN MODELS 1 OF VOTE CHOICE Adam J. Berinsky University of Michigan Department of Political Science April 1997 1 Paper presented at the 1997 Annual Meeting of the Midwest Political Science Association, April 10-12, 1997, Chicago, IL. For the impetus behind this project and many helpful comments on an earlier draft of this paper, I would like to thank Chris Achen, Steven Rosenstone and Nancy Burns. I would also like to thank Scott Allard and, especially, Fred Cutler for their advice and suggestions. Finally, I would like to thank William Greene for his speedy efforts in correcting a bug in LIMDEP which made possible the estimation of the models reported in this paper. I, of course, am responsible for any errors that remain. The data used in this paper were made available by the InterUniversity Consortium of Political and Social Research. Neither the collector of the original data nor the consortium bears any responsibility for the analyses or interpretations presented here. This material is based upon work supported under a National Science Foundation Graduate Fellowship. Any opinions, findings, conclusions, or recommendations expressed in this paper are those of the author and do not necessarily reflect the views of the National Science Foundation. Any questions or comments may be directed to the author (e-mail: berinsky@umich.edu).

Shared by: Myrna Carlson
About
Home-schooling my youngest child (16). Small on-line bookseller. Unpublished writing.
Other docs by Myrna Carlson
Politics+of+India
Views: 86  |  Downloads: 1
Politics+of+globalization+2
Views: 86  |  Downloads: 1
Politics+and+Government+of+Africa-1
Views: 76  |  Downloads: 0
Nonnested+Model+Testing+for+World+Politics
Views: 83  |  Downloads: 0
Executive+Popularity+in+France+Version
Views: 89  |  Downloads: 0
Political+Marketing-1
Views: 92  |  Downloads: 3
Policies+Prototypes+and+Presidential+Approval
Views: 44  |  Downloads: 0
What+Are+Your+Political+Beliefs
Views: 56  |  Downloads: 0
Internet+Political+Surveys
Views: 53  |  Downloads: 0