Quant. Methodology I
Prof. Noah Kaplan
The following questions are based upon “The Public Consequences of Private Inequality:
Family Life and Citizen Participation” by Nancy Burns, Kay Lehman Scholzman and
1) What is the dependent variable? What is/are the advantage(s) of constructing the
dependent variable in this fashion (hint: think about an OLS assumption)? What
draw backs, if any, might there be to constructing the dependent variable in such a
“The dependent variable in this analysis is an overall summary of an individual‟s political
activity, an additive scale based on eight activities: voting, working campaigns, making
campaign contributions, contacting public officials, taking part in protests, working
informally with others to solve community problems, belonging to local governing
boards, and affiliating with political organizations” (380).
In short, the advantage is that the dependent variable is ordinal and can be treated as
continuous. Remember that OLS assumes that the error term is normally distributed with
a mean of 0. By definition, the error term must be continuous for it to be normally
distributed. But since the error term is simply a linear transformation of the dependent
variable, the dependent variable must be continuous.
One potential downside (among others) is that the additive index mixes apples with
oranges. In other words, one might think that fundamentally different processes drive
turnout versus, for example, belonging to a local governing board. One might argue that
it would be better to model each of these activities separately (which would permit
different factors to be integrated into each model). Of course, OLS might then be a
suboptimal means of estimating these relationships (e.g., if a dependent variable is
binary, OLS is often a suboptimal approach to modeling).
2) The authors write the following in the appendix:
The political interest scale is based on the following question: “How
interested are you in politics and public affairs? Are you very interested,
somewhat interested, only slightly interested, or not at all in interested in
politics and public affairs?” The variable is coded 1 to 4, with 4 indicating
very interested (386).
A) Are the responses to this question nominal, ordinal or cardinal in
B) How are they treated in the OLS analyses reported in Table 6?
C) How else might the variable have been operationalized? What are the
advantages and disadvantages of such an alternative operationalization?
The analysts could have dummied out each value of the variable as a separate variable
(though one would function as the base category). The upside is that this
operationalization does not assume a linear relationship between responses and permits
the analyst to examine whether a linear specification is appropriate. The downside is that
it is non-parsimonious (i.e., the analyst has three variables in the model rather than one)
and interpretation is a bit less obvious.
3) How do you interpret the Adjusted R2 of 0.24 in the “Wives” model in Table 6?
Why is Adjusted R2 reported rather than just R2? What is another term for R2?
The model explains 24% of the variance of the wives‟ political activity (the dependent
variable). Adjusted R2 is reported rather than just R2 because the latter can never go down
as new variables are added to the model, encouraging analysts to include problematic
and/or irrelevant variables in the model (in contrast to the former, which penalizes the
analyst for a less parsimonious model, all other factors being equal). “Coefficient of
Determination” is another term for R2 (though I accepted those responses that categorized
it as a “goodness of fit measure”).
4) How do you interpret the coefficient associated with Family Income (0.11)
reported in the “Wives” model of Table 6? How do you interpret the coefficient
associated with Family Income (0.04) in the “Husbands” model reported in Table
A $10,000 increase in family income is associated with a 0.11 increase in wives‟ political
activity, on average, holding constant all other variables in the model. This variable is
statistically significant at the p < 0.05 level.
A $10,000 increase in family income is associated with a 0.04 increase in husbands‟
political activity, on average, holding constant all other variables in the model. However,
this variable is not statistically significant at the p < 0.05 level, indicating that we cannot
infer any systematic relationship between family income and the political activity of
5) How do you interpret the constant of -2.97 in the “Wives” model of Table 6?
What does it mean that this constant is twice as small as the constant in the
“Husbands” model of Table 6?
This turned out to be a tough question and I accepted a variety of answers. I was looking
for something along the following lines:
The consistent indicates that the model predicts that a wife will participate in -2.97
political acts, on average, given that all the independent variables are set to 0. However,
the constant is not statistically significant at the p < 0.05 level, indicating that we cannot
reject the null that the wife will participate in 0 political acts on average given that all the
independent variables are set to 0. The inability to reject the null that the constant is equal
to 0 makes sense since it is logically impossible to participate in negative political acts.
Though the wives‟ constant is twice as small as the constant in the “Husbands” model of
Table 6, neither are statistically significant at the p < 0.05 level (i.e., fail to reject the null
of the constant equaling zero in both the husbands and wives‟ models). Consequently, the
difference has little overt meaning.
6) How do you interpret the coefficient associated with Control over major financial
decisions (1.52) in the “Husbands” model of Table 6? How do you interpret the
coefficient associated with Control over major financial decisions (-0.68) in the
“Wives” model of Table 6?
A one unit increase in control over major financial decisions is associated with a 1.52
increase in husbands‟ political activity, on average, holding constant all other variables in
the model. This variable is statistically significant at the p < 0.05 level.
A one unit increase in control over major financial decisions is associated with a 0.68
decrease in wives‟ political activity, on average, holding constant all other variables in
the model. This variable is not statistically significant at the p < 0.05 level, (so we cannot
infer any systematic relationship between control over major financial decisions and
wives‟ political activity at the population level).
7) How do you interpret the Beta (the standardized coefficient) associated with
Control over major financial decisions (0.11) in the “Husbands” model of Table
A one standard deviation increase in control over major financial decisions is associated
with a 0.11 standard deviation increase in husbands‟ political activity, on average,
holding constant all other variables in the model. This variable is statistically significant
at the p < 0.05 level.
8) How do you interpret the value associated with the maximum effect of the wives‟
belief in equality at home (+0.96) reported in Table 7? How was this value
This turned out to be a tricky question since 0.96 is incorrect (i.e., the authors
miscalculated the maximum effect or the typesetters mis-set the number). Consequently, I
accepted a variety of answers. I was looking for something along the following lines:
The maximum change in a wife‟s belief in equality at home is associated with a +0.96
increase in a wife‟s political activity (on average, controlling for all other variables in the
model). So the maximum effect of a wife‟s belief in equality at home is smaller than the
maximum effect associated with any of the participatory factors (e.g., education, family
income etc…). Of the variables found to be statistically significant, only relative respect
as an advisor has a smaller maximum effect. For OLS, the maximum change is calculated
by subtracting the minimum value of the variable times the coefficient associated with
that variable from the maximum value of the variable times the coefficient associated
with that variable (maximum_value*coefficient – minimum_value* coefficient).
In footnote 25, the authors wrote “The maximum effect was calculated by multiplying the
regression coefficient from Table 6 by the difference between the highest and the lowest
values the associated variable takes in our data.”
9) Why do the authors report maximum effects in Table 7? Do the maximum effects
reported in Table 7 appear consistent with the Betas (standardized coefficients)
reported in Table 6? How so?
The maximum effect is reported in order to provide a means of comparing the relative
substantive effect of the various independent variables (upon political activity, the
dependent variable). So for example, as mentioned above, the maximum effect of a
wife‟s belief in equality at home is smaller than the maximum effect associated with any
of the participatory factors such as education, family income, etc….
Yes, the maximum effects reported in Table 7 appear to be consistent with the betas
reported in Table 6 since the rank order of the two is very, very similar. For example, for
wives, family income has the greatest beta and educations has the second greatest beta in
Table 6 and family income has the greatest maximum effect and education has the second
greatest maximum effect in Table 7. So the two tell a very similar story in regards to the
relative influence of the various independent variables on the dependent variable.
10) Why do the authors write “Many of the factors that we expected to enhance
wives‟ activity – for example, bringing in a high proportion of family income… –
seem not to affect their participation” (382).
As the authors outlined in the earlier parts of the paper, many (feminist) theorists
hypothesized a positive relationship between the relative influence/position of the wife in
the private sphere and the relative level of participation in the public sphere. However,
many of the variables used to tape the wife‟s relative influence/position of the wife in the
private sphere did not have a statistically significant relation to political activity. For
example, the percentage of the family income brought in by the wife was not a
statistically significant predictor – indicating that we cannot infer any systematic
relationship between the percentage of family income brought in by the wife and her
level of political activity.
11) Are the authors concerned with the issue of omitted variables (i.e., model
specification)? Are they concerned with the issue of variable operationalization?
If so, what, if anything, do they do to address these concerns?
The authors are very concerned about omitted variables and variable operationalization.
In regards to the former, they write “For example, we estimated versions of the model in
which we included measures of two additional participatory resources: job level and
leisure time. As expected, neither variable has a significant effect on the activity of either
husbands or wives, and the results presented in Table 6 were unchanged” (382).
In regards to the latter, they write in footnote 23 that “[i]t could be argued that the
appropriate specification of the model is to use only individual-level measures rather than
the relative measures. For the results of analyses based on individual-level measures, see
Appendix D.” They expand upon this point in detail in Appendix D, and indicate that the
overall picture does not change if one uses individual rather than relative measures.
12) Why do the authors write that “[w]hat is so striking about the findings in Table 7
is that the potential consequences of domestic hierarchy for husbands‟
participation are greater than the effects for wives” (382).
In short, this is because 1.41 is greater than 0.84. However, this difference is not in orders
of magnitude. Furthermore, it appears that this would not be accurate if they had correctly
estimated the maximum effect associated with a wife‟s belief in equality at home.
However, the result appears striking to the authors because it was hypothesized that a
woman‟s political activity would improve the better the relative position of the woman in
the domestic hierarchy, whereas two of the three coefficients associated with domestic
hierarchy are negative in the wives model (even though neither of these estimates are
statistically significant). I suspect that the results also appears striking to the authors
because most of the hypotheses centered on woman‟s political activity, and
comparatively little thought was focused on how the features of the domestic
arrangement influenced/benefited the political activity of husbands.
13) Could there be an issue of endogeneity (or reversed causality) between any of the
independent variables and the dependent variable? If yes, how might such a story
go? If yes, what implications would this have for the results reported in Table 6?
Yes, it is easy to imagine a number of such stories. For example, political activity might
stimulate political interest. Likewise, one could imagine that a wife‟s activity in the
public sphere might influence her belief‟s about the private sphere. In other words:
Political Interest ↔ Political Activity
Beliefs In Equality in the Home ↔ Political Activity
OLS estimates are biased and inconsistent in the presence of a non-recursive relationship.
In other words, you can not believe any results from OLS if you believe the relationship
has been misspecified (i.e., if it has been specified as a recursive relationship when it is
actually a non-recursive relationship).
14) The authors ran and report two OLS multivariate regressions in Table 6 (one for
wives and one for husbands). How could they have combined these two analyses
into a single OLS multivariate analysis (and still have obtained the same results
reported in Table 6)? What would be the advantage(s) of doing so (if any)? What
would be the disadvantage(s) of doing so (if any)?
Technical, they could have introduced a binary variable for gender and interacted this
gender variable with all the independent variables used in their model. This should have
produced results that would be identical to the two sets of OLS results presented in Table
At a minimum, the advantage of the above specification would be that they could have
run an F-test to determine if the interactions were jointly significant. In other words, an
F-test would have permitted them to determine whether a single model for both husbands
and wives would have been superior to the more complicated model with interactions
(i.e., to determine if separate models for husbands and wives was preferable or not to a
single model for both).
Done correctly, including interactions permits the analyst to determine if the difference in
slopes for husbands versus wives is statistically significant. For example, the estimated
coefficient associated with political interest is 0.36 for wives and 0.54 for husbands. But
there is a confidence interval around each point estimate. So the obvious question is
whether the difference between the 0.36 for wives and the 0.54 for husbands is
statistically significant. The hypothesis that the population level coefficients are equal to
one another can be easily test in a model which included the interactions.
An obvious downside to including interactions is that the model looks less parsimonious
and is a bit trickier to interpret.
Extra Credit: How might the independent variables have been recoded (in a
reasonable fashion) such that the constants in the OLS results reported in Table 6
would have been positive?
Recode all the variables such that the mean (or median) value of the variable was coded 0
(i.e., rather than range from 1-5, or 0-4, recode it so that the variable ranged from -2 to
+2). If all the independent variables are set to 0, and 0 represents the mean of all the
independent variables, then the constant should be approximately equal to the mean of
the dependent variable (which, for political activity, was a bit over 2).