# Quantitative Methods II by 2WhS2A

VIEWS: 14 PAGES: 33

• pg 1
```									                        Quantitative
Methods II
Dummy Variables &
Interaction Effects

Edmund Malesky, Ph.D., UCSD               1
The Homogeneity Assumption
 OLS assumes all cases in your data are
comparable
 x’s are a sample drawn from a single
population
 But we may analyze distinct groups of
cases together in one analysis
 Mean value of y may differ by group

2
Qualitative Variables
 These group effects remain as part of the
error term
 If groups differ in their distribution of x’s,
then we get a correlation between the X
variables and the error term
 Violates assumption: cov(Xi, ui)=E(u)=0
 Omitted Variable Bias!

3
Testing for Differences Across
Groups (p. 249-252) Test:
The Chow
1. Is only valid under homoskedasticity (the
   The Chow Test error variance for the two groups must be
   i.e. Testing for difference between males and females on
equal).
2.    The null hypothesis is that there is no
[SSRp  (SSR  SSR )] [n  2(k  1)]
difference at all; either in the intercept or

F
1         2
the slope between the two groups.
*
SSR1  This may be two restrictivein1these
3. SSR2                    k
cases, we should allow dummy variables
                    SSR2=Females only
SSR1=Males only;and dummy interactions to allow us to
   SSRur=SSR1+SSR2 different slopes and intercepts for
predict
   SSRP=SSRr=Pooling across both groups
the two groups.

4
Example: Democracy & Tariffs
But if Democracies are more
likely to be in RTA’s, then
pooling RTA and non-RTA
40
35
   Here we see that
states biases the coefficient
30

Percent Tariffs
democracies have         25
20                                                                     Pooled Data
lower tariffs            15
10
5
0

   Here we see that                                   Dictator   Oligarch   Anocracy   Democracy

states in Regional                        50
45
Percent Tariffs

Arrangements                              35
30                                                     RTA
(RTA’s) have lower                        25
20
No RTA
Pooled Data
tariffs                                   15
10
5
0
Dictator   Oligarch   Anocracy   Democracy
5
Solution: The Qualitative
Variable
 Measure this group difference (RTA vs.
Non-RTA) and specify it as an x
 This eliminates bias
 But we have no numerical scale to
measure RTA’s
 Create a categorical variable that captures
this group difference

6
The Qualitative “Dummy”
   Create a variable that equals 1 when a case is
part of a group, 0 otherwise
   This variable creates a new intercept for the
cases in the group marked by the dummy
   Specifically, how would we interpret:

TARIFF   0  1 DEM   2 RTA  u

7
Democracy and Tariff Barriers
50
45
40
Percent Tariffs

35
30
RTA
25
No RTA
20
15
10
5
0
Dictator   Oligarch   Anocracy   Democracy

ˆ ˆ            ˆ
TARIFF   0  1 DEM   2 RTA  uˆ
ˆ               ˆ
0  50 and 1  5 and  2  10ˆ
8
Graphical Depiction of a Dummy
y
ˆ ˆ ˆ           ˆ
y  0  1x1  2 x2 if x2  1
ˆ ˆ ˆ
y  0  1x1

ˆ
1

ˆ
1

ˆ ˆ
0   2
ˆ
1
ˆ
                            ˆ ˆ ˆ           ˆ
y  0  1x1  2 x2 if x2  0
0
ˆ
0

x1 (could be continuous, categorical,      9
or dichotomous)
Multiple Category Dummies
 Dummy variables are a very flexible way
to assess categorical differences in the
mean of y
 We can use dummies even for concepts
with multiple categories
 Imagine we want to capture the impact of
global region on tariffs
 Regions:   Americas, Europe, Asia, Africa
10
Warning!
   Do not fall into the dummy variable trap!
When you have entered both values of a
dummy variable in the same regression.
These two variables are linearly
dependent. One will drop out.

11
Multiple Category Dummies
 Create 4 separate dummy variables - 1 for
each region
 Include all except one of these dummies in
the equation
 If you include all 4 dummies you get
perfect collinearity with the constant. The
fourth dummy will drop out.
 Americas+Europe+Asia+Africa=1
12
Interpreting Multi-Category
Dummies
   Each coefficient compares the mean for that group to the
mean in the excluded category

   Thus if:
   βhat2-βhat4 compare the mean tariff in each region to the
mean in the Americas
ˆ ˆ          ˆ        ˆ         ˆ
TARIFF  0  1DEM  2 EUR  3 ASIA  4 AFR  u
ˆ
   Mean in Americas is βhat0
   An alternative strategy is to drop the constant and run all
dummies, as discussed last week.
13
Dumb Dummies
 Dummy variables are easy, flexible ways
to measure categorical concepts
 They CAN be just labels for ignorance
 Try to use dummies to capture theoretical
constructs not empirical observations
 If possible, measure the theoretical
construct more directly

14
Interaction Effects
 Dummy variables specify new intercepts
 Other slope coefficients in the equation do
not change
 OLS assumes that the slopes of
continuous variables are constant across
all cases
 What if slopes are different for different
groups in our sample?
15
Interaction Effects: An Example
   What if the effect of democracy on tariffs
depends on whether the state is in an RTA?

ˆ ˆ          ˆ
TARIFF  0  1DEM  2 RTA  u
ˆ
ˆ ˆ
1   0  1RTA
ˆ

16
Interaction Effects: An Illustration
(Notice that democracy has been converted to a dummy as
well for illustration purposes)
35
30
Percent Tariffs

25
20                                         RTA
15                                         No RTA

10
5
0
Non-Dem                  Democracy

ˆ     ˆ       ˆ
TARIFF   0  1 DEM   2 RTA  u
ˆ
1  5 if RTA  0
ˆ
1  6 if RTA  1
17
How Do We Estimate This Set
of Relationships?
   We begin with:
ˆ     ˆ      ˆ
TARIFF  0  1DEM  2 RTA  u
ˆ
ˆ
1   0  1RTA
ˆ    ˆ
   Substituting for Βhat1,hat
Β 1                    Βhat2
In STATA, they will       Βhat3
we get:                           appear as regular
coefficients
ˆ                         ˆ
TARIFF   0  ( 0  1RTA) DEM   2 RTA  u
ˆ    ˆ                      ˆ
ˆ ˆ
TARIFF    DEM  RTA * DEM   RTA  u
ˆ               ˆ    ˆ
0    0            1                         2
18
What Do These Coefficients
Mean?
ˆ ˆ                       ˆ
TARIFF  0  0 DEM  1RTA * DEM  2 RTA  u
ˆ                     ˆ
ˆ
 is the intercept for DEM when RTA=0
0
ˆ ˆ
0  2 is the new intercept for DEM when RTA=1
 0 is the slope of DEM when RTA=0
 1 is the impact of RTA on the coefficient for DEM
So if RTA=1, the slope of DEM is  0 + 1
19
Interpreting the Interaction
   Recall that:            ˆ     ˆ      ˆ
TARIFF  0  1DEM  2 RTA  u
ˆ
ˆ
1   0  1RTA
ˆ    ˆ

ˆ                          ˆ
TARIFF   0  ( 0  1 RTA) DEM   2 RTA  u
ˆ   ˆ                       ˆ
ˆ                                 ˆ
TARIFF   0  0 DEM  1 RTA * DEM   2 RTA  u
ˆ          ˆ                     ˆ
   RTA is a dummy variable taking on the values 0
or 1
ˆ ˆ
Thus if RTA=0, then 1 = 0
ˆ ˆ ˆ
But if RTA=1, then 1 = 0 +1                        20
An Illustration of the Coefficients
   Imagine we estimate:
TARIFF  30  5( DEM )  1( RTA * DEM ) 10( RTA)
35
30
Percent Tariffs

25
20                         RTA
15                         No RTA

10
5
0
Non-Dem   Democracy

21
Substantive Effects of Dummy
Interactions
No RTA            RTA

Non-        Βhat0 =           Βhat0 + Βhat3 =
Democracy   30                20
Democracy   Βhat0 + Βhat1 =   Βhat0 + Βhat1 +
25                Βhat2 + Βhat3 =
14
22
Interactions with Continuous
Variables
    The exact same logic about interactions applies if
Βhat1 depends on a continuous variable
ˆ      ˆ      ˆ
y   0  1x1   2 x 2  u
ˆ
ˆ
    x
ˆ      ˆ
1     0     1   2

 0 is the impact of x1 when x2 =0
ˆ
ˆ
1 is the change in 1 for each one unit increase in x2
ˆ
ˆ
 is the impact of x when x =0
2                            2   1              23
Example:
Democracy, Tariffs & Unemployment
50
40
30
20
10

Dictator            Oligarch                   Anocrat               Demo
Democracy 1-4

yhat_, Unemployment == 0             yhat_, Unemployment == 2
yhat_, Unemployment == 4             yhat_, Unemployment == 6

TARIFF  28  2( DEM )  1( DEM *UNEMP)  5(UNEMP)
24
Graphical Depiction of a Dummy/Continuous
Interaction
y     y  ˆ0  0 x1 1(x1 * x2 )  ˆ3 x2 if x2  1
ˆ
ˆ1  0 1
y  ˆ0  ˆ1x1  ˆ2 x2 if x2  1
ˆ
ˆ
1

ˆ                                  ˆ ˆ                              ˆ
ˆ   y  0  0 x1 1 ( x1 * x2 )  3 x2 if x2  0
1 0
0
ˆ
0
ˆ     ˆ
 0  3
x1 (could be continuous, categorical,              25
or dichotomous)
What if a Variable Interacts with
Itself?
   What if Βhat1 depends on the value of x1?
ˆ     ˆ     ˆ
y   0  1x1   2 x 2  u
ˆ
ˆ
1   0  1x1
ˆ      ˆ
   Then we substitute in as before:
ˆ                        ˆ
y   0  ( 0  1x1 )x1   2 x 2  u
ˆ   ˆ                    ˆ
ˆ
y    x  x 2   x  u
ˆ       ˆ       ˆ        ˆ
0    0   1   1 1      2   2

   Curvilinear (Quadratic) effect is a type of
interaction                                   26
More Complex Interactions
 We can use this method to specify the
functional form of βhat1 in any way we
choose
 Simply substitute the function in for βhat1 ,
multiply out the terms and estimate
 Only limitations are theories of interaction
and levels of collinearity

27
Examples of
interaction effects
from my own research

28
Governance and Economic Welfare
Figure 4: PCI Performance and Economic Welfare
15

Better governed (high PCI)
provinces are able to
generate higher living
standards from the same
level of development
10
5
0

0              20                  40                   60                  80             10 0
Structura l Endowments (Infrastructure, H uma n C apital, Proxim ity to M arkets)

L ow PC I                     H igh PCI

29
Predicted Number of Loans by Legal Status
among Vietnamese Private Firms
Land Use Rights Certificate

Registered at DPI   None          Partial          Full

No                         0.83             0.99           1.2

Yes                        2.73             3.27          3.98

30
Predicted Probability of Provincial Division in
Vietnam
(By State Sector Output with Number of Cabinet Officials)
.8
.7
.6
.5
.4

0                  .2              .4              .6              .8        1
State Contribution to Provincial Output

No Cabinet Members                   1 Cabinet Member
2+ Cabinet Members
Contribution of covariates at 75th percentile

31
32
33

```
To top