Docstoc

1 Differences-in-Differences Differences-in-Differences We have

Document Sample
1 Differences-in-Differences Differences-in-Differences We have Powered By Docstoc
					                                Differences-in-Differences


Differences-in-Differences

We have already come across the idea of ‘differencing’ as a way to deal with the
problem of omitted variables. In the context of the analysis of experimental data the
simple comparison of the mean of the outcome in treatment and control groups is
justified on the grounds that the randomization guarantees they should not have any
systematic differences in any other pre-treatment variable. And in the analysis of
twins data the use of the differences estimator was justified on the grounds that the
twins could be assumed to be identical in their ‘ability’ (though one might debate
whether this is really a good assumption as discussed earlier).

This idea of trying to mimic an experiment suggests trying to find equivalents of
‘treatment’ and ‘control groups’ in which everything apart from the variable of
interest (or other things that can be controlled for) are assumed to be the same. But
this is often a very difficult claim to make as it is rarely possible to do this perfectly in
which case observed differences between treatment and control groups may be the
result of some other omitted factors.

But, even if one might not be prepared to make the assumption that the treatment and
control groups are the same in every respect apart from the treatment one might be
prepared to make the assumption that, in the absence of treatment, the unobserved
differences between treatment and control groups are the same over time.

In this case one could use data on treatment and control group before the treatment to
estimate the ‘normal’ difference between treatment and control group and then
compare this with the difference after the receipt of treatment. Perhaps a graph will
make the idea clearer




                                                            A
                                          Treatment
               y

                                                             C


                                                             B

                                            Control

                           Pre-                           Post-
                                                                          Time




                                             1
If one just used data from the post-treatment period then one would estimate the
treatment effect as the distance AB – this estimate being based on the assumption that
the only reason for observing a difference in outcome between treatment and control
group is the receipt of treatment.

In contrast the ‘difference-in-difference’ estimator will take the ‘normal’ difference
between the treatment and control group as the distance CB and estimate the
treatment effect as the distance AC. Note that the validity of this is based on the
assumption that the ‘trend’ in y is the same in both treatment and control group – if,
for example, the trend was greater in the treatment group then AC would be an over-
estimate of the treatment group. With only two observations on treatment and control
group one cannot test this assumption of a constant underlying trend though I will
discuss a bit later how one can do this if one has more than two observations.

Lets introduce some notation. Define µit to be the mean of the outcome in group i at
time t. Define i=0 for the control group and i=1 for the treatment group. Define t=0
to be a pre-treatment period and t=1 to be the post-treatment period (though only the
treatment group gets the treatment).

The difference estimator we have discussed so far simply uses the difference in means
between treatment and control group post-treatment as the estimate of the treatment
effect i.e. it uses an estimate of ( µ11 − µ01 ) . However, this assumes that the treatment
and control groups have no other differences apart from the treatment, a very strong
assumption with non-experimental data. A weaker assumption is that any
differences in the change in means between treatment and control groups is the result
of the treatment i.e. to use an estimate of ( µ11 − µ01 ) − ( µ10 − µ00 ) as an estimate of the
treatment effect – this is the differences-in-differences estimator.

How can one estimate this in practice? One way is to write the D-in-D estimator as
( µ11 − µ10 ) − ( µ01 − µ00 ) - note that the first term is the change in outcome for the
treatment group and the second term the change in outcome for the control group and
then simply estimate the model:
                                         ∆yi = β 0 + β1 X i + ε i                         (1.1)
where:
                                            ∆yi = yi1 − yi 0                              (1.2)
Note that this is simply the differences estimator applied to differenced data.
To implement the difference-in-difference estimator in the form in (1.1) requires data
on the same individuals in both the pre- and post- periods. But it might be the case
that the individuals observed in the two periods are different so that those in the pre-
period who are in the treatment group are observed prior to treatment but we do not
observe their outcome after the treatment. If we use t=0 to denote the pre-period and
t=1 to denote the post-period yit to denote the outcome for individual i in period t
then an alternative regression-based estimator that just uses the level of the outcome
variable is to estimate the model:
                               yit = β 0 + β1 X i + β 2Tt + β 3 X i * Tt + ε it           (1.3)




                                              2
where X i is a dummy variable taking the value 1 if the individual is in the treatment
group and 0 if they are in the control group, and Tt is a dummy variable taking the
value 1 in the post-treatment period and 0 in the pre-treatment period.

The D-in-D estimator is going to be the OLS estimate of β3 , the coefficient on the
interaction between X i and Tt . Note that this is a dummy variable that takes the
value one only for the treatment group in the post-treatment period.

From what you have done already you should be able to work out that in the
estimation of (1.3) we will have that:
                                 ˆ
                           p lim β 0 = µ00
                                 ˆ
                           p lim β = µ − µ
                                    1     10    00
                                                                                       (1.4)
                                  ˆ
                            p lim β 2 = µ01 − µ00
                            p lim β 3 = ( µ11 − µ01 ) − ( µ10 − µ00 )
                                  ˆ
        ˆ
so that β3 is a consistent estimator of the D-in-D estimate of the treatment effect.

Where one has repeated observations on the same individuals one can use both
estimation methods - (1.1) and (1.3) – on the same data and they will give exactly the
same estimate of the treatment effect. However the standard error of that estimate
will be different in the two cases – the class exercize asks you about the reasons for
that.

Other Regressors
You can include other regressrs in either (1.1) or (1.3). Note that if you think it is
the level of some variable that affects the level of y then you should probably include
the change in that variables as one of the other regressors if one is estimating the
model (1.1) i.e. in differenced form.

Differential Trends in Treatment and Control Groups
The validity of the differences-in-differences estimator is based on the assumption
that the underlying ‘trends’ in the outcome variable is the same for both treatment and
control group. With only two observations this is not testable but with more than two
observations we can get some idea of its plausibility.

To give an example, consider the paper “Vertical Relationships and Competition in
Retail Gasoline Markets”, by Justine Hastings, published in the American Economic
Review, 2004. She was interested in the effect on retail petrol prices as a result of an
increase in vertical integration when a chain of independent Californian ‘Thrifty’
petrol stations were acquired by ARCO, who also have interests in petrol refining.
She defined a petrol station as being in the ‘treatment’ group if it was within one mile
of a Thrifty station (i.e. one can think of it as having a competitor that was a ‘Thrifty’)
and in the ‘control’ group if it did not. Because there are likely to be all sorts of
factors that causes petrol prices to differ across locations, this lends itself to a
difference-in-difference approach. The basic conclusions can be summarized in the
following graph.




                                               3
Before the acquisition, prices in the ‘treatment’ group were, on average 2-3 cents
lower than in the control group, but after the acquisition they were 2-3 cents higher.
Hence, the difference-in-difference estimate of the effect of the acquisition is 5 cents.

This picture also presents information on prices not just in the periods immediately
prior to the acquisition and immediately afterwards but also in other periods. One can
see that the trends in prices in treatment and control groups are similar in these other
periods suggesting that the assumption of common trends is a reasonable one.

Lets also give a famous example where the D-in-D assumption does not seem so
reasonable. In “Estimating the Effect of Training Programs on Earnings”, Review of
Economics and Statistics, 1978, Orley Ashenfelter was interested in estimating the
effect of government-sponsored training on earnings. He took a sample of trainees
under the Manpower Development and Training Act (MDTA) who started their
training in the first 3 months of 1964. Their earnings were tracked both prior, during
and after training from social security records. A random sample of the working
population were used as a comparison group.

The average earnings for white males for the two groups in the years 1959-69
inclusive are shown in the following Figure.




                                            4
                         8.5
              Log Mean Annual Earnings
                  7.5    7     8




                                         1959   1960   1961   1962   1963   1964   1965   1966   19 67   1968   19 69
                                                                            year

                                                               Comparison Group              Trainees




There are several points worth noting

First the earnings of the trainees in 1964 are very low because they were training and
not working for much of this year – we should not pay much attention to this.

Secondly the trainee and comparison groups are clearly different in some way
unconnected to training as their earnings both pre- and post-training are different.
This means that the differences estimator based, say, on 1965 data would be a very
misleading measure of the impact of training on earnings. This suggests a
differences-in-differences approach.

A simple-minded approach would be to use the data from 1963 and 1965 to give a
difference-in-difference estimate the effect of training on earnings. However
inspection of the figures suggests that the earnings of the trainees were rather low not
just in the training year, 1964, but also in the previous year, 19631. This is what is
known as ‘Ashenfelter’s Dip’ – the most likely explanation is that the trainees had a
bad year that year (e.g. they lost a job) and it was this that caused them to enter
training in 1964. Because the earnings of the trainees are rather low in 1963 a simple
differences-in-differences comparison of earnings in 1965 and 1963 is likely to over-
estimate the returns to training. Ashenfelter then describes a number of ways to deal
with this problem that I am not going to discuss here – the point is that observations
on multiple years can be used to shed light on whether the assumption underlying the
use of differences-in-differences is a good one.




1
 Note that they are only slightly lower than 1962 earnings but the usual pattern is for earnings growth
so it is quite a lot lower than what one might have expected to see.


                                                                            5

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:46
posted:3/5/2010
language:English
pages:5
Description: 1 Differences-in-Differences Differences-in-Differences We have