Embed
Email

Causal Inference

Document Sample
Causal Inference
Shared by: HC111124075227
Categories
Tags
Stats
views:
1
posted:
11/24/2011
language:
English
pages:
35
Impact Evaluation









Causal Inference



Sebastian Galiani



November 2006

Motivation

 The research questions that motivate

most studies in the health sciences

are causal in nature. For example:



 What is the efficacy of a given drug in

Impact: given population? What fraction

of deaths from a given disease could have

been avoided by a given treatment or

policy?









HDN SAR WBI 2

Motivation



 The most challenging empirical questions in

economics also involve causal-effect

relationships:



 Does school decentralization improve schools quality?









HDN SAR WBI 3

Motivation

 Interest in these questions is motivated

by:

 Policy concerns

 Does privatization of water systems improve child

health?

 Theoretical considerations



 Problems facing individual decision

makers





HDN SAR WBI 4

Causal Analysis

 The aim of standard statistical analysis, typified

by likelihood and other estimation techniques, is

to infer parameters of a distribution from

samples drawn of that distribution.



 With the help of such parameters, one can:



1. Infer association among variables,



2. Estimate the likelihood of past and future events,



3. As well as update the likelihood of events in light of new

evidence or new measurement.



HDN SAR WBI 5

Causal Analysis

 These tasks are managed well by standard

statistical analysis as long as experimental

conditions remain the same.



 Causal analysis goes one step further:



 Its aim is to infer aspects of the data generation

process.



 With the help of such aspects, one can deduce not only

the likelihood of events under static conditions, but also

the dynamics of events under changing conditions.







HDN SAR WBI 6

Causal Analysis

 This capability includes:



1. Predicting the effects of interventions

2. Predicting the effects of spontaneous changes

3. Identifying causes of reported events



 This distinction implies that causal and

associational concepts do not mix.









HDN SAR WBI 7

Causal Analysis

The word cause is not in the vocabulary of standard

probability theory.



 All Probability theory allows us to say is that two

events are mutually correlated, or dependent –

meaning that if we find one, we can expect to

encounter the other.



 Scientists seeking causal explanations for

complex phenomena or rationales for policy

decisions must therefore supplement the

language of probability with a vocabulary for

causality.



HDN SAR WBI 8

Causal Analysis

 Two languages for causality have been

proposed:



1. Structural equation modeling (ESM)

(Haavelmo 1943).



2. The Neyman-Rubin potential outcome

model (RCM) (Neyman, 1923; Rubin,

1974).



HDN SAR WBI 9

The Rubin Causal Model



 Define the population by U. Each unit in U

is denoted by u.



 For each u  U, there is associated a value

Y(u) of the variable of interest Y, which we

call: the response variable.



 Let A be a second variable defined on U.

We call A an attribute of the units in U.



HDN SAR WBI 10

 The key notion is the potential for exposing or

not exposing each unit to the action of a cause:



 Each unit has to be potentially exposable to any

one of the causes.



 Thus, Rubin takes the position that causes are

only those things that could be treatments in

hypothetical experiments.



 An attribute cannot be a cause in an experiment,

because the notion of potential exposability does

not apply to it.





HDN SAR WBI 11

 For simplicity, we assume that there are just two

causes or level of treatment.



 Let D be a variable that indicates the cause to

which each unit in U is exposed:



t if unit u is exposed to treatment

D

c if unit u is exposed to control



In a controlled study, D is constructed by the

experimenter. In an uncontrolled study, it is

determined by factors beyond the experimenter’s

control.

HDN SAR WBI 12

 The values of Y are potentially affected by the

particular cause, t or c, to which the unit is

exposed.



 Thus, we need two response variables:



Yt(u), Yc(u)



 Yt is the value of the response that would be

observed if the unit were exposed to t and



 Yc is the value that would be observed on the

same unit if it were exposed to c.





HDN SAR WBI 13

 Let D also be expressed as a binary

variable:



D = 1 if D = t and D = 0 if D = c



 Then, the outcome of each individual can

be written as:



Y(U) = D Y1 + (1 – D) Y0





HDN SAR WBI 14

 Definition: For every unit u treatment {Du = 1 instead

of Du = 0} causes the effect





u = Y1(u) – Y0(u)

 This definition of a causal effect assumes that the

treatment status of one individual does not affect the

potential outcomes of other individuals.



 Fundamental Problem of Causal Inference: It is

impossible to observe the value of Y1(u) and Y0(u) on the

same unit and, therefore, it is impossible to observe the

effect of t on u.



 Another way to express this problem is to say that we

cannot infer the effect of treatment because we do not

have the counterfactual evidence i.e. what would have

happened in the absence of treatment.



HDN SAR WBI 15

 Given that the causal effect for a single unit u

cannot be observed, we aim to identify the

average causal effect for the entire population

or for sub-populations.



 The average treatment effect ATE of t (relative

to c) over U (or any sub-population) is given by:





ATE =E [Y1(u) – Y0(u)]

= E [Y1(u)] – E [Y0(u)]



   Y1  Y0 (1)





HDN SAR WBI 16

 The statistical solution replaces the impossible-

to-observe causal effect of t on a specific unit

with the possible-to-estimate average causal

effect of t over a population of units.



 Although E(Y1) and E(Y0) cannot both be

calculated, they can be estimated.



 Most econometrics methods attempt to construct

from observational data consistent estimates of





Y1 and Y0

HDN SAR WBI 17

 Consider the following simple estimator

of ATE:





ˆ  [ Y | D  1] - [ Y | D  0] (2)

ˆ ˆ

1 0





 Notethat equation (1) is defined for the

whole population, whereas equation (2)

represents an estimator to be evaluated

on a sample drawn from that population





HDN SAR WBI 18

 Let  equal the proportion of the population

that would be assigned to the treatment

group.



 Decomposing ATE, we have:



   {D1}  (1   ) {D0}

   Y1  Y0  | D  1  (1   ) Y1  Y0  | D  0



   [ Y1 | D  1]  (1   ) [ Y1 | D  0]

 [Y 0 | D  1]  (1   ) [ Y0 | D  0]  Y1  Y0

HDN SAR WBI 19

 If we assume that

[Y1 | D  1]  [Y1 | D  0] and [Y0 | D  1]  [Y0 | D  0]



   [ Y1 | D  1]  (1   ) [ Y1 | D  1]

 [Y

0 | D  0]  (1   ) [ Y0 | D  0]



  [ Y1 | D  1] - [ Y0 | D  0]

 Which is consistently estimated by its sample

analog estimator:





ˆ  [ Y | D  1] - [ Y | D  0]

ˆ ˆ

1 0

HDN SAR WBI 20

 Thus, a sufficient condition for the standard

estimator to consistently estimate the true ATE is

that:

[Y1 | D  1]  [Y1 | D  0] and [Y0 | D  1]  [Y0 | D  0]

 In this situation, the average outcome under the treatment

and the average outcome under the control do not differ

between the treatment and control groups.



 In order to satisfy these conditions, it is sufficient that

treatment assignment D be uncorrelated with the potential

outcome distributions of Y1 and Y2.



 The principal way to achieve this uncorrelatedness is

through random assignment of treatment.

HDN SAR WBI 21

 In most circumstances, there is simply no

information available on how those in the control

group would have reacted if they had received the

treatment instead.



 This is the basis for an important insight into the

potential biases of the standard estimator (2).



 After a bit of algebra, it can be shown that:





ˆ    [Y0 | D  1]  [Y0 | D  0]  (1   ){D1}  {D0} 

  

 

Baseline Difference ity

TreatmentHeterogene





HDN SAR WBI 22

 This equation specifies the two sources of

biases that need to be eliminated from

estimates of causal effects from observational

studies.



1. Selection Bias: Baseline difference.

2. Treatment Heterogeneity.



 Most of the methods available only deal with

selection bias, simply assuming that the

treatment effect is constant in the population

or by redefining the parameter of interest in

the population.





HDN SAR WBI 23

Treatment on the Treated

 ATE is not always the parameter of interest.



 In a variety of policy contexts, it is the average

treatment effect for the treated that is of

substantive interest:





TOT =E [Y1(u) – Y0(u)| D = 1]

= E [Y1(u)| D = 1] – E [Y0(u)| D = 1]









HDN SAR WBI 24

Treatment on the Treated

 The standard estimator (2) consistently

estimates TOT if:







[Y0 | D  1]  [Y0 | D  0]









HDN SAR WBI 25

Structural Equation Modeling

 Structural equation modeling was originally

developed by geneticists (Wright 1921) and

economists (Haavelmo 1943).









HDN SAR WBI 26

Structural Equations

 Definition: An equation



y=βx+ε (8)



is said to be structural if it is to be interpreted as follows:



 In an ideal experiment where we control X to x and any

other set Z of variables (not containing X or Y) to z, the

value y of Y is given by β x + ε, where ε is not a function of

the settings x and z.



 This definition is in the spirit of Haavelmo (1943), who

explicitly interpreted each structural equation as a statement

about a hypothetical controlled experiment.



HDN SAR WBI 27

 Thus, to the often asked question, “Under what

conditions can we give causal interpretation to

structural coefficients?”



 Haavelmo would have answered: Always!



 According to the founding father of SEM, the

conditions that make the equation y = β x + ε

structural are precisely those that make the

causal connection between X and Y have no

other value but β, and ensuring that nothing

about the statistical relationship between x and

ε can ever change this interpretation of β.



HDN SAR WBI 28

 The average causal effect: The average

causal effect on Y of treatment level x is

the difference in the conditional

expectations:





E(Y|X = x) – E(Y|X = 0)





 In the context of dichotomous

interventions (x = 1), this causal effect is

called the average treatment effect

(ATE).



HDN SAR WBI 29

Representing Interventions

 Consider the structural model M:



z = fz(w)

x = fx(z, )

y = fy(x, u)



 We represent an intervention in the model through

a mathematical operator denoted d0(x).



 d0(x) simulates physical interventions by deleting

certain functions from the model, replacing them

by a constant X = x, while keeping the rest of the

model unchanged.

HDN SAR WBI 30

 To emulate an intervention d0(x0) that holds X constant (at X

= x0) in model M, replace the equation for x with x = x0, and

obtain a new model, Mx0



z = fz(w)

x = x0

y = fy(x, u)



 The joint distribution associated with the modified model,

denoted P(z, y| d0(x0)) describes the post-intervention

(“experimental”) distribution.



 From this distribution, one is able to assess treatment

efficacy by comparing aspects of this distribution at different

levels of x0.





HDN SAR WBI 31

Structural Parameters

 Definition: The interpretation of a structural

equation as a statement about the behavior of Y

under a hypothetical intervention yields a simple

definition for the structural parameters.



The meaning of β in the equation y = β x + ε is

simply







  E[Y | d o (x)]

x

HDN SAR WBI 32

Counterfactual Analysis in Structural

Models

 Consider again model Mxo. Call the solution of Y

the potential response of Y to x0.



 We denote it as Yx0(u, , w).



 This entity can be given a counterfactual

interpretation, for it stands for the way an

individual with characteristics (u, , w) would

respond, had the treatment been x0, rather than

the x = fx(z, ) actually received by the individual.





HDN SAR WBI 33

 In our example,



Yx0(u, , w) = Yx0(u) = y = fy(x0, u)



• This interpretation of counterfactuals, cast as solutions to

modified systems of equations, provides the conceptual

and formal link between structural equation modeling and

the Rubin potential-outcome framework.



• It ensures us that the end results of the two approaches

will be the same.



• Thus, the choice of model is strictly a matter of

convenience or insight.









HDN SAR WBI 34

References

 Judea Pearl (2000): Causality: Models, Reasoning and

Inference, CUP. Chapters 1, 5 and 7.

 Trygve Haavelmo (1944): “The probability approach in

econometrics”, Econometrica 12, pp. iii-vi+1-115.

 Arthur Goldberger (1972): “Structural Equations Methods in

the Social Sciences”, Econometrica 40, pp. 979-1002.

 Donald B. Rubin (1974): “Estimating causal effects of

treatments in randomized and nonrandomized

experiments”, Journal of Educational Psychology 66, pp.

688-701.

 Paul W. Holland (1986): “Statistics and Causal Inference”,

Journal of the American Statistical Association 81, pp. 945-

70, with discussion.





HDN SAR WBI 35


Related docs
Other docs by HC111124075227
Bishops Newsletter Sept Oct2003
Views: 2  |  Downloads: 0
?????????????????????? ...
Views: 2  |  Downloads: 0
Extension Activities for Children�s Books:
Views: 3  |  Downloads: 0
K/J Document Type - Topic
Views: 1  |  Downloads: 0
Landbase Table
Views: 0  |  Downloads: 0
Listino
Views: 6  |  Downloads: 0
Kazananlaragustosmrfix
Views: 0  |  Downloads: 0
7
Views: 2  |  Downloads: 0
Five Step Closing System
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!