CAUSAL INFERENCE BENEFIT-COST ANALYSIS

Document Sample
CAUSAL INFERENCE BENEFIT-COST ANALYSIS Powered By Docstoc
					 CAUSAL INFERENCE &
BENEFIT-COST ANALYSIS



         Jens Ludwig
University of Chicago & NBER
                     MOTIVATION

Benefit-cost analysis (BCA) tries to assign dollar values
to program benefits and costs

Necessary precondition is to understand program’s
benefits and costs
 •   For that we need causal evidence
 •   Applying BCA to biased program impact estimates might do
     more harm than good
              EXPERIMENTS &
            THEIR ALTERNATIVES

1. What to do with social experiments that deviate
   from our super-clean ideal?
2. How good are our non-experimental alternatives?
3. Value of thinking about non-experiments through
   experimental lens
       THE (NEARLY) IDEAL HEAD START
                EXPERIMENT

We owe a great debt to HHS & Westat
 •   Representative sample of 383 over-subscribed Head Start
     centers, nearly 5,000 children, starting fall 2002
 •   Random assignment

But as often happens in real world, complications:
 •   Response rate differences across groups
 •   Imperfect compliance with treatment assignment
 •   Other center-based care alternatives
      SOME DIFFERENCES IN RESPONSE RATES
     ACROSS TREATMENT & CONTROL GROUPS

                           % with valid Fall    % with valid Fall 2003
                           2002 child           child assessment
                           assessment
Treatment group            85%                  88%
Control group              72%                  77%

What do we want to know to determine if this is a real problem?
 •    Are T and C samples similar with respect to baseline Xs?
 •    Are results sensitive to conditioning on baseline Xs? (tells us
      something about importance of any differences in Xs)
 •    Worst-case bounds analysis
      o Selective attrition w/ respect to things that change after baseline?
          IMPERFECT COMPLIANCE WITH
            TREATMENT ASSIGNMENT
Cleanest thing to do is compare everyone assigned to T group with
everyone assigned to C group
  • Known as intent to treat (ITT) in program evaluation
  • Understates impacts of actually participating (effects of
     treatment on the treated, or TOT)

Distinction between ITT and TOT is crucial for BCA because:
 • Cost-effectiveness studies usually compare TOTs
 • BCA usually uses cost per child of program participation
      (don’t want ITT benefits vs. TOT costs)
                  CALCULATION OF TOT
                        Control Group
Treatment Group         Non-participant      Participant
                        Never-taker          Defier
Non-participant
                        Complier             Always-taker
Participant

If Head Start center quality similar for T and C participants, and
never-takers unaffected by T assignment, then ITT is a weighted
average (Bloom, 1984, Angrist, Imbens and Rubin, 1996)

ITT = (% Never-takers)×(0) + (% Always-takers)×(0) + (% Compliers)×(TOT)
                CALCULATION OF TOT
Note that:
   % Compliers = (% T group in HS) – (% C group in HS)

Then we can re-arrange terms from previous slide:

   TOT    =       Intent to Treat Effect
                     % compliers

          = (T group avg. scores) – (C group avg. scores)
            (% T group in HS) – (% C group in HS)

Note that both numerator and denominator are fully experimental
  INTUITIVE ALTERNATIVES TO BLOOM TOT
Bloom estimator for TOT very counter-intuitive
 • ITTs in numerators and denominators pool everyone together
 • Never directly compare participants with non-participants


Westat approach to control group HS participants: throw them out
 • Then adjust for baseline covariates
 • They note this is now non-experimental
 • That also means not sure how useful this is for BCA

Cure for control group cross-over and possibility of T / C group
differences in Head Start center quality may well be worse
(perhaps far worse) than the disease
         ALSO IMPORTANT TO NOTE THE
               COUNTERFACTUAL
43% of control group 4 year olds participate in some other form of
center-based care

So TOT is in large part Head Start vs. other center-based care
 • Different from whether Head Start is better than nothing
 • We’ll want to think about how to count costs of
    counterfactual center-based care in the BCA (particularly if
    some of that care gets some government support)
       WHAT ARE OUR ALTERNATIVES TO
        RANDOMIZED EXPERIMENTS?
How bad can selection bias really be, anyway?
       WHAT ARE OUR ALTERNATIVES TO
        RANDOMIZED EXPERIMENTS?
How bad can selection bias really be, anyway?

Pretty bad!
  • Starting with LaLonde (1986), job training
      o See if we can use non-experimental comparison group &
        methods to reproduce correct experimental answer
  • Not clear we have good diagnostic tests for bias
  • Might miss the right answer by more than a little
      o (Ex) hormone replacement therapy, class size
  • Bias likely to be application specific (depends on program
      self-selection process and available data quality)
  • Would be great to “do a LaLonde” to Head Start experiment
                REGRESSION DISCONTINUITY (RD)
                   “Nature does not make jumps”
Growing body of LaLonde-esque studies suggest RD might be
good alternative to randomized experiments
Head Start funding per capita, 1968                           Mortality from causes screened for by Head Start, 1973-83

             Nonparametric               Flexible quadratic                Nonparametric              Flexible quadratic
  800




                                                                 5
  600




                                                                 4
                                                                 3
  400




                                                                 2
  200




                                                                 1
  0




                                                                     40      50          60              70           80
        40      50          60              70           80                       1960 Poverty rate
                     1960 Poverty rate

Source: Jens Ludwig and Douglas Miller, November 2007 Quarterly Journal of Economics
   VALUE OF EXPERIMENTAL THINKING FOR RD
Tulsa RD study, Gormley, Gayer, Phillips and Dawson
                                  Just barely too young for         Just barely old enough
                                  pre-K in 2002                     for pre-K in 2002 (so K
                                  (so pre-K in 2003)                in 2003)
Select into pre-K                 1,587                             1,349
                                  60% free / reduced lunch          67% free / reduced lunch
                                  30% black                         39% black
                                  24% mom HS dropout                17% mom HS dropout

Do not select into pre-K          ????                              1,800

Comparison of self-selected compliers around DOB threshold; assume selection rule varies
continuously, but choice set changes discretely (choose in 2002 vs. 2003); does selection process
also change discretely? Note differences in Xs and Ns

Easy fix – sample EVERYONE (not just those in school in 2003) and do ITT analog of RD estimate
                       BOTTOM LINES
Rapid changes in technology of program evaluation
   First study of Head Start “fade out” was 1966 (short honeymoon!)
       But cross-section comparisons (participants & non-participants)
   1990s, Janet Currie sibling-difference estimates
   2000s, randomized experiments & growing body of RD evidence

Further refinements to evaluation technology would be useful
for improving BCA in early childhood education

Great value in LaLonde-ing Head Start experiment so we can
learn more about value of alternatives to experimentation
   HHS, please either pay Westat to do that or give out the data!