Docstoc

International Program for Development

Document Sample
International Program for Development Powered By Docstoc
					   RealWorld Evaluation
Designing Evaluations under Budget, Time, Data
           and Political Constraints
       American Evaluation Association
      Professional pre-session workshop
                    Denver
                November 5, 2008



           Facilitated by:
         Michael Bamberger
           and Jim Rugh


                                          1
          Workshop Objectives
1. The seven steps of the RealWorld Evaluation
   approach for addressing common issues and
   constraints faced by evaluators such as: when the
   evaluator is not called in until the project is nearly
   completed and there was no baseline nor
   comparison group; or where the evaluation must be
   conducted with inadequate budget and insufficient
   time; and where there are political pressures and
   expectations for how the evaluation should be
   conducted and what the conclusions should say


                                                    2
          Workshop Objectives
2. Identifying and assessing various design options
   that could be used in a particular evaluation setting
3. Ways to reconstruct baseline data when the
   evaluation does not begin until the project is well
   advanced or completed.
4. How to identify and to address threats to the validity
   or adequacy of quantitative, qualitative and mixed
   methods designs with reference to the specific
   context of RealWorld evaluations



                                                    3
      Workshop Objectives
Note: Given time constraints the workshop
will focus on project-level impact
evaluations. However, if the results of a pre-
workshop survey of participants call for it, a
brief introduction to the application of RWE
techniques in other forms of evaluation,
including the assessment of country
programs and policy interventions, could be
included.

                                         4
                 Workshop agenda

8.00 – 8.20: Session 1: Introduction:
    •   Workshop objectives
    •   Feedback from participant survey
    •   Handout: RealWorld Evaluation Overview (summary chapter of book)

8.20 – 8.50: Session 2: RealWorld Evaluation overview and
addressing the counterfactual
    •   Handout: “Why evaluators can’t sleep at night”

8.50 - 9.20: Session 3: Small Group discussions.
    •   Participants will introduce themselves and then share experiences on the types of constraints they have
        faced when designing and conducting evaluations, and what they did to try to address those constraints.

9:20-10.00: Session 4: RWE Steps 1, 2 and 3: Scoping the
evaluation and strategies for addressing budget and time
constraints
    •   Presentation and discussion

10.00-10.15: BREAK
        Workshop agenda, cont.
10:15 – 10:45: Session 5: RWE Step 4: Addressing data constraints
    •   Presentation and discussion

10.45 – 11.15: Session 6: Mixed methods
    •   Presentation and discussion

11.15 – 12.00: Session 7: Small groups read their case studies and
begin to discuss the learning exercise.
    •   We will use a low-cost housing case study. All four groups will discuss the same project but from different
        perspectives.

12:00 – 1:00: LUNCH
2.10 – 1.45: Session 8: Identifying and addressing threats to the
validity of the evaluation design and conclusions
1.45 – 2.30: Session 9: Small groups complete exercise.
    •   Negotiate with your paired group how you propose to modify the ToR of your case study.

2.30 – 2.45: Session 10: Feedback from exercise.
    •   Discussion of lessons learned from the case study or the RealWorld Evaluation approach in general.

2.45 – 3.00: Session 11: Wrap up and workshop evaluation
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints


          Session 2. a
      OVERVIEW OF THE
        RWE APPROACH


                                       7
RealWorld Evaluation Scenarios
Scenario 1: Evaluator(s) not brought in until near
   end of project
For political, technical or budget reasons:
   • There was no baseline survey
   • Project implementers did not collect
      adequate data on project participants at the
      beginning or during the life of the project
   • It is difficult to collect data on comparable
      control groups


                                            8
RealWorld Evaluation Scenarios
Scenario 2: The evaluation team is called in
   early in the life of the project
But for budget, political or methodological
   reasons:
 The ‘baseline’ was a needs assessment,
   not comparable to eventual evaluation
 It was not possible to collect baseline data
   on a comparison group

                                          9
Reality Check – Real-World
Challenges to Evaluation
•   All too often, project designers do not think
    evaluatively – evaluation not designed until the
    end
•   There was no baseline – at least not one with data
    comparable to evaluation
•   There was/can be no control/comparison group.
•   Limited time and resources for evaluation
•   Clients have prior expectations for what the
    evaluation findings will say
•   Many stakeholders do not understand evaluation;
    distrust the process; or even see it as a threat
    (dislike of being judged)
                                                11
RealWorld Evaluation
Quality Control Goals
   Achieve maximum possible evaluation rigor
    within the limitations of a given context
   Identify and control for methodological
    weaknesses in the evaluation design
   Negotiate with clients trade-offs between
    desired rigor and available resources
   Presentation of findings must recognize
    methodological weaknesses and how they
    affect generalization to broader populations

                                            12
    The Need for the RealWorld
    Evaluation Approach

   As a result of these kinds of constraints, many
    of the basic principles of impact evaluation
    design (comparable pre-test-post test design,
    comparison group, instrument development and
    testing, random sample selection, control for
    researcher bias, thorough documentation of the
    evaluation methodology etc.) are often
    sacrificed.



                                            13
The RealWorld Evaluation
Approach

                 An integrated approach to
                 ensure acceptable standards
                 of methodological rigor while
                 operating under real-world
                 budget, time, data and
                 political constraints.


     See handout summary chapter extracted from
      RealWorld Evaluation book for more details

                                                   15
The RealWorld Evaluation
approach
   Developed to help evaluation practitioners
    and clients
    • managers, funding agencies and external
      consultants
   A work in progress
   Originally designed for developing countries,
    but equally applicable in industrialized
    nations


                                            16
Special Evaluation Challenges in
Developing Countries
   Unavailability of needed data
   Scarce local evaluation resources
   Limited budgets for evaluations
   Institutional and political constraints
   Lack of an evaluation culture
   Many evaluations are designed by, and for,
    external funding agencies and seldom reflect
    local and national stakeholder priorities

                                            17
Special Evaluation Challenges in
Developing Countries

 Despite these challenges, there is a
 growing demand for methodologically
 sound evaluations which assess the
 impacts, sustainability and replicability of
 development projects and programs
 …………………….



                                        18
Most RealWorld Tools are not New—
Only the Integrated Approach is New

    Most of the RealWorld Evaluation data
     collection and analysis tools will be familiar to
     most evaluators
    What is new is the integrated approach
     which combines a wide range of tools to
     produce the best quality evaluation under
     real-world constraints



                                                19
Who Uses RealWorld Evaluation
and When?
   Two main users:
     • Evaluation practitioners
     • Managers, funding agencies and external
       consultants
   The evaluation may start at:
     • The beginning of the project
     • After the project is fully operational
     • During or near the end of project
       implementation
     • After the project is finished
                                             21
What is Special About the
RealWorld Evaluation Approach?

   There is a series of steps, each with
    checklists for identifying constraints and
    determining how to address them
   These steps are summarized on the following
    slide and then the more detailed flow-chart
    …
                 (See page6of handout)




                                          22
The Steps of the RealWorld
Evaluation Approach

Step 1: Planning and scoping the evaluation
Step 2: Addressing budget constraints
Step 3: Addressing time constraints
Step 4: Addressing data constraints
Step 5: Addressing political constraints
Step 6: Assessing and Addressing the strengths
   and weaknesses of the evaluation design
Step 7: Helping clients use the evaluation


                                           23
                                           The Real-World Evaluation Approach
                                                   Step 1: Planning and scoping the evaluation
                    A. Defining client information needs and understanding the political context
                    B. Defining the program theory model
                    C. Identifying time, budget, data and political constraints to be addressed by the RWE
                    D. Selecting the design that best addresses client needs within the RWE constraints



            Step 2                               Step 3                                 Step 4                                  Step 5
     Addressing budget               Addressing time constraints             Addressing data constraints                Addressing political
         constraints              All Step 2 tools plus:                   A. Reconstructing baseline data                    influences
A. Modify evaluation design       F. Commissioning preparatory             B. Recreating comparison                A. Accommodating pressures
B. Rationalize data needs         studies                                  groups                                  from funding agencies or
C. Look for reliable secondary    G. Hire more resource persons            C. Working with non-equivalent          clients on evaluation design.
data                              H. Revising format of project            comparison groups                       B. Addressing stakeholder
D. Revise sample design           records to include critical data for     D. Collecting data on sensitive         methodological preferences.
E. Economical data collection     impact analysis.                         topics or from difficult to reach       C. Recognizing influence of
methods                           I. Modern data collection and            groups                                  professional research
                                  analysis technology                      E. Multiple methods                     paradigms.




                                 Step 6                                                                        Step 7
  Assessing and addressing the strengths and weaknesses of the
                          evaluation design                                                     Helping clients use the evaluation
An integrated checklist for multi-method designs                                 A. Utilization
A. Objectivity/confirmability                                                    B. Application
B. Replicability/dependability                                                   C. Orientation
C. Internal validity/credibility/authenticity                                    D. Action
D. External validity/transferability/fittingness
                                                                                                                                            24
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints



          Session 2.b
     The challenge of the
       counterfactual
Attribution and counterfactuals

 How do we know if the observed changes in
 the project participants or communities
  •    income, health, attitudes, school attendance etc
 are due to the implementation of the project
  •    credit, water supply, transport vouchers, school
      construction etc
 or to other unrelated factors?
  •    changes in the economy, demographic movements,
      other development programs etc


                                                      26
The Counterfactual
   What would have been the condition of
    the project population at the time of the
    evaluation if the project had not taken
    place?




                                         27
Where is the counterfactual?

After families had been living
  in a new housing project for
  3 years, a study found
  average household income
  had increased by an 50%

Does this show that housing is
  an effective way to raise
  income?

                                 28
      Comparing the project with two
      possible comparison groups
I
n
c
o
m                       Project group. 50% increase
e



750
                                Scenario 1. 50% increase in
                               comparison group income. No
                                evidence of project impact
500
                             Scenario 2. No increase in
                             comparison group income.
250                         Potential evidence of project
                                       impact


          2000       2002
5 main evaluation strategies
for addressing the counterfactual


Randomized designs
I. True experimental designs
II. Randomized field designs
Quasi-experimental designs
III. Strong quasi-experimental designs
IV. Weaker quasi-experimental designs
Non-experimental designs.
V. No logically defensible counterfactual


                                            30
       The best statistical design option in most field
       settings: Randomized or strong quasi-experimental
       evaluation designs

                                          T1           T2            T3
                                        Pre-test   Treatment       Post-
                                                    [project]       test
Subjects randomly      Project group         P1        X              P2
  assigned to the
project and control
 groups or control
  group selected
using statistical or   Control group         C1                       C2
    judgmental
     matching




                                                           Conditions of both
             Gain score [impact] = P2 – P1                  groups are not
                                                           controlled during
                                   C2– C1                     the project
         Control group and comparison group

   Control group = randomized allocation of
    subjects to project and non-treatment group
   Comparison group = separate procedure for
    sampling project and non-treatment groups




                                            32
Reference sources for
randomized field trial designs
1. MIT Poverty Action Lab
                www.povertyactionlab.org

2. Center for Global Development
“When will we ever learn?”
     http://www.cgdev.org/content/publications/detail/7973

3. International Initiative for Impact Evaluation = 3ie
                   http://www.3ieimpact.org/




                                                         33
The limited use of strong
evaluation designs
   It is estimated that
    • Less only 5-10% of impact evaluations use a
        strong quasi-experimental design
    •   Significantly less than 5% use randomized
        control trials




                                               34
TIME FOR DISCUSSION   35
Introductory small-group
discussions
 Introduce yourselves, including something
 about your experience in coordinating or
 conducting evaluations.

 In particular share experiences on the types
 of constraints you have faced when
 designing and conducting evaluations, and
 what you did to try to address those
 constraints.

                                         36
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints


       Session 4, Step #1
PLANNING AND SCOPING THE
       EVALUATION


                                   37
Step 1: Planning and Scoping the
Evaluation

   Understanding client information needs
   Defining the program theory model
   Preliminary identification of constraints to
    be addressed by the RealWorld
    Evaluation




                                           38
A. Understanding client information
needs

Typical questions clients want answered:
 Is the project achieving its objectives?

 Are all sectors of the target population
  benefiting?
 Are the results sustainable?

 Which contextual factors determine the
  degree of success or failure?

                                       39
A. Understanding client information
needs

A full understanding of client information
  needs can often reduce the types of
  information collected and the level of
  detail and rigor necessary.

However, this understanding could also
 increase the amount of information
 required!

                                        40
B. Defining the program theory
model
All programs are based on a set of assumptions
   (hypothesis) about how the project’s
   interventions should lead to desired outcomes.
 Sometimes this is clearly spelled out in project
   documents.
 Sometimes it is only implicit and the evaluator
   needs to help stakeholders articulate the
   hypothesis through a logic model.


                                             41
B. Defining the program theory
model
   Defining and testing critical assumptions
    are a essential (but often ignored)
    elements of program theory models.

   The following is an example of a model
    to assess the impacts of microcredit on
    women’s social and economic
    empowerment

                                         42
     Critical Hypothesis for a Gender-Inclusive
     Micro-Credit Program

   Outputs
     • If credit is available women will be willing and able to obtain loans
       and technical assistance.
   Short-term outcomes
     • If women obtain loans they will start income-generating activities.
     • Women will be able to control the use of loans and reimburse them.
   Medium/long-term impacts
     • Economic and social welfare of women and their families will
       improve.
     • Increased women’s economic and social empowerment.
   Sustainability
     • Structural changes will lead to long-term impacts.
                                                                 43
C. Determining appropriate (and
feasible) evaluation design

   Based on an understanding of client
    information needs, required level of rigor,
    and what is possible given the
    constraints, the evaluator and client
    need to determine what evaluation
    design is required and possible under
    the circumstances.


                                          44
Let’s focus for a while on evaluation
design (a quick review)
1: Review different evaluation
  (experimental/research) designs
2: Develop criteria for determining appropriate
  Terms of Reference (ToR) for evaluating a
  project, given its own (planned or un-
  planned) evaluation design.
3: Defining levels of rigor
4: A life-of-project evaluation design
  perspective.

                                           45
                                                  45
      An introduction to various evaluation designs
       Illustrating the need for quasi-experimental
        longitudinal time series evaluation design
          Project participants




                    Comparison group


 baseline                         end of project   post project
                                  evaluation       evaluation
scale of major impact indicator
                                                             46
  OK, let’s stop the action to
  identify each of the major
types of evaluation (research)
            design …


  … one at a time, beginning with the
       most rigorous design.


                                        47
   First of all: the key to the traditional symbols:

      X = Intervention (treatment), I.e. what the
       project does in a community
      O = Observation event (e.g. baseline, mid-term
       evaluation, end-of-project evaluation)

      P (top row): Project participants
      C (bottom row): Comparison (control) group

Note: the RWE evaluation designs are laid out in Table 3 on page 46 of your handout



                                                                                      48
            Design #1: Longitudinal Quasi-experimental
       P1       X         P2        X      P3       P4
       C1                 C2               C3       C4


Project participants




                        Comparison group


    baseline           midterm    end of project   post project
                                  evaluation       evaluation
                                                             49
     Design #1+: Longitudinal Randomized Control Trial
       P1       X         P2            X       P3    P4
       C1                 C2                    C3    C4


Project participants
                         Research subjects
                         randomly assigned
                         either to project or
                         control group.




                        Control group


    baseline           midterm      end of project   post project
                                    evaluation       evaluation
                                                               50
              Design #2: Randomized Control Trial
       P1                    X                 P2
       C1                                      C2


Project participants
                        Research subjects
                        randomly assigned
                        either to project or
                        control group.




                       Control group


    baseline                       end of project
                                   evaluation
                                                    51
Design #3: Quasi-experimental (pre+post, with comparison)
       P1                   X             P2
       C1                                 C2


Project participants




                       Comparison group


    baseline                     end of project
                                 evaluation
                                                        52
                Design #7: Truncated Longitudinal
                X         P1        X      P2
                          C1               C2


Project participants




                        Comparison group


                       midterm    end of project
                                  evaluation
                                                    53
    Design #8: Pre+post of project; post-only comparison
       P1                X                P2
                                          C


Project participants




                       Comparison group


    baseline                     end of project
                                 evaluation
                                                           54
     Design #9: Post-test only of project and comparison
                         X                P
                                          C


Project participants




                       Comparison group


                                 end of project
                                 evaluation
                                                           55
       Design #10: Pre+post of project; no comparison
       P1              X             P2




Project participants




    baseline                 end of project
                             evaluation
                                                        56
       Design #11: Post-test only of project participants
                       X               P




Project participants




                               end of project
                               evaluation
                                                            57
     Some of the questions to consider as
     you customize an evaluation Terms of
     Reference (ToR):

1.    Who asked for the evaluation? (Who are
      the key stakeholders)?
2.    What are the key questions to be
      answered?
3.    Will this be a formative or summative
      evaluation?
4.    Will there be a next phase, or other
      projects designed based on the findings of
      this evaluation?
                                                   58
 Other questions to answer as
 you customize an evaluation
 ToR:
5.   What decisions will be made in response
     to the findings of this evaluation?
6.   What is the appropriate level of rigor?
7.   What is the scope / scale of the
     evaluation / evaluand (thing to be
     evaluated)?
8.   How much time will be needed /
     available?
9.   What financial resources are needed /
     available?
                                               59
 Other questions to answer as
 you customize an evaluation
 ToR:
10.   Should the evaluation rely mainly on
      quantitative or qualitative methods?
11.   Should participatory methods be used?
12.   Can / should there be a household
      survey?
13.   Who should be interviewed?
14.   Who should be involved in planning /
      implementing the evaluation?
15.   What are the most appropriate media
      for communicating the findings to
      different stakeholder audiences?        60
  Evaluation (research) design?     Resources available?
    Key questions?                      Time available?

Evaluand (what to evaluate)?                 Skills available?
   Qualitative?
                                              Participatory?
  Quantitative?
                                                Extractive?
  Scope?

  Appropriate level of rigor?     Evaluation FOR whom?


 Does this help, or just confuse things more? Who
   said evaluations (like life) would be easy?!! 61
TIME FOR DISCUSSION
                65
Now, where were we?



Oh, yes, we’re ready for Steps 2 and
  3 of the RealWorld Evaluation
  Approach.

Let’s continue …

                                66
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints


           Steps 2 + 3
  ADDRESSING BUDGET AND
    TIME CONSTRAINTS


                                   67
Step 2: Addressing budget
constraints
A.   Clarifying client information needs
B.   Simplifying the evaluation design
C.   Look for reliable secondary data
D.   Review sample size
E.   Reducing costs of data collection and
     analysis


                                       68
2A: Simplifying the evaluation
design
   For quantitative evaluations it is possible
    to select among the most common
    evaluation designs (noting the trade-offs
    when using a simpler design).
   For qualitative evaluations the options
    will vary depending on the type of
    design.


                                          69
2A (cont): Qualitative designs
   Depending upon the design, some of the
    options might include:
    • Reducing the number of units studied
        (communities, families, schools)
    •   Reducing the number of case studies or the
        duration and complexity of the cases.
    •   Reducing the duration or frequency of
        observations


                                               70
2.B. Rationalize data needs


   Use information from Step 1 to identify
    client information needs
   Review all data collection instruments
    and cut out any questions not directly
    related to the objectives of the
    evaluation.



                                       71
2.C. Look for reliable
secondary sources
   Planning studies, project administrative
    records, government ministries, other
    NGOs, universities / research institutes,
    mass media.




                                        72
2.C. Look for reliable
secondary sources, cont.
Assess the relevance and reliability of
  sources for the evaluation with respect
  to:
 Coverage of the target population

 Time period

 Relevance of the information collected

 Reliability and completeness of the data

 Potential biases
                                      73
2.D. Seeking ways to reduce
sample size
Accepting a lower level of precision
  significantly reduces the required
  number of interviews:
 To test for a 5% change in proportions
  requires a maximum sample of 1086
 To test for a 10% change in proportions
  requires a maximum sample of up to 270


                                    74
2.E. Reducing costs of data
collection and analysis
   Use self-administered questionnaires
   Reduce length and complexity of
    instrument
   Use direct observation
   Obtain estimates from focus groups and
    community forums
   Key informants
   Participatory assessment methods
   Multi-methods and triangulation
                                      76
Step 3: Addressing time
constraints
In addition to Step 2 methods:
 Reduce time pressures on external
  consultants
    • Commission preparatory studies
    • Video conferences
   Hire more consultants/researchers
   Incorporate outcome indicators in project
    monitoring systems and documents
   Technology for data inputting/coding

                                            77
Addressing time constraints

   It is important to distinguish between approaches
    that reduce the:
    a) duration in terms of time over the life of the
    project (e.g. from baseline to final evaluation over 5
    years)
    b) duration in terms of the time needed to undertake
    the actual evaluation study/studies (e.g. 6 weeks,
    whether completed in an intensive consecutive 6
    weeks or a cumulative total of 6 weeks periodically
    over the course of a year), and
    b) the level of effort (person-days, i.e. number of
    staff x total days required).


                                                    78
Addressing time constraints

Negotiate with the client to discuss questions such as the
   following:
1. What information is essential and what could be
   dropped or reduced?
2. How much precision and detail is required for the
   essential information? E.g. is it necessary to have
   separate estimates for each geographical region or
   sub-group or is a population average acceptable?
3. Is it necessary to analyze all project components and
   services or only the most important?
4. Is it possible to obtain additional resources (money,
   staff, computer access, vehicles etc) to speed up the
   data collection and analysis process?
                                                    79
TIME FOR DISCUSSION
                80
                      80
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints



          Session 5
       Addressing data
         constraints
                                       Step 4 Addressing data constraints



                                                            Step 1
                                              Planning and Scoping the Evaluation




    Step 2                           Step 3                            Step 4
Addressing budget                 Addressing time
                                                                                                         Step 5
                                                                   Addressing data
   constraints                     constraints
                                                                                                   Addressing political
                                                                     constraints                       constraints




            Step 6 Assessing the strengths and weaknesses
                         of the evaluation
                               design


                                                                                             Step 4
                                                                              Addressing data constraints
                         Step 7 Strengthening the                             A. Reconstructing baseline data
                        Evaluation Design                                     B. Special challenges in working with
                                                                                 comparison groups.
                                                                              C. Collecting data on sensitive topics
                                                                              D. Collecting data on difficult to
                                                                                 reach groups
Two kinds of data constraints:


 1.   Reconstructing baseline
      data
 2.   Special data issues for
      comparison groups


                                83
1. Reconstructing baseline
   conditions for project
   and comparison groups
   [see Table 10, p. 59]
               1. The importance
                  of baseline data
   Hard to assess change without data on pre-
    project conditions
   Post-test comparisons do not fully address:
    •   Selection bias: initial differences between participants
        and non-participants
         • Propensity score matching and instrumental variables
           partially addresses this
    •   Historical factors influencing outcomes that were
        assumed to have been caused by the project
        intervention


                                                          85
1. Ways to reconstruct baseline
conditions
A.   Secondary data.
B.   Project records.
C.   Recall
D.   Key informants
E.   PRA and other participatory techniques
     such as timelines, and critical incidents
     to help establish the chronology of
     important changes in the community
                                         86
1-A. Assessing the utility of
potential secondary data
   Reference period
   Population coverage
   Inclusion of required indicators
   Completeness
   Accuracy
   Free from bias


                                       87
1-A. Using secondary data to
reconstruct baselines

   Census
   Surveys
   Project administrative data
   Agency reports
   Special studies by NGOs, donors
   University studies
   Mass media (newspapers, radio, TV)

                                     88
1-A. Using secondary data to
reconstruct baselines

   Community organization records
   Notices in offices, community centers etc
   Posters
   Birth/death records
   Wills and documents concerning
    property
   Private Sector data

                                        89
1-B. Using project records
Types of data
 Feasibility/planning studies
 Application/registration forms
 Supervision reports
 MIS data
 Meeting reports
 Community and agency meeting minutes
 Progress reports
 Construction costs


                                         90
    1-B. Assessing the reliability of
            project records
   Who collected the data and for what
    purpose?
   Were they collected for record-keeping or to
    influence policymakers or other groups?
   Do monitoring data only refer to project
    activities or do they also cover changes in
    outcomes?
   Were the data intended exclusively for
    internal use? For use by a restricted group?
    Or for public use?
                                            91
1-B. Assessing the reliability of
project records

   How accurate and complete are the
    data? Are there obvious gaps? Were
    these intentional or due to poor record-
    keeping.
   Potential biases with respect to the key
    indicators required for the impact
    evaluation?


                                         92
1-B. Working with the client to improve the
utility of project data for evaluation

   Collecting additional information on
    applicants or participants
   Ensure identification data is included and
    accurate.
   Ensure data organized in the way
    needed for evaluation [by community/
    types of service/ family rather than just
    individuals/ economic level etc]

                                         93
1-C. Using recall to reconstruct
baseline data
   School attendance and time/cost of travel
   Sickness/use health facilities
   Income and expenditures
   Community/individual knowledge and skills
   Social cohesion/conflict
   Water usage/quality/cost
   Periods of stress
   Travel patterns


                                          94
1-C. Where Knowledge about
Recall is Greatest
   Areas where most research has been
    done on the validity of recall
    • Income and expenditure surveys
    • Demographic data and fertility behavior
   Types of Questions
    • Yes/No; fact
    • Scaled
    • Easily related to major events
                                                95
1-C. Limitations of recall


   Generally not reliable for precise
    quantitative data
   Sample selection bias
   Deliberate or unintentional distortion
   Few empirical studies (except on
    expenditure) to help adjust estimates.



                                        96
1-C. Sources of bias in recall

   Who provides the information
   Under-estimation of small and routine expenditures
   “Telescoping” of recall concerning major expenditures.
   Distortion to conform to accepted behavior.
    •   Intentional
    •   Romanticizing the past
   Contextual factors:
    •   Time intervals used in question
    •   Respondents expectations of what interviewer wants to
        know
   Implications for the interview protocol


                                                          97
1-C. Improving the validity of
recall
   Conduct small studies to compare recall
    with survey or other findings.
   Ensure all groups interviewed
   Triangulation
   Link recall to important reference events
    • Elections
    • Drought/floods
    • Construction of road, school etc

                                         98
1-D. Key informants
   Not just officials and high status people
   Everyone can be a key informant on
    their own situation:
    • Single mothers
    • Factory workers
    • Users of public transport
    • Sex-workers
    • Street children
                                          99
1-D. Guidelines for key-
informant analysis
   Triangulation greatly enhances validity
    and understanding
   Include informants with different
    experiences and perspectives
   Understand how each informant fits into
    the picture.
   Employ multiple rounds if necessary
   Carefully manage ethical issues

                                       100
1-E. PRA and related participatory
techniques

   PRA techniques collect data at the group
    or community [rather than individual]
    level.
   Can either seek to identify consensus or
    identify different perspectives.
   Risk of bias:
    • Only certain sectors of the community attend
    • Certain people dominate the discussion
                                             101
1-E. Time-related PRA techniques
useful for reconstructing the past

   Time line
   Trend analysis
   Historical transect
   Seasonal diagram
   Daily activity schedule
   Participatory genealogy
   Dream map
   Critical incidents
                                 102
1-E. Using PRA recall methods: seasonal calendars

Seasonal Calendar of Poverty Drawn by Villagers in Nyamira,
Kenya
               Jan   Feb   Mar   April   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec



Light meals    OOO   OOO   O     O                                                 OO


Begging        OOO   OOO   O                                                       OOO
               OOO   OOO                                                           OO

Migration      OOO   OOO   OO    O       O     OO


Unemployment   OOO   OOO   OO
               OOO   OOO

Income                     O     OOO     OO    OOO   OOO   OOO   OOO   O     O     O
                                 O       OO    O     OOO   OOO   OO

Disease                    O     OOO     OO    OOO   OO    OOO   OO
                                 O       OO

Rainfall                   OO    OOO                       O     O     OO    OO    O
                           OO    O                                     O     O

Source: Rietbergen-McCracken and Narayan 1997
1-F. Issues in baseline
reconstruction
   Variations in reliability of recall.
   Memory distortion.
   Secondary data not easy to use
   Secondary data incomplete or unreliable.
   Key informants may distort the past




                                       104
2. Reconstructing comparison
(control) groups




                          105
2. Ways to reconstruct control
groups
   Judgmental matching of communities.
   When phased introduction of project
    services beneficiaries entering in later
    phases can be used as “pipeline” control
    group.
   Internal controls when different subjects
    receive different combinations and levels
    of services

                                        106
2. Using propensity scores to
strengthen comparison groups
   Propensity score matching
   Rapid assessment studies can compare
    characteristics of project and control groups
    using:
    •   Observation
    •   Key informants
    •   Focus groups
    •   Secondary data
    •   Aerial photos and GIS data


                                             107
2. Using propensity scores to
strengthen comparison groups
   Logistical regression (Logit) on project and
    comparison population to identify determinants
    of project participation
   Select “nearest neighbors” (usually around 5)
    from comparison group who most closely
    match a participant.
   Project impact = gain score = difference
    between project participant score and mean
    score for nearest neighbors.

                                            108
Issues in reconstructing control
groups
   Project areas often selected purposively and
    difficult to match.
   Differences between project and control
    groups - difficult to assess if outcomes due to
    project or to these initial differences.
   Lack of good data to select control groups
   Contamination
   Econometric methods cannot fully adjust for
    initial differences between the groups
    [unobservables].

                                             109
References
   Bamberger, Rugh and Mabry (2006).
    RealWorld Evaluation. Chapter 5
   Kumar, S (2002). Methods for Community
    Participation. A complete guide for
    practitioners.
   Patton, M.Q. (2002). Qualitative research
    and evaluation methods. Chapters 6 and 7.
   Roche, C. 1999. Impact assessment for
    development agencies. Chapter 5.
                                          110
Pause for DISCUSSION
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints



         Session 6
  Mixed-method evaluations
     It should NOT be a fight between pure

QUALITATIVE                 QUANTITATIVE
(verbiage alone)   OR       (numbers alone)


  Quantoid!                        Qualoid!




                                              113
 “Your human
 interest story
sounds nice, but
let me show you
 the statistics.”
                    QUALITATIVE


                          “Your numbers
 QUANTITATIVE            look impressive,
                          but let me tell
                          you the human
                          interest story.”
                                             114
What’s needed is the right combination of
   BOTH QUALITATIVE methods
    AND QUANTITATIVE methods




                                            115
I. Mixed Method Designs
1. Quantitative data collection
   methods
   Structured surveys (household, farm,
    transport usage etc)
   Structured observation
   Anthropometric methods
   Aptitude and behavioral test

                                      116
    1. Quantitative data collection methods
          Strengths and weaknesses

             Strengths                            Weaknesses
   Generalization                       Surveys cannot capture many
   Statistically representative          types of information
   Estimate magnitude and               Do not work for difficult to reach
    distribution of impacts               groups
   Clear documentation of               No analysis of context
    methods                              Survey situation may alienate
   Standardized approach                Long delay in obtaining results
   Statistical control of bias and      Data reduction loses information
    external factors



                                                                  117
2. Qualitative data collection methods

                                                                Analysis of
    Interviewing             Observation                       Documents and
                                                                  Artifacts
    Structured           Participant observation       Project documents
    Semi-structured      Structured observation        Published reports
    Unstructured         Unstructured observation      E-mail
    Focus groups         Photography and video         Legal documents:
    Community             recording                       •   birth and death certificates,
     interviews                                            •   property transfer documents
    PRA                                                   •   marriage certificates
    Audio recording                                   Posters
                                                       Decorations in the house
                                                       Clothing and gang insignia



                                                                             118
2. Qualitative data collection methods
Characteristics

   The researcher’s perspective is an integral part of
    what is recorded about the social world
   Scientific detachment is not possible
   Meanings given to social phenomena and situations
    must be understood
   Programs cannot be studied independently of their
    context.
   Cause and effect cannot be defined and change
    must be studied holistically.



                                                 119
2. Qualitative data collection methods
Strengths and weaknesses

            Strengths                               Weaknesses
   Flexible to evolve                      Lack of clear design may
   Sampling focuses on high                 frustrate clients
    value subjects                          Lack of generalizability
   Holistic focus (“the big picture”)      Multiple perspectives -hard to
                                             reach consensus
   Multiple sources provide
                                            Individual factors not isolated.
    complex understanding
                                            Interpretive methods appear too
   Narrative more accessible to             subjective
    non-specialists
   Triangulation strengthens
    validity of findings



                                                                   120
3. Mixed method evaluation designs

   Combine the strengths of both QUANT and QUAL
    approaches
   One approach ( QUANT or QUAL) is often
    dominant and the other complements it
   Can have both approaches equal but harder to
    design and manage.
   Can be used sequentially or concurrently



                                               121
                                Determining appropriate precision and mix of multiple methods
                                        High rigor, high quality, more time & expense




                                                                                                Participatory --- Qualitative
                               Nutritional
Extractive --- Quantitative



                              measurements
                                                  HH
                                                surveys                           Focus
                                                                                  Groups

                               Nutritional
                              measurements                                        Focus
                                                  HH                              Groups
                                                surveys
                                                                         Key
                                                                      Informant
                                                                     interviews
                                                                                        Large
                                                                                        group

                                    Low rigor, questionable quality, quick and cheap
      3. Mixed method evaluation designs
      How quantitative and qualitative methods
             complement each other

A. Broaden the conceptual framework
     • Combining theories from different disciplines:
     • Exploratory QUAL studies can help define framework
B. Combine generalizability with depth and context
     • Random subject selection ensures representativity and generalizability
     • Case studies, focus groups etc can help understand the characteristics of the
             different groups selected in the sample
C. Permit access to difficult to reach groups [QUAL]
     • PRA, focus groups, case studies, snowball samples, etc can be effective
             ways to reach women, ethnic minorities and other vulnerable groups
         •   Direct observation can provide information on groups difficult to interview.
             For example, informal sector and illegal economic activities
D. Enable Process analysis [QUAL]
     • Observation, focus groups and informal conversations are more effective for
             understanding group processes or interaction between people and public
             agencies, and studying the organization
                                                                              123
         3. Mixed method evaluation designs
         How quantitative and qualitative methods
             complement each other (cont.)
D.   Analysis and control for underlying structural factors [QUANT]
     •   Sampling and statistical analysis can avoid misleading conclusions
     •   Propensity scores and multivariate analysis can statistically control for
         differences between project and control groups
     Example:
     •   Meetings with women may suggest gender biases in local firms’ hiring
         practices; however,
     •   Using statistical analysis to control for years of education or experience
         may show there are no differences in hiring policies for workers with
         comparable qualifications
     Example:
     •   Participants who volunteer to attend a focus group may be strongly in
         favor or opposed to a certain project, but
     •   A rapid sample survey may show that most community residents have
         different views
                                                                         124
   3. Mixed method evaluation designs
  How quantitative and qualitative methods
      complement each other (cont.)

F. Triangulation and consistency checks
    •   Direct observation may identify inconsistencies in interview responses
    •   Examples:
        •   A family may say they are poor but observation shows they have new
            furniture, good clothes etc.
        •   A woman may say she has no source of income, but an early morning visit
            may show she operates an illegal beer brewing business


G. Broadening the interpretation of findings:
    •   Combining personal experience with “social facts”
    •   Statistical analysis frequently includes unexpected or interesting
        findings which cannot be explained through the statistics. Rapid
        follow-up visits may help explain the findings


                                                                     125
3. Mixed method evaluation designs
     How quantitative and qualitative methods
         complement each other (cont.)

G. Interpreting findings
    Example:
    • A QUANT survey of community water management in
       Indonesia found that with only one exception all village water
       supply was managed by women
    • Follow-up visits found that in the one exceptional village
       women managed a very profitable dairy farming business –
       so men were willing to manage water to allow women time to
       produce and sell dairy produce
       Source: Brown (2000)




                                                           126
Using Qualitative methods to improve
the Evaluation design and results

 Use recall to reconstruct pre-test situation
 Interviews with key informants to identify other changes
  in the community or in gender relations
 Interviews or focus groups with women and men to
   •   assess the effect of loans on gender relations within the
       household, such as
        • changes in control of resources and decision-making
   •   identify other important results or unintended consequences:
         • increase in women’s work load,
         • increase in incidence of gender-based or domestic violence

                                                            127
     Enough of our
presentations: it’s time for
  you (THE RealWorld
PEOPLE!) to get involved
Small group case study work
1.   Some of you are playing the role of
     evaluation consultants, others are clients
     coordinating the evaluation.
2.   Decide what your group will do to
     address the given constraints/
     challenges.
3.   Prepare to negotiate the ToR with the
     other group after lunch.

                                         129
RealWorld Evaluation
Designing Evaluations under Budget,

Time, Data and Political Constraints


              Session 8
Identifying and addressing threats
  to the validity of the evaluation
      design and conclusions


                                   130
                                             The Real World Evaluation [RWE] Approach


                                                                       Step 1
                                                         Planning and scoping the evaluation




                                                                                          Step 4
          Step 2                               Step 3                                                                             Step 5
                                                                                 Addressing data constraints
Addressing budget constraints         Addressing time constraints                                                        Addressing political influences




                                      Step 6
                        Strengthening the evaluation design
                                    and validity



                                                                                                              Step 6
                                      Step 7                                                   Strengthening the evaluation
                                Helping clients use the evaluation                             design and the validity of the conclusions

                                                                                               A. Identifying threats to validity of quasi-
                                                                                               experimental designs
                                                                                               B. Assessing the adequacy of qualitative
                                                                                               designs
                                                                                               C. An integrated checklist for mixed-method
                                                                                               designs
                                                                                               D. Addressing threats to quantitative
                                                                                               evaluation designs
                                                                                               E. Addressing threats to the adequacy of
                                                                                               qualitative designs
                                                                                               F. Addressing threats to mixed-method designs
Session outline

1.   What is validity and why does it
     matter?
2.   General guidelines for assessing
     validity
3.   Additional threats to validity for
     quantitative evaluation designs
4.   Strategies for addressing threats to
     validity
                                       132
1. What is validity and
why does it matter?
Defining validity
The degree to which the evaluation findings and
  recommendations are supported by:
 The conceptual framework describing how the
  project is supposed to achieve its objectives
 Statistical techniques (including sample design)

 How the project and the evaluation were
  implemented
 The similarities between the project population and
  the wider population to which findings are
  generalized

                                                134
Importance of validity
Evaluations provide recommendations for
  future decisions and action. If the
  findings and interpretation are not valid:
 Programs which do not work may
  continue or even be expanded
 Good programs may be discontinued

 Priority target groups may not have
  access or benefit
                                        135
RWE quality control goals
   The evaluator must achieve greatest possible
    methodological rigor within the limitations of a given
    context
   Standards must be appropriate for different types of
    evaluation
   The evaluator must identify and control for
    methodological weaknesses in the evaluation design.
   The evaluation report must identify methodological
    weaknesses and how these affect generalization to
    broader populations.



                                                     136
2. General guidelines for assessing the
   validity of all evaluation designs
   [see Overview Handbook Appendix 1]


A.   Confirmability
B.   Reliability
C.   Credibility
D.   Transferability
E.   Utilization

                                      137
A.    Confirmability
Are the conclusions drawn from the available evidence
  and is the research relatively free of researcher
  bias?
Examples:
A-1: Inadequate documentation of methods and
  procedures
A-2: Is data presented to support the conclusions and
  are the conclusions consistent with the findings?
  [Compare the executive summary with the data in
  the main report]



                                               138
B.    Reliability
Is the process of the study consistent, reasonably
   stable over time and across researchers and
   methods?
Examples:
B-2: Data was only collected from people who
   attended focus groups or community meetings
B-4: Were coding and quality checks made and
   did they show agreement?


                                            139
C. Credibility

Are the findings credible to the people studied and to
     readers? Is there an authentic picture of what is being
     studied?
Examples:
C-1: Is there sufficient information to provide a credible
     description of the subjects or situations studied?
C-3: Was triangulation among methods and data sources
     systematically applied? Were findings generally
     consistent? What happened if they were not?




                                                      140
D. Transferability

Do the conclusions fit other contexts and how
    widely can they be generalized?
Examples:
D-1: Are the characteristics of the sample
    described in enough detail to permit
    comparisons with other samples?
D-4: Does the report present enough detail for
    readers to assess potential transferability?


                                             141
E. Utilization


Were findings useful to clients,
   researchers and communities studied?
Examples:
E-1: Were findings intellectually and
   physically accessible to potential
   users?
E-3: Do the findings provide guidance for
   future action?

                                    142
3. Additional threats to validity for
   Quasi-Experimental Designs [QED]
     [see Overview Handbook Appendix 1]

F.   Threats to statistical conclusion validity
     why inferences about statistical association between two variables
     (for example project intervention and outcome) may not be valid

G.   Threats to internal validity why assumptions that
     project interventions have caused observed outcomes may not be
     valid

H.   Threats to construct validity why selected
     indicators may not adequately describe the constructs and causal
     linkages in the evaluation model

I.   Threats to external validity why assumptions
     about the potential replicability of a project in other locations or with
     other groups may not be valid
                                                                      143
F. Statistical conclusion validity
The statistical design and analysis may incorrectly
  assume that program interventions have contributed
  to the observed outputs.
   The wrong tests are used or they are
    applied/interpreted incorrectly
   Problems with sample design
   Measurement errors


                                              144
G. Threats to internal validity
It may be incorrectly assumed that there is a
   causal relationship between project
   interventions and observed outputs.
   Unclear temporal sequence between the
    project and the observed outcomes.
   Need to control for external factors
   Effects of time
   Unreliable measures

                                          145
Example of threat to internal
validity: The assumed causal model




                          Increases women’s
                                income
 Women join the village
 bank where they
 receive loans, learn
 skills and gain
 self-confidence
                           Increases women’s
                               control over
 WHICH ………
                          household resources
An alternative causal model


                     Women who had taken
                      literacy training are
                        more likely to join
                         the village bank.
                     Their literacy and self-
  Some women            confidence makes
        had           them more effective       Women’s income and
 previously taken          entrepreneurs              control over
 literacy training                              household resources
 which increased                                   increased as a
     their self-                                 combined result of
 confidence and                                      literacy, self-
     work skills                                confidence and loans
H. Threats to construct validity
The indicators of outputs, impacts and contextual
  variables may not adequately describe and
  measure the constructs [hypotheses/concepts]
  on which the program theory is based.
 Indicators may not adequately measure key
  concepts
 The program theory model and the interactions
  between stages of the model may not be
  adequately specified.
 Reactions to the experimental context are not
  well understood.

                                           148
I. Threats to external validity
Assumptions about how the findings could be
  generalized to other contexts may not be valid.
 Some important characteristics of the project
  context may not be understood.
 Important characteristics of the project
  participants may not be understood.
 Seasonal and other cyclical effects may have
  been overlooked.


                                           149
RealWorld Evaluation book
• Appendix 2 gives a worksheet for
assessing the quality and validity of an
evaluation design
• Appendix 3 provides worked examples



                                           150
4. Addressing generic
threats to validity for
all evaluation designs
A. Confirmability



Example: Threat A-1: inadequate
  documentation of methods and procedures
Possible ways to address:
 Request the researchers to revise their
  documentation to explain more fully their
  methodology or to provide missing material.
 Rapid data collection methods (surveys, desk
  research, secondary data) to fill gaps

                                        152
B. Reliability

Example: Threat B-4: data were not collected across
  the full range of appropriate settings, times,
  respondents etc.

Possible ways to address

   If the study has not yet been conducted revise the
    sample design or use qualitative methods to cover
    the missing settings, times or respondents.
   If data collection has already been completed
    consider using rapid assessment methods such as
    focus groups, interviews with key informants,
    participant observation etc to fill in some of the gaps.

                                                    153
C. Credibility
Example: Threat C-2: The account does not ring true
  and does not reflect the local context
Possible ways to address

   If the study has not yet been conducted, revise the
    sample design or use qualitative methods to cover the
    missing settings, times or respondents.
   If data collection has already been completed consider
    using rapid assessment methods such as focus groups,
    interviews with key informants, participant observation etc
    to fill in some of the gaps.



                                                        154
D. Transferability
Example: Threat D-3: Sample does not permit
  generalization to other populations
Possible ways to address:
   Organize workshops or consult key informants to assess
    whether the problems concern missing information,
    factual issues or how the material was interpreted by the
    evaluator.

   Return to the field to fill in the gaps or include the
    impressions of key informants, focus group participants,
    or participant observers to provide different perspectives.



                                                        155
E. Utilization
Example: Threat E-2: The findings do not provide
  guidance for future action
Possible ways to address
 If the researchers have the necessary information, ask
  them to make their recommendations more explicit.
 It they do not have the information organize
  brainstorming sessions with community groups or the
  implementing agencies to develop more specific
  recommendations for action.



                                                  156
Lightning feedback



What are some of the most serious threats to
  validity affecting your evaluations?
 How can they be addressed?
Time for more discussion   158
Small group case study work,
cont.
1.   Evaluation ‘consultants’ meet with
     ‘clients’ working on same case study
     (1A+1B) and (2A+2B)
2.   Negotiate your proposed modification of
     the ToR in order to cope with the given
     constraints
3.   Be prepared to summarize lessons
     learned from this exercise (and
     workshop)
                                        159
In conclusion:
Evaluators must be prepared to:
1. Enter at a late stage in the project cycle;
2. Work under budget and time restrictions;
3. Not have access to comparative baseline
   data;
4. Not have access to identified comparison
   groups;
5. Work with very few well qualified evaluation
   researchers;
6. Reconcile different evaluation paradigms and
   information needs of different stakeholders.
                                        160
     Main workshop messages
1.   Evaluators must be prepared for real-world
     evaluation challenges
2.   There is considerable experience to draw on
3.   A toolkit of rapid and economical “RealWorld”
     evaluation techniques is available
4.   Never use time and budget constraints as an
     excuse for sloppy evaluation methodology
5.   A “threats to validity” checklist helps keep you
     honest by identifying potential weaknesses in
     your evaluation design and analysis
                                             161
162
      162

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:15
posted:11/16/2012
language:English
pages:155