According to Bachman and Palmer
"Test usefulness, consisting of several
 qualities (reliability, construct validity,
 authenticity, interactiveness, impact, and
 is an overriding consideration for quality
 control throughout the process of
 designing, developing, and using a
 particular test." (Bachman & Palmer,
 1996, p. 9)
         Usefulness principles
Usefulness: Reliability + Construct valididity + Authenticity
+ Interactiveness + Impact + Practicality
1. It is the overall usefulness of the test that is to be
    maximazed, rather than the individual qualities that affect
2. The individual test qualities cannot be evaluated
    independently, but must be evaluated in terms of their
    combined effect on the overall usefulness of the test.
3. Test usefulness and the appropriate balance among the
    different qualities cannot be prescribed in general, but
    must be determined for each testing situation.

   Other needs for usefulness
• To be useful, any given test,must be
  – with a specific purpose
  – A particular group do test takers
  – A specific language use domain (Target
    Language Use, TLU)
     • TLU domain
     • TLU tasks

• consistency of measurement
  – Between equivalent forms of the test -
    equivalent forms reliability
  – Between and within raters – inter-raters and
    intra-raters reliability
  – Between sittings: test-retest reliability
  – Within test: internal consistency

             Construct Validity
• What is a construct?
• Does the test measure what it is said to measure
  in the purpose.
• Meaningfulness and appropriateness of the
  interpretations that we make on the basis of the
• Domain of generalization (prediction outside of
  test tasks).
• Other sorts of validity: content validity (all
  facets), predictive validity, concurrent/criterion
  validity, face validity.
Quiz question

Can a test be valid
and not reliable
and vice versa?

Relation between target language use
(TLU) or language methodology
(structural, communicative, etc..) and the
characteristics of the tasks

       Interactiveness p. 26 Figure 2.5
• Difference between interactiveness in
  today language methodology and what is
  meant here.
• “The extent and type of involvement of the
  test taker’s individual characteristics in
  accomplishing a test task” (Bachman & Palmer,
  1996, p. 25)

• Ways in which language ability + topical
  knowledge + affective schemata are
  engaged by the test tasks
• Impact at the macro level (society, education
  system) and micro level (individual).
• Some vocabulary:
  – Stake-holders: tests takers, tests users/decision
  – High-stakes tests / high-stakes nation-wide
    public examination.
  – Washback: effect of test on teaching and
    learning, or on educational practices (beneficial
    or harmful).
                 Impact (2)
• Test takers: prep time, topical content, language
  knowledge improved, strategies (Involving test
  takers in the preparation of the test?), type of
  feedback, fair decisions and fair test use
• Teachers: How can we minimise the potential for
  negative impact on instruction ? Test
  corresponding to instructional program. Or using
  test to make program change.
• Society: test use reflecting the values and goals
  of a small number of people in society.
• `What are the solutions ? (p. 35)
• See formula in Bachman and Palmer (1996)
  chapter written below
• Practicality=Available resources*/Required
  resources (must be greater than 1 to be
  – *Resources=human resources, material
    resources, time.
• Note that Brown’s definition in chapter 2 is
  different (from user's point of view).
          Group question

Considering the six qualities for test
usefulness, could there be a trade-off
between the first two qualities and the
others (think of the language teaching
methodology) ?

             Question 4 p. 40
• Suppose someone were to propose that the test
  task* for Project 3 in Part three (p. 296) be used
  for the purpose described in project 4 in Part
• In what specific ways would the various qualities
  of usefulness be reduced?
*Test takers are given a specific sets of words to be
  combined into sentences which together form a
  business letter similar to the one elicited in
  project 2 (formal letter to a client who has
                  Purposes project 3 and 4
• Project 3: The purpose of this test is to make decisions about whether or
  not those job applicants described in Project 2 (working for a telephone
  company to respond in writing to customers’ complaints about phone
  company service) who have been admitted and sent to take an English
  for Specific Purpose course in the phone company, have mastered
  specific course content. The test will also be used to provide diagnostic
  feedback to those applicants who are judged not to have mastered the
  specific course content, as well as information that course designers and
  teachers can use to tailor the course more closely to the needs of this
  group of students.

• Project 4: This is a fairly low-stakes test for the purpose of providing
  evidence of the test takers’ ability to participate in small talk.
  Interpretation of scores will be included in an end-of-course certificate
  and scores will be used as a basis for assigning course grades.

 Next topic: Test Usefulness: Application to a Test
• Readings:
• Bachman & Palmer (1996) chap. 7 - Developing a plan for the
  evaluation of test usefulness. Read also Projects 1 and 2 and 3
• Answer question 3 of chap. 7, which involves project 2 and 3. A
  model is done for Project 1. Refer to it in order to do that
• One group (as decided in class) will do the first part of the
  question (with project 2) and the other group will do the second
  part of the question (with project 3). Everyone should know
  about projects 2 and 3 in general for the discussion.
• Note: The meaning of “unmotivated way” is given on page 139

