Evaluation Essay Topics and Sample by imb12814

VIEWS: 208 PAGES: 26

More Info
									 Creating Valid and
 Reliable Classroom
James A. Wollack, PhD
John Siegler, PhD
Taehoon Kang
Craig S. Wells
Testing & Evaluation Services
      Creating Valid and Reliable Classroom Tests
Session III: Writing and Scoring Essay and Short-Answer Questions

    Recap of Session II
        Writing Multiple-Choice Questions
    Sharing of Homework
    Writing and Scoring Essay and Short-Answer Questions
      Essay Grading Exercise
      Item Writing Rules
      Scoring Guidelines
      Essay Grading Exercise, Revisited
    Question and Answer Session
    Preview of Session IV
        Collect answer sheets from quiz
                           Recap of Session II
                   Writing Multiple-Choice Tests
   Types of multiple-choice items
   Rules for writing multiple-choice items
      Make item as clear as possible
      Item should have one and only one unambiguously correct answer
      Distractors should be plausible
      Avoid irrelevant clues to the right answer or clues that certain distractors
       are wrong
            Design items so that common test-taking strategies are ineffective
   Multiple-choice item writing exercise
                        Sharing of Homework

   In groups of 3, review your constructed-response items and select
    the one item that you like best.

   As a group, develop the scoring rules/criteria for this item. Include
    the following:
        Information for the student on how the item will be scored.
        Information for instructor or TA on how to grade the item.

   You decide: The item being considered could be either one item on
    a test or a homework, or perhaps the entire test/homework.
Writing and Scoring Essay and Short-Answer Items
                       Essay Grading Exercise

   Consider the following essay question:

    While taking multiple-choice tests, students often use various test
    taking strategies to help them select the correct response. The
    reason that students are able to use test taking strategies effectively
    is, frequently, that the items are poorly constructed. Provide
    examples of two different types of item writing problems that provide
    irrelevant clues that can be used to the advantage of the test-wise
    student. Also, provide an example of how instructors can use their
    knowledge of test taking strategies to develop multiple-choice items
    that will be harder for students who are overly reliant on test-taking
    strategies. Justify your choices of examples. (10 points)
Writing and Scoring Essay and Short-Answer Items
                       Essay Grading Exercise

   Initial Score for Student A

   Initial Score for Student B
Writing and Scoring Essay and Short-Answer Items
   Constructed-response (CR) items are ideal for assessing whether
    students possess a rich understanding of material.

   Useful for learning the processes a student uses to solve problems.
    Students decide
        How to approach problems
        How to set up problems
        What factual information or opinions to use
        How much emphasis to devote to various parts
        How to express their answer
Writing and Scoring Essay and Short-Answer Items
   Examples of situations well-suited for CR items.
        assessing writing ability
        solving math or science problems
        comparing and contrasting opposing viewpoints
        recalling and describing important information
        developing a plan for solving a problem
        criticizing or defending an important theory

   CR items are easy to write, but difficult to score
        Grading issues
             Consistency
             Fairness
                        Rules for Writing CR Items
   Use CR items to measure complex objectives only
       Information that can be easily obtained using MC items or other
        objective-type items should not be tested in CR format
            list
            define
            identify

       Advantages gained by asking students to produce the answer, rather
        than recognize it, are more than offset by disadvantages associated
        with scoring such items.

       Reserve CR items for situations where supplying answer is essential
        and where MC items are of limited value
                  why            create
                  describe       relate
                  explain        interpret
                  compare        analyze
                  contrast       evaluate
                  criticize
                   Rules for Writing CR Items
   The shorter the required answer, the better
      Allows for more items to be administered
      Reduces influence of verbal fluency, spelling, etc.
      Easier to grade

   Focus question on a single issue
      Don’t give examinee or grader too much freedom in determining what
       the correct answer should be.
      Issue should be directly linked to course objectives
      Keep blueprint in mind (i.e., weighting of objectives) when determining
       how many points to assign to an item and how much time students will
       require to answer.

   Provide students enough time to answer
                    Rules for Writing CR Items
   Models for essay tests
       Take home vs. in class
       Inform students ahead of time of the possible topics
            e.g., Show students 6 questions and tell them that 2 will be on the
       Provide students with a number of questions and allow students
        to select the one(s) on which to write.
                   Rules for Writing CR Items

   Models for essay tests
       Take home vs. in class
          Take home essays allow students to possible thoughtful, complete,
        Inform students ahead of time of the give moretopics
         and better answers 6 questions and tell them that 2 will be on the
          e.g., Show students
         Take home essays afford limited security in terms of item exposure,
          available resources, and identity of test taker
     Provide students with a number of questions and allow students
          In class essays measure organizational
      toselect the one(s) on which to write. skills and writing and thinking
          speed in addition to knowledge of the topic.
         It is possible to require that take home essays be typed, thereby
          making it easier and quicker to read, eliminating handwriting from
          consideration, and reducing the scoring impact of spelling errors.
                      Rules for Writing CR Items

   Models for essay tests
       Inform students ahead of time of the possible topics
            e.g., Show students 6 questions and tell them that 2 will be on the test.
                  Students don’t like surprises
                  Allows students maximum opportunity to prepare their answers
                      Could include memorizing information supplied by a friend.
                      Opens possibility of someone cheating by coming to class with essay already
                  Unless the initial set of possible topics covers the entire blueprint very
                   well, this may result in students not bothering to study other areas you
                   regard as important
                      Rules for Writing CR Items

   Models for essay tests
       Provide students with a number of questions and allow students to
        select the one(s) on which to write.
            It is very difficult to compare performances of two individuals who
             answered different items
                  Not all items are equally difficult or easy to grade
            If students select items to answer, it is very hard for you to control the
             content of the exam.
                  Students will choose to answer items on familiar topics
                  Performance will not represent how well they have mastered entire
                   domain of interest.
            Avoid this model, if at all possible.
                      Rules for Scoring CR Items
   CR items are very difficult to grade
       Grading difficulty increases with the number of scale points
            Item may be easy to grade with 3 points, but hard with 10.
       Grading difficulty increases with item complexity.
   When grading, focus on consistency and fairness
       Consistency refers to the extent to which the same points are
        awarded or subtracted for comparable information across students.
            Two students making comparable misinterpretations or mistakes should
             receive the same deductions
            More consistent grading will produce more reliable scores
       Fairness refers to the extent to which the points assigned or
        deducted reflect the weighting of objectives in the test blueprint.
            In answering related questions where the answer for one question is
             used as input into another question, getting an intermediate step wrong
             should only result in losing points once.
                  The second problem likely relates to a different objective.
                      Rules for Scoring CR Items
   Construct a detailed scoring rubric that identifies the basis for awarding
    and subtracting points at each phase of each item.
      May help to develop a model answer and think about essential elements in
       producing this answer
      Evaluate answers in terms of the learning outcomes measured
      Pay careful attention to errors of omission and commission
      Keep in mind the total number of points for the item
             Little mistakes may result in deductions on items worth a lot of points, but maybe
              not on items worth few points.
        Give careful thought to the basis for awarding or subtracting points on
         essays where examinee is asked for his/her opinion
             Given some scenario, argue for or against something.
             Students should be allowed to reach different conclusions, and they shouldn’t be
              graded down for political, ideological, or philosophical differences from grader.
       Rubric for Wisconsin Student Assessment System
             Knowledge & Concepts Examinations
6 Response is complete and superior in development, with fine use of language and mechanics. The
  writing is clearly focused on a topic and is logical and well developed. There is a clear sense of
  voice, purpose, and audience. Balance, precise vocabulary, and sophistication set this response

5 Response is clear and well organized. There is a clear sense of purpose and few errors in
  mechanics or language. There is logical development of topic. Response shows a good command
  of language, with spelling errors on above grade level words only. This response is balance and

4 Response is completely organized and developed with adequate use of language and mechanics.
  the piece follows an organizational plan to closure. Development may be brief with few examples,
  but it is focused on a topic. Vocabulary is good, and common words are spelled correctly.

3 Response is somewhat developed. Frequent errors in mechanics and language detract from the
  whole. There is some focus on a topic, though lapses in logic or balance may occur.

2 Response is poor. Errors in language and mechanics may obscure the meaning. There is little
  evidence of focus on a topic or of an organized plan. Poor vocabulary and spelling inhibit

1 Response is scarcely coherent. Errors obscure the meaning. There is no balance, little or no logic,
  or attention to the topic.
                     Rules for Scoring CR Items
   CR items should be graded anonymously, if possible.
      Reduces grader subjectivity.
      Can use a code number to identify examinees
      Examinees could put their name only on the front sheet, which graders are
       instructed not to look at.

   Grade all students’ responses one question at a time.
      Grade item 1 for all students before moving on to grade item 2.
      Helps grader maintain a single set of criteria for awarding points.
      Reduces influence of examinee’s previous performance on other items.
      If multiple graders are used and it is not possible for all graders to rate all
       items for all students, it is better to have each grader score a particular
       problem or two for all students than to have each grader score all problems
       for only a subset of students.
            e.g., Don’t have TA grade exams for only the students in their discussion section
            This strategy eliminates effects due to one person grading harder than another.
                       Rules for Scoring CR Items
   While grading a question, maintain a log of the types of errors observed
    and their corresponding deductions.
      It is very difficult to anticipate every error you will see
      Allows for consistency across exams.
             May be necessary to re-examine some questions that had already been graded
              to verify that the point deductions are consistent and fair.
                   Some mistakes may be more common than you had anticipated when you first started

   Use multiple raters, if possible

   Unless writing skill is one of the course objectives, do not take credit off
    for poor grammar, spelling errors, or failure to punctuate properly.
        Points can be reduced if quality of writing clearly interferes with your ability
         to understand whether the student has adequately grasped the material.

   Never grade on the basis of penmanship or length.
        Can grade down for length if it is clearly outside the length parameters
         identified on assignment.
             Other Considerations with CR Items
   Well-developed CR items can provide a richness of information not
    available with MC testing. HOWEVER,
       They are harder and more time-consuming to grade
            Less reliable than MC tests.
            More likely to result in students contesting their grades.

       Because they take longer to answer, many more MC items can be
        administered in the same time period.
            CR items don’t sample the domain of interest as thoroughly.

       A common criticism of MC testing is that many students aren’t good at that
        type of assessment, so it doesn’t allow for them to show what they know.
            Research shows very clearly that students’ writing varies by genre.
                  Students may be good at writing compare-contrast or an objective piece, but may not
                   be good at expressing opinions.
                     Summary of MC versus CR items
                                        MC items                            CR items

Learning outcomes           Good for measuring outcomes at     Inefficient for measuring
measured                    lower levels of learning (e.g.,    knowledge outcomes; best for
                            knowledge, comprehension, and      ability to organize, integrate, and
                            application); inadequate for       express ideas.
                            organizing and expressing ideas.

Sampling of content         The use of a large number of       The use of a small number of
                            items results in broad coverage    items limits coverage which makes
                            which makes representative         representative sampling of content
                            sampling of content feasible.      infeasible.

Preparation of items        Preparation of good items is       Preparation of good items is
                            difficult and time consuming.      difficult but easier than MC items

Scoring                     Objective, simple, and highly      Subjective, difficult, and less
                            reliable.                          reliable.

Factors distorting scores   Reading ability and guessing.      Writing ability and bluffing.

Probable effect on          Encourages students to             Encourages students to organize,
learning                    remember, interpret, and use the   integrate, and express their own
                            ideas of others.                   ideas.
         Scoring Rubric for Sample Essay Question

   Consider scoring rubric for the sample essay question.
        Re-evaluate the responses to the two sample essays.

             Final score for Student A

             Final score for Student B

         By show of hands…
              Who saw their scores stay the same for both essays?
              Who saw only one of their scores stay the same?
              Who saw both their scores change?
              Who saw the difference between the two scores change?
              Who saw the ranking of the two scores change?
              Who scored item 2 higher than item 1?
                          Group Activity

   Re-assemble into groups and discuss any possible revisions to your
    scoring rubrics.

   Share items and rubrics
Questions from the first three sessions?
                         Preview of Session IV
   Collect answer sheets from quiz
       You may hold onto the actual quiz

   Evaluating the Test—The Final Step
     Overview of item analysis
     Overview of scanning and reporting options
     Review item analysis from quiz
            Evaluate items
            Revise items

   Final Questions

   Evaluation
The Room for Session IV Has Changed

  Session IV will take place in Union South.
       Please check the room postings
       when you arrive to see which room
       we will be in.

To top