Rating Differences of Experienced Raters in Assessment of English

Document Sample
Rating Differences of Experienced Raters in Assessment of English Powered By Docstoc
					 A Study of the Decision-making
  Behavior of Markers in E-C
Sentence Translation Assessment

            Wen Hui
           Nie Jianzhong
                             Abstract
    Second language testing increasingly used rated tasks in place of
objectively scored items. An understanding of the raters’ decision-
making behavior is of particular importance for both reliability and
validity. This paper is intended to investigate the decision-making
behavior of experienced raters with similar background in their
assessment of an English-Chinese sentence translation task on the
basis of quantitative and qualitative analysis. rating scores and raters’
verbal protocol analysis indicated that raters in this study differed
significantly in their decision making processes.

Key words:
    English-Chinese sentence translation; Think-aloud verbal
protocol; Decision-making behavior; Inter-rater difference;
Translation assessment


                                                                       (1-11)
                           Introduction




                               Data                 Limitations
                                          Results        &
objective   Significance     collection
                                                    recommend
                             &analysis                 ations




                                                           (2-11)
       Current researches on rater
                 behavior
Bachman et al.        1995   speaking
Brown                 1995   writing
Cumming et al         2002   writing
Lumley & McNamara     1995   writing
Lumely                2002   writing
Milanovic et al.      1996   writing
Pollitt & Murray      1996   speaking
Weigle                1994   writing
Weigle et al.         2003   writing
张文霞                   2004   writing
                                        (3-11)
                         Objective
To explore decision-making behavior of experienced raters during the
process of translation assessment.

                               Pilot study




     Rating scale            SIGNIFICANCE                Rater training




                                   Score
                               interpretation
                                                                  (4-11)
                 Research Design and Methodology


          Data collection                             Data analysis


quantitative               qualitative       quantitative           qualitative



                                                                 Protocol   written
                                   written   Frequ-   Variance
scripts     test scores   TAP                                               reports
                                   reports    ency    analysis   analysis   analysis



                                                                            (5-11)
              E-C translation assessment




nature                                      Scale development




               Translation competence                                 Nature & role




         Characteristics of E-C translation test         Current problems (reliability & validity)




                                                                Two scoring procedures
                                                                  (holistic & analytic)



                                                                Development procedure
                                                                 comments-construct-
                                                                 piloting &application
                        Verbal protocol analysis




applications            Limitations & cautions          Reliability & validity




                                        Influence participants’
               psychology                                             Appropriate instruction
                                            internal process



                                           Incompleteness                 Minimize time
           Language testing
                                             of protocols                Process & report



                                       Inconsistency between
                                                                         Adequate coding
               translation                     reports
                                                                             scheme
                                           and behavior
                                                                Results and discussion
RQ1: How is the students’ E-C sentence translation assessed by the raters
in terms of scores?
                                            ACC Frequency
                                                                                                                Sum of            Mean
                                                                                                                          df               F      Sig.
               6                                                                                       Source   Squares           Square
               5
   Frequency




               4                                                                               Between
               3                                                                                                1.35      5.00    0.27     0.21   0.95
                                                                                               Groups
               2
               1
               0                                                                  Accuracy     Within Groups    67.90     54.00   1.26
                     R1           R2        R3         R4        R5    R6
                     1 2 3 4 5 1 2 3 4 5 1 2 3 4 1 2 3 4 1 2 3 4 5 1 2 3 4                     Total            69.25     59.00
                                                   Score                                       Between
                                                                                                                3.40      5.00    0.68     0.67   0.65
                                                                                               Groups
  Figure 4.1 a Frequency analysis on the category of accuracy
                                                                                  Expression   Within Groups    55.00     54.00   1.02
                                            EXP Frequency
                                                                                               Total            58.40     59.00
               8.0                                                                    Table 4.3 ANOVA table for the data of table 4.1
   Frequency




               6.0
               4.0
               2.0
               0.0
                      R1




                                   R2




                                             R3




                                                           R4




                                                                       R6
                                                                 R5




                          1 2 3 4 5 1 2 3 4 5 1 2 3 4 1 2 3 4 1 2 3 4 5 1 2 3 4
                                                    Score


               Figure 4.1 b Frequency analyses on the category of expression

                                                                                                                                               (8-11)
    RQ2 Are there any remarkable differences among raters in their                         (9-11)
                                                                          qualitative judgments
    to the same translation?
    RQ3 What elements do these raters focus on while marking these two translation tasks and
    how do they focus on these elements?
                     A model of the decision– making process in translation marking

                        Decide final                                      Tentative marking
                           mark




                         Good
                                                  Macro-assessment                     Micro-assessment
                   First Impression
                         Bad




                 1.Task completion          Qualitative judgments         Elements                1.Locating errors
                 2.Legibility                                                                     2.Judging gravities
                                                                                                  of errors


  1 In addition to the extremely good or bad performance, raters usually hold different perceptions to the same
performance.
  2. In E-C sentence translation assessment, rater showed great difference in deciding on the gravities of errors located.
                E-C translation elements raters focused on in marking

                Sentence translation                                             Discourse translation
criterion-related elements         non-criterion elements          criterion-related elements         non-criterion elements
Comprehension of the source text   Layout                          Comprehension of the source text    Layout
Mistranslation                     Spelling mistakes               Mistranslation                     Spelling mistakes
Omission                           Rater’s affective responses     Omission                           Rater’s affective responses
Incorrect addition                 Comparison with other scripts   Incorrect addition                 Comparison with other
                                                                                                      scripts
Uncontrolled translation                                           Obscurity of language
Lexical choice                                                     Uncontrolled translation
Chinese conventions                                                Lexical choice
Fluency in expression                                              Sentence structure
Stylistic inappropriateness                                        Pauses of sentences
Task completion                                                    Coherence of sentences
Error types                                                        Literary grace
Error gravity                                                      Chinese conventions
Error frequency                                                    Fluency in expression
                                                                   Succinctness of language
                                                                   Stylistic inappropriateness
                                                                   Global communicative effect
                                                                   Task completion
                                                                   Error types
                                                                   Error gravity
                                                                   Error frequency
                                                                                                                  (10-11)
                                                                                         (11-11)
                                Conclusion


implications                      limitations                            recommendations


            rater training                incomplete investigation of            C-E translation
                                                  protocols                        assessment




        usefulness of scales                    small sample size




         score interpretation            reliability & validity(TEM8 &
                                                     NCEC)



        research methodology            methodological problem of TAP           rater backgrounds
THANKS !

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:12/5/2011
language:Afrikaans
pages:13