Re-scoring re-ranking of n-best translation hypothesis lists

Document Sample
scope of work template
							                        2006 Fall
           COMP 526 Natural Language Processing
                     Course Project




           Re-scoring / re-ranking of
       n-best translation hypothesis lists




                          Instructed by

                         Prof. Dekai Wu




              Human Language Technology Center

        Department of Computer Science and Engineering
        Hong Kong University of Science and Technology

                        December, 2006




Name of Student: Lo Chi Kiu   Email: jackielo Student ID: 01161253
                             Architectural Descriptions
There are in total 2 modules in this project.
1. Training
    This training module include the following tasks:
    i.    Evaluate the n-best list with reference translations using BLEU score.
   ii.    Give actual rank to the n-best list using the BLEU score obtained in previous module. (Label the
          best translation in BLEU score)
  iii.    Prepare the training data, the class (the actual rank of n-best list) and the evidence (the score
          from additional score functions) for the maximum entropy model.
  iv.     Train the maximum entropy model using the training data prepared in the previous task.


   Input:
             1.   reference translation files
                       File name: [filestem].[reference id]       e.g. test_set.ref.txt.0
                       Format:       Each reference translation file contains one set of translation of all
                                     sentences for training. Each line in the file represents one sentence.
                                     The sentences should be sorted according to sentence id.
                                     i.e.
                                     <reference translation of sentence 0>
                                     <reference translation of sentence 1>
                                     …
                                     <reference translation of sentence k>
             2.   n-best translation files of the baseline system
                       File name: [filestem].[sentence id].[n]best.txt e.g. test_set.0000.1000best.txt
                       Format:       Each n-best translation file contains the n-best translations of one
                                     sentence with the sentence id stated in the filename. Each line in the
                                     file represents one possible translation of that sentence. The
                                     translations should be sorted according to the rank given by the
                                     baseline.
                                     i.e.
                                     <1st best translation of sentence 0 by baseline system>
                                     <2nd best translation of sentence 0 by baseline system>
                                     …
                                     <nth best translation of sentence 0 by baseline system>
             3.   score files generated by the score functions
                        File name: [filestem].[sentence id].[n]best.txt e.g. jackielo.0000.1000best.txt
                        Format:       Each score file contain the scores generated of the n-best
                                      translations of one sentence with the sentence id stated in the
                                      filename. Each line in the file represents the score of that translation.
                                      The score can be any real numbers where the larger the number
                                      represents higher the rank. The scores should be in the same order
                                      associated with the corresponding n-best translation.
                                      i.e.
                                      <score of 1st best translation of sentence 0 by baseline>
                                      <score of 2nd best translation of sentence 0 by baseline>
                                      …
                                      <score of nth best translation of sentence 0 by baseline>
   Output:        Weighting of each scoring function.
   Command:       train.perl [no. of sentence] [n-best] [no. of score function] [reference filestem] [n-best
                  filestem] [score function filestem]+ [weight filename]
   E.g.:          train.perl 92 1000 2 reference/test_set.ref.txt n-best/test_set test_data/jackielo
                  test_data/csjackie weight.txt


2. Testing
    This training module include the following tasks:
    i.    Prepare the testing data, the evidence (the score from additional score functions) for the
          maximum entropy model.
   ii.    Test the maximum entropy model using the testing data prepared in the previous task.


   Input:
             1.   score files generated by the score functions
                      File name: [filestem].[sentence id].[n]best.txt e.g. jackielo.0000.1000best.txt
                      Format:       Each score file contain the scores generated of the n-best
                                    translations of one sentence with the sentence id stated in the
                                    filename. Each line in the file represents the score of that translation.
                                    The score can be any real numbers where the larger the number
                                    represents higher the rank. The scores should be in the same order
                                    associated with the corresponding n-best translation.
                                    i.e.
                                    <score of 1st best translation of sentence 0 by baseline>
                                    <score of 2nd best translation of sentence 0 by baseline>
                                    …
                                    <score of nth best translation of sentence 0 by baseline>
             2.   weight file generated in the training module
   Output:        test result
                  Format: Each line represent the test result of one sentence. The first number is the id
                             of translation (start from 0, i.e. 0 represent the 1st best translation) of the
                             baseline that should rank first. The following scores are the ranking scores of
                             each translation, the higher the score the higher the rank.
                             i.e.
                             [id of 1st rank of sentence 0] [rank score of 1st best trans.] [r. score of 2nd best
                             trans] …
                             [id of 1st rank of sentence 1] [rank score of 1st best trans.] [r. score of 2nd best
                             trans] …
                             …
                             [id of 1st rank of sentence k] [rank score of 1st best trans.] [r. score of 2nd best
                             trans] …
   Command:       test.perl [no. of sentence] [n-best] [no. of score function] [score function filestem]+
                  [weight filename] [result filename]
   E.g.:          test.perl 92 1000 2 test_data/jackielo test_data/csjackie weight.txt test_result.txt




                                Platform Requirements
   GNU C++ v3.2.2, Perl v5.8.0 on i86 architectures




                                    Background Survey
Hasan, Sasa, Evgeny Matusov, ArneMauser, Daivd Vilar, Richard Zens and Hermann Ney. 2006.
   Improving SMT by Using Multiple Translation Hypotheses. TC-STAR OpenLab on Speech
   Translation. Trento, Italy.
Och, Franz Josef. 2003. Minimum Error Rate Training in Statistical Machine Translation. In
   Proceedings of the ACL. Sapporo, Japan
Ratnaparkhi, Adwait. 1997. A Simple Introduction to Maximum Entropy Models for Natural Language
   Processing. Technical Report 97-98, Institute for Research in Cognitive Science, University of
   Pennsylvania.
Ravichandran, Deepak, Eduard Hovy and Franz Josef Och. Statistical QA – Classifier vs. Re-ranker:
   What’s the difference? 2003. In Proceedings of the ACL Workshop on Multilingual Summarization
   and Question Answering.
                               Acknowledgement
     The BLEU score evaluation module is modified from the open source toolkit for statistic machine
translation provided in JHU Summer Workshop 2006.
     The maximum entropy model (YASMET2) is provided by Franz Josef Och.

						
Related docs