Re-scoring re-ranking of n-best translation hypothesis lists
Document Sample


2006 Fall
COMP 526 Natural Language Processing
Course Project
Re-scoring / re-ranking of
n-best translation hypothesis lists
Instructed by
Prof. Dekai Wu
Human Language Technology Center
Department of Computer Science and Engineering
Hong Kong University of Science and Technology
December, 2006
Name of Student: Lo Chi Kiu Email: jackielo Student ID: 01161253
Architectural Descriptions
There are in total 2 modules in this project.
1. Training
This training module include the following tasks:
i. Evaluate the n-best list with reference translations using BLEU score.
ii. Give actual rank to the n-best list using the BLEU score obtained in previous module. (Label the
best translation in BLEU score)
iii. Prepare the training data, the class (the actual rank of n-best list) and the evidence (the score
from additional score functions) for the maximum entropy model.
iv. Train the maximum entropy model using the training data prepared in the previous task.
Input:
1. reference translation files
File name: [filestem].[reference id] e.g. test_set.ref.txt.0
Format: Each reference translation file contains one set of translation of all
sentences for training. Each line in the file represents one sentence.
The sentences should be sorted according to sentence id.
i.e.
<reference translation of sentence 0>
<reference translation of sentence 1>
…
<reference translation of sentence k>
2. n-best translation files of the baseline system
File name: [filestem].[sentence id].[n]best.txt e.g. test_set.0000.1000best.txt
Format: Each n-best translation file contains the n-best translations of one
sentence with the sentence id stated in the filename. Each line in the
file represents one possible translation of that sentence. The
translations should be sorted according to the rank given by the
baseline.
i.e.
<1st best translation of sentence 0 by baseline system>
<2nd best translation of sentence 0 by baseline system>
…
<nth best translation of sentence 0 by baseline system>
3. score files generated by the score functions
File name: [filestem].[sentence id].[n]best.txt e.g. jackielo.0000.1000best.txt
Format: Each score file contain the scores generated of the n-best
translations of one sentence with the sentence id stated in the
filename. Each line in the file represents the score of that translation.
The score can be any real numbers where the larger the number
represents higher the rank. The scores should be in the same order
associated with the corresponding n-best translation.
i.e.
<score of 1st best translation of sentence 0 by baseline>
<score of 2nd best translation of sentence 0 by baseline>
…
<score of nth best translation of sentence 0 by baseline>
Output: Weighting of each scoring function.
Command: train.perl [no. of sentence] [n-best] [no. of score function] [reference filestem] [n-best
filestem] [score function filestem]+ [weight filename]
E.g.: train.perl 92 1000 2 reference/test_set.ref.txt n-best/test_set test_data/jackielo
test_data/csjackie weight.txt
2. Testing
This training module include the following tasks:
i. Prepare the testing data, the evidence (the score from additional score functions) for the
maximum entropy model.
ii. Test the maximum entropy model using the testing data prepared in the previous task.
Input:
1. score files generated by the score functions
File name: [filestem].[sentence id].[n]best.txt e.g. jackielo.0000.1000best.txt
Format: Each score file contain the scores generated of the n-best
translations of one sentence with the sentence id stated in the
filename. Each line in the file represents the score of that translation.
The score can be any real numbers where the larger the number
represents higher the rank. The scores should be in the same order
associated with the corresponding n-best translation.
i.e.
<score of 1st best translation of sentence 0 by baseline>
<score of 2nd best translation of sentence 0 by baseline>
…
<score of nth best translation of sentence 0 by baseline>
2. weight file generated in the training module
Output: test result
Format: Each line represent the test result of one sentence. The first number is the id
of translation (start from 0, i.e. 0 represent the 1st best translation) of the
baseline that should rank first. The following scores are the ranking scores of
each translation, the higher the score the higher the rank.
i.e.
[id of 1st rank of sentence 0] [rank score of 1st best trans.] [r. score of 2nd best
trans] …
[id of 1st rank of sentence 1] [rank score of 1st best trans.] [r. score of 2nd best
trans] …
…
[id of 1st rank of sentence k] [rank score of 1st best trans.] [r. score of 2nd best
trans] …
Command: test.perl [no. of sentence] [n-best] [no. of score function] [score function filestem]+
[weight filename] [result filename]
E.g.: test.perl 92 1000 2 test_data/jackielo test_data/csjackie weight.txt test_result.txt
Platform Requirements
GNU C++ v3.2.2, Perl v5.8.0 on i86 architectures
Background Survey
Hasan, Sasa, Evgeny Matusov, ArneMauser, Daivd Vilar, Richard Zens and Hermann Ney. 2006.
Improving SMT by Using Multiple Translation Hypotheses. TC-STAR OpenLab on Speech
Translation. Trento, Italy.
Och, Franz Josef. 2003. Minimum Error Rate Training in Statistical Machine Translation. In
Proceedings of the ACL. Sapporo, Japan
Ratnaparkhi, Adwait. 1997. A Simple Introduction to Maximum Entropy Models for Natural Language
Processing. Technical Report 97-98, Institute for Research in Cognitive Science, University of
Pennsylvania.
Ravichandran, Deepak, Eduard Hovy and Franz Josef Och. Statistical QA – Classifier vs. Re-ranker:
What’s the difference? 2003. In Proceedings of the ACL Workshop on Multilingual Summarization
and Question Answering.
Acknowledgement
The BLEU score evaluation module is modified from the open source toolkit for statistic machine
translation provided in JHU Summer Workshop 2006.
The maximum entropy model (YASMET2) is provided by Franz Josef Och.
Related docs
Get documents about "