# Machine-Translation by lanyuehua

VIEWS: 4 PAGES: 31

• pg 1
```									Machine Translation

A Presentation by:
Julie Conlonova,
Rob Chase,
and Eric Pomerleau
Overview

Language Alignment System
Datasets
Sentence-aligned sets for training (ex. The
Hansards Corpus, European Parliamentary
Proceedings Parallel Corpus)
A word-aligned set for testing and evaluation
to measure accuracy and precision
Decoding
Language Alignment

Goal: Produce a word-aligned set from
a sentence-aligned dataset
First step on the road toward Statistical
Machine Translation
Example Problem:
The motion to adjourn the House is now
La motion portant que la Chambre s'ajourne
IBM Models 1 and 2
-Kevin Knight, A Statistical MT Tutorial Workbook, 1999

 Each capable of being used to produce a
word-aligned dataset separately.
 EM Algorithm
 Model 1 produces T-values based on
normalized fractional counting of
corresponding words.
 Additionally, Model 2 uses A-values for
“reverse distortion probabilities” –
probabilities based on the positions of the
words
Training Data
 European Parliament Proceedings Parallel
Corpus 1996-2003
 Aligned Languages:
English - French
English - Dutch
English - Italian
English - Finish
English - Portuguese
English - Spanish
English - Greek
Training Data cont.

Eliminated
Misaligned sentences
Sentences with 50 or more words
XML tags
Symbols and numerical characters other then
commas and periods
Ideally…

http://www.cs.berkeley.edu/~klein/cs294-5
Bypassing Interlingua: Models I-III

Variables contributing to the probability
of a sentence:
Correlation between words in the
source/target languages
Fertility of a word
Correlation between order of words in
source sentence and order of words
in target
A Translation Matrix
Rob    Cat     is   Dog

Rob     1      0       0    0

Gato    0      1       0    0

es      0      0       .5   0

esta    0      0       .5   0

Perro   0      0       0    1
Building the Translation Matrix: Starting
from alignments

Find the sentence alignment
If a word in the source aligns with a word
in the target, then increment the
translation matrix.
Normalize the translation matrix
Can’t find alignments

Most sentences in the hansards corpus
are 60 words long. There are many that
can be over 100.
100100 possible alignments
Counting

Rob is a boy.     Rob es nino.
Rob is tall.      Rob es alto.
Eric is tall.     Eric es alto.
…                    …
Base counts on co-occurrence, weighting
based on sentence length.
Iterative Convergence
 Use Estimation          Rob Is     Tall   boy
Maximization
algorithm           Rob .66 .33    .25    .25
 Creates translation
matrix              es  .30 .66    .25    .25

alto   .2   .05   .5     0

nino .2     .05   0      .5
Distorting the Sentence

Word order changes between languages
How is a sentence with 2 words distorted?
How is a sentence with 3 words distorted?
How is a sentence with       …

To keep track of this information we use…
A tesseract!

dictionary)
This could be a problem if there
are more than 100 words in a
sentence.
100x100x100x100 = too big for
RAM and takes too much time

 “The translation process can be
described simply as:
1. Decoding the meaning of the source text, and
2. Re-encoding this meaning in the target
language.”
- “Translation Process”, Wikipedia, May 2006
Decoding

How to go from the T-matrix and A-matrix
to a word alignment?

There are several approaches…
Viterbi

If only doing alignment, much smaller
memory and time requirements.
Returns optimal path.

T-Matrix probabilities function as the
“emission” matrix
A-Matrix probabilities concerned with
the positioning of words
Decoding as a Translator

Without supplying a translated sentence
to the program, it is capable of being a
stand-alone translator instead of a word
aligner.

However, while the Viterbi algorithm runs
quickly with pruning for decoding, for
translating the run time skyrockets.
Greedy Hill Climbing
Knight & Koehn, What’s New in Statistical Machine Translation, 2003

Best first search
2-step look ahead to avoid getting stuck in
most probable local maxima
Beam Search
Knight & Koehn, What’s New in Statistical Machine Translation, 2003

Optimization of Best First Search with
heuristics and “beam” of choices
“beam” width
Other Decoding Methods
Knight & Koehn, What’s New in Statistical Machine Translation, 2003

Finite State Transducer
Mapping between languages based on a finite
automaton
Parsing
String to Tree Model
Problem: One to Many

Necessary to take all alignments over a
certain probability in order to capture the
“probability that e has fertility at least a
given value”

Al-Onaizan, Curin, Jahr, etc., Statistical Machine Translation, 1999
Results

Study done in 2003 on word alignment
error rates in Hansards corpus:
Model 2 –
29.3% on 8K training sentence pairs
19.5% on 1.47M training sentence pairs
Optimized Model 6 –
20.3% on 8K training sentence pairs
8.7% on 1.47M training sentence pairs
Och and Ney, A Systematic Comparison of Various Statistical Alignment
Models, 2003
Expected Accuracy

70%                         overall
Language performance:
 Dutch
 French
• Italian, Spanish, Portuguese
 Greek
      Finish
Possible Future Work

 Given more time, we would’ve implemented IBM
Model 3
 Additionally uses n, p, and d fertilities for weighted
alignments:
 N, number of words produced by one word
 D, distortion
 P, parameter involving words that aren’t involved directly
 Invokes Model 2 for scoring
Another Possible Translation Scheme

Example-Based Machine Translation
Translation-by-Analogy
Can sometimes achieve better than the “gist”
translations from other models
Why Is Improving Machine
Translation Necessary?
A Chinese to English Translation
The End
Are there any