In this section, we describe various issues to perform an efficient implementation of a search algorithm for the alignment template approach.
Search hypothesis representation
A very important design decision in the implementation is the representation of a hypothe- sis. Theoretically, it would be possible to represent search hypotheses only by the associated decision and a backpointer to the previous hypothesis. Yet, this would be a very inefficient rep- resentation for the implementation of the operations that have to be performed in search. The hypothesis representation should contain all information to perform efficiently the computations needed in the search, but should not contain more information to keep the memory consumption small.
In search, we produce partial hypotheses, each of which contains the following information:
70 CHAPTER 6. ALIGNMENT TEMPLATES
2. : the state of the language model,
3.
: the coverage vector representing the already covered positions of the source sentence (
means the positionis covered,
=0 the position
is not covered),
4. 3: a reference to the alignment template instantiation, which produced the final target
word,
5. : the position of the final target word in the alignment template instantiation, 6. 1: the accumulated probability of all previous decisions,
7.
: a reference to the previous partial hypothesis.
Using this representation, we can perform the following operations very efficiently:
Comparing if a specific alignment template instantiation can be used to extend a hypoth-
esis. To do that, we check if the positions of the alignment template instantiation are still free in the hypothesis coverage vector.
Checking if a hypothesis is final. To do that, we check if the coverage vector contains
no uncovered position. If the coverage vector is internally represented as bit vector, the corresponding operation can be implemented very efficiently.
Checking if two hypotheses extensions can be recombined. The criterion to recombine
two hypotheses 3 and 3 is:
identical language model state
identical coverage vector
3 3
alignment template instantiation is identical 3 3
alignment template instantiation finished
Here,denotes the new language model state, which is obtained if the wordis used
to extend the language model state.
We compare in beam search those hypotheses that cover different parts of the input sentence. This makes the comparison of the probabilities problematic. Therefore, we integrate an admis- sible estimation of the remaining probabilities to arrive at a complete translation. Details of the heuristic function for the alignment templates are described in Section 6.4.
Efficient search
We discuss in the following various methods that significantly speed up search efficiency. A significantly faster search is obtained using the direct search criterion of Eq. 6.29 instead of the Bayes approach. Here, both language and translation model predict target language words. Figure 6.5 shows a graphical representation of the resulting dependencies. Hence, for
each extension, we can directly compute the translation model contribution. The probability to extend a hypothesis with one target language word can then directly include the translation
6.4. HEURISTIC FUNCTION 71
Figure 6.5: Dependencies of a log-linear combination of a left-to-right language model and the direct alignment template translation model.
model contribution. This allows a more efficient pruning of hypotheses. The probabilities to start a new alignment template (Eq. 6.32) is:
54 4 3 3 3 3 ( ( ( (6.35)
The probability to extend an alignment template (Eq. 6.33) is:
54 4 3 ( ( ( (6.36)
An even more efficient search can be obtained when a segmentation of the input sentence is performed beforehand. This can be done by determining the sequence of phrases for which the most probable alignment templates exist. We search for a sequence of phrases
ÆÆ with: ÆÆ max# (6.37) This is computed efficiently by dynamic programming. This approximation might be useful in applications where very efficient search is important. All results in this thesis are obtained without performing this approximation.
An additional important element in the search algorithm is an efficient and early garbage collec- tion of those hypotheses that are pruned away. This is important to keep the dynamic memory requirement as small as possible. We use a garbage collection algorithm based on reference counting. Each hypothesis contains an additional integer that counts the number of pointers and back-pointers that refer to this hypothesis. If this count reaches zero the hypothesis can be freed. Hence, unnecessary search hypotheses are removed from the memory as early as possible. In C++, this type of garbage collection can be performed efficiently and safe using the concept of smart pointers [Koenig & Moo 97].
6.4
Heuristic Function
To improve the comparability of search hypotheses, we introduce heuristic functions. An ad- missible heuristic function estimates optimistically the probabilities to reach the goal node from