Bettina SchraderRoom: 31/431, Phone: +49-(0)-541-969-2706 Developing a hybrid (word) alignment system using linguistic rules and statistical similarity measuresadvised by Peter Bosch and Helmar Gust |
![]() |
Standard word alignment systems use statistical means to determine correspondences between two languages, i.e. word alignment systems compute wich word of a source language may be translated to another word in the target language.
Statistical similarity measures establish such word correspondences using word occurrence patterns, e.g. that two words "probably mean" the same, if their frequencies are similar, if they appear at similar positions in the text etc.
While word occurrence patterns are useful to detect word correspondences or "alignments", they are by far not the only sources of information: every utterance in every language is a highly structured entity that can be described on various levels of linguistic description, involving information on e.g. word category, morphological features, syntactic constituents etc. This information can be used to establish rules of translational equivalence which in turn serve to determine more exact word correspondences, i.e. that two words "do mean" the same.
The task of my dissertation is to develop an alignment architecture that uses both forms of information - on the one hand, linguistic rules are defined and used to determine word correspondences. On the other hand, statistical similarity measures are used for all cases for which rules are not (yet) given.