Vivien LE POUPON, Taro Watanabe
Implementation of EM-IS Algorithm
for Machine Translation
Abstract:We want to implement the EM-IS algorithm to build a lexicon model (and then try to
use it in conjunction with an IBM model). We followed a Maximum Entropy (ME)
approach, but expanded it to Latent ME (because normal ME is limited by scarcity of
empirical data). The principle consists in embedding the iterative scaling loop in an EM
procedure in order to determine the weighting paramaters associated with the different
features: we are then able to make these parameters naturally match the information
contained in the corpus. We developped this method using a corpus of English-Japanese
pair.