張樹武
Distance-related Unit Association
Maximum Entropy (DUAME)
Language Modeling
Abstract:In this report, we proposed a distance-related unit association maximum entropy
(DUAME) language modeling. In comparison with conventional N-gram modeling,
some major characteristics of DUAME modeling are: 1). Instead of longer N-gram,
it can simulate an event (unit subsequence) using the exponential co-occurrence
of full distance unit association (UA) features. Thus, it is functional comparable
to higher order N-gram, 2). DUAME modeling can smooth the distribution of an
partially unobserved event with the exponential co-occurrence of decreasing UA features. It is more accurate to predict this part of events compared to conventional
backoff or interpolation smoothing in N-gram modeling, 3). Because all UA features
in DUAME model are relevant to only two units, it takes much less memory requirement for storing feature parameters and it is more available in terms of memory to
exploit longer distance language correlations compared to longer order N-gram features. Preliminary experimental results have shown that DUAME modeling is very
useful for improving the current N-gram language modeling in speech recognition.