Takashi Komori and Shigeru Katagiri
GPD Training of Dynamic
Programming-Based Speech Recognizers
Abstract:Although many pattern classifiers based on artificial neural networks have been
vigorously studied, they are still inadequate from a viewpoint of classifying dynamic
(variable- and unspecified-durational) speech patterns. To cope with this problem, the
generalized probabilistic descent method (GPD) has been recently proposed. GPD not only
allows one to train a discriminative system classifying dynamic patterns, but also possesses
a remarkable advantage, namely the learning optimality guaranteed in a sense of
probabilistic descent search. A practical implementation of this theory, however, remains to
be evaluated. In this light, we particularly focus on evaluating GPD in designing a widely-used
speech recognizer based on dynamic time warping distance-measurement. We also
show that a design algorithm appraised in this paper can be considered as a new version of
learning vector quantization, which is incorporated with the dynamic programming.
Experimental evaluation results in tasks of classifying syllables and phonemes clearly
demonstrate the GPD's superiority.