TR-A-0130 :1992.1.17

Takashi Komori and Shigeru Katagiri

GPD Training of Dynamic Programming-Based Speech Recognizers

Abstract:Although many pattern classifiers based on artificial neural networks have been vigorously studied, they are still inadequate from a viewpoint of classifying dynamic (variable- and unspecified-durational) speech patterns. To cope with this problem, the generalized probabilistic descent method (GPD) has been recently proposed. GPD not only allows one to train a discriminative system classifying dynamic patterns, but also possesses a remarkable advantage, namely the learning optimality guaranteed in a sense of probabilistic descent search. A practical implementation of this theory, however, remains to be evaluated. In this light, we particularly focus on evaluating GPD in designing a widely-used speech recognizer based on dynamic time warping distance-measurement. We also show that a design algorithm appraised in this paper can be considered as a new version of learning vector quantization, which is incorporated with the dynamic programming. Experimental evaluation results in tasks of classifying syllables and phonemes clearly demonstrate the GPD's superiority.