Sophie Aveline, Toshiaki Fukada
A Study on Continuous Speech Recognition
Based on Polynomial Segment Models
Abstract:In this paper, we present a speech recognition system based on polynomial segment models (PSMs). To date, several PSM-based studies have been shown that
the performance of the PSMs was better than that of regular HMMs. However,
most of the comparisons have been done for classification tasks or for rescoring the
HMM-based recognition results because the computational requirement for PSM
is quite high. In our approach, to reduce the computational requirement dramatically,
a recurrent neural network (RNN) based landmark detector, which can estimate
boundary candidates of phonemes accurately, is first developed. Then, PSM-based
recognition is performed by evaluating landmark candidates obtained from the
landmark detector. Our preliminary experimental results on the TIMIT database
showed that the proposed system gave equivalent recognition performance to that of a
conventional HMM system.