Hidefumi Sawai
Connectionist Large-Vocabulary
Continuous Speech Recognition
Abstract:This paper describes connectionist approaches to large-vocabulary
continuous speech recognition integrating speech recognition and
language processing. The speech recognition part consists of the
Large Phonemic Time-Delay Neural Networks (TDNNs) which can
automatically spot all Japanese phonemes by simply scanning among
an input speech. The language processing part is made up of a
predictive LR parser which predicts subsequent phonemes based on
currently processed phonemes. Recognition experiments using ATR's
large-vocabulary speech database with 5,240 words and "Conference
Registration" task, yielded high recognition performance.
Furthermore, we discuss some extensions of the current system for
robust speech recognition, speaker-adaptation and speaker-
independent recognition.