TR-IT-0261 :1998.04.21

Petra Philips, Mike Schuster

Adaptation of BRNN Speech Recognition Systems

Abstract:Because speaker-independent large-vocabulary systems need huge amounts of training data the parameters of the acoustical units have a high variance and thus give poor models for individual utterances, being sensitive to changes of environment (speaker or channel). One attempt to solve this problem is to transforme the feature and/or model space in order to reduce the mismatch between the acoustical data and the acoustical models of the system. We present some experimental results achieved with supervised and unsupervised adaptation of a hybrid BRNN (Bidirectional Recurrent Neural Network) phoneme recognition system on TIMIT data using

1. a Linear Input Network (LIN)

2. retraining the BRNN with weight-sharing

We show also how unsupervised adaptation can be improved using only a simple acoustical confidence measure based on the posterior probability of the recognized class for every frame.