TR-I-0233 :1991.12

Hiroaki HATTORI

Text-Independent Speaker Recognition Using Neural Networks

Abstract:This paper describes a text-independent speaker recognition method using predictive neural networks. The speech production process is regarded as a non-linear process so the speaker individuality in the speech signal also includes non-linearity. Therefore, the predictive neural network, which is a non-linear prediction model based on multi-layer perceptrons, is expected to be a more suitable model for representing speaker individuality. For text-independent speaker recognition, an ergodic model which allows transitions to any other state, including self-transitions, is adopted as the speaker model and one predictive neural network is assigned to each state. The proposed method was compared to distortion based methods, HMM based methods, and a discriminative neural network based method through a text-independent speaker recognition experiments on 24 female speakers. The proposed method gave the highest recognition accuracy of 100.0%, and the effectiveness of predictive neural networks for representing speaker individuality was clarified.