TR-S-0025

TR-S-0025 :2001. 3.30

松田繁樹,クルディップパリワル,中村哲

A Study of Speech Recognition based on Segmental Feature Model

Abstract:We introduce a segmental feature model (SFM) that represents temporal relationships between feature vectors. A feature vector sequence can be divided into most likely periods by using the conventional HMM. In the conventional HMM, temporal relationships between these periods are represented, because the conventional HMM consists of plural states connected temporarily. However, temporal relationships between feature vectors in each period is not modeled. If the temporal relationships between the feature vectors are modeled, it is considered that feature vector sequences can be modeled more efficiently than the conventional HMM. The SFM calculate a probability of a fixed-dimension segmental feature vector, the segmental feature vector is extracted from a variable-length period that is allocated to each state in the SFM. We propose a segmental feature vector based on average values. The segmental feature vector can calculate temporal covariances. And, we propose a new SFM that has variances in a segment (period), to reduce missmatches between a feature vector sequence and a segmental feature vector. For the SFM using the segmental feature vector based on average values, we performed speech recognition experiments of a phoneme classification and a continuous phoneme recognition. The SFMs achieved higher recognition rates than conventional HMMs in the phoneme classification experiments. However, in the continuous phoneme classification experiments, the SFMs got lower recognition rates than conventional HMMs. It is considered that the SFM does not estimate phoneme boundaries rightly.