TR-I-0170 :1990.8

Toshiya Sakano, Tsuyoshi Morimoto

Design Principle of Language Model for Speech Recognition

Abstract:In speech recognition, the recognition rate can be improved by using statistical information about linguistic texts. However, there is no criterion for selecting sample linguistic texts from those available. If unbalanced linguistic text samples are selected, the information extracted would not be suitable for a linguistic model. To solve this problem, we need to quantitatively analyze linguistic texts. In this report, we introduce a method to quantitatively analyze linguistic texts, and propose a method for selecting linguistic texts. Moreover, we describe the possibility of developing a criterion for selecting test texts needed for system evaluation, and show the relationship between recognition system performance and the features of the sample text group used for the linguistic model.