TR-I-0170 :1990.8

坂野俊哉, 森元逞

音声認識用言語モデル構築に関する考察

Abstract:In speech recognition, the recognition rate can be improved by using statistical information about linguistic texts. However, there is no criterion for selecting sample linguistic texts from those available. If unbalanced linguistic text samples are selected, the information extracted would not be suitable for a linguistic model. To solve this problem, we need to quantitatively analyze linguistic texts. In this report, we introduce a method to quantitatively analyze linguistic texts, and propose a method for selecting linguistic texts. Moreover, we describe the possibility of developing a criterion for selecting test texts needed for system evaluation, and show the relationship between recognition system performance and the features of the sample text group used for the linguistic model.