TR-I-0167 :1990

Masahide SUGIYAMA

AUTOMATIC LANGUAGE RECOGNITION USING ACOUSTIC FEATURES

Abstract:Language recognition (e.g. Japanese, English, German, etc) using acoustic features is an important yet difficult problem for current speech technology. In this report, two language recognition algorithms are proposed and some experimental results are described. The speech data base used in this report contains 20 languages. The speech data was carefully divided into training and test sets, recognition experiments being designed as both speaker-independent and text-independent. The first algorithm is based on the standard Vector Quantization (VQ) technique. The second algorithm is based on a single universal (common) VQ codebook for all languages, and its occurrence probability histograms. The experimental results show that the recognition rates for the first and second algorithms were 65% and 80%, respectively, each using just 8 sentences of unknown speech (about 64 seconds). With sufficient input speech the second algorithm is better than the first.