TR-H-0006

TR-H-0006 :1993.3.31

Kazuaki OBARA, Kiyoaki AIKAWA, Hideki KAWAHARA

Word Recognition Using Auditory Model Front-End Incorporating Spectro-Temporal Masking

Abstract:An auditory model front-end that reflects spectro-temporal masking characteristics is proposed. The model gives an excellent performance in the multi-speaker word recognition system. Recent auditory perception research shows that the forward masking pattern becomes more wide-spread over the frequency axis as the masker-signal interval increases. This spectro-temporal masking characteristics is modeled and implemented into the cochlear filter front-end for speech recognition. The current masking level is calculated as the weighted sum of the smoothed preceding spectra. The weight values become smaller and the smoothing window size becomes wider on the frequency axis as the masker-signal interval increases. The current masked spectrum is obtained by subtracting the masking levels from the current spectrum. Word recognition experiments demonstrated that the recognition performance is improved by incorporating the masking effect into the cochlear filter front-end. The performance was better than that with traditional LPC-based word recognizers.