TR-SLT-0045

TR-SLT-0045 :2003.08.28

Alex Park

Relating Phonetic Feature Stream Reliability to Noise Robust Speech Recognition

Abstract:This report describes an attempt to improve robustness in automatic speech recognition by optimizing the extraction of phonetic feature streams independently of the recognition process. The eventual goal is to use a bank of feature extraction modules which use specialized signal processing and statistical techniques to reliably extract speech relevant features from the acoustic signal. In this work, we demonstrate the viability of using a sparse set of feature streams for a simple connected digit recognition task. We then illustrate some techniques for improving the reliability of the voicing feature module and evaluate several alternatives modules using both clean and noisy data. Finally, we relate the reliability of extraction for the individual voicing module to the overall performance of the recognizer by performing recognition experiments on the Aurora 2 database.