TR-I-0139

TR-I-0139 :1990. 2

Alain de Cheveigne

Auditory nerve fiber spike generation model

Abstract:Speech recognition systems are far from reproducing the capabilities of human listeners. Recognition rates remain low even in the best recording conditions, and deteriorate even further when the conditions approach those of everyday use: noise, reverberation, interfering voices, etc. In contrast, people can hear and understand speech in very adverse conditions. To a certain degree, we can parse the complex spectrum of superimposed speech and noise into separate "streams", and concentrate on the speech while eliminating the noise. We can also "reconstruct" portions of speech that have been obliterated. Such capabilities likely involve a combination of peripheral processing (under central control via efferent nerve pathways), and central processing of auditory nerve fiber patterns. Physiological experiments (recordings of discharge patterns within the auditory nerve) provide a picture of the data that central processing operate on, and are a starting point for models of central processing. Despite the large amount of published data available, questions may arise for which this data is insufficient. A model of auditory nerve fiber spike generation is useful within this context, because it allows us to: a) experiment with models of higher-level auditory processing, b) extrapolate from existing experimental data to new stimulus conditions, without carrying out physiological experiments, c) clarify the relation between the shape of the driving function that controls nerve fiber discharge, and the shape of histograms as they can be measured from single units of the auditory nerve, d) assess the degree to which experimental data has been affected by the choice of histogram format used to report it (different histograms are not equivalent).