TR-H-0195

TR-H-0195 :1996.5.27

Alain de Cheveigné

Speech Fundamental Frequency Estimation

Abstract:Several methods for voiced speech Fundamental Frequency (F₀) estimation were implemented and evaluated on a database of speech recorded together with a laryngograph signal. A first approach was based on an algorithm originally designed for concurrent speech (two-voice) F₀ estimation. The idea was that, by modeling the speech signal as the sum of two periodic signals, the algorithm would deal with common causes of F₀ estimation failure: strong harmonics or subharmonics (diplophony), changes in amplitude and spectrum (modeled by the algorithm as a local beat pattern), periodic interference (hum, interfering speech, reverberation), etc.. The approach turned out to be less effective than expected and was abandoned. The second approach, based on a careful error analysis of the classic AMDF algorithm, resulted in several new schemes to avoid errors. Combined, these schemes reduced errors over the database in a ratio of 3.5 for a male voice and 9 for a female voice, and allowed the algorithm to outperform the standard ESPS get_f0 algorithm by a factor of about 2.