TR-IT-0230 :July 24, 1997

Christophe d'Alessandro

Voice quality modification using periodic-aperiodic decomposition and spectral processing of the voice source signal

Abstract:Voice quality is currently a key issue in speech synthesis. On the one hand, the lack of realistic intra-speaker voice quality variation results in poor naturalness in synthesis methods using small corpora and signal processing (e.g diphone synthesis). On the other hand, voice quality mismatches is one of the main source of concern for methods based on large corpora and labelling (e.g. word or subword units concatenation systems,like CHATR). A new method for voice quality modification is designed. It takes advantage of a spectral theory for voice source signal representation. An algorithm based on periodic-aperiodic decomposition and spectral processing (using the short-term Fourier transform) is described. The use of adaptive inverse filtering in this framework is also discussed. Applications of this algorithm may include: pre-processing of speech corpora, modification of voice quality parameters together with intonation in synthesis, voice transformation. Some experiments were perfomed, showing convincing voice quality modifications for various speakers.