TR-H-0200

TR-H-0200 :1996.8.30

Hideki Kawahara

Speech transformation using adaptive interpolation of time-frequency representation and all-pass filters

Abstract:A simple new procedure called STRAIGHT (Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum) has been developed, using pitch-adaptive spectral analysis combined with a surface reconstruction method in the time-frequency region, and an excitation source design based on phase manipulation of all-pass filters. The proposed interpolation preserves the bilinear surface in the time-frequency region and allows for over 600% manipulation of such speech parameters as pitch, vocal tract length, and speaking rate, without introducing further degradation due to parameter manipulation. A new design procedure of all-pass filters also reduces the characteristic degradation caused by the usual pulse excitation which can be annoying, especially under headphone listening conditions.