 
 
 
 
Hideki Kawahara
 
 
Speech transformation using adaptive interpolation of 
time-frequency representation and all-pass filters
 
Abstract:A simple new procedure called STRAIGHT (Speech Transformation and Representation
using Adaptive Interpolation of weiGHTed spectrum) has been developed, using pitch-adaptive
spectral analysis combined with a surface reconstruction method in the time-frequency
region, and an excitation source design based on phase manipulation of all-pass
filters. The proposed interpolation preserves the bilinear surface in the time-frequency region
and allows for over 600% manipulation of such speech parameters as pitch, vocal tract length,
and speaking rate, without introducing further degradation due to parameter manipulation. A
new design procedure of all-pass filters also reduces the characteristic degradation caused
by the usual pulse excitation which can be annoying, especially under headphone listening
conditions.