Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki, Kiyohiro Shikano
Optimizing Segment Selection for
High-Quality Text-to-Speech
Abstract:This report addresses the problem of how to improve the naturalness of synthetic
speech in corpus-based Text-to-Speech. To deal with this problem, we focus on two
factors: (1) an algorithm for selecting the most appropriate synthesis units from a
speech corpus, and (2) an evaluation measure for selecting the synthesis units. We
confirm that the proposed segment selection algorithm and the proposed cost
function based on perceptual evaluation are effective for improving the naturalness
of synthetic speech.