TR-SLT-0033 :March 05, 2003

Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki, Kiyohiro Shikano

Optimizing Segment Selection for High-Quality Text-to-Speech

Abstract:This report addresses the problem of how to improve the naturalness of synthetic speech in corpus-based Text-to-Speech. To deal with this problem, we focus on two factors: (1) an algorithm for selecting the most appropriate synthesis units from a speech corpus, and (2) an evaluation measure for selecting the synthesis units. We confirm that the proposed segment selection algorithm and the proposed cost function based on perceptual evaluation are effective for improving the naturalness of synthetic speech.