戸田智基,河井恒,津崎実,鹿野清宏
高品質なテキスト音声合成のための
素片選択の最適化
Abstract:This report addresses the problem of how to improve the naturalness of synthetic
speech in corpus-based Text-to-Speech. To deal with this problem, we focus on two
factors: (1) an algorithm for selecting the most appropriate synthesis units from a
speech corpus, and (2) an evaluation measure for selecting the synthesis units. We
confirm that the proposed segment selection algorithm and the proposed cost
function based on perceptual evaluation are effective for improving the naturalness
of synthetic speech.