Kei Usami, Jin-Song Zhang
A design of phonetic-balanced transcript for Chinese
Abstract:This technical report presents our efforts to develop phonetic-balanced transcripts for Chinese,
which are supposed necessary for building a Chinese speech data corpus. The search algorithm
was based on maximum entropy rule by which an equal appearance of the phonetic context is
prefered. As the preliminary experimental results, we found a 7-utterance-set in which nearly all
Initials and Finals appear, and a 25-utterance-set in which 134 di-phones appear. This system
is supposed to be used to generate more specifically phonetic-balanced transcripts for Chinese in
the future.