Yoshinori KITAHARA and Yoh'ichi TOHKURA
Prosody and Expression of Emotions
in Speech
Abstract:For the purpose of application to natural and high quality speech
synthesis, the role of prosody in speech perception has been studied.
Prosodic components, which contribute to the expression of emotions and
their intensity, are clarified by analyzing emotional speech and by
performing listening tests of synthetic speech. It has been confirmed
that prosodic components, which are composed of pitch structure,
temporal structure and amplitude structure, contribute to the expression
of emotions more than the spectral structure of speech. The results of
listening tests using prosodic substituted speech showed that temporal
structure was the most important for the expression of anger, while for
the intensity of anger, all the three components were much more
important. Pitch structure also played a significant role in the
expression of joy and sadness and their intensity. These results made it
possible to convert a neutral utterances (i.e., ones with no particular
emotion) into utterance expressing various kinds of emotions. The
results can also be applied to controlling the emotional characteristics
of speech in synthesis by rule.