Wenzel Svojanovsky, Rainer Gruhn
Clustering of Backchannels
in Japanese Spontaneous Speech
Abstract:Human language, especially spontaneous speech, carries more information than just
spoken words. This research analyzes prosodic features of the backchannel "うん"
based on F0, duration, and energy of the signal.
Training and test data are subsets extracted from a 150 hour corpus of spontaneous
conversational speech from one Japanese female collected in the ESP project. The data
is partially labeled with 8 types of intentional labels by human experts.
The "うん" segments are automatically clustered and classified into one of several
speech act classes.