Hideki Tanaka
Probabilistic Speech Act Type
Tagging System and Its Application
to Speech Translation Systems
Abstract:This technical report describes the author's one-year study on discourse processing which
covered two research subjects: devising a new speech act type tagging system and investigating
the use of the tags in machine translation. We describe a new efficient speech act type tagging
system in the first part of this technical report. This system covers the tasks of (1) segmenting
a turn into the optimal number of speech act units (SA units), and (2) assigning a speech act
type tag (SA tag) to each SA unit. Our method is based on a theoretically clear statistical model
that integrates linguistic, acoustic and situational information. We report tagging experiments
on Japanese and English dialogue corpora manually labeled with SA tags and then discuss the
performance difference between the two languages. We then describe the problem of translation
of positive response expressions using SA tags. We describe the use of speech act type tags for
translating Japanese and English positive response expressions in the second half of this report.
Positive responses quite often appear in task-oriented dialogues like those in our tasks. They are
often highly ambiguous and problematic in speech translation. We will show that these expressions
can be effectively translated with the help of dialogue information, SA tags.