TR-IT-0294 :February,1999

Kristiina Jokinen, Hideaki Iwamoto, Hideki Tanaka

Manual for tagging the SLDB English dialogues with speech acts and topic

Abstract:We describe a dialogue tagging project on the bilingual ATR spoken dialogue corpus. This project is a part of a larger research effort, the goal of which is to promote research and development involving speech translation systems based on multi-level information. The discourse level information contains dialogue act and topic types. The novel feature in our tagging research is the use of topic tags, which represent the information content of utterances, and thus complement dialogue act tags which represent the speakers' intentions. In this report we describe the design and use of the tag sets for these two features. We also report on a tag browser which we have developed to check the consistency of the tag sets, and on the preliminary results concerning tag prediction and bilingual surveys on our corpus.