Dieter Huber
A Bilingual Dialogue Database for
Automatic Spoken Language Interpretation
between Japanese and English
Abstract:This report presents a bilingual Japanese-English dialogue database that
is presently constructed at ATR for research in spoken language
interpretation via telephone. Ten subjects participated in the recording
of the dialogues, five of which are native speakers of Standard Japanese
(2 female, 3 male), the other five native speakers of British (1 male)
and American (2 female, 2 male) English. The material consists of seven
short dialogues that were chosen from the ATR Linguistic Database and
represent typical conversations that may reasonably be expected to
occur in conference registration by telephone. The Japanese data were
spoken both in continuous and isolated phrase (Bunsetsu) modes, the
English data in continuous mode only. The material was recorded in an
anechoic, sound-insulated studio at the ATR Auditory and Visual
Perception Research Laboratories, using high-quality digital recording
equipment. The entire material comprises a total of approximately five
hours of recorded dialogues. It is aimed to be used for dedicated study
of prosodic transfer in spoken language interpretation, and for training
continuous speech recognition systems within the field of interpreting
telephony.