Rachael Serrell, Ohta Yoko, Nick Campbell
Speech Alignment and Prosodic Transcription
Abstract:This paper examines the process of aligning speech files in terms
of words and syllables taken from an utterance of spontaneous or non-spontaneous
speech. It is designed to be used as an instruction manual with example files listed
for reference. The same example files are used throughout to enable continual reference access to known data. The example files are taken from two databases
currently under analysis for their spontaneous speech content. The first, Sally
is a monologue consisting in its original form of twenty minutes of the speakers'
recollections and feelings about Japan. The entire process of files used for aligning Sally is referred to in the brackets after each command. The second emmi,
taken from a different speaker, is a set of dialogues collected from a multi-media
experiment where a travel agent advises a client on the various ways to get to his
desired destination. The example files given here are taken from the agents' side
of the conversation.