TR-IT-0268 :1998.07.31

Thomas Wahl

A Speech Recognition Database Library

Abstract:This Technical Report describes a C++ database object that supports experimental work in speech recognition. Difficulties that can make those experiments inconvenient include:

・the number of utterances is often very high,

・the waveform and feature data files can be very large, and

・utterances from different sources may have different formats.

The database is designed to be an interface that allows experimenting with many utterances in a standardized fashion without having to care about vast amounts of data that are present in form of the samples and (after preprocessing) the features.

Section 1 gives an introduction to the purpose of the database, including how to make it ready for use.

Section 2 explains its functionality in detail.

Section 3 contains a description of how to use scripting languages along with the database library.

Section 4 provides examples for using the library both within C++ and within scripting languages.

Section 5 gives some information on extensibility and maintainance, including a data structure description.

Section 6 finally is intended as a quick reference manual for important things that may get lost in a big paper, but may be needed quickly when working with the database library.