TR-IT-0072 :1994.9.10

Christian BOITET

On the design of MIDDIM-DB, a data base of ambiguities and disambiguation methods

Abstract:Interactive disambiguation will be necessary in future Machine Translation and Machine Interpretation systems to be used in non-restricted contexts by the general public. In order to determine which ambiguities end users could and should help the system disambiguate, and which (possibly multimodal) disambiguation methods are most appropriate, we have started to build MIDDIM-DB, a data base of real ambiguities and simulated or implemented disambiguation methods. A first version has already been prototyped in HyperCard, and has shown the need for some design improvements. This paper elaborates on the motivations to build such a data base, and presents the main aspects of the design of a second version. An important byproduct of this enterprise is an exact and formal definition of what an ambiguity is, so that an ambiguity becomes a computational object.