TR-IT-0162 :March, 1996

Kwang-jun Seo, Osamu Furuse

Two Distance Computation Models for TDMT Using Narrow and Broad Concepts on a Thesaurus

Abstract:This paper discusses two distance computing models for TDMT: a distance model and a similarity model. These models use a semantic dictionary (or thesaurus) with a Directed Acyclic Graph (DAG) structure concept classification system, named the Concept Classification Graph (CCG). The distance model computes the distance between two expressions using the narrow concept set on a CCG. The similarity model computes the similarity between two expressions using the broad concept set on a CCG. These models are divided into 3 phase calculations, for expressions, for words and for concepts. The calculations for the concepts use concept sets on a CCG, and the other phases use the results of the next phase. In addition, a heuristic method for searching for the shortest path between two concepts is provided with the distance model. The experimental results have shown that the proposed models have few ambiguities, but are slow. And, a comparison of the two models shows that the similarity model can compute about 13 times as fast as the distance model, although it is a little more ambiguous than the distance model.