Kwang-jun Seo, Osamu Furuse
Two Distance Computation Models for TDMT
Using Narrow and Broad Concepts
on a Thesaurus
Abstract:This paper discusses two distance computing models for TDMT: a distance model and
a similarity model. These models use a semantic dictionary (or thesaurus) with a
Directed Acyclic Graph (DAG) structure concept classification system, named the
Concept Classification Graph (CCG). The distance model computes the distance
between two expressions using the narrow concept set on a CCG. The similarity model
computes the similarity between two expressions using the broad concept set on a CCG.
These models are divided into 3 phase calculations, for expressions, for words and for
concepts. The calculations for the concepts use concept sets on a CCG, and the other
phases use the results of the next phase. In addition, a heuristic method for searching for
the shortest path between two concepts is provided with the distance model.
The experimental results have shown that the proposed models have few ambiguities,
but are slow. And, a comparison of the two models shows that the similarity model can
compute about 13 times as fast as the distance model, although it is a little more
ambiguous than the distance model.