Tam Wai Lok, Kyonghee Paik
Comparative Analysis of Chinese,
Japanese and Korean Numeral Classifier
Abstract:We will present our analysis of numeral classifiers extracted from Japanese,
Korean, and Chinese corpora. We compare how numeral classifiers are
matched with their referents in our corpora with the results produced by the
algorithm given in Bond and Paik (2000) for generating classifiers using
semantic classes from an ontology provided by Goi-Taikei. We also attempt
at automatically analyzing the Japanese sentences containing classifiers by
typing the classifiers contained following Bond (2001) and the syntactic
construction following Asahioka et al (1990). We have identified some
problematic constructions in Chinese and Japanese and point out the
phenomenon that classifier types change in the course of translation. We
have also shown that the anaphoric usage of numeral classifiers is
problematic to machine translation. In conclusion, we point out the
difficulty to predict the correct numeral classifiers to be used when
translating between Chinese, Japanese and Korean as the domain covered
by the same type of classifiers and the constructions containing numeral
classifiers vary. For further work, we suggest analyzing classifier
constructions using statistical model based on the data produced here and
applying word sense disambiguation techniques to the referents.