TR-IT-0149 :January 31, 1996

Stephane Auberger, Eiichiro Sumita, Hitoshi Iida

A comparative study of Query Reformulation methods on Vector-Space Models

Abstract:Relevance feedback is a well-known method developed to improve the effectiveness of information retrieval systems. It is based on the automatic and iterative improvement of the textual queries supplied by the users. After a brief overview of the system performance, this paper describes several different approaches and further refinement of a standard query reformulation method ("Ide-dec hi formula"). The main research was focused on using information about the origin of each particular term in the query ("modified Ide-dec hi formula") and especially information on terms in non-relevant documents ("Common Term System"). Also reviewed are less-expensive methods for decreasing the retrieval time as well as the size of query and document vectors ("fixed expansion size" and "fixed vector size"). In general, all methods are found to be effective in improving the performance on a standard test suite, the Virginia Disc One collections.