Exploiting Disambiguation and Discrimination in Information Retrieval Systems

Basile, Pierpaolo; Caputo, Annalina; Semeraro, Giovanni

doi:10.1109/WI-IAT.2009.344

Polysemous words have more than one possible meaning, thus word ambiguity is a key issue for the systems which access textual information. Computational linguistics proposes two main methods to cope with word ambiguity: sense disambiguation and sense discrimination. (Word) Sense Disambiguation is the task of selecting a sense for a word from a set of predefined possibilities, while (Word) Sense Discrimination is the task of dividing the usages of a word into different meanings, discriminating among word meanings based on information found in unannotated corpora. This paper proposes a strategy to compare disambiguation and discrimination systems by adopting an in vivo evaluation in an Information Retrieval scenario. The goal of the evaluation is to establish how disambiguation and discrimination bias the retrieval performance.