Nowadays, the amount of available information, especially on the Web and in Digital Libraries, is increasing over time. In this context, the role of user modeling and personalized information access is increasing. This paper focuses on the problem of choosing a representation of documents that can be suitable to induce concept-based user profiles as well as to support a content-based retrieval process. We propose a framework for content-based retrieval, which integrates a word sense disambiguation algorithm based on a semantic similarity measure between concepts (synsets) in the WordNet IS-A hierarchy, with a relevance feedback method to induce semantic user profiles. The document representation adopted in the framework, that we called Bag-Of-Synsets (BOS) extends and slightly improves the classic Bag-Of-Words (BOW) approach, as shown by an extensive experimental session.
WordNet-based Word Sense Disambiguation for Learning User Profiles
DEGEMMIS, MARCO;LOPS, PASQUALE;SEMERARO, Giovanni
2005-01-01
Abstract
Nowadays, the amount of available information, especially on the Web and in Digital Libraries, is increasing over time. In this context, the role of user modeling and personalized information access is increasing. This paper focuses on the problem of choosing a representation of documents that can be suitable to induce concept-based user profiles as well as to support a content-based retrieval process. We propose a framework for content-based retrieval, which integrates a word sense disambiguation algorithm based on a semantic similarity measure between concepts (synsets) in the WordNet IS-A hierarchy, with a relevance feedback method to induce semantic user profiles. The document representation adopted in the framework, that we called Bag-Of-Synsets (BOS) extends and slightly improves the classic Bag-Of-Words (BOW) approach, as shown by an extensive experimental session.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.