Word Sense Disambiguation (WSD) is traditionally considered an AI-hard problem. In fact, a breakthrough in this field would have a significant impact on many relevant fields, such as information retrieval and information extraction. This paper describes JIGSAW, a knowledge-based WSD algorithm that attemps to disambiguate all words in a text by exploiting WordNet(1) senses. The main assumption is that a Part-Of-Speech (POS)-dependent strategy to WSD can turn out to be more effective than a unique strategy. Semantics provided by WSD gives an added value to applications centred on humans as users. Two empirical evaluations are described in the paper. First, we evaluated the accuracy of JIGSAW on Task 1 of SEMEVAL-1 competition(2). This task measures the effectiveness of a WSD algorithm in an Information Retrieval System. For the second evaluation, we used semantically indexed documents obtained through a WSD process in order to train a native Bayes learner that infers semantic sense-based user profiles as binary text classifiers. The goal of the second empirical evaluation has been to measure the accuracy of the user profiles in selecting relevant documents to be recommended within a document collection.

The JIGSAW Algorithm for Word Sense Disambiguation and Semantic Indexing of Documents

DEGEMMIS, MARCO;LOPS, PASQUALE;SEMERARO, Giovanni
2007-01-01

Abstract

Word Sense Disambiguation (WSD) is traditionally considered an AI-hard problem. In fact, a breakthrough in this field would have a significant impact on many relevant fields, such as information retrieval and information extraction. This paper describes JIGSAW, a knowledge-based WSD algorithm that attemps to disambiguate all words in a text by exploiting WordNet(1) senses. The main assumption is that a Part-Of-Speech (POS)-dependent strategy to WSD can turn out to be more effective than a unique strategy. Semantics provided by WSD gives an added value to applications centred on humans as users. Two empirical evaluations are described in the paper. First, we evaluated the accuracy of JIGSAW on Task 1 of SEMEVAL-1 competition(2). This task measures the effectiveness of a WSD algorithm in an Information Retrieval System. For the second evaluation, we used semantically indexed documents obtained through a WSD process in order to train a native Bayes learner that infers semantic sense-based user profiles as binary text classifiers. The goal of the second empirical evaluation has been to measure the accuracy of the user profiles in selecting relevant documents to be recommended within a document collection.
2007
978-3-540-74781-9
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/112425
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 1
social impact