This paper presents the participation of the semantic N-levels search engine SENSE at the CLEF 2009 Ad Hoc Robust-WSD Task. During the participation at the same task of CLEF 2008, SENSE showed that WSD can be helpful to improve retrieval, even though the overall performance was not exciting mainly due to the adoption of a pure Vector Space Model with no heuristics. In this edition, our aim is to demonstrate that the combination of the N-levels model and WSD can improve the retrieval performance even when an efiective retrieval model is adopted. To reach this aim, we worked on two difierent strategies. On one hand a new model, based on Okapi BM25, was adopted at each level. Moreover, we improved the word stemming algorithm and we normalized words removing some characters that made more evident the word mismatch problem. The use of these simple heuristics allowed us to increases of 106% the MAP value, compared to our best result obtained at CLEF 2008. On the other hand, we integrated a local relevance feedback technique, called Local Context Analysis, in both indexing levels of the system (keyword and word meaning). The hypothesis that Local Context Analysis can be efiective even when it works on word meanings coming from a WSD algorithm is supported by experimental results. In Mono-lingual task MAP increased of about 2% exploiting disambiguation, while GMAP increased from 4% to 9% when we used WSD in both Mono- And Cross- lingual tasks.
UNIBA-SENSE @ CLEF 2009: Robust WSD task
Basile P.;Caputo A.;Semeraro G.
2009-01-01
Abstract
This paper presents the participation of the semantic N-levels search engine SENSE at the CLEF 2009 Ad Hoc Robust-WSD Task. During the participation at the same task of CLEF 2008, SENSE showed that WSD can be helpful to improve retrieval, even though the overall performance was not exciting mainly due to the adoption of a pure Vector Space Model with no heuristics. In this edition, our aim is to demonstrate that the combination of the N-levels model and WSD can improve the retrieval performance even when an efiective retrieval model is adopted. To reach this aim, we worked on two difierent strategies. On one hand a new model, based on Okapi BM25, was adopted at each level. Moreover, we improved the word stemming algorithm and we normalized words removing some characters that made more evident the word mismatch problem. The use of these simple heuristics allowed us to increases of 106% the MAP value, compared to our best result obtained at CLEF 2008. On the other hand, we integrated a local relevance feedback technique, called Local Context Analysis, in both indexing levels of the system (keyword and word meaning). The hypothesis that Local Context Analysis can be efiective even when it works on word meanings coming from a WSD algorithm is supported by experimental results. In Mono-lingual task MAP increased of about 2% exploiting disambiguation, while GMAP increased from 4% to 9% when we used WSD in both Mono- And Cross- lingual tasks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.