UNIBA-CORE: Combining Strategies for Semantic Textual Similarity.

Caputo, Annalina; Basile, Pierpaolo; Semeraro, Giovanni

This paper describes the UNIBA participation in the Semantic Textual Similarity (STS) core task 2013. We exploited three different systems for computing the similarity between two texts. A system is used as baseline, which represents the best model emerged from our previous participation in STS 2012. Such system is based on a distributional model of semantics capable of taking into account also syntactic structures that glue words together. In addition, we investigated the use of two different learning strategies exploiting both syntactic and semantic features. The former uses a combination strategy in order to combine the best machine learning techniques trained on 2012 training and test sets. The latter tries to overcame the limit of working with different datasets with varying characteristics by selecting only the more suitable dataset for the training purpose.