The paper tackles the problem of mining linked open data. The inherent lack of knowledge caused by the openworld assumption made on the semantic of the data model determines an abundance of data of uncertain classification. We present a semi-supervised machine learning approach. Specifically a self-training strategy is adopted which iteratively uses labeled instances to predict a label also for unlabeled instances. The approach is empirically evaluated with an extensive experimentation involving several different algorithms demonstrating the added value yielded by a semi-supervised approach over standard supervised methods.
Mining Linked Open Data through Semi-supervised Learning Methods based on Self-training
FANIZZI, Nicola;D'AMATO, CLAUDIA;ESPOSITO, Floriana
2012-01-01
Abstract
The paper tackles the problem of mining linked open data. The inherent lack of knowledge caused by the openworld assumption made on the semantic of the data model determines an abundance of data of uncertain classification. We present a semi-supervised machine learning approach. Specifically a self-training strategy is adopted which iteratively uses labeled instances to predict a label also for unlabeled instances. The approach is empirically evaluated with an extensive experimentation involving several different algorithms demonstrating the added value yielded by a semi-supervised approach over standard supervised methods.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.