Semi-Supervised learning methods utilize abundant unlabeled data in order to enlarge the training set and to update classifiers. For the purpose, standard methods label and select unknown data which are classified with high confidence by the current classifier. This paper presents an experimental investigation on the use of semi-supervised learning and discusses three methods (our feedback-based technique and two different algorithms known in literature) for retraining individual classifiers in a multi-expert scenario. More specifically, we analyze the entire system so that a misclassified sample for a particular expert, respect to the final decision, can be used to update itself if that sample is classified with a confidence greater than a specific threshold by multi-expert system. Experimental tests, carried out on the CEDAR (handwritten digits) database, are presented and some considerations about accuracy, space and time between different methods are provided. For the purpose, a SVM classifier and five different combination techniques at abstract and measurement level have been used. The results show that our feedback-based algorithm outperforms Self-Training and Co-Training algorithms when the training set is very small and a suitable number of iterations is performed in the feedback process.

Instance Selection for Semi-Supervised Learning in Multi-Expert Systems: A Comparative Analysis

IMPEDOVO, Sebastiano;
2014

Abstract

Semi-Supervised learning methods utilize abundant unlabeled data in order to enlarge the training set and to update classifiers. For the purpose, standard methods label and select unknown data which are classified with high confidence by the current classifier. This paper presents an experimental investigation on the use of semi-supervised learning and discusses three methods (our feedback-based technique and two different algorithms known in literature) for retraining individual classifiers in a multi-expert scenario. More specifically, we analyze the entire system so that a misclassified sample for a particular expert, respect to the final decision, can be used to update itself if that sample is classified with a confidence greater than a specific threshold by multi-expert system. Experimental tests, carried out on the CEDAR (handwritten digits) database, are presented and some considerations about accuracy, space and time between different methods are provided. For the purpose, a SVM classifier and five different combination techniques at abstract and measurement level have been used. The results show that our feedback-based algorithm outperforms Self-Training and Co-Training algorithms when the training set is very small and a suitable number of iterations is performed in the feedback process.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11586/39158
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact