The creation of training set, for pattern recognition, is a difficult, expensive and time consuming task because it requires the efforts of experienced human annotators. On the other hand, unlabeled data can be obtained cheaply, but there are few ways to use them. Semi-Supervised learning uses both labeled and unlabeled data for classification task. In this paper we propose to apply semi-supervised learning and three methods in order to re-train individual classifiers in a multi-expert scenario. More specifically, these experiments are focused on acceptance threshold that defines what data are selected in the feedback-based process. Our approach analyzes the entire system so that a misclassified sample, respect to the final decision, by particular expert can be used to update itself if that sample is classified with a confidence greater than a specific threshold. Experimental results, carried out on the CEDAR (handwritten digits) database, show a comparison between our approach and Self-Training and Co-Training algorithms. The SVM classifier and two different combination techniques at measurement level have been used.
Evaluating Threshold for Retraining Rule in Semi-Supervised Learning using Multi-Expert System
IMPEDOVO, Sebastiano;PIRLO, Giuseppe
2014-01-01
Abstract
The creation of training set, for pattern recognition, is a difficult, expensive and time consuming task because it requires the efforts of experienced human annotators. On the other hand, unlabeled data can be obtained cheaply, but there are few ways to use them. Semi-Supervised learning uses both labeled and unlabeled data for classification task. In this paper we propose to apply semi-supervised learning and three methods in order to re-train individual classifiers in a multi-expert scenario. More specifically, these experiments are focused on acceptance threshold that defines what data are selected in the feedback-based process. Our approach analyzes the entire system so that a misclassified sample, respect to the final decision, by particular expert can be used to update itself if that sample is classified with a confidence greater than a specific threshold. Experimental results, carried out on the CEDAR (handwritten digits) database, show a comparison between our approach and Self-Training and Co-Training algorithms. The SVM classifier and two different combination techniques at measurement level have been used.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.