Ensembles of density estimators for positive-unlabeled learning

IRIS

Positive-Unlabeled (PU) learning works by considering a set of positive samples, and a (usually larger) set of unlabeled ones. This challenging setting requires algorithms to cleverly exploit dependencies hidden in the unlabeled data in order to build models able to accurately discriminate between positive and negative samples. We propose to exploit probabilistic generative models to characterize the distribution of the positive samples, and to label as reliable negative samples those that are in the lowest density regions with respect to the positive ones. The overall framework is flexible enough to be applied to many domains by leveraging tools provided by years of research from the probabilistic generative model community. In addition, we show how to create mixtures of generative models by adopting a well-known bagging method from the discriminative framework as an effective and cheap alternative to the classical Expectation Maximization. Results on several benchmark datasets show the performance and flexibility of the proposed approach.

Ensembles of density estimators for positive-unlabeled learning

Basile T. M. A.;Di Mauro N.;Esposito F.;Ferilli S.;Vergari A.

2019-01-01

Abstract

Positive-Unlabeled (PU) learning works by considering a set of positive samples, and a (usually larger) set of unlabeled ones. This challenging setting requires algorithms to cleverly exploit dependencies hidden in the unlabeled data in order to build models able to accurately discriminate between positive and negative samples. We propose to exploit probabilistic generative models to characterize the distribution of the positive samples, and to label as reliable negative samples those that are in the lowest density regions with respect to the positive ones. The overall framework is flexible enough to be applied to many domains by leveraging tools provided by years of research from the probabilistic generative model community. In addition, we show how to create mixtures of generative models by adopting a well-known bagging method from the discriminative framework as an effective and cheap alternative to the classical Expectation Maximization. Results on several benchmark datasets show the performance and flexibility of the proposed approach.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2019

Appare nelle tipologie:

1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Basile2019_Article_EnsemblesOfDensityEstimatorsFo.pdf accesso aperto Descrizione: Articolo principale Tipologia: Documento in Versione Editoriale Licenza: Creative commons Dimensione 594.14 kB Formato Adobe PDF Visualizza/Apri	594.14 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/236580

Citazioni

ND

6

3

social impact