Incremental adaptive semi-supervised fuzzy clustering for data stream classification

IRIS

Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.

Incremental adaptive semi-supervised fuzzy clustering for data stream classification

Casalino Gabriella;Giovanna Castellano;Corrado Mencar

2018-01-01

Abstract

Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Codice ISBN
	
				978-1-5386-1376-4
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
eais2018.pdf non disponibili Tipologia: Documento in Versione Editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 645.81 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	645.81 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
EAIS_camera_ready.pdf non disponibili Descrizione: Articolo versione pre editoriale Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 646.54 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	646.54 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/224264

Citazioni

ND

24

11

social impact