Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.

Incremental adaptive semi-supervised fuzzy clustering for data stream classification

Casalino Gabriella;Giovanna Castellano
;
Corrado Mencar
2018-01-01

Abstract

Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.
2018
978-1-5386-1376-4
File in questo prodotto:
File Dimensione Formato  
eais2018.pdf

non disponibili

Tipologia: Documento in Versione Editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 645.81 kB
Formato Adobe PDF
645.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
EAIS_camera_ready.pdf

non disponibili

Descrizione: Articolo versione pre editoriale
Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 646.54 kB
Formato Adobe PDF
646.54 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/224264
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 10
social impact