Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.
Incremental adaptive semi-supervised fuzzy clustering for data stream classification
Casalino Gabriella;Giovanna Castellano
;Corrado Mencar
2018-01-01
Abstract
Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.File | Dimensione | Formato | |
---|---|---|---|
eais2018.pdf
non disponibili
Tipologia:
Documento in Versione Editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
645.81 kB
Formato
Adobe PDF
|
645.81 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
EAIS_camera_ready.pdf
non disponibili
Descrizione: Articolo versione pre editoriale
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
646.54 kB
Formato
Adobe PDF
|
646.54 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.