Data stream mining refers to methods able to mine continuously arriving and evolving data sequences or even large scale static databases. Most of data stream classification methods are supervised, hence they require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we introduced a method for data stream classification based on an incremental semi-supervised fuzzy clustering algorithm. This method processes data belonging to different classes assuming that they are available during time as chunks. It creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend our method by introducing a dynamic component that is able to adapt dynamically the number of clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the dynamic mechanism introduced in the method.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
|Titolo:||Incremental adaptive semi-supervised fuzzy clustering for data stream classification|
|Data di pubblicazione:||2018|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|