Clustering is a data mining task to group objects such that data inside each cluste model the continuity of some environment, while separate cluster model variation over it. CORSO is a multi-relational data mining method to discover clusters of structured objects possibly related each other according to some relation defining a discrete data structure. Clusters are built by merging partially overlapping sets of neighbors which are homogeneous with respect to the cluster description. The quality of clusters depends on the evaluation of cluster homogeneity as well as the selection of the objects which are seeds in the neighborhood construction. To face these issues, we illustrate some solutions whose validity is confirmed by experimental results on artificial and real data.
On Homogeneity Evaluation and Seed Selection in Clustering Relational Data
APPICE, ANNALISA;LANZA, Antonietta;
2007-01-01
Abstract
Clustering is a data mining task to group objects such that data inside each cluste model the continuity of some environment, while separate cluster model variation over it. CORSO is a multi-relational data mining method to discover clusters of structured objects possibly related each other according to some relation defining a discrete data structure. Clusters are built by merging partially overlapping sets of neighbors which are homogeneous with respect to the cluster description. The quality of clusters depends on the evaluation of cluster homogeneity as well as the selection of the objects which are seeds in the neighborhood construction. To face these issues, we illustrate some solutions whose validity is confirmed by experimental results on artificial and real data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.