As society comes to depend more and more on information, statistical confidentiality remains an essential issue to be considered. Confidentiality of data means that identification of both protected by the law stored data and activity results of a certain respondent were not possible. This is a concern with official data. A solution to confidentiality problems is that of creating datasets through the aggregation of micro-data concerning groups of individuals available to external agencies and institutes. This approach is at the base of Symbolic Data Analysis that generalizes data mining methods, such as those developed for classification tasks, to the case of symbolic objects (SOs). These objects synthesize information concerning a group of individuals of a population, eventually stored in a relational database, and ensure confidentiality of original micro-data. In this paper a lazy-learning approach to classify SOs is presented. The method, named SO-NN, is evaluated on symbolic datasets.
Classifying Aggregated Data: a Symbolic Data Analysis Approach
APPICE, ANNALISA;ESPOSITO, Floriana;MALERBA, Donato
2006-01-01
Abstract
As society comes to depend more and more on information, statistical confidentiality remains an essential issue to be considered. Confidentiality of data means that identification of both protected by the law stored data and activity results of a certain respondent were not possible. This is a concern with official data. A solution to confidentiality problems is that of creating datasets through the aggregation of micro-data concerning groups of individuals available to external agencies and institutes. This approach is at the base of Symbolic Data Analysis that generalizes data mining methods, such as those developed for classification tasks, to the case of symbolic objects (SOs). These objects synthesize information concerning a group of individuals of a population, eventually stored in a relational database, and ensure confidentiality of original micro-data. In this paper a lazy-learning approach to classify SOs is presented. The method, named SO-NN, is evaluated on symbolic datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.