Numerical induction from data is one of the statistical data analysis tasks, which uses a tabular model, with almost exclusively numerical features, as data representation formalism. The output representations are different: from functions to probability distributions, from surface equations to tables of indexes. One approach to extend the classical data analysis techniques to symbolic objects is the Symbolic Data Analysis: the input and the output of classical techniques are expressed in a symbolic way, so guaranteeing the comprehensibility of both the observations and the results, while the processing techniques, although appropriately adapted, maintain the efficiency of the classical statistical inferential models. Also in the field of Machine Learning several methods have been proposed to extend some inductive approaches from statistical data analysis to data represented as attribute-value couples. Sometimes these approaches transform ideas and principles coming from numerical induction to handle propositional calculus descriptions, otherwise they combine different techniques in order to treat numericalcontinuous data and algebraic-symbolic data differently. The aim here is to improve the efficiency and to preserve the expressive power of the representations during the learning process, and to save the accuracy and flexibility of the numerical techniques during the recognition phase. This kind of integration is more and more complex when using first-order computational learning models, which are useful for handling object descriptions in structured domains, when not only the properties of objects but also the relations between different objects must be considered. The necessity arises from integrating different computational strategies, different knowledge representations and different processing methods in a naive combination of classifiers or, more meaningfully, in a real integration within a unique theoretical framework. When building machine learning systems increasing attention is given to handling symbolic and numerical information inside the same system. In the paper, we face the problem of handling both numerical and symbolic data in first-order models, distinguishing the phase of model generation from examples, and the phase of model recognition by means of a flexible probabilistic subsumption test.

Inductive Learning from Numerical and Symbolic Data: An Integrated Framework

ESPOSITO, Floriana;MALERBA, Donato;
2001-01-01

Abstract

Numerical induction from data is one of the statistical data analysis tasks, which uses a tabular model, with almost exclusively numerical features, as data representation formalism. The output representations are different: from functions to probability distributions, from surface equations to tables of indexes. One approach to extend the classical data analysis techniques to symbolic objects is the Symbolic Data Analysis: the input and the output of classical techniques are expressed in a symbolic way, so guaranteeing the comprehensibility of both the observations and the results, while the processing techniques, although appropriately adapted, maintain the efficiency of the classical statistical inferential models. Also in the field of Machine Learning several methods have been proposed to extend some inductive approaches from statistical data analysis to data represented as attribute-value couples. Sometimes these approaches transform ideas and principles coming from numerical induction to handle propositional calculus descriptions, otherwise they combine different techniques in order to treat numericalcontinuous data and algebraic-symbolic data differently. The aim here is to improve the efficiency and to preserve the expressive power of the representations during the learning process, and to save the accuracy and flexibility of the numerical techniques during the recognition phase. This kind of integration is more and more complex when using first-order computational learning models, which are useful for handling object descriptions in structured domains, when not only the properties of objects but also the relations between different objects must be considered. The necessity arises from integrating different computational strategies, different knowledge representations and different processing methods in a naive combination of classifiers or, more meaningfully, in a real integration within a unique theoretical framework. When building machine learning systems increasing attention is given to handling symbolic and numerical information inside the same system. In the paper, we face the problem of handling both numerical and symbolic data in first-order models, distinguishing the phase of model generation from examples, and the phase of model recognition by means of a flexible probabilistic subsumption test.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/133311
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact