In this paper we propose an extension of the naive Bayes classification method to the multi-relational setting. In this setting, training data are stored in several tables related by foreign key constraints and each example is represented by a set of related tuples rather than a single row as in the classical data mining setting. This work is characterized by three aspects. First, an integrated approach in the computation of the posterior probabilities for each class that make use of first order classification rules. Second, the applicability to both discrete and continuous attributes by means a supervised discretization. Third, the consideration of knowledge on the data model embedded in the database schema during the generation of classification rules. The proposed method has been implemented in the new system Mr-SBC, which is tightly integrated with a relational DBMS. Testing has been performed on two datasets and four benchmark tasks. Results on predictive accuracy and efficiency are in favour of Mr-SBC for the most complex tasks.
Mr-SBC: a Multi-Relational Naive Bayes Classifier / CECI M; APPICE A; MALERBA D. - 2838(2003), pp. 95-106. ((Intervento presentato al convegno 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2003 tenutosi a Cavtat-Dubrovnik, Croatia nel September 22-26, 2003.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
Titolo: | Mr-SBC: a Multi-Relational Naive Bayes Classifier |
Autori: | |
Data di pubblicazione: | 2003 |
Rivista: | |
Citazione: | Mr-SBC: a Multi-Relational Naive Bayes Classifier / CECI M; APPICE A; MALERBA D. - 2838(2003), pp. 95-106. ((Intervento presentato al convegno 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2003 tenutosi a Cavtat-Dubrovnik, Croatia nel September 22-26, 2003. |
Abstract: | In this paper we propose an extension of the naive Bayes classification method to the multi-relational setting. In this setting, training data are stored in several tables related by foreign key constraints and each example is represented by a set of related tuples rather than a single row as in the classical data mining setting. This work is characterized by three aspects. First, an integrated approach in the computation of the posterior probabilities for each class that make use of first order classification rules. Second, the applicability to both discrete and continuous attributes by means a supervised discretization. Third, the consideration of knowledge on the data model embedded in the database schema during the generation of classification rules. The proposed method has been implemented in the new system Mr-SBC, which is tightly integrated with a relational DBMS. Testing has been performed on two datasets and four benchmark tasks. Results on predictive accuracy and efficiency are in favour of Mr-SBC for the most complex tasks. |
Handle: | http://hdl.handle.net/11586/136736 |
ISBN: | 3-540-20085-1 |
Appare nelle tipologie: | 4.1 Contributo in Atti di convegno |