Most parameters which constitute the statistical profile are related to the record selectivity. To estimate record selectivity factors, the nonparametric are better than parametric methods in that they make no a priori assumptions concerning the data distribution and generally provide accurate results. Nonparametric methods are classified into the usual scale-based methods, which function by the scaling of attribute ranges, and analytic methods discussed in this paper, which are scale independent. Our analytic method is based on the computation of a set of parameters, the so-called Canonical Coefficients, which enable the multivariate distribution of the data to be well known. Based on the canonical coefficients, the main parameters of database statistical profiles can be easily defined and efficiently calculated (in terms of computation time and estimation accuracy). In addition, some important applications, which are of peculiar interest to statistical database systems can be developed. Experimental results on real databases are presented which demonstrate the versatility and reliability of the analytic approach.

ANALYTICAL PROFILE ESTIMATION IN DATABASE-SYSTEMS

LEFONS, Ezio;TANGORRA, Filippo
1995-01-01

Abstract

Most parameters which constitute the statistical profile are related to the record selectivity. To estimate record selectivity factors, the nonparametric are better than parametric methods in that they make no a priori assumptions concerning the data distribution and generally provide accurate results. Nonparametric methods are classified into the usual scale-based methods, which function by the scaling of attribute ranges, and analytic methods discussed in this paper, which are scale independent. Our analytic method is based on the computation of a set of parameters, the so-called Canonical Coefficients, which enable the multivariate distribution of the data to be well known. Based on the canonical coefficients, the main parameters of database statistical profiles can be easily defined and efficiently calculated (in terms of computation time and estimation accuracy). In addition, some important applications, which are of peculiar interest to statistical database systems can be developed. Experimental results on real databases are presented which demonstrate the versatility and reliability of the analytic approach.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/129727
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 12
social impact