An Analytic Approach to Statistical Databases

Lefons, Ezio; Silvestri, A; Tangorra, Filippo

n the commonly adopted data models (as in Chen's entity-relationship data model [1], for example) an attribute is a mapping between an entity set or a relationship set and a value set. The intension of a mapping property is given implicitly or explicitly in the data models, but the extension can be generally represented by the set {<entity,value>}, as in the relational model. We propose an alternative data model for statistical databases, in which an attribute is represented by its analytic properties (the distribution function of the values of the attribute). These analytic properties are described by a set of parameters, which we call the canonical coefficients of the attribute. The canonical coefficients can be used to solve the usual statistical queries with no access to the data. In particular, we present: 1) the methods for computing and updating the canonical coefficients, 2) the use of the canonical coefficients for solving the main statistical queries, also in distributed statistical database environments. Besides, an application of such parameters to the query decomposition in distributed database environments is discussed.