The outsourcing of elaboration of data streams requires that a service provider collects and stores data on behalf of a company that does not have enough resources to sustain the efforts related to the management of such data streams. If a company does not trust the service provider, then it has to check the validity of the answers when querying the data store, since the results may be not reliable. In order to evaluate the answers, methods for approximate query processing can be used. These methods return fast answers based on data synopses. Such results can be used to validate those obtained by the providers on the basis of an accuracy estimation. In the paper, an extension of the traditional TPC-H benchmark has been used to compare three methods for approximate query processing, in order to verify the performance and the accuracy of the compared methods.
Benchmark for evaluating approximate query processing on data streams
Di Tria, Francesco;Lefons, Ezio;Tangorra, Filippo
2017-01-01
Abstract
The outsourcing of elaboration of data streams requires that a service provider collects and stores data on behalf of a company that does not have enough resources to sustain the efforts related to the management of such data streams. If a company does not trust the service provider, then it has to check the validity of the answers when querying the data store, since the results may be not reliable. In order to evaluate the answers, methods for approximate query processing can be used. These methods return fast answers based on data synopses. Such results can be used to validate those obtained by the providers on the basis of an accuracy estimation. In the paper, an extension of the traditional TPC-H benchmark has been used to compare three methods for approximate query processing, in order to verify the performance and the accuracy of the compared methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.