The data warehouse design methodologies require a novel approach in the Big Data context, because the methodologies have to provide solutions to face the issues related to the 5 Vs (Volume, Velocity, Variety, Veracity, and Value). So it is mandatory to support the designer through automatic techniques able to quickly produce a multidimensional schema using and integrating several data sources, which can be also unstructured and, therefore, need an ontology-based reasoning. Accordingly, the methodologies have to adopt agile techniques, in order to change the multidimensional schema as the business requirements change, without a complete design process. Furthermore, hybrid approaches must be used instead of the traditional data-driven or requirement-driven approaches, in order to avoid missing the adhesion to user requirements and to produce a valuable multidimensional schema compliant with data sources. In the paper, we perform a metric comparison among different methodologies, in order to demonstrate that methodologies classified as hybrid, ontology-based, automatic, and agile are tailored for the Big Data context.
Evaluation of data warehouse design methodologies in the context of big data
Di Tria, Francesco;Lefons, Ezio;Tangorra, Filippo
2017-01-01
Abstract
The data warehouse design methodologies require a novel approach in the Big Data context, because the methodologies have to provide solutions to face the issues related to the 5 Vs (Volume, Velocity, Variety, Veracity, and Value). So it is mandatory to support the designer through automatic techniques able to quickly produce a multidimensional schema using and integrating several data sources, which can be also unstructured and, therefore, need an ontology-based reasoning. Accordingly, the methodologies have to adopt agile techniques, in order to change the multidimensional schema as the business requirements change, without a complete design process. Furthermore, hybrid approaches must be used instead of the traditional data-driven or requirement-driven approaches, in order to avoid missing the adhesion to user requirements and to produce a valuable multidimensional schema compliant with data sources. In the paper, we perform a metric comparison among different methodologies, in order to demonstrate that methodologies classified as hybrid, ontology-based, automatic, and agile are tailored for the Big Data context.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.