The real challenge that in the nowadays society needs to be scientifically faced is to accurately, handle the enormous flow of information that in an IT world can be tremendously powerful to analyse the social and economic changes but dangerous when underestimated at the same time. The official statistics that the National Institutes of Statistics yield are facing their limits in all the research areas and the problem is already recognised and discussed in the international scientific context. Undoubtedly, the Public Administration datasets can be a very useful source of additional and detailed data to complete the statistical information about phenomena that are still partially depicted by means of the official data. In some cases, limitations in dealing with the real size of many socio-economic territorial developments are significant. In this context, however, it is important to underline that the administrative databases present two main problems that need to be considered: their administrative purpose, which is not statistical, and the huge amount of data they store. An additional problem, which is sometime encountered, is the entirety of the administrative data on the phenomenon analysed when its complete information requires the merging of two or more databases that often belong to independent PA offices. In this paper, we focus the analysis on the Italian real estate phenomenon and how the administrative data are powerful in adding new information on the phenomenon in terms of both volume and value. In particular, the analysis is circumscribed to the territory of the city of Bari, in the South Italy, because it is part of a national research project. We built a unique administrative database starting from 4 independent administrative databases, normally managed by independent PA offices (the Italian Real Estate Registry and the National Revenue Agency), which provide autonomous information. The record linkage has required the basic practise of the big data methods to deal with both missing data, duplication and erroneous information, and the identification of the useful variables to merge the 4 data sources. Although we have restricted the analysis to one city, the amount of data has also required the application of GIS processes to guarantee the exact matching of data and depict the real estate framework in detail. In fact, the results of our work allow researchers and policy makers to deeply analyse the territory, even the single building. And the differentials between the real estate market monetary values and the real estate values have shown significant results in terms of potential revaluations of city districts or areas. The importance of the analysis we yielded in this paper is unique and original in its attempt to describe an economic phenomenon that, in Italy, still suffers the consequences of the dearth of a complete and harmonised data-warehouse. Our work is the first attempt in this sense and the outcomes can be perfectly replicated in any dimensional territorial area.
Harmonised Administrative Databases: a new approach in the era of Big Data
Vittorio Nicolardi
;Caterina Marini
2019-01-01
Abstract
The real challenge that in the nowadays society needs to be scientifically faced is to accurately, handle the enormous flow of information that in an IT world can be tremendously powerful to analyse the social and economic changes but dangerous when underestimated at the same time. The official statistics that the National Institutes of Statistics yield are facing their limits in all the research areas and the problem is already recognised and discussed in the international scientific context. Undoubtedly, the Public Administration datasets can be a very useful source of additional and detailed data to complete the statistical information about phenomena that are still partially depicted by means of the official data. In some cases, limitations in dealing with the real size of many socio-economic territorial developments are significant. In this context, however, it is important to underline that the administrative databases present two main problems that need to be considered: their administrative purpose, which is not statistical, and the huge amount of data they store. An additional problem, which is sometime encountered, is the entirety of the administrative data on the phenomenon analysed when its complete information requires the merging of two or more databases that often belong to independent PA offices. In this paper, we focus the analysis on the Italian real estate phenomenon and how the administrative data are powerful in adding new information on the phenomenon in terms of both volume and value. In particular, the analysis is circumscribed to the territory of the city of Bari, in the South Italy, because it is part of a national research project. We built a unique administrative database starting from 4 independent administrative databases, normally managed by independent PA offices (the Italian Real Estate Registry and the National Revenue Agency), which provide autonomous information. The record linkage has required the basic practise of the big data methods to deal with both missing data, duplication and erroneous information, and the identification of the useful variables to merge the 4 data sources. Although we have restricted the analysis to one city, the amount of data has also required the application of GIS processes to guarantee the exact matching of data and depict the real estate framework in detail. In fact, the results of our work allow researchers and policy makers to deeply analyse the territory, even the single building. And the differentials between the real estate market monetary values and the real estate values have shown significant results in terms of potential revaluations of city districts or areas. The importance of the analysis we yielded in this paper is unique and original in its attempt to describe an economic phenomenon that, in Italy, still suffers the consequences of the dearth of a complete and harmonised data-warehouse. Our work is the first attempt in this sense and the outcomes can be perfectly replicated in any dimensional territorial area.File | Dimensione | Formato | |
---|---|---|---|
Harmonised Administrative Databases_ a new approach in the era of Big Data.pdf
non disponibili
Descrizione: Articolo completo
Tipologia:
Documento in Versione Editoriale
Licenza:
Creative commons
Dimensione
273.34 kB
Formato
Adobe PDF
|
273.34 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.