Recent advances in computing, communications, and digital storage technologies, together with the development of high-throughput data-acquisition technologies, have made it possible to gather and store incredible volumes of data. The warehouses of international retailers (such as Wal-Mart) are typically multi-terabyte databases that contain information about retail transactions by customers all over the world. The emergence of these large data sets creates a growing need for analyzing them across geographical lines using distributed and parallel systems like the Grid infrastructure, thereby unlocking the intelligence hidden deep within these geographically distributed databases. Market basket analysis is a method for discovering consumer purchasing patterns by extracting associations or co-occurrences from the stores transaction database. This is a typical association rule mining task where an Apriori algorithm is widely adopted to find out the large item-set. But since the traditional sequential Apriori algorithm can no longer serve the purpose due to the huge amount of data, the strategy for a parallel and distributed association rule mining algorithm is outlined in this paper.

Grid-based data mining for market basket analysis in the retail sector

MALERBA, Donato
2007-01-01

Abstract

Recent advances in computing, communications, and digital storage technologies, together with the development of high-throughput data-acquisition technologies, have made it possible to gather and store incredible volumes of data. The warehouses of international retailers (such as Wal-Mart) are typically multi-terabyte databases that contain information about retail transactions by customers all over the world. The emergence of these large data sets creates a growing need for analyzing them across geographical lines using distributed and parallel systems like the Grid infrastructure, thereby unlocking the intelligence hidden deep within these geographically distributed databases. Market basket analysis is a method for discovering consumer purchasing patterns by extracting associations or co-occurrences from the stores transaction database. This is a typical association rule mining task where an Apriori algorithm is widely adopted to find out the large item-set. But since the traditional sequential Apriori algorithm can no longer serve the purpose due to the huge amount of data, the strategy for a parallel and distributed association rule mining algorithm is outlined in this paper.
2007
978-1-84564-081-1
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/70163
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact