Market Basket Analysis Algorithm with MapReduce Using HDFS
Abstract
Market basket analysis techniques are substantially important to every day’s business decision. The traditional single processor and main memory based computing approach is not capable of handling ever increasing large transactional data. In today’s world, the MapReduce approach has been popular to compute huge volumes of data, moreover existing sequential algorithms can be converted in to MapReduce framework for big data.
This paper presents a Market Basket Analysis (MBA) algorithm with MapReduce on Hadoop to generate the complete set of maximal frequent item sets. The algorithm is to sort data sets and to convert it to (key, value) pairs to fit with the MapReduce concept. The framework sorts the outputs of the maps, which are then input to the “reduce” tasks. The experimental results show that the code with MapReduce increases the performance as adding more nodes until it reaches saturation.