MapReduce real world example on e-commerce transactions data is described here using Python streaming. The example does not require Hadoop installation. However, if you have Hadoop already installed it will run just fine on it. Python programming language is used because it is easy to read and understand.
A real world e-commerce transactions dataset from a UK based retailer is used. The best way to learn with this example is to use an Ubuntu machine with Python 2 or 3 installed on it.
The dataset consists of real world e-commerece data from UK based retailer
The dataset is provided by Kaggle
It contains 5.42k records (which is not small)
Our goal is to find out coun