Python MapReduce with Hadoop Streaming in Hortonworks Sandbox

P
Hortonworks sandbox for Hadoop Data Platform (HDP) is a quick and easy personal desktop environment to get started on learning, developing, testing and trying out new features. It saves the user from installation and configuration of Hadoop and other tools. This article explains how to run Python MapReduce word count example using Hadoop Streaming. Requirements: Minimum system requirement is 8 GB+ RAM. If you have 10 GB+ RAM perhaps than only you can run a VM with 8 GB. So if you do not fulfill this requirement, you can try it on cloud services such as Azure, AWS or Google Cloud. This article uses examples based on HDP 2.3.2 running on Oracle VirtualBox hosted Ubuntu 16.06. Download and
Subscribe or log in to read the rest of this content.

About the author

Devji Chhanga

I teach computer science at university of Kutch since 2011, Kutch is the western most district of India. At iDevji, I share tech stories that excite me. You will love reading the blog if you too believe in the disruptive power of technology. Some stories are purely technical while others can involve empathetical approach to problem solving using technology.

3 Comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Devji Chhanga

I teach computer science at university of Kutch since 2011, Kutch is the western most district of India. At iDevji, I share tech stories that excite me. You will love reading the blog if you too believe in the disruptive power of technology. Some stories are purely technical while others can involve empathetical approach to problem solving using technology.

Get in touch

Quickly communicate covalent niche markets for maintainable sources. Collaboratively harness resource sucking experiences whereas cost effective meta-services.