This guide will help you to install a single node Apache Hadoop cluster on your machine.
System Requirements
Ubuntu 16.04
Java 8 Installed
1. Download Hadoop 1 wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.0/hadoop-2.7.0.tar.gz
2. Prepare for Installation 1 2 tar xfz hadoop-2.7.0.tar.gz sudo mv hadoop-2.7.0 /usr/local/hadoop
3. Create Dedicated Group and User 1 2 3 sudo addgroup hadoop sudo adduser --ingroup hadoop hduser sudo adduser hduser sudo
4. Switch to Newly Created User Account
5. Add Variables to ~/.bashrc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #Begin Hadoop Variables export JAVA_HOME=/usr/lib/jvm/java-8-oracle export HADOOP_HOME=/usr/local/hadoop export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #End Hadoop Variables
6. Source ~/.bashrc
7. Set Java Home for Hadoop
Open the file : /usr/local/hadoop/etc/hadoop/hadoop-env.sh
Find and edit the line as :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 export JAVA_HOME=/usr/lib/jvm/java-8-oracle ``` #### 8. Edit core-site.xml * Open the file: /usr/local/hadoop/etc/hadoop/core-site.xml * Add following lines between _<configuration> ... </configuration>_ tags. fs.default.name hdfs://localhost:9000 #### 9. Edit yarn-site.xml * Open the file: /usr/local/hadoop/etc/hadoop/yarn-site.xml * Add following lines between _<configuration> ... </configuration>_ tags.
yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler
1 2 3 4 5 6 7 8 9 10 #### 10. Edit mapred-site.xml * Copy the mapred-site.xml template first using: cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapredsite.xml * Open the file: /usr/local/hadoop/etc/hadoop/mapred-site.xml * Add following lines between _<configuration> ... </configuration>_ tags.
fs.default.name hdfs://localhost:9000
1 2 3 4 #### 11. Edit hdfs-site.xml First, we create following directories:
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode sudo chown hduser:hadoop -R /usr/local/hadoop_store sudo chmod 777 -R /usr/local/hadoop_store
1 2 Now open /usr/local/hadoop/etc/hadoop/hdfs-site.xml and enter the following content in between the tag <configuration></configuration>
dfs.replication 1 dfs.namenode.name.dir file:/usr/local/hadoop_store/hdfs/namenode dfs.datanode.data.dir file:/usr/local/hadoop_store/hdfs/datanode
1 2 #### 12. Format NameNode
cd /usr/local/hadoop/ bin/hdfs namenode -format
1 2 #### 13. Start Hadoop Daemons
cd /usr/local/hadoop/ sbin/start-dfs.sh sbin/start-yarn.sh
1 2 3 #### 14. Check Service Status
jps
#### 15. Check Running Jobs
Type in browser's address bar:
http://localhost:8088
#### Done!