Commonly Used HDFS Commands : Learn Data Science
Hadoop Distributed File System or HDFS is the underlying storage for all Hadoop applications. HDFS can be manipulated using APIs such as Java API or REST API but using HDFS shell is the most commonly used option. Below is a list of ten commonly used HDFS commands.
1. Invoking the file system: HDFS Shell supports various file systems and not just HDFS. This means you can invoke file systems including Local FS, HFTP FS, S3 FS, and others. Invoking generic file system (any file system listed above):
hadoop fs
2. Listing Directory Contents
hdfs dfs -ls /user/hadoop/file1
3. Creating Directories
hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
4. Copying Local Files to HDFS
hdfs dfs -put localfile /user/hadoop/hadoopfile
5. Copying From HDFS to Local FS
hdfs dfs -get /user/hadoop/file1 localfile
6. Renaming or Moving Files
hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2
7. Copying Files Within HDFS
hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2
Tip: There is distcp command for inter-hdfs transfers.
8. Deleting Files From HDFS
hdfs dfs -rm /user/hadoop/file1
Tip: -skipTrash option can also be used. Tip: -rmr command can be used for recursive deletion
9. Display file contents
hdfs dfs -cat /user/hadoop/file1
Tip: We can pipe cat output to native head as there is no head command here.
10. Empty Trash
hdfs dfs -expunge
Please feel free to ask questions in the comments!