hadoop fs -rm command is used to delete a file or directory in Hadoop Distributed File System (HDFS).
In this blog post, we will explore the
hadoop fs -rm command’s usage and its most commonly used flags with examples to help you become proficient in removing files and directories in HDFS.
Another command available is
hadoop fs -rmdir, which is specifically designed for deleting directories. While
hadoop fs -rm serves the purpose of deleting both files and directories,
hadoop fs -rmdir proves particularly useful for removing directories. This command is particularly recommended for the removal of empty directories. For further insights into the functionalities of
hadoop fs -rmdir, you can explore the information provided in this blog post.
hadoop fs -rm [options] <directory_path>
directory_path: The path to the file or directory to be deleted.
options: The following options are available:
-f: Force the deletion, even if the file or directory does not exist.
hadoop fs -rm -f /user/hadoop/myfile.txt
-r: Recursively delete the file or directory, including all its subdirectories and files.
hadoop fs -rm -r /user/hadoop/mydir
-skipTrash: Delete the file or directory directly, bypassing the trash feature.
hadoop fs -rm -skipTrash /user/hadoop/myfile.txt
Delete File or Directory
This basic usage removes the specified file or directory from HDFS.
hadoop fs -rm <directory_path>
In this example, the command removes the file
file.txt from the
/user/hadoop/ directory in HDFS.
hadoop fs -rm /user/hadoop/file.txt
Recursively Delete Directory (-r)
-r flag is used to delete directories and their contents recursively. This is especially useful when you need to remove entire directory structures.
hadoop fs -rm -r <directory_path>
-r flag ensures the recursive deletion of the
data directory and its contents from
hadoop fs -rm -r /user/hadoop/data/
Forceful Deletion (-f)
-f flag allows you to forcefully delete files or directories without prompting, and it ignores nonexistent files.
hadoop fs -rm -f <directory_path>
-f flag forcefully deletes a nonexistent file without prompting.
hadoop fs -rm -f /user/hadoop/nonexistent_file.txt
Direct Deletion (-skipTrash)
-skipTrash flag allows you to delete files and directories directly, bypassing the trash feature.
hadoop fs -rm -skipTrash <directory_path>
-skipTrash flag deletes the
temp directory and its contents directly, bypassing the trash.
hadoop fs -rm -skipTrash /user/hadoop/temp/
Combining Multiple Flags
hadoop fs -rm command offers the flexibility of combining multiple flags in a single command to tailor the deletion process according to your needs. By using various flags together, you can achieve a comprehensive and customized data cleanup. Here’s an example that demonstrates the usage of all the flags
-skipTrash in a single command:
hadoop fs -rm -f -r -skipTrash /user/hadoop/data/
The command performs the following actions:
- Deletes the
/user/hadoop/data/directory and its contents.
- Forces the deletion without prompting or considering nonexistent files.
- Recursively deletes subdirectories and their contents.
- Directly deletes the specified items without moving them to the trash.
By combining these flags, you have the capability to perform a powerful data cleanup operation that aligns with your exact requirements. This example showcases how you can efficiently remove directories and files using a single command while employing multiple flags to control the deletion behavior.