HDFS Delete Files and Directories: hadoop fs -rm

The hadoop fs -rm command is used to delete a file or directory in Hadoop Distributed File System (HDFS).

In this blog post, we will explore the hadoop fs -rm command’s usage and its most commonly used flags with examples to help you become proficient in removing files and directories in HDFS.

Another command available is hadoop fs -rmdir, which is specifically designed for deleting directories. While hadoop fs -rm serves the purpose of deleting both files and directories, hadoop fs -rmdir proves particularly useful for removing directories. This command is particularly recommended for the removal of empty directories. For further insights into the functionalities of hadoop fs -rmdir, you can explore the information provided in this blog post.

Command Syntax:

hadoop fs -rm [options] <directory_path>
  • directory_path: The path to the file or directory to be deleted.
  • options: The following options are available:
    • -f: Force the deletion, even if the file or directory does not exist.
      • Example: hadoop fs -rm -f /user/hadoop/myfile.txt
    • -r: Recursively delete the file or directory, including all its subdirectories and files.
      • Example: hadoop fs -rm -r /user/hadoop/mydir
    • -skipTrash: Delete the file or directory directly, bypassing the trash feature.
      • Example: hadoop fs -rm -skipTrash /user/hadoop/myfile.txt

Delete File or Directory

This basic usage removes the specified file or directory from HDFS.

hadoop fs -rm <directory_path>

Example:

In this example, the command removes the file file.txt from the /user/hadoop/ directory in HDFS.

hadoop fs -rm /user/hadoop/file.txt

Recursively Delete Directory (-r)

The -r flag is used to delete directories and their contents recursively. This is especially useful when you need to remove entire directory structures.

hadoop fs -rm -r <directory_path>

Example:

Here, the -r flag ensures the recursive deletion of the data directory and its contents from /user/hadoop/.

hadoop fs -rm -r /user/hadoop/data/

Forceful Deletion (-f)

The -f flag allows you to forcefully delete files or directories without prompting, and it ignores nonexistent files.

hadoop fs -rm -f <directory_path>

Example:

The -f flag forcefully deletes a nonexistent file without prompting.

hadoop fs -rm -f /user/hadoop/nonexistent_file.txt

Direct Deletion (-skipTrash)

The -skipTrash flag allows you to delete files and directories directly, bypassing the trash feature.

hadoop fs -rm -skipTrash <directory_path>

Example:

The -skipTrash flag deletes the temp directory and its contents directly, bypassing the trash.

hadoop fs -rm -skipTrash /user/hadoop/temp/

Combining Multiple Flags

The hadoop fs -rm command offers the flexibility of combining multiple flags in a single command to tailor the deletion process according to your needs. By using various flags together, you can achieve a comprehensive and customized data cleanup. Here’s an example that demonstrates the usage of all the flags -f, -r, and -skipTrash in a single command:

hadoop fs -rm -f -r -skipTrash /user/hadoop/data/

The command performs the following actions:

  1. Deletes the /user/hadoop/data/ directory and its contents.
  2. Forces the deletion without prompting or considering nonexistent files.
  3. Recursively deletes subdirectories and their contents.
  4. Directly deletes the specified items without moving them to the trash.

By combining these flags, you have the capability to perform a powerful data cleanup operation that aligns with your exact requirements. This example showcases how you can efficiently remove directories and files using a single command while employing multiple flags to control the deletion behavior.

Leave a Reply

Your email address will not be published. Required fields are marked *