HDFS List Files and Directories: hadoop fs -ls

The hadoop fs -ls command is a versatile tool for inspecting the contents of directories within the Hadoop Distributed File System (HDFS). This command provides valuable information about files, directories, permissions, sizes, modification dates, and ownership details.

In this blog post, we will explore the hadoop fs -ls command’s usage and its most commonly used flags with examples to help you become proficient in listing files and directories in HDFS.

Command Syntax:

hadoop fs -ls [options] <directory_path>
  • directory_path: The path to the directory to be listed.
  • options: The following options are available:
    • -R: Recursively list the contents of the directory, including all its subdirectories.
      • Example: hadoop fs -ls -R /user/hadoop
    • -h: Print sizes in human-readable format.
      • Example: hadoop fs -ls -h /user/hadoop
    • -d: List only directories.
      • Example: hadoop fs -ls -d /user/hadoop
    • -1: Display long format listing, including permissions, owner, group, size, and modification date.
      • Example: hadoop fs -ls -1 /user/hadoop

List Files and Directories

This usage presents a detailed list of files and subdirectories within the specified directory.

hadoop fs -ls <directory_path>

Example:

Consider you have a directory named /user/hadoop/documents, and you want to get a quick overview of its contents. Running the following command will provide you with a detailed list of files and subdirectories and their relevant details.

hadoop fs -ls /user/hadoop/documents
Found 3 items
-rw-r--r--   1 hadoop supergroup        210 2023-08-01 10:00 /user/hadoop/documents/file1.txt
drwxr-xr-x   - hadoop supergroup          0 2023-08-01 09:45 /user/hadoop/documents/folder1
-rw-r--r--   1 hadoop supergroup       1245 2023-08-01 09:30 /user/hadoop/documents/file2.txt

Recursive Listing (-R)

When dealing with complex directory structures, understanding the contents of subdirectories becomes crucial. Using the -R flag in the command, you can obtain a recursive listing of all files and subdirectories.

hadoop fs -ls -R <directory_path>

Example:

Notice how all the files, including those in the subdirectory “folder1”, are listed.

hadoop fs -ls -R /user/hadoop/documents
Found 4 items
-rw-r--r--   1 hadoop supergroup      210 2023-08-01 10:00 /user/hadoop/documents/file1.txt
-rw-r--r--   1 hadoop supergroup     1245 2023-08-01 09:30 /user/hadoop/documents/file2.txt
drwxr-xr-x   - hadoop supergroup        0 2023-08-01 09:45 /user/hadoop/documents/folder1
-rw-r--r--   1 hadoop supergroup      932 2023-08-01 09:15 /user/hadoop/documents/folder1/file3.txt

Display Human-Readable Sizes (-h)

Sometimes, understanding file sizes in human-readable format can be more intuitive. Adding the -h flag to the command will present sizes in KB, MB, or GB for better comprehension.

hadoop fs -ls -h <directory_path>

Example:

Notice how the file sizes are in human-readable format, such as 210 B and 1.2 KB.

hadoop fs -ls -h /user/hadoop/documents
Found 3 items
-rw-r--r--   1 hadoop supergroup      210 B  2023-08-01 10:00 /user/hadoop/documents/file1.txt
drwxr-xr-x   - hadoop supergroup        0   2023-08-01 09:45 /user/hadoop/documents/folder1
-rw-r--r--   1 hadoop supergroup      1.2 KB 2023-08-01 09:30 /user/hadoop/documents/file2.txt

List Directories Only (-d)

With the -d flag, only directories themselves are listed, omitting their contents.

hadoop fs -ls -d [path]

Example:

In this example, the -d flag is used with the hadoop fs -ls command along with a wildcard (*) to list the directories themselves in the /user/hadoop/ directory. The command only displays the directories and their metadata without listing the contents within those directories.

hadoop fs -ls -d /user/hadoop/*
drwxr-xr-x   - hadoop supergroup          0 2023-08-01 12:34 /user/hadoop/data/
drwxr-xr-x   - hadoop supergroup          0 2023-08-01 12:34 /user/hadoop/documents/

Long Format Listing (-l)

The -l flag triggers a long format listing, providing comprehensive details including permissions, owner, group, size, and modification date.

hadoop fs -ls -l [path]

Example:

In this example, the -l flag is used with the hadoop fs -ls command to display a long format listing of the items in the /user/hadoop/ directory. The detailed information includes permissions, owner, group, size, and modification date for each directory and file within the specified path.

hadoop fs -ls -l /user/hadoop/
Found 3 items
drwxr-xr-x   - hadoop supergroup          0 2023-08-01 12:34 /user/hadoop/data/
-rw-r--r--   1 hadoop supergroup       1024 2023-08-01 12:34 /user/hadoop/file.txt
drwxr-xr-x   - hadoop supergroup          0 2023-08-01 12:34 /user/hadoop/documents/

Combining Multiple Flags

You can combine multiple flags in a single hadoop fs -ls command to customize the listing output according to your needs. Here’s an example that demonstrates the usage of all the flags -R, -h, -d, and -l:

hadoop fs -ls -R -h -d -l /user/hadoop/

The command performs the following actions:

  1. Lists files and directories in the /user/hadoop/ directory and its subdirectories recursively.
  2. Displays file sizes in a human-readable format (e.g., KB, MB).
  3. Lists only the directories themselves, omitting their contents.
  4. Provides a long format listing that includes detailed information such as permissions, owner, group, size, and modification date.

By using multiple flags in a single command, you have the flexibility to extract specific information from your HDFS directories and files in the format that suits your needs.

Leave a Reply

Your email address will not be published. Required fields are marked *