Hadoop hdfs command 05/08 Update SLTechnology News&Howtos

Hadoop hdfs command

2025-05-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Basic format of command: hadoop fs/dfs-cmd

< args >

1.lshadoop fs-ls /

List the directories and files under the root directory of the hdfs file system

Hadoop fs-ls-R /

List all directories and files of the hdfs file system

2.puthadoop fs-put

< local file >

< hdfs file >

The parent directory of hdfs file must exist, otherwise the command will not be executed

Hadoop fs-put

< local file or dir >

...

< hdfs dir >

Hdfs dir must exist, otherwise the command will not be executed

Hadoop fs-put-

< hdsf file>

Read input from keyboard to hdfs file, press Ctrl+D to end input, hdfs file cannot exist, otherwise the command will not be executed

2.1.moveFromLocalhadoop fs-moveFromLocal

< local src >

...

< hdfs dst >

Similar to put, the source file local src is deleted after the command is executed, and it can also be read from the keyboard and input into hdfs file

2.2.copyFromLocalhadoop fs-copyFromLocal

< local src >

...

< hdfs dst >

Similar to put, you can also read input from the keyboard into hdfs file

3.gethadoop fs-get

< hdfs file >

< local file or dir>

Local file cannot have the same name as hdfs file, otherwise you will be prompted that the file already exists and the file without the same name will be copied locally

Hadoop fs-get

< hdfs file or dir >

...

< local dir >

When copying multiple files or directories locally, the local is the folder path

Note: if the user is not root, the local path should be the path under the user folder, otherwise there will be permission problems.

3.1.moveToLocal

This command has not been implemented in the current version

3.2.copyToLocalhadoop fs-copyToLocal

< local src >

...

< hdfs dst >

Similar to get

4.rmhadoop fs-rm

< hdfs file >

.. hadoop fs-rm-r

< hdfs dir>

...

You can delete multiple files or directories at a time

5.mkdirhadoop fs-mkdir

< hdfs path>

You can only build a directory at one level. If the parent directory does not exist, an error will be reported by using this command.

Hadoop fs-mkdir-p

< hdfs path>

Create a parent directory if the parent directory does not exist

6.getmergehadoop fs-getmerge

< hdfs dir >

< local file >

Sort all files under the directory specified by hdfs and merge them into the files specified by local. The files will be created automatically when they do not exist, and the contents will be overwritten when they exist.

Hadoop fs-getmerge-nl

< hdfs dir >

< local file >

With nl, a line is left between the hdfs files merged into local file

7.cphadoop fs-cp

< hdfs file >

The target file cannot exist, otherwise the command cannot be executed, which is equivalent to renaming and saving the file, and the source file still exists

Hadoop fs-cp

< hdfs file or dir >

...

< hdfs dir >

The destination folder must exist, otherwise the command cannot be executed

8.mvhadoop fs-mv

< hdfs file >

The target file cannot exist, otherwise the command cannot be executed, which is equivalent to renaming and saving the file, and the source file does not exist.

Hadoop fs-mv

< hdfs file or dir >

...

< hdfs dir >

When there is more than one source path, the destination path must be a directory and must exist.

Note: movement across file systems (local to hdfs or vice versa) is not allowed

9.counthadoop fs-count

< hdfs path >

Count the number of directories, files and total file size under the corresponding path of hdfs

Display as number of directories, number of files, total file size, input path

10.duhadoop fs-du

< hdsf path>

Displays the size of each folder and file under the corresponding path of hdfs

Hadoop fs-du-s

< hdsf path>

Displays the size of all files and sums under the corresponding path of hdfs

Hadoop fs-du-h

< hdsf path>

Displays the size of each folder and file under the corresponding path of hdfs. The size of the file is expressed in an easy-to-read form, such as 64m instead of 67108864.

11.texthadoop fs-text

< hdsf file>

Output text files or non-text files in some formats in text format

12.setrephadoop fs-setrep-R 3

< hdfs path >

Change the number of copies of a file in hdfs. The number 3 in the above command is the set number of copies. The-R option can change the number of copies recursively for all directories and files under a person's directory.

13.stathdoop fs-stat [format]

< hdfs path >

Returns the status information of the corresponding path

[format] optional parameters are:% b (file size),% o (Block size),% n (file name),% r (number of copies),% y (date and time of last modification)

You can write hadoop fs-stat% b%o%n like this

< hdfs path >

However, it is not recommended that the output of each character is not easy to distinguish

14.tailhadoop fs-tail

< hdfs file >

Display the 1KB data at the end of the file in standard output

15.archivehadoop archive-archiveName name.har-p

< hdfs parent dir >

< src >

< hdfs dst >

Parameter name in the command: compress the file name and take whatever you want.

< hdfs parent dir >

The parent directory where the compressed file is located

< src >

: the file name to be compressed

< hdfs dst >

Compressed file storage path

* example: hadoop archive-archiveName hadoop.har-p / user 1.txt 2.txt / des

In the example, the file 1.txtPower2.txt under the / user directory in hdfs is compressed into a file called hadoop.har and stored in the hdfs / des directory. If 1.txtPower2.txt is not written, all the directories and files under the / user directory are compressed into a file called hadoop.har and stored in the hdfs / des directory.

To display the contents of har, you can use the following command:

Hadoop fs-ls / des/hadoop.jar

Show which files are compressed by har using the following command

Hadoop fs-ls-R har:///des/hadoop.har

Note: har files cannot be re-compressed. If you want to add a file to .har, you can only find the original file and create a new one. The data of the original file in the har file has not changed, and the real function of the har file is to reduce the excessive space waste of NameNode and DataNode.

16.balancerhdfs balancer

If the administrator finds that some DataNode saves too much data and some DataNode saves relatively little data, you can use the above command to start the internal equalization process manually

17.dfsadminhdfs dfsadmin-help

Administrators can manage HDFS through dfsadmin, and the usage can be viewed through the above command

Hdfs dfsadmin-report

Display the basic data of the file system

Hdfs dfsadmin-safemode

< enter | leave | get | wait >

Enter: enter safe mode; leave: leave safe mode; get: know whether safe mode is enabled

Wait: waiting to leave safe mode

18.distcp

Used to copy data between two HDFS

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.