[summary] Hadoop configuration file hdfs-site.xml 04/09 Update SLTechnology News&Howtos

[summary] Hadoop configuration file hdfs-site.xml

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Dfs.ha.automatic-failover.enabled

True

Configure whether to start fault recovery. When configuring this item as true, core-site.xml needs to configure ha.zookeeper.quorum.

Dfs.nameservices

Ns1

The NS logical name of the service provided, corresponding to the one in core-site.xml; Comma-separated list of nameservices

Dfs.ha.namenodes.ns1

Nn1,nn2

List the NameNode logical name under this logical name, and the value of the EXAMPLENAMESERVICE bit dfs.nameservices of the dfs.ha.namenodes.EXAMPLENAMESERVICE configuration item

Dfs.namenode.rpc-address.ns1.nn1

Nn1.ol:9000

Specify the RPC location of the NameNode; the RPC address that handles all client requests

Dfs.namenode.http-address.ns1.nn2

Nn2.ol:50070

Namenode web ui port, for example: http://nn1.ol:50070/

Dfs.namenode.shared.edits.dir

Qjournal://nn1.ol:8485;nn2.ol:8485;s1.ol:8485/ns1

Specifies the shared storage used for HA to store edits, usually the NFS mount point; active namenode writes to the configured folder, and standby namenode reads from this folder to ensure namespaces synchronization; this configuration is empty in non-HA clusters

Dfs.journalnode.edits.dir

/ hadoop/1/jn/

The effect is not known at the moment.

Dfs.ha.fencing.methods

Sshfence

Configuration item in core-site.xml; specify the isolation method of HA. Default is ssh, which can be set to shell, which will be described later.

Sshfence logs in to the previous ActiveNameNosde through ssh and kills it. In order for the mechanism to be implemented successfully

You need to configure password-free ssh login, and specify a private key file through the parameter dfs.ha.fencing.ssh.private-key-files

Dfs.ha.fencing.ssh.private-key-files

/ home/hadoop/.ssh/id_rsa

The SSH private key files to use with the builtin sshfence fencer.

Dfs.namenode.name.dir

File:///hadoop/1/dfs/nn

The directory where the local file system DFSnamenode stores name table (fsp_w_picpath files). If the parameter values are multiple directories separated by commas, the name table is copied to multiple directories, redundant

The fsp_w_picpath image file contains the indoe information of all directories and files of the entire HDFS file system. For files, it includes block description information, modification time, access time, etc.; for directories, it includes modification time, access control information (directory users, groups, etc.) and so on.

In addition, the edit file mainly records all kinds of update operations carried out by HDFS when NameNode has been started, and all write operations performed by the HDFS client will be recorded in the edit file.

Dfs.datanode.data.dir

File:///hadoop/1/dfs/dn,file:///hadoop/2/dfs/dn

DFS data node is the local file system directory where block files are stored. After a file is split according to block size, it is stored in multiple configured directories sequentially (ignore if the directory does not exist)

The purpose of this is that the data can be written to multiple hard disks, and these locations can be spread across all disks on each node to achieve disk I and O balance, thus significantly improving the performance of disk I and O.

Test the environment configuration:

Disk of the test environment datanode:

Dfs.blocksize

268435456

HDFS file system block size; you can configure a value of 256m that ends with a suffix such as mforce g

Dfs.support.append

True

HDFS allows you to append content to files; if this parameter is set to false, a tool such as Flume will have a problem continuously writing data to a file in HDFS (I guess. Not verified)

Fs.trash.interval

10080 units: minute

How long will it take before the file in the .Trash directory is deleted? 10080max 1440mm7, the parameter is set to 0, and the file is deleted directly and will not be moved to .Trash.

Dfs.datanode.failed.volumes.tolerated

one

Number of disk corruption allowed, exceeding the configuration value, datanode shutdown

Dfs.datanode.balance.bandwidthPerSec

104857600

Balancing purpose allows bandwidth without seconds; in order to prevent balance processes from preempting network resources, no resources are reserved for Mapreduce jobs or data input

Dfs.namenode.handler.count

forty-six

Dfs.datanode.max.xcievers

8192

This is equivalent to the maximum number of open files under linux. There is no such parameter in the document. When an error occurs in DataXceiver, it needs to be increased. Default 256

This configuration is out of date. Use dfs.datanode.max.transfer.threads instead. See DFSConfigKeys.java for source code.

Dfs.datanode.max.transfer.threads

8192

This is equivalent to the maximum number of open files under linux. There is no such parameter in the document. When an error occurs in DataXceiver, it needs to be increased. Default 256

Dfs.datanode.socket.write.timeout

100000000

IO timeout. The upper limit of timeout is in milliseconds. 0 means unlimited; java.net.SocketTimeoutException: 480000 (value of configuration parameter) millis timeout while waiting for channel to be ready for write

Dfs.client.socket-timeout

100000000

Dfs.datanode.du.reserved

21474836480

Disk size reserved for other purposes (non dfs use) per storage volume, 20G; per bytes per volume

Dfs.datanode.fsdataset.volume.choosing.policy

Org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy

Disk selection strategy for data copy storage

The first is to follow hadoop1.0 's disk directory polling method and implement the class: RoundRobinVolumeChoosingPolicy.java (default if not configured)

The second is to choose disk storage with enough free space to implement the class: AvailableSpaceVolumeChoosingPolicy.java

Dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold

2147483648

Allowed volume remaining space difference, 2G

Dfs.hosts.exclude

/ usr/local/hadoop/etc/hadoop/exclude_hosts

A list in which hosts is not allowed to connect to namenode

Dfs.client.read.shortcircuit

True

Turn on the local read function

Dfs.domain.socket.path

/ hadoop/1/dfs/dn_socket_PORT

Dfs.domain.socket.path is the local path to the Socket that communicates between Datanode and DFSClient

Socket file

Dfs.permissions.enabled

True

HDFS permission authentication, whether to open it or not

Dfs.permissions.superusergroup

Hadoop

Super user, user of format namenode

Fs.permissions.umask-mode

022

Core-site.xml configuration files. Note that umask has different effects on files and folders.

Dfs.namenode.acls.enabled

True

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.