HDFS Advanced applications configure NFS Gateway 10/19 Update SLTechnology News&Howtos

HDFS Advanced applications configure NFS Gateway

2025-10-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

HDFS Advanced applications configure NFS Gateway

Purpose of NFS Gateway

-1. Users can view the HDFS file system through the operating system compatible local NFSv3 client

-2. Users can download documents from the HDFS file system to the local file system

-3. Users can stream data directly through the mount point. File attachment is supported, but random writing is not supported.

-the NFS gateway supports NFSv3 and allows HDFS to be mounted as part of the client file system

Features and precautions

-Random writing is not supported

-in non-secure mode, the user running the gateway is the proxy user

-in safe mode, the user in Kerberos keytab is the proxy user

-AIX NFS has some known problems and cannot allow the default HDFS NFS gateway to work properly. If you want to access the NFS gateway in AIX, you need to configure the following parameters

Nfs.aix.compatibility.mode.enabled

True

Features and precautions

-the HDFS superuser is a user with the same identity as the NameNode process itself, and the superuser can perform any action, because the permission check will never fail for the superuser.

Nfs.superuser

The_name_of_hdfs_superuser

-if the client installation allows access time updates, on some Unix systems, users can disable access time updates by using the "noatime" installation

Dfs.namenode.accesstime.precision

-nfs.dump.dir

-the user needs to update the file dump directory parameters. NFS clients often reschedule write operations, and sequential writes arrive at random at the NFS gateway. This directory is often used to temporarily store unordered writes. For each file, unordered writes exceed a certain threshold as they accumulate in memory (such as. 1 mb) is dumped. You need to make sure there is enough space for the directory. For example, if the application uploads 10 100m, then this dump directory is recommended to have space around 1GB, so that the worst happens to each file. Only NFS gateways need to restart after setting this property.

-nfs.exports.allowed.hosts

-by default, export can be mounted by any client. To better control access, you can set properties. The value string is the machine name and access policy, separated by spaces. The machine name can be in the format of a single host, a regular expression of Java, or an IPv4 address. Access permissions use rw or ro to specify read / write or machine read-only access to the exported directory. If the access policy is not provided, the default is read-only. Each entry is segmented with ";".

Debugging and log debugging

-various errors are often encountered in the process of configuring the NFS gateway. If an error occurs, opening the debug log is a good choice.

Log4j.property

-log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG

-log4j.logger.org.apache.hadoop.oncrpc=DEBUG

Core-site.xml

-hadoop.proxyuser. {nfsuser} .groups

-hadoop.proxyuser. {nfsuser} .hosts

-nfsuser here is the user who is actually running nfsgw on your machine.

-in non-secure mode, the user running the nfs gateway is the proxy user

-groups is the group used by the mount point user

-hosts is the host address of the mount point

Hdfs-site.xml

-nfs.exports.allowed.hosts

-set allowed access to NFS host columns and permissions. Default is "ro".

Nfs.exports.allowed.hosts

* rw

-dfs.namenode.accesstime.precision

-close access time

Dfs.namenode.accesstime.precision

3600000

-nfs.dump.dir

-set the dump directory

Nfs.dump.dir

/ tmp/.hdfs-nfs

-nfs.rtmax & nfs.wtmax

-users can access HDFS as if they were part of the local file system, but hard links and random writes are not supported. For the optimization of large file I NFS O, you can increase the size of the NFS transfer (rsize and wsize) when you mount. By default, NFS gateways support 1MB as the maximum transfer size. Larger data transfer size, need to set "nfs.rtmax" and "nfs.wtmax" in hdfs-site.xml

-nfs.rtmax & nfs.wtmax

Nfs.rtmax

4194304

Nfs.wtmax

1048576

-nfs.port.monitoring.disabled

-allow nfs to be mounted from clients that do not have permission

Nfs.port.monitoring.disabled

False

Nfs.map

-the system administrator must ensure that the user on the NFS client has the same name and UID as the user on the HDFS gateway host. Users created on different hosts need to modify the UID (for example, using "usermod-u 123myusername") in the NFS client or NFS gateway host. If the client user and the NFS gateway user uid are not consistent, we need to configure the static mapping relationship of the nfs.map.

The experimental environment is ready to refer to https://blog.51cto.com/13558754/2066708

Configure NFS Gateway

# cd / usr/local/hadoop/

#. / sbin/stop-all.sh

# jps

6598 Jps

# vim / etc/hosts

192.168.4.1 master

192.168.4.2 node1

192.168.4.3 node2

192.168.4.4 node3

192.168.4.5 nfsgw / / add a new host

# for i in {1..5}

> do

> rsync-a / etc/hosts 192.168.4.$ {I}: / etc/hosts

> done

# scp / etc/yum.repos.d/yum.repo nfsgw:/etc/yum.repos.d/

Yum.repo 100% 61 0.1KB/s 00:00

# ssh nfsgw

Last login: Wed Jan 31 08:20:55 2018 from master

# sed-ri "s / ^ (SELINUX=). * /\ 1disabled/" / etc/selinux/config; yum-y remove firewalld

# reboot

/ / add users

[root@nfsgw] # adduser-g 10-u 1001 nfsuser

[root@nfsgw ~] # id nfsuser

Uid=1001 (nfsuser) gid=10 (wheel) group = 10 (wheel)

[root@master] # adduser-g 10-u 1001 nfsuser

[root@master ~] # id nfsuser

[root@master ~] # cd / usr/local/hadoop/

[root@master hadoop] # cd etc/hadoop/

[root@master hadoop] # vim core-site.xml

Hadoop.proxyuser.nfsuser.groups

Hadoop.proxyuser.nfsuser.hosts

[root@master hadoop] # for i in node {1..3}

> do

> rsync-a / usr/local/hadoop/etc/hadoop/ ${I}: / usr/local/hadoop/etc/hadoop/-e "ssh"

> done

[root@master ~] # ssh nfsgw

[root@nfsgw ~] # yum-y install java-1.8.0-openjdk-devel.x86_64

[root@nfsgw ~] # cd / usr/local/

[root@nfsgw ~] # rsync-azSH-- delete master:/usr/local/hadoop. /-e "ssh" / / synchronous hadoop

[root@nfsgw ~] # yum-y remove rpcbind nfs-util

[root@master ~] # cd / usr/local/hadoop/

[root@master hadoop] #. / sbin/start-dfs.sh / / start the cluster

[root@master hadoop] # jps

6755 NameNode

7062 Jps

6953 SecondaryNameNode

[root@master hadoop] #. / bin/hdfs dfsadmin-report / / check the node

[root@master hadoop] # ssh nfsgw

Last login: Wed Jan 31 08:26:48 2018 from master

[root@nfsgw ~] # cd / usr/local/hadoop/

[root@nfsgw hadoop] # cd etc/hadoop/

[root@nfsgw hadoop] # vim hdfs-site.xml

...

Nfs.exports.allowed.hosts

* rw / / allow access to those hosts

Dfs.namenode.accesstime.precision

3600000 / / accesstime update time

Nfs.dump.dir

/ var/nfstemp / / dump directory

Nfs.rtmax

4194304 / / read file transfer size

Nfs.wtmax

1048576 / / write file transfer size

Nfs.port.monitoring.disabled

False / / allow client mount

...

[root@nfsgw ~] # mkdir / var/nfstemp

[root@nfsgw ~] # chown 1001.10 / var/nfstemp/

[root@nfsgw] # setfacl-m u:nfsuser:rwx / usr/local/hadoop/logs/

[root@nfsgw ~] # cd / usr/local/hadoop/

-start the portmap service

[root@nfsgw hadoop] # / sbin/hadoop-daemon.sh-- script. / bin/hdfs start portmap

Starting portmap, logging to / usr/local/hadoop/logs/hadoop-root-portmap-nfsgw.out

[root@nfsgw hadoop] # su nfsuser

-start nfs3

[nfsuser@nfsgw hadoop] $. / sbin/hadoop-daemon.sh-- script. / bin/hdfs start nfs3

Starting nfs3, logging to / usr/local/hadoop/logs/hadoop-nfsuser-nfs3-nfsgw.out

[nfsuser@nfsgw hadoop] $jps

2728 Jps

2671 Nfs3

[nfsuser@nfsgw hadoop] $exit

Exit

[root@nfsgw hadoop] # jps

2738 Jps

2588 Portmap

2671-process information unavailable

-Special attention should be paid here:

-root user is required to start portmap

-you need to use the user set in core-site to start nfs3.

Mount nfs

-currently, NFS v3 only uses TCP as the transport protocol. NLM is not supported, so the installation option "nolock" is required. The installation option "sync" is strongly recommended because it minimizes or avoids reordering writes, which results in more predictable throughput. Failure to specify synchronization options may result in unreliable behavior when uploading large files

-if you must use a soft installation, the user should give it a relatively long timeout (at least not less than the default timeout on the host)

# mount-t nfs-o vers=3,proto=tcp,nolock,sync,noatime,noacl 192.168.4.5 mnt/

# ls / mnt/

Input output tmp

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.