How to build Hadoop Cluster 04/28 Update SLTechnology News&Howtos

How to build Hadoop Cluster

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about how to build a Hadoop cluster, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

Set up an installation option for a cluster

Apache tarball binary package has the advantages of flexible installation and heavy workload.

Packages provided by various Linux distributions of Packages

Cluster management tools Cloudera Manager and Apache Ambari

Cluster specification

Hadoop runs on commercial hardware

Commercial hardware is not the same as low-end hardware

It is not recommended to use large database-level machines, the performance-to-price ratio is too low.

General use of multicore CPU and multi-disk

RAID is used for HDFS namenode nodes, but RAID is not recommended for datanode.

Cluster scale

How fast does your cluster need to grow?

Network topology

DNSToSwitchMapping

Construction and installation of cluster

Install Java

Create a Unix user account

Decompress, uh, it's best not to put it in the home directory, because the home directory may be mounted on the NFS.

SSH configuration (distributed shell, public key sharing)

Configure Hadoop

Format HDFS file system

Start and stop daemons

Create a user directory

Start a resource manager on the local machine

Start a node manager on each machine enumerated in the slaves file

Start a namenode on each machine, which is missing from the return value obtained by executing hdfs getconf-namenodes

Start a datanode on each machine listed in the slaves file

Start a secondary namenode on each machine, which is determined by the return value obtained by executing hdfs getconf-secondarynamenodes

Start-dfs.sh

Start-yarn.sh

This is a good time to set a space limit for the directory

It is best to create specific Unix user accounts to distinguish between Hadoop processes and other services on the same machine

HDFS,MapReduce and YARN services typically run as independent users, named hdfs,mapred and yarn. They all belong to the same hadoop group.

Install hadoop

Hadoop configuration

Hadoop distribution package etc/hadoop

HADOOP_CONF_DIR

Hadoop-env.shmapred-env.shyarn-env.shcore-site.xmlhdfs-site.xmlmapred-site.xmlyarn-site.xmlslaveshadoop-metrics2.propertieslog4j.propertieshadoop-policy.xml

Configuration management

Each node of the cluster saves a series of configuration files, and it is recommended to use control

Environment settin

Fs.defaultFS dfs.namenode.name.dir dfs.datanode.data.dir dfs.namenode.checkpoint.dir

Daemon 1000MB yarn.nodemanager.resource.memory-mb

Each process occupies one core

Address and port number of the Hadoop daemon

CPU settings in yarn and MapReduce

By default, the HDFS storage directory is placed under the directory set by the hadoop.tmp.dir property (/ tmp/hadoop-$ {user.name}) and needs to be changed manually

Memory settings in yarn and MapReduce

Memory heap size by default each daemon allocates 1000MB memory

System log file

SSH Settin

Some key properties of the hadoop daemon

Other properties of hadoop

Buffer size 4kB

HDFS block 128MB

The Recycle Bin (uh, something magical)

Security.

Kerberos

Delegation token

After reading the above, do you have any further understanding of how to build a Hadoop cluster? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.