Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to create the federation of Hadoop

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how to create Hadoop federation". In daily operation, I believe many people have doubts about how to create Hadoop federation. Xiaobian consulted various materials and sorted out simple and easy operation methods. I hope to help you answer the doubts about "how to create Hadoop federation"! Next, please follow the small series to learn together!

Why does the Federation exist?

The resources used by Hadoop NN are limited by the physical limitations of the service and cannot meet the actual production needs.

II. THE REALIZATION OF THE FEDERATION

A federation is formed by using multiple NNs. NNs are independent and do not need to call each other. NN is federated and belongs to a federation, and the DN managed is used as a public storage of blocks. The concept of block pool, each namespace has a pool, datanodes will store all the pools in the cluster, the management between block pools is independent, a namespace does not need to coordinate with other namespaces when generating a blockid, the failure of a namespace will not affect the datanodes to other namesodes services. A namespace and its block pool are used as a management unit. After deletion, the pool corresponding to datanodes will also be deleted. When the cluster is upgraded, this snap-in is also upgraded independently. clusterID is introduced here to identify all nodes in the cluster. This id is generated after a namenode format, and is used for formats of other namenode in the cluster.

III. Main advantages:

Namespace scalability--Jointly add namespace horizontal extensions. DN is also expanded with the addition of NN.

Performance-File system throughput is not limited by a single NameNode. Add more Namenode clusters to extend file system read/write throughput.

Isolation-Isolates different types of programs and controls the allocation of resources to some extent

IV. Configuration:

Federated configurations are backward compatible, allowing a currently running single-node environment to be converted to a federated environment without changing any configuration. The new configuration scheme ensures that the configuration files are the same for all nodes in a clustered environment. The concept of NameServiceID is introduced here as a suffix for namenodes. Step 1: Configure the attribute dfs.nameservices for datanodes to identify namesenodes. Step 2: Add this suffix to each namenode.

V. Operation:

#Create federation, do not specify ID will be automatically generated

$HADOOP_HOME/bin/hdfs namenode -format [-clusterId ]

#Upgrade Hadoop to Cluster

$HADOOP_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR

-upgrade -clusterId

#Expand existing federation

$HADOOP_HOME/bin/hdfs dfsadmin -refreshNamenodes

Withdrawal from the Federation

$HADOOP_HOME/sbin/distribute-exclude.sh

$HADOOP_HOME/sbin/refresh-namenodes.sh

CDH (Cloudera's Distribution, including Apache Hadoop) is one of many branches of Hadoop, maintained by Cloudera, built on stable versions of Apache Hadoop, and integrated with many patches that can be used directly in production environments.//archive.cloudera.com/cdh6/cdh/5/

Advantages of CDH: clear division of versions

Fast version updates

Support Kerberos security authentication document clarity

Supports multiple installation methods (Cloudera Manager, YUM, RPM, Tarball) What is CM Cloudera Manager? is to facilitate Hadoop in clusters

It greatly simplifies the installation and configuration management of host, Hadoop, Hive, Spark and other services in the cluster.

Cloudera Manager has four main functions:

(1) Management: Manage clusters, such as adding and deleting nodes.

(2) Monitoring: Monitor the health of the cluster, and comprehensively monitor the various indicators and system operation conditions set.

(3) Diagnosis: Diagnose the problems that occur in the cluster and give suggestions for solutions to the problems that occur.

(4) Integration: Integration of multiple components of Hadoop.

At this point, the study of "how to create a federation of Hadoop" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report