What is the principle of high availability implementation in Hadoop2.2.0? 04/02 Update SLTechnology News&Howtos

What is the principle of high availability implementation in Hadoop2.2.0?

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, Xiaobian will bring you about the principle of high availability implementation in Hadoop 2.2.0. The article is rich in content and analyzed and described from a professional perspective. After reading this article, I hope you can gain something.

Before Hadoop 2.0.0, NameNode(NN) had a single point of failure in HDFS clusters, one NameNode per cluster, and if the machine where NN was located failed, the entire cluster would be unusable until NN restarted or NN daemon threads were started on another host.

HDFS availability is affected in two main ways:

(1) In unpredictable cases, if the machine in which NN is located crashes, the entire cluster will not be available until NN is restarted;

(2) Under predictable circumstances, such as the hardware or software of the machine where NN is located needs to be upgraded, which will lead to cluster downtime.

HDFS high availability solves both of these problems by running two NNs (active NN & standby NN) in the same cluster, which allows a new NN to be quickly enabled to recover from a failure in the event of a machine crash or machine maintenance.

In a typical HA cluster, there are usually two different machines acting as NN. At any time, only one machine is Active; the other machine is Standby. Active NN is responsible for the operation of all clients in the cluster; Standby NN is primarily used for standby, which maintains sufficient state to provide fast failover if necessary.

In order to keep the Standby NN state in sync with the Active NN, i.e. metadata consistent, both will communicate with the JournalNodes daemon. When Active NN performs any namespace changes, it needs to persist to more than half of JournalNodes (persisted via edits log), while Standby NN is responsible for observing changes to edits log, which can read edits from JNs and update its internal namespace. Once Active NN fails, Standby NN ensures that all Edits are read from JNs and switches to Active state. Standby NN reads all edits to ensure that Yes and Active NN have fully synchronized namespace state before failover occurs.

To provide fast failover, Standby NN also needs to preserve the storage locations of individual file blocks in the cluster. To achieve this, all databases in the cluster will configure the location of Active NN and Standby NN and send them the location and heartbeat of the block file, as shown in the following figure:

High Availability Implementation Principle of HDFS in Hadoop 2.2.0

It is extremely important that only one NN in the cluster is Active at any one time. Otherwise, NameSpace states will diverge between two Active NN states, which will lead to data loss and other incorrect results. To ensure that this does not happen, JNs allow only one NN to act as a writer at any time. During recovery, the NN that is about to become Active takes the role of writer and prevents another NN from remaining Active.

To deploy HA clusters, you need to prepare the following:

NameNode machines: Machines running Active NN and Standby NN require the same hardware configuration.

JournalNode machines: machines that run JN. JN daemons are relatively lightweight, so these daemons can run on the same machine as other daemons (e.g. NN, YARN ResourceManager). A minimum of three JN daemons should be running in a cluster, which should allow some fault tolerance. Of course, you can also run more than 3 JNs, but to increase the fault tolerance of the system, you should run an odd number of JNs (3, 5, 7, etc.), when running N JNs, the system will tolerate up to (N-1)/2 JN crashes.

In HA clusters, Standby NN also performs namespace state checkpoints, so it is unnecessary to run Secondary NN, CheckpointNode, and BackupNode; in fact, it is wrong to run these daemons.

The above is what the principle of high availability implementation in Hadoop 2.2.0 shared by Xiaobian is. If there is a similar doubt, please refer to the above analysis for understanding. If you want to know more about it, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.