How does HBase work? 07/03 Update SLTechnology News&Howtos

How does HBase work?

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about the working principle of HBase. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

1. HBase system architecture diagram

The whole HBase architecture focuses on several parts: HMaster, HRegionServer, Zookeeper, HRegion (including HLog, StoreFile, MemStore internally).

2. HMaster introduction

The Hbase cluster adopts the master/slave mode, and HMaster is the leader of the cluster (hereinafter referred to as Master), which is managed as a whole, so the underlying chores are not done much and the load is not high.

2.1 Master Responsibiliti

(1) assign Region to RegionServer.

(2) responsible for load balancing of RegionServer.

(3) find the RegionServer of offline or dead and redistribute the Region on it.

(4) Recycle junk files on HDFS.

(5) process the update request of Schema.

2.2 Master working mechanism

(1) master is online

1) acquire the only lock representing active master from zookeeper to prevent other master from becoming master.

2) scan the server parent node on zookeeper to get a list of currently available region server.

3) communicate with each region server to obtain the corresponding relationship between the currently allocated region and region server.

4) scan the collection of .META.region, calculate the currently unallocated region, and put them in the list of region to be allocated.

(2) master offline

Because master only maintains the metadata of table and region, and does not participate in the process of table data IO (addressing access zk and RegionServer, data read and write access RegionServer), the load of Master is very low, and master offline only results in the modification of all metadata frozen (unable to create deleted table, unable to modify table schema, unable to load balance region, unable to handle region online and offline, unable to merge region, the only exception is that region split can proceed normally. Because only region server participates), the data of the table can be read and written normally. Therefore, master offline has no impact on the entire hbase cluster in a short period of time. As you can see from the launch process, all the information saved by master can be redundant (all can be collected or calculated from other parts of the system). Therefore, in general, there is always a master providing services in a hbase cluster, and there is more than one 'master' waiting for the opportunity to seize its location.

3. HRegionServer introduction

HRegionServer (hereinafter referred to as RegionSever) is the slave in the cluster, which is responsible for handling specific read and write requests and specific processes such as compact and split of data.

3.1 RegionServer Responsibiliti

(1) maintain the Region assigned to it by Master, and process the I _ Pot O requests for these Region.

(2) be responsible for shredding the Region which is getting bigger and bigger in the process of running.

3.2 RegionServer working mechanism

(1) regionserver is online

Master uses zk to obtain regionserver information. When a regionserver starts, it first creates a file of its own in the zk's server directory and acquires an exclusive lock on the file. Because master subscribes to the change message of the server directory, zk will notify master in time when the files in the server directory are added or changed.

(2) regionserver offline

When region server goes offline, it disconnects its session with zookeeper, and zookeeper automatically releases the exclusive lock on the file that represents the server. On the other hand, master constantly polls the lock status of files in the server directory. If master finds that a region server has lost its own exclusive lock (or if master has been unable to communicate with region server several times in a row), master is trying to acquire a read-write lock on behalf of that region server. Once successful, it can be determined:

1) the network between region server and zookeeper is disconnected.

2) region server is dead.

In either case, region server can no longer serve its region, and master will delete the file that represents the region server in the server directory and assign the region of this region server to other comrades who are still alive.

If a brief problem with the network causes region server to lose its lock, after region server reconnects to zookeeper, as long as the file representing it is still there, it will keep trying to acquire the lock on that file, and once it has acquired it, it can continue to provide services.

4. Zookeeper introduction

Zookeeper can be said to be a popular lover in the design of master-slave architecture in Hadoop ecology, and it can well coordinate the unified and orderly work of the whole cluster. In the architecture of HBase, ZooKeeper provides the same function as file system to access directories and files (called znode), which is usually used by distributed file systems to coordinate ownership, register services, and listen for updates.

Each Region server registers its own temporary node in ZooKeeper, and the master server uses these temporary nodes to discover available servers, and can also use temporary nodes to track machine failures and network partitions.

In the ZooKeeper server, each temporary node belongs to a session that is automatically generated after the client connects to the ZooKeeper server. Each session has a unique id in the server, and the client will continuously send a "heartbeat" to the ZooKeeper with this id. In the event of a failure, the ZooKeeper client process dies, the ZooKeeper server will determine that the session timed out and automatically delete the temporary nodes that belong to it.

HBase can also use ZooKeeper to ensure that only one master server is running, storing the boot location used to discover the Region, as a registry for a Region server, and for other purposes at the first level. ZooKeeper is a key component without which HBase cannot work.

The main functions of Zookeeper are:

(1) guarantee the uniqueness of master. How it works: when master starts, it acquires an active master lock from ZK, preventing other nodes from becoming master.

(2) monitor the status of RegionServer in real time and inform master of the messages of Regionserver online and offline in time.

(3) the addressing entry where all Region are stored (that is, on which server the ROOT table is located).

(4) Scema for storing Hbase (Zookeeper stores-ROOT- and .meta. The location of these two tables actually exists in the HBase), including what table there are and what column family each table has.

5. Region introduction

Region is the basic unit for HBase to store and manage data. Region and RegionServer are many-to-one relationships, that is, a Region can only be used by one RegionServer at the same time, while a RegionServer can handle multiple Region at the same time.

5.1 relationship between Region and Table

Region is actually a partition of Table in the row direction. Generally speaking, a table has only one Region at the beginning. With the increase of data, when a certain threshold is reached, ReginServer will split the Region into two Region, and so on.

5.2 Region subdivision

If you continue to dig deeper, Region can be subdivided. It is composed of one or more Store. Here, it is necessary to understand several basic concepts: HFile, HLog, StoreFile, and MemStore.

The figure above reflects the underlying flow of data requests by the client, which can be simply understood as: HLog actually backs up data and is used for disaster recovery; MemStore is cached data to improve the efficiency of reading and writing. StoreFile is the lowest layer of storage, which is basically equal to HFile, the former is logical storage relative to HBase, and the latter is physical storage relative to HDFS (that is, hbase does not care about storage, and data binaries are stored on HDFS, which means HFile).

5.2.1 about HLog

(1) HLog, also known as WAL (Write Ahead Log), is similar to binlog in mysql, which is used for disaster recovery. HLog records all data changes, and once the data changes, it will be recorded in HLog, so it can also be recovered from here.

(2) each RegionServer maintains one HLog instead of one per Region. So logs from different Region (from different table) will be mixed together.

Benefits: all logs are written to one file, reducing the number of disk addresses and improving the performance of Hbase writes.

Cons: suppose a RegionServer is offline and a Region is about to be restored, then you need to send the HLog to another RegionServer and then restore it.

Add: HLog is a common Hadoop Sequence File,Sequence File key for HLogKey objects (including data attribution information, tablename, region and other information, as well as sequence number and timestamp), and value is the actual key-value object data for Hbase.

5.2.2 about MemStore

MemStore is a layer of cache blocking in front of StoreFile. When a Region request arrives, it will first look it up in MemStore. If it is hit, the result will be returned directly, so that large-scale StoreFile scanning can be avoided.

6. Data reading and writing process

When the data is updated, it is first written to Log (WAL log) and memory (MemStore). The data in MemStore is sorted. When MemStore accumulates to a certain threshold, a new MemStore is created, and the old MemStore is added to the Flush queue, and a separate thread Flush to disk to become a StoreFile. At the same time, the system records a Redo Point in Zookeeper, indicating that the changes made before this time have been persisted. (minor Compact) when an accident occurs on the system, it may result in the loss of data in memory (MemStore), so use Log (WAL log) to recover the data after Checkpoint. StoreFile is read-only and cannot be modified once created. So the update of Hbase is actually a constantly appended operation. When the StoreFile in a Store reaches a certain threshold, there will be a Major Compact, merging the changes to the same Key together to form a large StoreFile. When the size of the StoreFile reaches a certain threshold, the StoreFile will be Split, which is divided into two StoreFile. Because updates to the table are constantly appended, when processing read requests, you need to access all StoreFile and MemStore in Store and merge them according to row key. Because StoreFile and MemStore are sorted, and StoreFile has in-memory indexes, the merging process is relatively fast.

6.1 read request

(1) the client finds the RegionServer where the target data is located through ZK, ROOT table and META table.

(2) request RegionServer to find the target data.

(3) RegionServer locates to the Region where the target data is located, and sends out the query request.

(4) Region will be checked from Memstore first, and will be returned if hit (the advantage of checking Memstore first is that it is in memory and the query is fast).

(3) if Memstore is not available, scan StoreFile (this process can scan a lot of StorFfiel, with the help of Bloomfilter).

6.2 write request

(1) Client submits a write request to Regionserver.

(2) Regionserver finds the target Region.

(3) Region checks whether the data is consistent with Schema.

(4) if the client does not specify a timestamp, the current time is taken by default.

(5) Update the data to WAL Log.

(6) write the data to Memstore.

(7) determine whether Memstore needs Flush to be a Storefile file.

7.-ROOT- and .Meta. Introduction of two built-in tables

Hbase has two built-in tables (- ROOT-, .meta.) To manage region-related information, and the two tables have the same structure.

General addressing process: client-- > Zookeeper-- >-ROOT- table-- >. META. Table-- > RegionServer-- > Region-- > client

For a specific introduction of the two tables and the addressing process of an example, refer to: http://greatwqs.iteye.com/blog/1838904

Note a few points:

(1) Why is it necessary to design two identical tables?

First of all, it is important to note that these two tables are also two ordinary tables, so there may be multiple region cases (actually only .meta. The watch may have,. ROOT. Very small).

.META. The table is used to store region-related information, and each row represents an region,.META. Tables may also have a lot of region and scattered on different regionServer, so use the-ROOT- table to manage meta. Meta. The region of the table (a bit winding, which is actually a hierarchical relationship managed by the company), and then put the address of the-ROOT- table on the Zookeeper (default address: / hbase/root-region-server) so that you know the entry points for all queries.

(2)-ROOT- table will not be very large and will never be split, so it will not appear-ROOT- table also has a lot of region, which avoids the phenomenon of setting tables all the time.

(3)-the ROOT- table is in memory, so the query is fast.

(4) in fact, each request of client does not always go through such a complicated process. Client will cache the location information that has been queried, and the cache will not be invalidated actively.

The above is the working principle of HBase shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.