How to master HBase architecture 07/06 Update SLTechnology News&Howtos

How to master HBase architecture

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to master HBase architecture". In the operation process of actual cases, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

Implementation of HBase Reading

Through the previous description, we know that when HBase is written, the same Cell(RowKey/ColumnFamily/Column is the same) is not guaranteed to be together, and even deleting a Cell is only writing a new Cell, which contains Delete tags, and does not necessarily delete a Cell. To solve this problem, let's first analyze where the same Cell may exist: first, for a newly written Cell, it will exist in MemStore; then for a Cell that has been flushed into HDFS, it will exist in one or some StoreFile(HFile); and finally, for a Cell that has just been read, it may exist in BlockCache. Since the same Cell may be stored in three places, you only need to scan these three places when reading, and then merge the results (Merge Read). The order of scanning in HBase is: BlockCache, MemStore, StoreFile(HFile). StoreFile scans first use Bloom Filter to filter HFiles that are unlikely to meet the criteria, then use Block Index to quickly locate Cells, load them into BlockCache, and then read them from BlockCache. We know that there may be multiple StoreFiles (HFiles) in an HStore, so we need to scan multiple HFiles. If there are too many HFiles, it will cause performance problems.

Compaction

MemStore creates a new HFile every time Flush, and too many HFiles can cause performance problems with reads, so how to solve this problem? HBase uses Compaction mechanism to solve this problem, a bit similar to Java GC mechanism, at first Java keeps applying memory without releasing, increasing performance, but there is no free lunch in the world, eventually we still have to collect garbage under certain conditions, many times need Stop-The-World, this Stop-The-World sometimes causes big problems, such as reference to this article I wrote, so design is a trade-off, there is no perfect. Similar to GC in Java, there are two types of Compaction in HBase: Minor Compaction and Major Compaction.

Minor Compaction refers to taking small, adjacent StoreFiles and merging them into a larger StoreFile without processing Deleted or Expired Cells in the process. The result of a Minor Compaction is fewer and larger StoreFiles. (Is this correct? As write operations execute, the size of the memtable in-creates. When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS. This minor compaction process has two goals: it shrinks the memory usage of the tablet server, and it reduces the amount of data that has to be read from the commit log during recovery if this server dies. Incom- ing read and write operations can continue while com- pactions occur. That is to say, it calls an HFile/SSTable of memtable data flush a Minor Compaction)

Major Compaction refers to merging all StoreFiles into one StoreFile, in which cells marked Deleted are deleted, those that have Expired are discarded, and those that have exceeded the maximum number of versions are discarded. The result of a Major Compaction is an HStore and only one StoreFile exists. Major Compaction can be triggered manually or automatically, but because it causes a lot of IO operations and performance problems, it is usually scheduled for weekends, early morning, etc. when the cluster is idle.

For example, the following two diagrams represent Minor and Major compactions respectively.

HRegion Split

Initially, a Table has only one HRegion. As data writes increase, if an HRegion reaches a certain size, it needs to Split into two HRegions. This size is specified by hbase.hregion.max.filesize. The default is 10GB. When splitting, two new HRegions will be created in the same HRegionServer, each containing half of the data of the parent HRegion. When splitting is completed, the parent HRegion will go offline, and the two new child HRegions will register online with the HMaster. For Load Balancer consideration, these two new HRegions may be allocated to other HRegionServers by the HMaster. Details about Split.

HRegion Load Balancer

After HRegion Split, the two new HRegions will initially be on the same HRegionServer as the previous parent HRegion. For Load Balancer reasons, the HMaster may redistribute one or even two of them to other HRegionServers, causing some HRegionServers to process data on other nodes until the next Major Compaction moves data from the remote node to the local node.

HRegionServer Recovery

When an HRegionServer goes down, it is detected because it no longer sends Heartbeat to ZooKeeper. At this time, ZooKeeper will notify HMaster, and HMaster will detect which HRegionServer goes down. It will reallocate HRegion in the HRegionServer that goes down to other HRegionServers, and HMaster will split and allocate WAL related to the HRegionServer that goes down to the corresponding HRegionServer(write the split WAL file to the WAL directory of the corresponding destination HRegionServer and write it to the corresponding DataNode). These HRegionServers can then Replay the allocated WAL to rebuild MemStore.

HBase Architecture Simple Summary

In NoSQL, there is a famous CAP theory, that is, Consistency, Availability, Partition Tolerance can not be fully obtained, the basic NoSQL in the market at present uses Partition Tolerance to achieve horizontal expansion of data, to deal with the problem that Relational DataBase cannot handle too much data, or the performance problem caused. Only C and A are available. HBase chooses Consistency between the two, and then uses multiple HMasters and failure monitoring to support HRegionServer, ZooKeeper is introduced as a coordinator and other means to solve the Availability problem. However, when the Split-Brain(Network Partition) of the network occurs, it still cannot completely solve the Availability problem. From this perspective, Cassandra chooses A, which can still write normally in the network Split-Brain, and uses other techniques to solve the Consistency problem, such as triggering Consistency judgment and processing when reading. This is a design limitation.

Advantages from implementation:

HBase uses a strong consistency model, ensuring that all reads read the same data after a write returns.

Auto-scale with HRegion Dynamic Split and Merge, and use the multiple data backup capabilities HDFS provides for high availability.

HRegionServer and DataNode are used to localize data on the same server, improve read and write performance, and reduce network stress.

Built-in HRegionServer downtime recovery. Use WAL to Replay data that has not yet been persisted to HDFS.

Seamless integration with Hadoop/MapReduce.

Disadvantages of implementation:

WAL's Replay process can be slow.

Disaster recovery is complex and slow.

Major Compaction causes IO Storm.

"How to master the HBase architecture" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.