Summary Review of HBaseCon Asia 2019 Track 1 07/19 Update SLTechnology News&Howtos

Summary Review of HBaseCon Asia 2019 Track 1

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

What if HBaseCon doesn't come?

I can't listen to the three Track at the same time, so what if I don't have the skills to split up?

It doesn't matter! Xiaomi Cloud Technology will take you back in three issues.

All the essence ~!

Past Review: Track 2 practical Information Review

Track 1: HBase Internal

Track1 is a forum for focusing on the HBase kernel, and it's more about sharing with HBase developers.

Xiaomi made two shares in Track1, one is to introduce the optimization and improvement of Offheap on the HBase read path, mainly for the GC of HBase and reduce P99 latency, and the other is Split WAL/ACL optimization based on Procedure V2, the new implementation not only ensures correctness, but also reduces dependence on ZooKeeper, making it easier to deploy HBase in the future.

Intel mainly introduces how to put Bucket Cache on the new hardware Persistent Memory, which is a good choice in terms of price and performance.

Cloudera mainly brings about the sharing of HBCK2. After version 2.0 of HBase, hbck1 will only retain the check function and must be repaired with HBCK2 tools, so the improvement of HBCK2 plays a very important role in the stability of HBase 2.0, and the community is continuing to improve.

Flipcart introduced their work on data conformance testing. At present, the community release version will only do ITBLL testing, and there is no perfect check for Replication data consistency. Filpcart engineers also mentioned in RoundTable that the follow-up plan will contribute this part of the work to the community.

The sharing between Ali and Huawei is more about the improvements made on HBase within the company, and we look forward to contributing to the open source community as soon as possible.

1 、 H BCK2: Concepts, t rends and recipes for fixing issues within HBase 2

PPT download link: http://t.cn/AijGUxMa

Wellington Chevreuil, an engineer from Cloudera, shared the latest developments in HBCK2.

HBCK1 is actually a relatively mature tool that can check whether all Region in the entire cluster is healthy and can be well repaired for a variety of common situations. Because HBase-2.x redesigns almost all the operation processes according to Procedure-V2, the probability of state inconsistency in theory will be greatly reduced, but considering that there may be bug in the code implementation, HBCK2 is designed to fix these abnormal states.

Currently, HBCK2 has become a very lightweight fix tool, and the code is kept separately in a repository called hbase-operator-tools. First you need to compile to get the JAR package, and then use the HBase command to perform the repair operation. Several core repair operations are:

Assign and unassign region:

Hbase hbck-j.. / hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 1588230740

When tableState inconsistencies are found, you can use setTableState to fix them.

The bypass option can skip some stuck Procedure

In addition to the repair operation, the cluster needs a tool that supports global checking. At present, you can still do global checking through HBCK1, but the repair function of HBCK1 has been removed by disabled. If necessary, you can use HBCK2 to repair it.

2 、 Further GC optimization for HBase2.x: Reading HFileBlock into offheap directly

PPT download link: http://t.cn/AijGUQqC

This topic is shared by Anoop, a senior PMC member of Intel, and Hu Zhan, an engineer at Xiaomi. Anoop mainly introduces the background of offheap, the read and write path of HBase2.0. The fundamental purpose is to minimize the impact of GC on p99 and p999 latency, move the core memory allocation to GC-independent offheap, and request latency is no longer affected by STW. But Xiaomi HBase team found that even after implementing the offheap read and write path on HBase2.0, p999 is still limited by Young GC. Later, the survey found that it is because after Cache Miss, when you go to HDFS files to read block, you still go to heap application, a block 64KB, so it is easy to accumulate a lot of memory in Young area.

The most direct solution is to read block into an offheap ByteBuffer pool, but find that due to the existence of RAMCache, it is impossible to find the right time to release. So the Xiaomi team uses reference counting to solve the problem of memory collection.

The specific design is shown in the figure above. RegionServer believes that RAMCache and RpcHandler on the read path are two independent reference paths, which cannot be released if there is any path reference, and must be released once there is no path reference. This can be said to be the most important point of this sharing.

In the test scenario of Xiaomi's big data set, it is found that when this feature is enabled, the number of Young GC can be reduced by about 1520%, the memory that can be recovered by each Young GC is reduced by about 80%, and the throughput and latency are also improved to varying degrees. Usually our Cache hit rate is less than 100%, so this feature is actually a good performance optimization point. We will release this feature in future HBase2.3.0 and HBase3.0.0.

3 、 BDS: A data synchronization platform for HBase

PPT download link: http://t.cn/AijGUg1X

The topic was shared by Xiong Jiannan, head of data links at Ali-HBase. This paper mainly introduces the design of cloud data migration across HBase clusters. For community HBase users, the best solution for cross-cluster data migration must be through the cooperation of snapshot and replication to complete the migration of full data and incremental data respectively.

Ali's BDS adopts a similar idea, through multiple worker to copy HFile concurrently to achieve full data migration. Note that this process does not depend on Yarn clusters, and BDS can control the data migration rate of the entire process by dynamically adjusting worker. In addition, the locality of the target cluster will be taken into account during migration, which is a very friendly solution for cloud users.

For the incremental data generated in the full volume process, BDS scans the HLog log directly, and then writes the incremental HLog to the peer cluster. The whole process accesses the HDFS directly and is decoupled from the HBase cluster on the source side.

For cloud users, this scheme can be used not only for data migration, but also for data backup. Turning this function into a separate system is indeed a very friendly experience for users.

4 、 The Procedure v2 Implementation of WAL Splitting and ACL

PPT download link: http://t.cn/AijG4w1R

The topic was shared by HBase Committer Mei from Xiaomi, who is the only female Committer in China.

Sharing is mainly divided into three parts:

The first part mainly introduces the core principles of ProcedureV2. In PPT, her introduction of the components of ProcedureV2 and the demonstration of implementing the rollback process should be the clearest and easiest documents I have ever seen about ProcedureV2. Friends who are interested in ProcedureV2 are highly recommended to learn this PPT.

The second part introduces how to reconstruct the HBase Grant/Revoke ACL process of the community with ProcedureV2. There are several main purposes of refactoring:

The original design uses Zookeeper notification mechanism to achieve the ACL update of each RegionServer, the whole process depends on Zookeeper, and the process is equivalent to asynchronous. Once the ACL cache update of some RS fails (possible but with low probability), it is easy to cause inconsistent ACL permissions among nodes. With ProcedureV2 rewriting, the whole process becomes a synchronous process, this problem no longer exists, and the dependency on Zookeeper services is removed.

Another original intention of refactoring is to expose some Coprocessor interfaces when performing Grant and Revoke. For example, there is a very classic scenario in which some users expect to run offline tasks by scanning Snapshot, but because scanning Snapshot requires the permission of HDFS, and the permission management of HBase is completely different from that of HDFS. At this point, you can implement a Coprocessor to synchronize HBase permissions and HDFS permissions during Grant and Revoke, so that users with table permissions can access the Snapshot of the table immediately. This feature will be supported in HBASE-18659.

The third part introduces the process of rewriting WAL Splitting based on ProcedureV2. The point to consider is similar to that of ACL, mainly that the asynchronous process is rewritten into a more controllable synchronous process, while removing the dependency on Zookeeper. Please refer to the lecture PPT and video for more details.

5 、 HBase Bucket Cache On Persistent Memory

PPT download link: http://t.cn/AijG4MFz

It was shared by Anoop and Ramkrishna, senior PMC members from Intel, and their Intel colleague XuKai participated in the presentation. Persistent Memory is a new type of persistent memory developed by Intel, which communicates with Intel's friends. It is said that the cost is only 1 big 3 of memory, but the performance can reach about 90% of memory, while ensuring durability. This is a new storage medium with high performance-to-price ratio.

Take Xiaomi machine as an example, HBase machines are all 128GB memory, plus about 12 pieces of 900GB SSD disk. A single computer can store nearly 10TB data, but only 128GB in memory, accounting for 1.1% of memory capacity and disk capacity. In fact, delay-sensitive business parties have higher requirements for HBase's Cache hit rate, so what should I do? The idea of Intel is to put Cache on the Persistent Memory with larger capacity and controllable performance loss, such as using 1TB's Persistent Memory to do BucketCache on 10TB machines, then the Cache hit rate will be greatly increased.

From their test results, we can see that there is indeed a great improvement in performance.

Of course, we have discussed internally that without such special hardware support as Persistent Memory, we can also consider mixing BucketCache on memory and SSD. To put it simply, the hottest data is stored in memory, and the secondary hot data is stored in SSD. At least the secondary hot data is the local SSD that is read directly. Whether it is local HDFS short circuit reading or HDFS remote reading, it is impossible to read the local SSD faster than skipping the HDFS protocol.

6. Distributed Bitmap Index Solution & Lightweight SQL Engine-Lemon SQL

PPT download link: http://t.cn/AijG4pjC

This topic, shared by Huawei engineers Hao Xingjun and Liu Zhi, is actually two relatively independent issues, one is the implementation of Bitmap index based on HBase, and the other is the implementation of lightweight SQL engine based on HBase.

First of all, Huawei proposes that it will put a lot of tags on users in the field of security. The business level then checks the set of users by specifying various combinations of tags (using AND,OR,NOT, etc.). Therefore, Huawei designed a HBase-based bitmap index to update both the main table and the index table with the help of Coproccessor.

In the second part, Huawei engineer Liu Zhi introduces their implementation of a lightweight SQL query engine based on HBase. Compared with Phoenix, their implementation is lighter, higher performance and more scalable. Interested friends can scan their Wechat at the end of PPT and communicate directly with the two engineers.

7 、 Test-suite for Automating Data-consistency checks on HBase

PPT download link: http://t.cn/AijG4nma

This is the last Talk in HBase Track 1, shared by Flipkart engineer Pradeep (Flipkart, founded in 2007 by two former Amazon employees, is India's largest e-commerce retailer).

Because Flipkart is an e-commerce scenario, they have a very high demand for data consistency in distributed databases. So they designed a series of test sets to evaluate whether each version of HBase meets their stringent data consistency requirements. Some of the typical test sets they designed are: zookeeper network disconnection, inter-component network packet loss, clock drift, disk suddenly become read-only, and so on, which is very helpful to provide reliable data guarantee for HBase.

In the future, Flipkart will consider opening up their test suite to github for reference and evaluation by other HBase users.

Pay attention to "Xiaomi Cloud Technology"

The third phase of the update will take you to absorb all the HBaseCon practical information.

What are you waiting for?

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.