Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement region merge and split of Hbase?

2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

1. Region split mechanism

A large amount of rowkey data is stored in region. When there are too many pieces of data in region, it directly affects the query efficiency. When the region is too large, .hbase will split the region, which is also an advantage of Hbase.

HBase has the following region split strategies:

1 、 ConstantSizeRegionSplitPolicy

Default sharding policy before version 0.94

When the region size is greater than a certain threshold (hbase.hregion.max.filesize=10G), segmentation is triggered, and a region is equally divided into two region.

However, in the production line, this segmentation strategy has considerable disadvantages: there is no obvious distinction between the large table and the small table. A large threshold (hbase.hregion.max.filesize) setting is friendly to large tables, but small tables may not trigger splits, and in extreme cases there may be only one, which is not good for the business. If the setting is small, it is friendly to small tables, but a large table will generate a large amount of region in the whole cluster, which is not a good thing for cluster management, resource usage, and failover.

2 、 IncreasingToUpperBoundRegionSplitPolicy

Default sharding policy for version 0.94 ~ version 2.0

The sharding strategy is a little complicated, and generally speaking, it is the same as ConstantSizeRegionSplitPolicy. A region larger than the set threshold will trigger sharding. But this threshold is not a fixed value like ConstantSizeRegionSplitPolicy, but will be adjusted constantly under certain conditions, and the adjustment rule is related to the number of region of the table to which region belongs on the current regionserver.

The formula for region split is:

Regioncount ^ 3 128M2, split when the region reaches the size

For example:

First time split: 1 ^ 3 256 = 256MB

Second split: 2 ^ 3 256 = 2048MB

Third split: 3 ^ 3 256 = 6912MB

Fourth time split: 4 ^ 3 256 = 16384MB > 10GB, so take a lower value 10GB

Every time after that, split's size is 10GB.

3 、 SteppingSplitPolicy

Version 2.0 default sharding policy

The segmentation threshold of this segmentation strategy has changed again, which is simpler than IncreasingToUpperBoundRegionSplitPolicy, and it is still related to the number of region on the current regionserver of the table to be split region, if the number of region is equal to 1.

The segmentation threshold is flush size * 2, otherwise it is MaxRegionFileSize. This sharding strategy is more friendly than IncreasingToUpperBoundRegionSplitPolicy for large and small tables in large clusters. Small tables will no longer generate a large number of small region, but enough is enough.

4 、 KeyPrefixRegionSplitPolicy

The data is grouped according to the prefix of rowKey. Here, the number of the first bits of rowKey is specified as the prefix. For example, rowKey is all 16 bits, and the first 5 bits are prefixes. Then the first 5 bits of the same rowKey will be assigned to the same region when region split.

5 、 DelimitedKeyPrefixRegionSplitPolicy

Ensure that the data with the same prefix is in the same region, for example, the format of rowKey is: userid_eventtypeeventid, and the specified delimiter is, then split will ensure that the data with the same userid is in the same region. 6. DisabledRegionSplitPolicy does not enable automatic split. You need to specify manual split.

2. Region merging mechanism 1.1.1 region merging shows that Region merging is not for performance, but for maintenance. For example, when a large amount of data is deleted, each Region becomes very small and storing multiple Region is wasted. At this time, the Region can be merged, thus reducing some Region server nodes 1.2 how to merge region 1.2.1 cold merge Region through the Merge class

Before performing the merge, = = need to shut down the hbase cluster = =

Create a hbase table: create 'test','info1',SPLITS = > [' 1000 'region 2000'] View table region

Demand:

You need to merge the two region data in the test table:

Test,1565940912661.62d28d7d20f18debd2e7dac093bc09d8.

Test,1000,1565940912661.5b6f9e8dad3880bcc825826d12e81436.

This is implemented through the org.apache.hadoop.hbase.util.Merge class. You don't need to enter hbase shell and execute it directly (= = you need to shut down the hbase cluster first = =):

Hbase org.apache.hadoop.hbase.util.Merge test test,1565940912661.62d28d7d20f18debd2e7dac093bc09d8. Test,1000,1565940912661.5b6f9e8dad3880bcc825826d12e81436.

Interface observation after success

1.2.2 Hot merge of Region through online_merge

= = there is no need to shut down the hbase cluster = =, merge online

Unlike cold combination, the parameter passed by online_merge is the hash value of Region, while the hash value of Region is the last paragraph of the Region name in two. The part of the string between.

Requirements: 2 region data in the test table need to be merged:

Test,2000,1565940912661.c2212a3956b814a6f0d57a90983a8515.

Test,3000,1565940912661.553dd4db667814cf2f050561167ca030.

You need to enter hbase shell:

Merge_region 'c2212a3956b814a6f0d57a90983a8515 "553dd4db667814cf2f050561167ca030'"

Observe the interface after success

Merge_region 'c2212a3956b814a6f0d57a90983a8515 "553dd4db667814cf2f050561167ca030'"

Observe the interface after success

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report