Deep Analysis of HBase-Memstore Flush and flush shell Operation 07/12 Update SLTechnology News&Howtos

Deep Analysis of HBase-Memstore Flush and flush shell Operation

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

/ / memstore flush mechanism and flush shell command refresh

/ / Memstore is one of the most important parts of HBase framework, and it is very important for HBase to achieve high-performance random read and write. An in-depth understanding of the working principle, operating mechanism and related configuration of Memstore is of great help to hbase cluster management and performance tuning.

Writing mechanism (approximately)

1. HBase is based on LSM-Tree model.

2. All data update and insert operations are first written to Memstore (and sequentially written to log HLog)

3. After reaching the specified size, these modification operations are written to disk in batches to generate a new HFile file. This design can greatly improve the writing performance of HBase.

4. HBase requires all data in HFile to be sorted according to RowKey in order to facilitate retrieval according to RowKey.

5. Memstore data will be sorted once before flush is HFile, and the data will be ordered.

Reading mechanism (approximately)

1. According to the locality principle, newly written data is more likely to be read.

2. Therefore, when reading data, HBase first checks whether the requested data is in Memstore.

3. If the write cache misses, then look it up in the read cache, and the read cache will look up in the HFile file before it is hit, and finally return a result of merged to the user.

/ / it can be seen that Memstore is very important to both write performance and read performance of HBase. Among them, flush operation is the core operation of Memstore.

Question:

Next, we focus on the in-depth analysis of the flush operation of Memstore:

1. First analyze the scenarios in which HBase will trigger flush.

2. Then analyze the operation flow of the whole flush with the source code.

3. Finally, we focus on the configuration parameters related to flush, which are very important for performance tuning and problem location.

In which cases will HBase trigger flush operations?

/ / prompt:

It is important to note that the smallest flush unit of MemStore is HRegion rather than a single MemStore. / / it is understandable that Wie:

When a region is two column families, there will be two memstore. For example, if one of them is menstore=128M, the other will not flush if it does not reach 128m.

Only when both memstore in the region reach 128m will the real refresh of the region-level flush be triggered

It is conceivable that if there are too many Memstore in a HRegion, the cost of each flush is bound to be high, so we also recommend that you minimize the number of ColumnFamily when designing the table. It is suggested that the number of column families should be 1-3, and the column families of popular businesses should be designed as 1.

1. Memstore level limit: when the size of any MemStore in the Region reaches the upper limit (hbase.hregion.memstore.flush.size, default 128MB), Memstore refresh will be triggered.

2. Region level limit: when the total size of all Memstore in Region reaches the upper limit (hbase.hregion.memstore.block.multiplier hbase.hregion.memstore.flush.size, default is 2128m = 256m), memstore refresh will be triggered.

3. Region Server level limit: when the total size of all Memstore in a Region Server reaches the upper limit (hbase.regionserver.global.memstore.upperLimit * hbase_heapsize, the default JVM memory usage is 40%), some Memstore refresh will be triggered. The Flush order is executed according to Memstore from large to small, first Flush Memstore the largest Region, and then execute the second largest until the overall Memstore memory usage falls below the threshold (hbase.regionserver.global.memstore.lowerLimit * hbase_heapsize, the default of 38% of JVM memory usage).

/ / it has a great impact

4. When the number of HLog in a Region Server reaches the upper limit (which can be configured by parameter hbase.regionserver.maxlogs), the system will select one or more Region corresponding to the earliest HLog for flush

5. HBase refreshes the Memstore regularly: the default period is 1 hour to ensure that the Memstore will not be persisted for a long time. In order to avoid the problems caused by all MemStore flush at the same time, regular flush operations have a random delay of about 20000.

6. Execute flush manually: users can use the shell command flush 'tablename' or flush' region name' to flush a table or a Region, respectively.

/ / there are 6 ways to trigger the operation of flush. The above parameters are all available in cdh.

Memstore Flush process / / see blog for details

In order to reduce the impact of the flush process on reading and writing, HBase adopts a similar two-phase commit approach, dividing the entire flush process into three phases:

1. Prepare stage

(1) iterate through all the Memstore in the current Region, make a snapshot snapshot of the current dataset kvset in the Memstore, and then create a new kvset.

(2) all later write operations will be written to the new kvset, and the entire flush phase read operation will first traverse the kvset and snapshot respectively, and then look in the HFile if it cannot be found.

(3) the prepare phase needs to add a updateLock to block the write request, and the lock will be released after the end.

(4) because there is no time-consuming operation at this stage, the locking time is very short.

2. Flush stage

(1) traverse all Memstore and persist the snapshot generated in prepare phase into temporary files, which will be uniformly placed in the .tmp directory.

(2) this process is relatively time-consuming because it involves disk IO operations.

3. Commit stage

(1) iterate through all the Memstore and move the temporary files generated in the flush phase to the specified ColumnFamily directory

(2) generate corresponding storefile and Reader for HFile, and add storefile to the storefiles list of HStore

(3) finally, clear the snapshot generated in the prepare phase.

The influence of Memstore Flush on Business Reading and Writing

1. For HBase users, what they are most concerned about is what impact flush behavior will have on read and write requests and how to avoid it.

2. Because the flush operations under different trigger modes have different effects on user requests, the following will be summarized according to the different trigger methods of flush, and classified according to the impact:

(1) the impact is minimal.

Normally, most Memstore Flush operations do not have much impact on business reads and writes, such as these scenarios: HBase refreshes Memstore regularly, performs flush operations manually, triggers Memstore level limits, triggers HLog number limits, triggers Region level limits, and so on. These scenarios only block write requests on the corresponding Region, with a short blocking time and millisecond level.

(2) the influence is great.

However, once the Region Server level limit is triggered to cause flush, it will have a greater impact on the user request. It blocks all update operations that fall on the Region Server for a long time, even up to the minute level. In general, the Region Server level limit is difficult to trigger, but in some extreme cases, the possibility of trigger cannot be ruled out. Here is a scenario that may trigger this flush operation:

/ /

Related JVM configuration and HBase configuration:

MaxHeap = 71

Hbase.regionserver.global.memstore.upperLimit = 0.35

Hbase.regionserver.global.memstore.lowerLimit = 0.30

Based on the above configuration, the total Memstore memory at the trigger Region Server level is 24.9g, as shown below:

2015-10-12 13 regionserver.MemStoreFlusher 05 INFO [regionserver60020] regionserver.MemStoreFlusher: globalMemStoreLimit=24.9 G, globalMemStoreLimitLowMark=21.3 G, maxHeap=71 G

/ / according to the setting of condition 3 above

710.30mm 710.35 / / that is region server, when the memory synthesis of memstore tends to be 21.3 24.85 and higher than 24.85, the trigger flush will be lower than 21.3.

Analysis:

Assuming that the default size of each Memstore is 128m, in the above configuration, if there are two Memstore for each Region and 100 region are running on the entire Region Server, the total memory consumption is calculated to be 128m 1002 = 25.6g > 24.9g. Obviously, in this case, the Region Server level limit will be triggered, which will have a great impact on users.

Summary:

1. According to the above analysis, the main factor triggering the Region Server level limit is the total number of Region running on a Region Server.

2. One is the number of Store on Region (that is, the number of ColumnFamily in the table).

3. For the former, according to the number of read and write requests, it is generally recommended that the number of Region running on an online Region Server should be kept at about 50-80. If it is too small, it will waste resources, and if it is too large, it may trigger other exceptions.

4. For the latter, it is recommended to have as few ColumnFamily as possible. If multiple ColumnFamily is needed logically, it is best to limit it to 3.

Question 1 of the small partner:

A query needs to look up mem and file in a Store.

If you find that block is read in blockcache in blockcache, you don't have to read that file.

The mem is definitely going to be read (otherwise there is a problem with the read data).

Cache and file are not necessarily both, but the newly generated file is certainly not in blockcache if it has not been read.

Basically correct

Question of the small partner 2:

In the memstore flush mechanism *

There is only one "number of HLog in a Region Server", why is the number of hlog in a region server, and the blogger also said in "HBase-data writing process parsing" that "each Region Server has a HLog log". Whether there is something wrong with the description here. I don't quite understand, but it's still the blogger's answer.

"HBase-data writing process parsing" says that "each Region Server has one HLog log" to emphasize that all Region share HLog. All Region in a Region Server will write to the same HLog, and when the HLog size exceeds the threshold, a new HLog file will be generated to receive new writes. So the number of HLog files is actually multiple.

Question 3 of the buddies:

Therefore, when HBase reads the data, it first checks whether the requested data is in Memstore. If the write cache misses, it looks for it in the read cache, and the read cache does not hit until it looks in the HFile file.

Hello! Here I have a question: if I am a scan query, write cache or read cache only part of the data meets the requirements, then the server will directly return this part of the data, or will it continue to look up in HFile

All queries are returned after all the results are obtained.

Reference link: http://hbasefly.com/2016/03/23/hbase-memstore-flush/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.