How to optimize the configuration of hbase 07/11 Update SLTechnology News&Howtos

How to optimize the configuration of hbase

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "how to optimize the configuration of hbase". The content is simple and easy to understand, and the organization is clear. I hope it can help you solve your doubts. Let Xiaobian lead you to study and learn the article "how to optimize the configuration of hbase".

configuration optimization zookeeper.session.timeout

Default: 3 minutes (180000ms)

Description: Connection timeout between RegionServer and Zookeeper. When the timeout expires, ReigonServer will be removed from the RS cluster list by Zookeeper. After receiving the removal notification, HMaster will rebalance the regions responsible for this server and let other surviving RegionServers take over.

Tuning:

This timeout determines whether RegionServer is able to failover in a timely manner. Set to 1 minute or less to reduce the failover time extended by wait timeouts.

However, it should be noted that for some Online applications, RegionServer has a very short time from downtime to recovery (network failure, crash and other failures, operation and maintenance can quickly intervene). If the timeout time is lowered, it will not be worth the loss. Because when Reigon Server is officially removed from the RS cluster, HMaster starts balancing (allowing other RSs to recover from WAL logs recorded by the failed machine). When the faulty RS is restored by manual intervention, this balance action is meaningless, but it will make the load uneven and bring more burden to RS. Especially those with fixed allocation regions.

hbase.regionserver.handler.count

Default: 10

Description: Number of IO threads requested by RegionServer.

Tuning:

The tuning of this parameter is memory dependent.

Fewer IO threads, suitable for processing Big PUT scenarios with high memory consumption per request (large-capacity single PUT or scan with large cache settings, both belong to Big PUT) or ReigonServer scenarios with tight memory.

More IO threads, suitable for scenarios with low memory consumption per request and very high TPS requirements. When setting this value, monitor memory as the primary reference.

It should be noted here that if the number of server regions is small, a large number of requests fall on a region, and the read-write lock caused by the rapid filling of memstore trigger flush will affect the global TPS, not the higher the number of IO threads, the better.

Enable RPC-level logging to monitor the memory consumption and GC status of each request at the same time. Finally, adjust the number of IO threads reasonably through multiple pressure test results.

The number of IO threads is 100 for information only.

hbase.hregion.max.filesize

Default: 256M

Note: The maximum storage space for a single Reigon on the current ReigonServer. When a single Region exceeds this value, the Region will be automatically split into smaller regions.

Tuning:

Small regions are split and compact friendly because the storfiles in split regions or compact small regions are fast and have a low memory footprint. The downside is that splits and compactions are frequent.

In particular, a large number of small regions keep splitting, compaction, which will lead to large fluctuations in cluster response time. Too many regions will not only bring trouble to management, but even cause some bugs in Hbase.

Generally, 512 and below are considered small regions.

Large region is not suitable for frequent split and compact, because doing a compact and split will produce a long pause, which has a great impact on the application's reading and writing performance. In addition, a large region implies a larger storefile, and compaction is also a memory challenge.

Of course, big regions have their uses. If your application scenario has low traffic at a certain point in time, then do compact and split at this time, which can not only successfully complete split and compact, but also ensure stable reading and writing performance most of the time.

Since split and compaction affect performance so much, is there any way to remove them?

Compaction is unavoidable, split can be adjusted from automatic to manual.

By increasing this parameter to a value that is difficult to achieve, such as 100G, you can indirectly disable automatic split (RegionServer does not split regions that do not reach 100G).

With RegionSplitter this tool, when split is needed, split manually.

Manual split is much more flexible and stable than automatic split. On the contrary, the management cost does not increase much. It is recommended to use online real-time system.

In terms of memory, small regions are more flexible in setting the size value of memstore, while large regions are too large or too small. Too large a region will cause the IO wait of the app to increase during flush, and too small a region will affect the read performance due to too many store files.

hbase.regionserver.global.memstore.upperLimit/lowerLimit

Default: 0.4/0.35

upperlimit Description: hbase.hregion.memstore.flush.size This parameter flushes all memstores in a Region when the sum of all memstores in that region exceeds the specified value. RegionServer flush is processed asynchronously by adding requests to a queue, simulating production consumption patterns. There is a problem here. When the queue is too late to consume and generates a large backlog of requests, it may lead to a sharp increase in memory. The worst case is to trigger OOM.

This parameter is used to prevent excessive memory consumption. When the total memory occupied by memstores of all regions in ReigonServer reaches 40% of heap, HBase will force all updates to block and flush these regions to release all memory occupied by memstores.

lowerLimit Description: Same as upperLimit, except lowerLimit does not flush all memstores when the memory occupied by all memstores in the region reaches 35% of Heap. It will find a region with the largest memstore memory consumption and do individual flush. At this time, the write update will still be blocked. lowerLimit is a remedy before all region forced flushes cause performance degradation. Flush thread woke up with memory above low water. "

Tuning: This is a Heap memory protection parameter, the default value is already suitable for most scenarios.

Parameter adjustment will affect reading and writing. If the write pressure is large and often exceeds this threshold, adjust the read cache hfile.block.cache.size to decrease the threshold, or do not modify the read cache size when the Heap margin is large.

If this threshold is not exceeded under high pressure, it is recommended that you appropriately adjust this threshold and then do pressure measurement to ensure that the trigger times are not too many, and then when there are more Heap margins, increase hfile.block.cache.size to improve reading performance.

Another possibility is that hbase.hregion.memstore.flush.size remains the same, but RS maintains too many regions, knowing that the number of regions directly affects the size of the memory occupied.

hfile.block.cache.size

Default: 0.2

Note: The read cache of the storefile occupies a percentage of the Heap size, 0.2 means 20%. This value directly affects the performance of data reads.

Tuning: Of course, the bigger the better, if writing is much less than reading, it is OK to open 0.4-0.5. If reading and writing are more balanced, it is about 0.3. If you write more than you read, acquiesce decisively. When setting this value, you should also refer to hbase.regionserver.global.memstore.upperLimit, which is the maximum percentage of memstore in heap. Two parameters, one affecting reading and one affecting writing. If the two values add up to more than 80-90%, there is a risk of OOM, set carefully.

hbase.hstore.blockingStoreFiles

Default: 7

Note: In flush, when there are more than 7 storefiles in a Store (Coulmn Family) in a region, block all write requests for compaction to reduce the number of storefiles.

Tuning: Block write requests can severely impact the response time of the current regionServer, but too many storefiles can also impact read performance. For practical purposes, the value can be set to infinity in order to obtain a smoother response time. If you can tolerate large peaks and troughs in response time, you can adjust it by default or according to your own scene.

hbase.hregion.memstore.block.multiplier

Default: 2

Note: When the memstore in a region occupies more than twice the size of hbase.hregion.memstore.flush.size, block all requests from the region, flush, and free memory.

Although we set the total memory size of memstores occupied by region, such as 64 MB, imagine that at the last 63.9 MB, I put a 200 MB data, and the size of memstore will instantly explode to several times the expected hbase.hregion.memstore.flush.size. The effect of this parameter is to block all requests when the memstore size increases to more than twice hbase.hregion.memstore.flush.size, and to curb further expansion of risk.

Tuning: The default value of this parameter is still reliable. If you anticipate that your normal application scenario (excluding exceptions) will not have sudden writes or controllable amounts of writes, then keep the default values. If your write requests are usually several times larger than normal, you should increase this multiple and adjust other parameter values, such as hfile.block.cache.size and hbase.regionserver.global.memstore.upperLimit/lowerLimit, to reserve more memory and prevent HBase server OOM.

hbase.hregion.memstore.mslab.enabled

Default value: true

Enable LZO compression

LZO has higher performance than Hbase's default GZip, which has higher compression. See Using LZO Compression for details. For developers who want to improve HBase's reading and writing performance, LZO is a good choice. For developers who care a lot about storage space, it is recommended to keep the default.

Do not define too many Column Families in one table

Hbase currently does not handle tables containing more than 2-3 CFs well. Because when a CF flush occurs, its neighboring CF will also be triggered by the correlation effect, eventually causing the system to generate more IO.

batch import

Before bulk importing data to Hbase, you can balance the load by creating regions beforehand. See Table Creation: Pre-Creating Regions

Avoid CMS concurrent mode failure

HBase uses CMS GC. GC is triggered by default when the current generation memory reaches 90%. This percentage is set by-XX:CMSInitiatingOccupancyFraction=N. concurrent mode failed occurs in such a scenario:

CMS starts concurrent garbage collection when the older generation reaches 90% memory, while the newer generation is rapidly advancing objects to the older generation. When the CMS of the old generation had not completed the concurrent labeling, the old generation was full, and tragedy occurred. CMS has to pause mark because there is no memory available, trigger stop the world once (suspend all jvm threads), and then clean up all garbage objects in a single-threaded copy mode. This process will be very long. To avoid concurrent mode failed, it is recommended that GC be triggered before 90%.

By setting-XX:CMSInitiatingOccupancyFraction=N

This percentage can be calculated easily. If your hfile.block.cache.size and hbase.regionserver.global.memstore.upperLimit add up to 60%(default), then you can set it to 70-80, usually about 10% higher.

Hbase Client Optimization AutoFlush

Set setAutoFlush of HTable to false to support client-side batch updates. That is, when Put fills the client flush cache, it is sent to the server.

The default is true.

Scan Caching

scanner How much data is cached at a time to scan (how much data is captured from the server at a time to scan).

The default value is 1, which takes one entry at a time.

Scan Attribute Selection

When scanning, it is recommended to specify the required Column Family to reduce traffic, otherwise scan operation will return all data of the whole row (all Coulmn Families) by default.

Close ResultScanners

Remember to close ResultScanner after scanning data, otherwise RegionServer may have problems (corresponding Server resources cannot be released).

Optimal Loading of Row Keys

When you scan a table and only need row key (CF, qualifier,values, timestamps are not needed), you can add a filterList to the scan instance and set the MUST_PASS_ALL operation to add FirstKeyOnlyFilter or KeyOnlyFilter to the filterList. This reduces network traffic.

Turn off WAL on Puts

When putting some non-essential data, you can set writeToWAL(false) to further improve write performance. writeToWAL(false) discards writing WAL log when Put. The risk is that when RegionServer goes down, the data you just put may be lost and cannot be recovered.

Enable Bloom Filter

Bloom Filter improves read performance by exchanging space for time.

The above is "hbase how to optimize configuration" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.