Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

HBase client Rpc retry mechanism and client parameter optimization.

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

How to ensure the fault tolerance and low latency of the system by hbase client retry mechanism

HBase client Rpc retry mechanism and client parameter optimization.

Retry Mechanism of HBase client based on Backoff algorithm

1. On the one hand, business users pay more attention to the read-write performance of HBase's own services: throughput and read-write latency.

2. On the other hand, we will pay more attention to the problems in the use of HBase clients, mainly focusing on two aspects: is a retry mechanism provided to ensure the fault tolerance of the system operation? Is there a necessary timeout mechanism to ensure the fastfail of the system and the low latency of the system?

3. The retry mechanism provided by the HBase client, and by configuring reasonable parameters, the client can not only ensure a certain fault tolerance, but also ensure the low latency of the system.

The RpcRetryingCall function is an implementation of the Rpc request retry mechanism, so there are two inferences:

1. The HBase client request failed due to an exception in the network during that period, and the rpc request entered the retry logic.

2. According to HBase's retry mechanism (Backoff mechanism), the thread will sleep for a period of time between every two retries, that is, 115 lines of code in the above figure. This dormant time is so long that the thread has been in TIME_WAITING state all the time.

There is no request for clogging for nearly a few hours, except in case 2.

Case 1: there is a problem with the configuration: the client needs to check that the configuration of hbase.client.pause and hbase.client.retries.number parameters is abnormal. For example, if the handshake parameter hbase.client.pause is set to 10000, it may block for several hours.

/ /

HBase client pauses

Hbase.client.pause = 100ms default

Maximum number of HBase client retries

Hbase.client.retries.number = 35 default times

Case 2: there is a persistent problem with the network: if thread 1 exits after a global lock retry fails, and thread 2 competes for the lock, there is still a problem with the network. Thread 2 will enter and retry again, fail to exit after retrying 8min, and continue to cycle, and it is also possible to block for several hours.

/ / Summary: the probability of case 1 is very low. Basically, there is no response to client requests and no requests are blocked. It is a big reason that the network continues to have problems.

Through monitoring and partner confirmation, it is found that there are indeed many services jitter due to abnormal cloud network upgrades at the time of the incident (0: 00 a.m. to 6: 00 a.m.).

HBase Rpc retry mechanism

The retry mechanism of HBase is the key point of this exception, so it is necessary to parse it. HBase performs a retry operation after a failure to execute rpc

The maximum number of retries can be configured through the configuration file, and the corresponding parameter is 31 in the hbase.client.retries.number,0.98 version.

/ / after repeated retries within a certain period of time, if there is no successful response, the connection to the cluster will eventually be abandoned.

Practice of client parameter optimization

Obviously, according to the introduction of parts 2 and 3 above, once there is an exception of network jitter, by default, a thread will have a retry time of about 8min, which will cause other threads to block on the global lock regionLockObject. In order to build a more stable and low-latency HBase system, in addition to making various adjustments to server-side parameters, client-side parameters also need to be adjusted accordingly:

Hbase.client.pause: 100 by default, which can be reduced to 50

/ / HBase client pauses in cdh

Hbase.client.pause = 100ms hbase.client.retries.number: default is 31, which can be reduced to 21

/ / maximum number of HBase client retries

Hbase.client.retries.number = 35

After modification, the above algorithm can be used to calculate the pause time between each connection cluster retry:

[50 pr 100 150 250 250 500 500 1000 2000 pr 5000 pr 5000pr 5000pr 5000pr 10000, … , 10000]

The client will retry 20 times within the 2min, then give up the connection to the cluster, and then hand over the global lock to other threads to execute other requests.

The retry mechanism of HBase Rpc and the optimization of client parameters are explained in detail.

Reference link

HBase Best practices-client retry Mechanism http://hbasefly.com/2016/06/04/hbase-client-1/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report