In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Xiaobian to share with you HBase Load Balancer and performance indicators of the example analysis, I believe most people do not know how, so share this article for your reference, I hope you read this article after a great harvest, let us go to understand it!
HBase Load Balancer and Performance Metrics
In distributed systems, Load Balancer is a very important function. HBase implements Load Balancer through the number of Regions, that is, it implements custom Load Balancer algorithm through hbase.master.loadbalancer.class.
content
Load Balancer of HBase system is a periodic operation. Region is evenly distributed to each RegionServer through Load Balancer. The time interval of Load Balancer is controlled through hbase.balancer.period attribute. The default is 5 minutes. Triggering a Load Balancer operation is conditional, but a Load Balancer operation is not triggered if:
l Load Balancer automatically operates balance_switch OFF, i.e. balance_switch false;
l The HBase Master node is initializing operations;
l HBase cluster is executing RIT, i.e. Region is migrating;
l HBase cluster is processing offline RegionServer;
Load Balancer Algorithm
When HBase executes the Load Balancer operation, how to judge whether the number of Regions on each RegionServer node is balanced? Here, the judgment is made through the following steps:
l Calculate the interval range of equilibrium value, calculate the average number of Regions by the total number of Regions and the number of RegionServer nodes, and then calculate the minimum and maximum values on this basis;
Traverse RegionServer nodes that exceed the maximum Region value, and migrate the Region value on this node until the number of Regions on this node is less than or equal to the maximum Region;
Traversing RegionServer nodes below the Region minimum value, assigning Regions in the cluster to these RegionServers until Region is greater than or equal to the minimum value;
l Responsible for the above operations. The cluster will not reach Load Balancer until the number of Regions on all RegionServers in the cluster is between the minimum and maximum values. After that, even if the balancing command is executed manually again, the underlying logic of HBase determines that the operation will be ignored.
Example of algorithm flow
The following author analyzes the implementation process of HBase Load Balancer algorithm through actual application scenarios. For example, suppose we currently have a 5-node HBase cluster (including Master and RegionServer), which consists of 2 Masters and 3 RegionServers. The number of Regions on each RegionServer is as shown in the figure:
Fig. 1
Before executing the Load Balancer operation, first calculate the total number of Regions in the cluster. The total number of Regions in the cluster in the current instance is 175+56+99=330. Then calculate the average number of Regions that each RegionServer needs to accommodate. The calculation result:
Average (110) = Total Region (330) /Total RegionServers (3)
Calculate the minimum value and maximum value to determine whether the HBase cluster needs to perform Load Balancer operation. Calculation formula:
# hbase.regions.slop weight value, default is 0.2
Min = Math.floor(Average * (1-0.2))
Max = Math. ceiling (Average * (1+0.2))
HBase Cluster If it is judged that the minimum number of Regions in each RegionServer is greater than the calculated minimum value, and the maximum number of Regions is less than the maximum value, this is a direct return without triggering the Load Balancer operation. Based on the number of Regions given in the example, the minimum Region is calculated to be 88 and the maximum Region is calculated to be 120.
Since the number of Regions of RegionServer2 in the instance is 56, which is less than the minimum number of Regions of 88, and the number of Regions of RegionServer1 is 175, which is greater than the maximum number of Regions of 120, Load Balancer is required.
HBase system provides administrator commands to operate Load Balancer. Specific operation commands:
#Use the hbase shell command to enter the HBase console and turn on automatic Load Balancer
hbase(main):001:0> balance_switch true
Balance_switch command bottom implementation balance_switch.rb and admin.rb file source code:
Fig. 2
The output of this command is the switch setting of the previous Load Balancer, and then look at the balance_switch command processing implementation source code:
Fig. 3
At this point HBase Load Balancer automation is enabled, but what if we need to balance the number of Regions in the cluster immediately? Here HBase also provides management commands, which are implemented through the balancer command. Operation commands:
hbase(main):001:0> balancer
balancer command implementation View balancer.rb and admin.rb file source code:
Fig. 4
Fig. 5
This command generates a Load Balancer plan by calling the balanceCluster() method of the Load Balancer to execute the Load Balancer operation of the cluster, and the Master implements the underlying source code of the Load Balancer:
Figure 6-1
Figure 6-2
However, each time it is executed manually, the number of balances may not meet the requirements, so we can use scripts to schedule execution by encapsulating the command. The specific implementation code:
Fig. 7
This script executes 20 times by default, but you can customize the number of executions by entering integer arguments.
When the HBase cluster has checked that the number of Regions on all RegionServers has been set, the cluster's Load Balancer operation has been completed. If this is not the case, you can execute the script again until all the Regions are between the minimum and maximum. After all RegionServers in the HBase cluster complete Load Balancer, the number distribution of Regions on each RegionServer in the instance is shown in the figure below:
Fig. 8
At this time, the number of Regions on each RegionServer node is within the range of minimum value and maximum value, and the Regions on each RegionServer node of HBase cluster process equilibrium state.
performance Index
HBase system has a very important performance indicator, which is the delay of processing requests in clusters. The HBase system provides a tool class to reflect the time spent processing requests within the cluster, namely:
org.apache.hadoop.hbase.tool.Canary
Such primary users check the time-consuming state of the HBase system. If you don't know how to use it, use the help command to see the specific usage.
hbase org.apache.hadoop.hbase.tool.Canary -help
(1)View the elapsed time for each Region in each table in the cluster
hbase org.apache.hadoop.hbase.tool.Canary
(2)View the time spent on each Region in the money table, using spaces between multiple tables
#View the money table and person table
hbase org.apache.hadoop.hbase.tool.Canary money person
(3)See how long each RegionServer takes
hbase org.apache.hadoop.hbase.tool.Canary -regionserver dn1
Usually we pay more attention to the time-consuming situation of each RegionServer node, encapsulate the command, and then print the time-consuming situation of each RegionServer in the cluster. Script implementation:
The above is "HBase Load Balancer and Performance Indicator Sample Analysis" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.