In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail how to achieve RocketMQ performance stress test analysis. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.
One machine deployment 1.1 machine composition
1 nameserver
1 broker asynchronous flashing disk
2 producer
2 consumer
1.2 hardware configuration
CPU two x86_64cpu, each cpu12 core, a total of 24 cores
48 GB of memory
Network card gigabit network card
Disk except that the disk of the broker machine is RAID10, a total of 1.1T, the rest are ordinary disks of about 500G
1.3 deployment structure
The orange arrow is the data flow, and the black connection is the network connection.
1.4 Kernel parameters
Broker is a storage system, which has its own flushing strategy for disk reading and writing, uses a lot of file memory mapping, and consumes a lot of file handles and memory. Therefore, the default setting of the system can not make RocketMQ play a good performance, so it is necessary to set some specific parameters for the system's pagecache, memory allocation, Imax O scheduling and file handle restrictions.
System Icano and virtual memory settings
Echo 'vm.overcommit_memory=1' > > / etc/sysctl.conf
Echo 'vm.min_free_kbytes=5000000' > > / etc/sysctl.conf
Echo 'vm.drop_caches=1' > > / etc/sysctl.conf
Echo 'vm.zone_reclaim_mode=0' > > / etc/sysctl.conf
Echo 'vm.max_map_count=655360' > > / etc/sysctl.conf
Echo 'vm.dirty_background_ratio=50' > > / etc/sysctl.conf
Echo 'vm.dirty_ratio=50' > > / etc/sysctl.conf
Echo 'vm.page-cluster=3' > > / etc/sysctl.conf
Echo 'vm.dirty_writeback_centisecs=360000' > > / etc/sysctl.conf
Echo 'vm.swappiness=10' > > / etc/sysctl.conf
System file handle settings
Echo 'ulimit-n 1000000' > > / etc/profile
Echo 'admin hard nofile 1000000' > > / etc/security/limits.conf
System Ipaw O scheduling algorithm
Deadline
1.5 JVM parameters
Use RocketMQ default settings
-server-Xms4g-Xmx4g-Xmn2g-XX:PermSize=128m-XX:MaxPermSize=320m-XX:+UseConcMarkSweepGC-XX:+UseCMSCompactAtFullCollection-XX:CMSInitiatingOccupancyFraction=70-XX:+CMSParallelRemarkEnabled-XX:SoftRefLRUPolicyMSPerMB=0-XX:+CMSClassUnloadingEnabled-XX:SurvivorRatio=8-XX:+DisableExplicitGC-verbose:gc-Xloggc:/root/rocketmq_gc.log-XX:+PrintGCDetails-XX:-OmitStackTraceInFastThrow
Performance Evaluation 2.1 Evaluation purpose
Pressure test stand-alone TPS to evaluate stand-alone capacity
2.2 Evaluation indicators
The highest TPS does not represent the most suitable TPS, and a tradeoff must be made between the TPS and the various indicators of the system resources. The system resources will soon reach the limit, but can still operate normally, so the TPS at this time is more appropriate. For example, ioutil should not exceed 75% CPU load should not exceed the total number of cores or too many cores. Frequent swap would not cause large memory jolts. Therefore, you can not only focus on TPS, but also focus on the following indicators:
Message: TPS
Cpu:load,sy,us
Memory: useed,free,swap,cache,buffer
I _ I _.
Network: network card traffic
2.3 Evaluation method
Two producer threads continuously send 2K messages to broker, which means 1000 characters. This message is relatively large and can fully meet the needs of the business.
2.4 Evaluation results
TPS is relatively high
After long-time testing and observation, a single borker TPS is as high as 16000, that is, the server can process 16000 messages per second, and the consumer consumes in time. The average delay from the server storing the message to the consumer consuming the message is about 1.3s, and the delay will not increase with the increase of the TPS, so it is a relatively stable value.
Broker has high stability.
The two producer start a total of 44 threads to send messages non-stop for 10 hours. The broker is very stable, which simply means that in the actual production environment, dozens of producer can send messages to a single broker with high frequency, but the broker will remain stable. Under such pressure, the load of broker is only as high as 3 (24-core cpu), and there is a lot of memory available.
Moreover, during the test for more than 10 hours in a row, the jvm of broker was very smooth, without a single fullgc, the recovery efficiency of the new generation of GC was very high, and there was no pressure on memory. Here are the data extracted from gclog:
2014-07-17T22:43:07.407+0800: 79696.377: [GC2014-07-17T22:43:07.407+0800: 79696.377: [ParNew: 1696113K-> 18686K (1887488K), 0.1508800 secs] 2120430K-> 443004K (3984640K), 0.1513730 secs] [Times: user=1.36 sys=0.00, real=0.16 secs]
The size of the Cenozoic is 2g, the memory occupation is about 1.7g before recovery and 17m after recovery, and the recovery efficiency is very high.
About disk IO and memory
The average time spent on a single physical IO is about 0.06ms, and IO has almost no extra wait because await and svctm are basically the same. During the whole testing process, there was no disk physical read, because of the file mapping, a large amount of cached memory cached the file contents, and there was still a lot of free space in the memory.
Performance bottleneck of the system
After the TPS reaches 16000, it will not be able to go up again, at this time, the traffic of the gigabit network card is about 100m per second, which basically reaches the limit, so the network card is the performance bottleneck. However, the highest IOUTIL of the system has reached about 40%, and this number is not low, so even if the network traffic increases, the system IO index may be unhealthy. Overall, the TPS of 16000 on a single machine is a relatively safe value.
The following is the trend of each indicator
TPS
TPS can overwhelm about 16000 at the highest, and then press up, TPS has a downward trend.
Memory
The memory is very stable, with a total of 48 GB, and the actual available memory is very high.
No swap exchange occurs, and system performance will not be bumpy due to frequent access to the disk.
A large amount of memory is used as file cache, as shown in the cached index, which greatly avoids physical disk reading.
Disk throughput
As the number of threads increases, the disk physical IO reads and writes about 70m of data per second.
Percentage of IO
As the number of threads increases, the percentage of IO finally stabilizes at around 40%, which is acceptable.
This is the end of this article on "how to achieve RocketMQ performance stress Test Analysis". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it out for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.