In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the practice of ARMS in how to select APM tools, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Preface
Under the influence of the demand of digital transformation and the implementation of Internet architecture, micro-service architecture is more and more widely used in current systems. We enjoy the benefits of micro-service (high development efficiency, independent deployment, horizontal expansion, fault and resource isolation, etc.), but also bring difficulties in testing, transaction, application monitoring and other aspects.
As can be seen from the above figure, under the distributed Internet architecture, the calls between applications become more and more complex. Our traditional development engineers take the initiative to bury the site, and the operation and maintenance personnel go to the host to check the logs and combine the call chain. Monitoring the operation of applications is becoming more and more inadequate.
In order to better monitor the application level, including the infrastructure data of the application running environment, system business calls, performance consumption analysis, and quickly locate and solve problems when performance, anomalies and faults occur, many excellent APM (Application Performance Management) tools have been born.
These APM tools provide including metrics statistics and call link tracking information.
Common APM tools
APM tools include metrics collection and call chain collection. Metrics collect, for example, the number of requests, exceptions, errors, response time RT, resource usage at IAAS layer (such as cpu, memory, IO, load, network) in a certain period of time, as well as various running parameters of JVM (such as memory partition, gc). The call chain collection includes applications, classes, methods, and time consumption on each running node / method that has been accessed in the business request.
Common APM tools are:
1. ARMS: an APM tool developed by Alibaba. As the enterprises with Ali as the main body of the distributed micro-service framework began to explore early, Ali Group has a matching Hawkeye system to do related application monitoring a long time ago, in order to adapt to the cloud output of products, Ali officially provided application monitoring services in the form of ARMS products in 2016-08-04.
2. APM of open source system
U Pinpoint: an open source APM tool based on java, developed by Koreans with complete functions and rapid development, which affects the implementation of many other APM tools and is widely used at home and abroad.
U Skywalking: an open source tool that supports open tracing standard and is developed by Wu Sheng of China, which is currently an open source project under Apache, develops very rapidly, and is widely used in various open source APM tools in China.
U ZipKin: supports the open tracing standard, developed and contributed by Twitter, and started open source development in 2012. It is a relatively mature open source APM tool.
U Jaeger: supports the open tracing standard, developed and contributed by Uber, is a relatively mature open source APM tool.
Principle of APM tool
Although the functions and implementations of these APM tools are different, the basic principles are the same. This principle is based on google dapper's distributed tracking technology paper, which divides the implementation of APM tools into two parts:
1. Carry on the application burying point on the application running node, and generate the buried point data during the business operation.
In this call chain tracing technology, the function of restoring the call chain mainly depends on two ID.
The first ID is TraceID, which represents a business call, just like an order settlement initiated in an e-commerce system; a course selection process in online education; collection in a logistics system; these services from customer trigger to get the response result is a complete request, that is, a business call, and each business request will get an one-dimensional TraceID.
The second ID is RpcID (or SpanID), which may pass through more than one application in a business request. Take an e-commerce order business as an example: it needs to create an order through the order system; the payment system accepts payment; the inventory system deducts the product inventory; the member system gives points to the buyer; the shopping cart system cleans the shopping list. In this way, for each application through which the business flows, there is a hierarchical RpcID, and this RpcID can be considered to be recorded at the directory level. From the perspective of this RpcID, even if it is called many times in the same business, its RpcID entry is the same each time.
Depending on TraceID & RpcID, we can easily restore the entire call chain.
6. Active diagnosis ability
ARMS provides active diagnosis capability, which can be performed by selecting a specific time. ARMS will analyze the operation of the application during this period, automatically summarize the problems during this period, and produce specific reports based on Ali's experience. According to this report, we can accelerate our positioning and optimization.
7. Rich alarm ability
Improve the alarm system, ARMS provides a wealth of alarm rules, we can turn on / off the corresponding rules, edit, so that we can quickly build the alarm system. In the alarm channel, you can directly send the pin / WebHook/Email/ short message gateway and so on.
Advantages in operation and maintenance ability
1. On-demand monitoring, start and stop management
Through the ARMS management console, we can manage the start and stop of applications in batches, stop all ARMS monitoring with one click, or start the monitoring of related applications with one click. Very much in line with Shangyun's concept of on-demand use.
2. Dynamic sampling rate change
In the face of special time points or abnormal occurrence probability, we want to dynamically adjust the sampling rate, for example, by increasing the sampling rate to capture these call chains with very little probability. With the help of ARMS configuration management, we can easily collect more complete call chains. To ensure the rational use of storage space by reducing the sampling rate. (other APM tools need to be reconfigured and started when they change the sampling rate, which is not only troublesome to deal with, but also affects the edge of the business. In practice, it is difficult to make a decision to interrupt the business during operation to change the sampling rate. )
3. Switch of binding parameters
Although many APM tools can provide the ability to bind parameters. But very often, if the system is sensitive to business data, you do not want this kind of APM tools to collect the running parameters of SQL/API when it is not necessary. Therefore, it is very meaningful for ARMS to provide such a function in its configuration management, that is, when it needs to collect these running business parameters for problem location analysis, then as long as it is turned on, after use, by turning off these switches, then we can protect our business data from leaking out.
4. Easy access
You can access it through a variety of very convenient access methods, such as Ali Container ACK/EDAS/SAE, which can be completed with simple YAML comments or buttons.
5. Components are stable and maintenance-free.
Because ARMS is a commercial product, all the components do not require us to use square operation and maintenance. If we use open source self-built, then we need to collect logs, calculate cleaning services, and operate and maintain the storage product itself, including the corresponding cluster size, cleaning processing, and capacity expansion processing. If resource recovery is not carried out after the peak, it will also lead to additional waste of use.
Advantages in cost use
1. ARMS is billed by the hour (duration) of the access node, which can give full play to the advantages of cloud products. Use it as needed and pay according to the application node you need. In addition, ARMS is simply calculated according to the number of nodes, and does not change due to the change of sampling rate, so it has a certain advantage for the application of large sampling rate.
2. ARMS has a corresponding resource package, which can be further saved by purchasing a resource package.
3. Due to the combination of products, ARMS will automatically be billed at a 50% discount if it is used with Aliyun's ACK.
Remarks
1. Open source stores unified statistical data for 15 days and full detailed data for 3 days (ARMS data is used 24 hours a day, stored for 60 days, and the monthly fee is discounted annually under a non-container. )
2. The manpower cost is calculated on the basis of a monthly salary of 30,000 yuan for operation and maintenance personnel with development capabilities. The labor cost, the release caused by the change of the main parameters, the efficiency loss caused by the instability of the back-end system, and the maintenance operation of the back-end system. Medium and large ones will do some customized development (for example, dynamic configuration of sampling takes effect)
This is the end of the practice of ARMS in how to select APM tools. I hope the above content can be helpful to everyone and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.