Introduction to APM and actual combat 07/06 Update SLTechnology News&Howtos

Introduction to APM and actual combat

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Space 1: the basic part of APM 1. What is APM?

APM, full name: Application Performance Management, the current market systems are basically done with reference to Google's Dapper (tracking system of large-scale distributed systems), translation portal "Dapper Chinese translation of google"

Think about it: those who do not abide by the theory are pseudo-APM, playing hooligans?

What is the core idea of APM? When each node of the application service invokes each other, record and pass an application-level tag, which can be used to correlate the relationship between the service nodes. For example, if HTTP is used as the transport protocol between two application service nodes, these tags will be added to the HTTP header. It can be seen that how to transmit these tags is related to the communication protocols used between application service nodes, and it is relatively easy for commonly used protocols to add these contents, while some customized ones may be relatively difficult. This directly determines the difficulty of implementing a distributed tracking system.

2. Why use APM?

Only when there are business pain points do you need to seek a solution. Personally, APM needs to give priority to solving two scenarios in the test environment, based on the principle of test first:

Giving priority to macro data does not mean that testers do not need to pay attention to problems at the micro level. From a testing point of view, we first solve the problems of data sampling and collection in the performance test environment, and then evaluate the production environment. Online link monitoring requires R & D to cooperate with OPS, and [R & D perspective scenario] is less concerned than testers.

3. What APM tools are available on the market? Pinpoint

Pinpoint is an open source APM (Application Performance Management) tool for large-scale distributed systems written in Java.

Https://github.com/naver/pinpoint

SkyWalking

A distributed tracing system, and APM (Application Performance Monitoring).

Http://skywalking.org

Zipkin

Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures. It manages both the collection and lookup of this data. Zipkin's design is based on the Google Dapper paper.

Http://zipkin.io/

CAT (Dianping)

CAT is a real-time application monitoring platform based on Java, including real-time application monitoring and business monitoring.

Https://github.com/dianping/cat

4. let's start with the conclusion.

At present, it is more suitable for Google Dapper design, and Pinpoint is better than Zipkin.

Pinpoint zero invasion of the code, the use of JavaAgent bytecode enhancement technology, add startup parameters.

And it accords with the macro of performance testing and tuning monitoring in the [Test Angle scenario].

Of course, the conclusion is too early, and there will be doubts:

"what is the relationship between Spring Cloud Slueth and zipkin?"

For detailed comparison, please see the following figure:

5. Comparison.

In essence, there is no comparison between Spring Cloud Slueth and Pinpoint. The real comparison is that Zipkin,Spring Cloud Slueth focuses on link tracking and analysis, sends information to Zipkin, and uses Zipkin storage to store information. Of course, Zipkin can also use ELK to log and display, and then store the data to ELK through scripts to collect server performance, which can display server status information. The overall presentation of Zipkin is also based on link analysis.

Length 2: Pinpoint practical article 1, Pinpoint architecture diagram

Pinpoint is an open source APM (Application Performance Management) tool for large-scale distributed systems written in Java.

Description of the architecture diagram:

Pinpoint-Collector: collect all kinds of performance data Pinpoint-Agent: the probe is associated with an application server (such as tomcat) and deployed to the same server Pinpoint-Web: the collected data layer is now shown in web HBase Storage: the collected data is stored in HBase 2. Pinpoint data structure

The data structure of Pinpoint messages mainly consists of three types: Span,Trace and TraceId.

Span is the most basic call tracking unit.

When the remote call arrives, Span refers to the job that handles the call and carries trace data. To achieve code-level visibility, the Span also contains a layer of SpanEvent data structures. Each Span contains a SpanId.

Trace is a set of interrelated Span collections

The Span under the same Trace shares a TransactionId and will be arranged into a hierarchical tree structure according to SpanId and ParentSpanId.

TraceId is a combination of TransactionId, SpanId and ParentSpanId

TransactionId (TxId) is a transactional ID that sends and receives messages across the entire distributed system, and it must be globally unique in the entire server group. In other words, TransactionId identifies the entire call chain; SpanId (SpanId) is the ID that handles remote call jobs, which is generated when a call reaches a node; and ParentSpanId (pSpanId), as the name implies, is the ID of the caller Span that produces the current Span. If a node is the original initiator of the transaction, its ParentSpanId is-1 to indicate that it is the root Span of the entire transaction. The following figure can intuitively illustrate the relationship between these ID structures.

3. Pinpoint deployment

There are too many deployment documents on the Internet, which will not be described in detail here, but briefly:

Note the version requirements:

Java version required to run Pinpoint:

HBase compatibility table:

Agent compatibility table:

There are two ways to start:

Method 1: modify the bin/catalina.sh under the tomat directory and add the following three lines of code to Control Script for the CATALINA Server: CATALINA_OPTS= "$CATALINA_OPTS-javaagent:/home/webapps/service/pp-agent/pinpoint-bootstrap-1.6.2.jar" CATALINA_OPTS= "$CATALINA_OPTS-Dpinpoint.agentId=pp32tomcattest" CATALINA_OPTS= "$CATALINA_OPTS-Dpinpoint.applicationName=32tomcat"

First line: location of pinpoint-bootstrap-1.6.2.jar

Second line: agentId must be unique, marking a jvm

The third line: applicationName indicates the same application: different instances of the same application should use different agentId and the same applicationName

Mode 2: SpringBoot startup

Java-javaagent:/home/webapps/pp-agent/pinpoint-bootstrap-1.6.2.jar-Dpinpoint.agentId=pp32tomcattest-Dpinpoint.applicationName=32tomcat-jar 32tomcat-0.0.1-SNAPSHOT.jar 4, how code injection works

Pinpoint encapsulates code injection very much like AOP. When a class is loaded, it injects before and after logic into the specified method through Interceptor, in which the running state of the system can be obtained, and a Trace message is created through TraceContext and sent to the Pinpoint server. But unlike AOP, Pinpoint takes into account more of the ability to interact with the target code when encapsulating, so writing code with API provided by Pinpoint is easier and more professional than AOP.

5. Demonstration of Pinpoint actual combat effect

Set up a java open source project jforum, run under tomcat, and use jmeter for stress testing.

Server diagram (ServerMap)

Understand the topology of any distributed system by visualizing the interconnection of its components. Clicking the node displays detailed information about the component, such as its current state and transaction count.

Real-time activity Line Diagram (Realtime Active Thread Chart)

Monitor active threads within the application in real time. (official picture was used, but there was no screenshot at that time)

Request / response scatter diagram (Request/Response Scatter Chart)

Visualize request count and response patterns to identify potential problems. You can select a transaction by dragging on the chart for more details.

Call stack information (CallStack)

Enhance the code-level visibility of each transaction in a distributed environment to identify bottlenecks and points of failure in a single view.

Inspector (Inspector)

View other details of the application, such as CPU usage, memory / garbage collection, TPS, and JVM parameters.

6. Summary

First: PinPoint from a macro point of view: overall link, service overall status (cpu, memory, etc.), in line with the [test perspective scenario] performance test tuning monitoring of the macro

Second: Spring Cloud Slueth needs to be combined with Zipkin from a microscopic point of view: it cannot be displayed alone, but link problems should be shown with Zipkin (there is no display of the overall status of the server), and more information about server performance needs to be collected and displayed through ELK, which is in line with the microscopic monitoring of performance test tuning monitoring in [R & D scenario].

Generally speaking, the two are a combination. If you want to use them separately, from the test business point of view: PinPoint satisfaction, performance test tuning monitoring macro [test perspective scenario]

7. Project scenario

Why is there 23 database access requests for a series of links generated by a backend application service to access a certain API? This is where you need to troubleshoot. Take a closer look at CallTree to find out which SQL query statements can be optimized.

In addition, when doing performance testing, the continuous writing of the server's concurrent IO,PP will also create a bottleneck, which needs to be solved later.

8. Simple pressure test of tag library project

Perform a simple pressure test on the tag library through jmeter. The script is as follows:

The problem found through APM is as follows:

Pquery.do 's res is as high as 6782ms, so you need to arrange for developers to further troubleshoot location code problems.

In another scenario, the tester cannot get the information on the page (in some cases, the tester does not have server permissions). These are the abnormal information at the bottom of the service, which can be viewed through CallTree.

9. Panoramic spider web diagram of the link after the application service is connected to the APM

References:

Pinpoint github

Analysis of Pinpoint Source Code (3)

Dapper, tracking system of large-scale distributed system

Pinpoint study notes

Pinpoint v1.5.0 APM video introduction

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.