The principle of JVMCPUProfiler Technology and the example Analysis of Source Code 04/27 Update SLTechnology News&Howtos

The principle of JVMCPUProfiler Technology and the example Analysis of Source Code

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

JVMCPUProfiler technical principles and source code example analysis, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

Introduction

When developers encounter online alarms or need to optimize system performance, they often need to analyze program running behavior and performance bottlenecks. Profiling technology is a dynamic analysis method to collect program-related information when the application is running. The commonly used JVM Profiler can dynamically analyze the program from many aspects, such as CPU, Memory, Thread, Classes, GC and so on, among which CPU Profiling is the most widely used.

CPU Profiling is often used to analyze the execution hotspots of code, such as "which method takes up the longest execution time of CPU", "what is the proportion of each method occupying CPU" and so on. After getting the above relevant information through CPU Profiling, R & D staff can easily analyze and optimize the hotspot bottleneck, thus break through the performance bottleneck and greatly improve the throughput of the system.

Introduction to CPU Profiler

There are many JVM Profiler implemented in the community, such as JProfiler, which is already commercial and powerful, and there are also free and open source products, such as JVM-Profiler, which have their own strengths. The latest version of Intellij IDEA that we use on a daily basis also integrates a simple and easy-to-use Profiler. For more information, see the official Blog.

After opening the Java project that needs to be diagnosed with IDEA, add a "CPU Profiler" in the "Preferences-> Build, Execution, Deployment-> Java Profiler" interface, then go back to the project, click "Run with Profiler" in the upper right corner to start the project and start the CPU Profiling process. After a certain period of time (recommended 5min), click "Stop Profiling and Show Results" on the Profiler interface to see the result of Profiling, including the flame diagram and call tree, as shown below:

Intellij IDEA-performance Flame Diagram

Intellij IDEA-call stack tree

The flame diagram is a visual performance analysis diagram generated from the sample set of the call stack. "how to read the flame diagram?" In short, when looking at the fire picture, we need to pay attention to "flat top", because that is the CPU hot spot of our program. The call tree is another means of visual analysis, which, like the flame map, is generated from the same sample set and can be selected as needed.

To be clear here, because we didn't introduce any dependencies into the project, just "Run with Profiler", Profiler can get the information about the runtime of our program. This function is actually realized through JVM Agent. In order to help you understand it systematically, let's first make a brief introduction to JVM Agent.

Introduction to JVM Agent

JVM Agent is a special library written according to certain rules, which can be passed to JVM through command line parameters in the startup phase and run in the same process as the target JVM as a concomitant library. In Agent, the relevant information within the JVM process can be obtained through a fixed interface. Agent can be either a JVMTI Agent written in C/C++/Rust or a Java Agent written in Java.

By executing the Java command, we can see the command line parameters related to Agent:

-agentlib: [=]

Load native proxy libraries, such as-agentlib:jdwp

See also-agentlib:jdwp=help

-agentpath: [=]

Load the local agent library by full pathname

-javaagent: [=]

To load the Java programming language agent, see java.lang.instrument

JVMTI Agent

JVMTI (JVM Tool Interface) is a set of standard programming interface provided by JVM. It is the unified basis for implementing Debugger, Profiler, Monitor, Thread Analyser and other tools. It is implemented in mainstream Java virtual machines.

When we want to implement an Agent based on JVMTI, we need to implement the following entry function:

/ / $JAVA_HOME/include/jvmti.h

JNIEXPORT jint JNICALL Agent_OnLoad (JavaVM * vm, char * options, void * reserved)

Implement this function using C _ agentpath +, and compile the code into a dynamic link library (.so on Linux), pass the full path of the library to the Java process with the-path argument, and JVM will execute the function at the appropriate time in the startup phase. Inside the function, we can get the function pointer tables of JNI and JVMTI through JavaVM pointer parameters, so we have the ability to interact with JVM in a variety of complex ways.

More details about JVMTI can be found in the official documentation.

Java Agent

In many scenarios, it is not necessary for us to develop JVMTI Agent using CramCure +, because it is expensive and difficult to maintain. JVM itself encapsulates a set of Instrument API interfaces for Java based on JVMTI, which allows Java Agent (just a jar package) to be developed in Java language, which greatly reduces the development cost of Agent. Community open source products such as Greys, Arthas, JVM-Sandbox, JVM-Profiler, etc. are written in pure Java and run in the form of Java Agent.

In Java Agent, we need to specify Premain-Class as a portal class in the MANIFEST.MF of the jar package and implement the following methods in the portal class:

Public static void premain (String args, Instrumentation ins) {

/ / implement

}

The packaged jar is a Java Agent, and you can pass the jar to the Java process to accompany startup with the-javaagent parameter, and JVM will also execute the method at the appropriate time during the startup phase.

In this method, the parameter Instrumentation interface provides the ability of Retransform Classes, and we can modify the Class of the host process by using this interface to achieve the functions of method time statistics, fault injection, Trace and so on. The capability provided by the Instrumentation interface is relatively simple, which is only related to Class bytecode operation, but because we are now in the host process environment, we can use JMX to directly obtain the memory, threads, locks and other information of the host process. Whether it is Instrument API or JMX, their internal implementation is still based on JVMTI.

More details about Instrument API can be found in the official documentation.

Analysis of the principle of CPU Profiler

After understanding how Profiler is executed in the form of Agent, we can start trying to construct a simple CPU Profiler. But before that, it is also necessary to understand the two implementations of CPU Profiling technology and their differences.

Sampling vs Instrumentation

Students who have used JProfiler should know that the CPU Profiling function of JProfiler provides two options: Sampling and Instrumentation, which are also two means to implement CPU Profiler.

The Sampling method, as its name implies, is based on the "sampling" of StackTrace. The core principles are as follows:

Introduce Profiler dependency, or directly use Agent technology to inject the target JVM process and start Profiler.

Start a sampling timer to Dump the call stacks of all threads at fixed sampling frequencies (milliseconds).

Summarize and count the Dump results of each call stack, after collecting enough samples in a certain period of time, derive the statistical results, the content is the number of times each method was sampled and the method call relationship.

Instrumentation uses Instrument API to enhance the bytecode of all necessary Class, and buries the point before entering each method. After the method execution is finished, the method execution time is counted and finally summarized. Both can get the results they want, so what's the difference between them? In other words, which is better or worse?

The Instrumentation approach adds additional AOP logic to almost all methods, which will have a huge performance impact on online services, but its advantages are: absolutely accurate method calls, call time statistics.

The Sampling method samples the call stack snapshots of all threads at a fixed frequency based on non-intrusive additional threads, and its performance overhead is very low compared to the former. However, because it is based on the "sampling" mode and the inherent "defect" of JVM that it can only sample at the safe point (Safe Point), it will lead to some deviation in the statistical results. For example, some methods take a very short time to execute, but the execution frequency is very high, which really takes up a lot of CPU Time, but the sampling cycle of Sampling Profiler cannot be infinitely reduced, which will lead to a sharp increase in performance overhead, so it will lead to a large number of samples that do not have the "high-frequency mini methods" just mentioned in the call stack, resulting in the final result not reflecting the real CPU hotspots. For more Sampling-related questions, please refer to "Why (Most) Sampling Java Profilers Are Fucking Terrible".

Specific to the question of "which is better or which is inferior", there is no very obvious judgment between the two implementation technologies, and it is meaningful only under the discussion of sub-scenarios. Because of its low overhead, Sampling is more suitable for CPU-intensive applications and online services that can not accept a lot of performance overhead. Instrumentation, on the other hand, is more suitable for use in I-ramp O-intensive applications, where it is insensitive to performance overhead and where accurate statistics are really needed. Community Profiler is more based on Sampling to achieve, this article is also based on Sampling to explain.

Implementation based on Java Agent + JMX

The simplest Sampling CPU Profiler can be implemented in Java Agent + JMX mode. With Java Agent as the entry point, open a ScheduledExecutorService after entering the target JVM process, regularly use the threadMXBean.dumpAllThreads () of JMX to export the StackTrace of all threads, and finally summarize and export.

/ / com/uber/profiling/profilers/StacktraceCollectorProfiler.java

/ *

* StacktraceCollectorProfiler is equivalent to the CpuProfiler described in this article, only with different naming preferences

* CpuProfiler of jvm-profiler refers to the Profiler of CpuLoad metrics

, /

/ / the Profiler interface is implemented, and all Profiler are timed by a unified ScheduledExecutorService.

@ Override

Public void profile () {

ThreadInfo [] threadInfos = threadMXBean.dumpAllThreads (false, false)

/ /...

For (ThreadInfo threadInfo: threadInfos) {

String threadName = threadInfo.getThreadName ()

/ /...

StackTraceElement [] stackTraceElements = threadInfo.getStackTrace ()

/ /...

For (int I = stackTraceElements.length-1; I > = 0; iMury -) {

StackTraceElement stackTraceElement = stackTraceElements [I]

/ /...

}

/ /...

}

The default Interval for the timer provided by Uber is 100ms, which is a little rough for CPU Profiler. However, because the execution cost of dumpAllThreads () should not be underestimated, the Interval should not be set too small, so the CPU Profiling result of this method will have a large error.

The advantage of JVM-Profiler is that it supports multiple metrics of Profiling (StackTrace, CPUBusy, Memory, I _ Profiling O, Method), and supports reporting Profiling results back to the central Server for analysis through Kafka, that is, cluster diagnosis.

Implementation based on JVMTI + GetStackTrace

It is relatively simple to use Java to implement Profiler, but there are also some problems, such as sharing AppClassLoader between Java Agent code and business code. Agent.jar loaded directly by JVM may pollute business Class if it introduces third-party dependency. As of press time, JVM-Profiler has this problem. It introduces components such as Kafka-Client, http-Client, Jackson and so on. If it conflicts with the component version in the business code, it may cause unknown errors. The solution of Greys/Arthas/JVM-Sandbox is to separate the entry from the core code, and use custom ClassLoader to load the core code to avoid affecting the business code.

At the lower level of CAPI +, we can directly interface with the JVMTI interface and use native C API to operate JVM. The features are richer and more powerful, but the development efficiency is low. To develop CPU Profiler based on the same principles in the previous section, using JVMTI requires the following steps:

1. Write Agent_OnLoad (), and get the JavaVM* pointer of JVMTI through the GetEnv () function of the JVMTI pointer at the entrance:

/ / agent.c

JNIEXPORT jint JNICALL Agent_OnLoad (JavaVM * vm, char * options, void * reserved) {

JvmtiEnv * jvmti

(* vm)-> GetEnv ((void * *) & jvmti, JVMTI_VERSION_1_0)

/ /...

Return JNI_OK

}

two。 Start a thread timing loop and use the jvmtiEnv pointer to call the following JVMTI functions:

/ / get the jthread of all threads

JvmtiError GetAllThreads (jvmtiEnv * env, jint * threads_count_ptr, jthread * * threads_ptr)

/ / get the thread information (name, daemon, priority...) according to jthread

JvmtiError GetThreadInfo (jvmtiEnv * env, jthread thread, jvmtiThreadInfo* info_ptr)

/ / get the thread call stack according to jthread

JvmtiError GetStackTrace (jvmtiEnv * env

Jthread thread

Jint start_depth

Jint max_frame_count

JvmtiFrameInfo * frame_buffer

Jint * count_ptr)

The main logic is roughly as follows: first call GetAllThreads () to get the "handle" jthread of all threads, then traverse to get thread information based on jthread call GetThreadInfo (), filter out unnecessary threads by thread name, and continue to traverse the call stack of threads based on jthread call GetStackTrace ().

3. Save the sampling results of each time in Buffer and finally generate the necessary statistical data.

Follow the steps above to implement a JVMTI-based CPU Profiler. But it's important to note that even getting the call stack using GetStackTrace () based on the native JVMTI interface has the same problem as JMX-- it can only be sampled at a safe point (Safe Point).

SafePoint Bias problem

The CPU Profiler based on Sampling approximately calculates the hot methods through the call stack samples of the collection program at different time points. Therefore, theoretically speaking, Sampling CPU Profiler must follow the following two principles:

There must be enough samples.

All running code points in the program must be sampled by Profiler with the same probability.

If you can only sample at a safe point, you violate the second principle. Because we can only take a snapshot of the call stack at a safe point, it means that some code may never have a chance to be sampled, even if it really takes a lot of CPU execution time, a phenomenon called "SafePoint Bias".

As we mentioned above, SafePoint Bias exists in both JMX-based and JVMTI-based Profiler implementations, but one detail worth knowing is that alone, the GetStackTrace () function of JVMTI does not need to be executed at the safe point of Caller, but when calling GetStackTrace () to get the call stack of other threads, you must wait until the target thread enters the safe point. Moreover, GetStackTrace () can only be called synchronously through a separate thread and cannot be called asynchronously in the Handler of the UNIX signal processor. To sum up, GetStackTrace () has the same SafePoint Bias as JMX. For more information about security points, please refer to "Safepoints: Meaning, Side Effects and Overheads".

So, how to avoid SafePoint Bias? The community provides a Hack idea-AsyncGetCallTrace.

Implementation based on JVMTI + AsyncGetCallTrace

As mentioned in the previous section, if we have a function that can get the call stack of the current thread without being disturbed by the security point, and it also supports being called asynchronously in the UNIX signal processor, then we only need to register a UNIX signal processor and call this function in Handler to get the call stack of the current thread. Because the UNIX signal will be sent to a random thread of the process for processing, the final signal will be evenly distributed across all threads, and the call stack samples of all threads will be obtained evenly.

A function called AsyncGetCallTrace is provided internally in OracleJDK/OpenJDK, and its prototype is as follows:

/ / Stack frame

Typedef struct {

Jint lineno

JmethodID method_id

} AGCT_CallFrame

/ / call stack

Typedef struct {

JNIEnv * env

Jint num_frames

AGCT_CallFrame * frames

} AGCT_CallTrace

/ / fill the call stack into the trace pointer according to ucontext

Void AsyncGetCallTrace (AGCT_CallTrace * trace, jint depth, void * ucontext)

As you can see from the prototype, the function is used in a very simple way, and the complete Java call stack can be obtained directly through ucontext.

As the name implies, AsyncGetCallTrace is "async" and is not affected by the security point, so sampling can occur at any time, including during the execution of Native code, during GC, etc., at this time we are unable to get the Java call stack. The num_frames field of AGCT_CallTrace normally identifies the obtained call stack depth, but in the abnormal case mentioned earlier, it is expressed as a negative number, and the most common-2 represents GC at the moment.

Because AsyncGetCallTrace is a non-standard JVMTI function, we can't find the function declaration in jvmti.h, and because its target file has already been linked into the JVM binary file, we can't get the function's address through a simple declaration, which needs to be solved by some Trick methods. To put it simply, Agent is eventually loaded into the address space of the target JVM process as a dynamic link library, so you can get the symbolic address named "AsyncGetCallTrace" in the current address space (that is, the target JVM process address space) through the dlsym () function provided by glibc. In this way, you get the pointer to the function, and after you convert the type according to the above prototype, you can call it normally.

The general process of implementing CPU Profiler through AsyncGetCallTrace:

1. Write Agent_OnLoad (), get the jvmtiEnv and AsyncGetCallTrace pointers at the entry, and get the AsyncGetCallTrace as follows:

Typedef void (* AsyncGetCallTrace) (AGCT_CallTrace * traces, jint depth, void * ucontext)

/ /...

AsyncGetCallTrace agct_ptr = (AsyncGetCallTrace) dlsym (RTLD_DEFAULT, "AsyncGetCallTrace")

If (agct_ptr = = NULL) {

Void * libjvm = dlopen ("libjvm.so", RTLD_NOW)

If (! libjvm) {

/ / deal with dlerror ().

}

Agct_ptr = (AsyncGetCallTrace) dlsym (libjvm, "AsyncGetCallTrace")

}

two。 One more thing we need to do in the OnLoad phase is to register the two Hook OnClassLoad and OnClassPrepare, because jmethodID is delayed allocation, and using AGCT to get Traces depends on pre-allocated data. We try to get all the Methods of the Class in the CallBack of OnClassPrepare, so that JVMTI allocates the jmethodID of all methods in advance, as shown below:

Void JNICALL OnClassLoad (jvmtiEnv * jvmti, JNIEnv* jni, jthread thread, jclass klass) {}

Void JNICALL OnClassPrepare (jvmtiEnv * jvmti, JNIEnv * jni, jthread thread, jclass klass) {

Jint method_count

JmethodID * methods

Jvmti- > GetClassMethods (klass, & method_count, & methods)

Delete [] methods

}

/ /...

JvmtiEventCallbacks callbacks = {0}

Callbacks.ClassLoad = OnClassLoad

Callbacks.ClassPrepare = OnClassPrepare

Jvmti- > SetEventCallbacks (& callbacks, sizeof (callbacks))

Jvmti- > SetEventNotificationMode (JVMTI_ENABLE, JVMTI_EVENT_CLASS_LOAD, NULL)

Jvmti- > SetEventNotificationMode (JVMTI_ENABLE, JVMTI_EVENT_CLASS_PREPARE, NULL)

3. SIGPROF signal is used for timing sampling:

/ / the ucontext transmitted by the signal handler here is the ucontext required by AsyncGetCallTrace

Void signal_handler (int signo, siginfo_t * siginfo, void * ucontext) {

/ / use AsyncCallTrace for sampling, and pay attention to handling exceptions with negative num_frames

}

/ /...

/ / register the handler of the SIGPROF signal

Struct sigaction sa

Sigemptyset & sa.sa_mask)

Sa.sa_sigaction = signal_handler

Sa.sa_flags = SA_RESTART | SA_SIGINFO

Sigaction (SIGPROF, & sa, NULL)

/ / generate SIGPROF signal at regular intervals

/ / interval is the sampling interval represented by nanoseconds, and AsyncGetCallTrace can be suitably high frequency relative to synchronous sampling.

Long sec = interval / 1000000000

Long usec = (interval% 1000000000) / 1000

Struct itimerval tv = {{sec, usec}, {sec, usec}}

Setitimer (ITIMER_PROF, & tv, NULL)

4. Save the sampling results of each time in Buffer and finally generate the necessary statistical data.

AsyncGetCallTrace-based CPU Profiler can be realized by following the above steps, which is currently the lowest performance overhead and the most efficient CPU Profiler implementation in the community. In the Linux environment, combined with perf_events, we can also sample Java stack and Native stack at the same time, and analyze the performance hotspots in Native code at the same time. The typical open source implementations of this approach are Async-Profiler and Honest-Profiler,Async-Profiler with high quality. If you are interested, you are advised to read the reference source code. Interestingly, IntelliJ IDEA's built-in Java Profiler is actually the packaging of Async-Profiler. For more information about AsyncGetCallTrace, please refer to "The Pros and Cons of AsyncGetCallTrace Profilers".

Generate performance flame diagram

Now we have the ability to sample and invoke the stack, but the call stack sample set exists in memory in the form of a two-dimensional array data structure, how to convert it into a visual flame diagram?

The flame image is usually a svg file, and some excellent projects can automatically generate the flame image file according to the text file, and only have certain requirements for the format of the text file. The core of the FlameGraph project is just a Perl script, which can generate the corresponding flame diagram svg file according to the call stack text we provide. The text format of the call stack is fairly simple, as follows:

Base_func;func1;func2;func3 10

Base_func;funca;funcb 15

After integrating the call stack sample set we sampled, we need to output the text format shown above. Each line represents a "class" call stack, the left side of the space is the method name arrangement of the call stack, separated by semicolons, the bottom of the left stack, the top of the right stack, and the right side of the space is the number of times the sample appears.

Give the sample file to the flamegraph.pl script for execution, and you can output the corresponding flame diagram:

$flamegraph.pl stacktraces.txt > stacktraces.svg

The effect is shown in the following figure:

Flame diagrams generated by flamegraph.pl

Analysis of Dynamic Attach Mechanism of HotSpot

So far, we have learned the complete working principle of CPU Profiler. However, students who have used JProfiler/Arthas may have questions. In many cases, you can directly Profling online running services without adding Agent parameters to the startup parameters of the Java process. The answer is Dynamic Attach.

JDK provides Attach API after 1.6, which allows you to add Agent to running JVM processes, which is widely used in various Profiler and bytecode enhancement tools. The official introduction is as follows:

This is a Sun extension that allows a tool to 'attach' to another process running Java code and launch a JVM TI agent or a java.lang.instrument agent in that process.

In general, Dynamic Attach is a special capability provided by HotSpot that allows one process to send commands to another running JVM process and execute them. Commands are not limited to loading Agent, but also include Dump memory, Dump threads, and so on.

Attach via sun.tools

Although Attach is a capability provided by HotSpot, JDK also encapsulates it at the Java level.

As mentioned earlier, for Java Agent, the PreMain method is executed when Agent is run as a startup parameter, but we can actually implement an additional AgentMain method and specify Agent-Class as the Class in MANIFEST.MF:

Public static void agentmain (String args, Instrumentation ins) {

/ / implement

}

The packaged jar can be started as a-javaagent parameter or can be Attach to the running target JVM process. JDK has encapsulated a simple API. Let's directly Attach a Java Agent, which is demonstrated by the code in Arthas:

/ / com/taobao/arthas/core/Arthas.java

Import com.sun.tools.attach.VirtualMachine

Import com.sun.tools.attach.VirtualMachineDescriptor

/ /...

Private void attachAgent (Configure configure) throws Exception {

VirtualMachineDescriptor virtualMachineDescriptor = null

/ / get all the JVM processes and find the target process

For (VirtualMachineDescriptor descriptor: VirtualMachine.list ()) {

String pid = descriptor.id ()

If (pid.equals (Integer.toString (configure.getJavaPid () {

VirtualMachineDescriptor = descriptor

}

VirtualMachine virtualMachine = null

Try {

/ / call the VirtualMachine.attach () method for a JVM process to get the VirtualMachine instance

If (null = = virtualMachineDescriptor) {

VirtualMachine = VirtualMachine.attach ("" + configure.getJavaPid ())

} else {

VirtualMachine = VirtualMachine.attach (virtualMachineDescriptor)

}

/ /...

/ / call VirtualMachine#loadAgent () to add the jar attach specified by arthasAgentPath to the target JVM process

/ / the second parameter is the attach parameter, that is, the first String parameter args of agentmain

VirtualMachine.loadAgent (arthasAgentPath, configure.getArthasCore () + ";" + configure.toString ())

} finally {

If (null! = virtualMachine) {

/ / call VirtualMachine#detach () to release

VirtualMachine.detach ()

}

Attach HotSpot directly

Sun.tools encapsulated API is simple and easy to use, but it can only be written in Java and can only be used on Java Agent, so sometimes we have to manually Attach the JVM process directly. For JVMTI, in addition to Agent_OnLoad (), we also need to implement an Agent_OnAttach () function that starts when JVMTI Agent Attach is delivered to the target process:

/ / $JAVA_HOME/include/jvmti.h

JNIEXPORT jint JNICALL Agent_OnAttach (JavaVM * vm, char * options, void * reserved)

Let's take the jattach source code in Async-Profiler as a clue to explore how to use the Attach mechanism to send commands to running JVM processes. Jattach is a Driver provided by Async-Profiler, which is used in an intuitive way:

Usage:

Jattach [args...]

Args:

Process ID of the target JVM process

Command to be executed

Command parameter

The mode of use is as follows:

$jattach 1234 load / absolute/path/to/agent/libagent.so true

Execute the above command, and libagent.so is loaded into the JVM process with ID 1234 and starts executing the Agent_OnAttach function. It is important to note that the processes euid and egid that execute Attach must be the same as the target JVM process being Attach. Next, start analyzing the jattach source code.

The Main function shown below describes the overall flow of an Attach:

/ / async-profiler/src/jattach/jattach.c

Int main (int argc, char** argv) {

/ / parse command line parameters

/ / check euid and egid

/ /...

If (! check_socket (nspid) & &! start_attach_mechanism (pid, nspid)) {

Perror ("Could not start attach mechanism")

Return 1

}

Int fd = connect_socket (nspid)

If (fd =-1) {

Perror ("Could not connect to socket")

Return 1

}

Printf ("Connected to remote JVM\ n")

If (! write_command (fd, argc-2, argv + 2)) {

Perror ("Error writing to socket")

Close (fd)

Return 1

}

Printf ("Response code =")

Fflush (stdout)

Int result = read_response (fd)

Close (fd)

Return result

}

Ignore the process of parsing command-line arguments and checking euid and egid. Jattach first calls the check_socket function to do a "socket check?" The check_socket source code is as follows:

/ / async-profiler/src/jattach/jattach.c

/ / Check if remote JVM has already opened socket for Dynamic Attach

Static int check_socket (int pid) {

Char path[MAX _ PATH]

Snprintf (path, MAX_PATH, "% s/.java_pid%d", get_temp_directory (), pid); / / get_temp_directory () always returns "/ tmp" under Linux

Struct stat stats

Return stat (path, & stats) = = 0 & & S_ISSOCK (stats.st_mode)

}

We know that the UNIX operating system provides a file-based Socket interface called "UNIX Socket" (a common way of interprocess communication). Use the S_ISSOCK macro in this function to determine whether the file is bound to UNIX Socket, so, "/ tmp/.java_pid"

Refer to the official documents and get the following description:

The attach listener thread then communicates with the source JVM in an OS dependent manner:

On Solaris, the Doors IPC mechanism is used. The door is attached to a file in the file system so that clients can access it.

On Linux, a Unix domain socket is used. This socket is bound to a file in the filesystem so that clients can access it.

On Windows, the created thread is given the name of a pipe which is served by the client. The result of the operations are written to this pipe by the target JVM.

It proves that our conjecture is correct. So far, the purpose of the check_socket function is easy to understand: to determine whether an UNIX Socket connection has been established between the external process and the target JVM process.

Back to the Main function, after using check_socket to determine that the connection has not been established, the start_attach_mechanism function is called immediately. The function name intuitively describes its function. The source code is as follows:

/ / async-profiler/src/jattach/jattach.c

/ / Force remote JVM to start Attach listener.

/ / HotSpot will start Attach listener in response to SIGQUIT if it sees .attach _ pid file

Static int start_attach_mechanism (int pid, int nspid) {

Char path[MAX _ PATH]

Snprintf (path, MAX_PATH, "/ proc/%d/cwd/.attach_pid%d", nspid, nspid)

Int fd = creat (path, 0660)

If (fd = =-1 | | (close (fd) = = 0 & &! check_file_owner (path) {

/ / Failed to create attach trigger in current directory. Retry in / tmp

Snprintf (path, MAX_PATH, "% s/.attach_pid%d", get_temp_directory (), nspid)

Fd = creat (path, 0660)

If (fd =-1) {

Return 0

}

Close (fd)

}

/ / We have to still use the host namespace pid here for the kill () call

Kill (pid, SIGQUIT)

/ / Start with 20 ms sleep and increment delay each iteration

Struct timespec ts = {0, 20000000}

Int result

Do {

Nanosleep (& ts, NULL)

Result = check_socket (nspid)

} while (! result & & (ts.tv_nsec + = 20000000) < 300000000)

Unlink (path)

Return result

}

The start_attach_mechanism function first creates a file called "/ tmp/.attach_pid"

From this point of view, HotSpot seems to provide a special mechanism, as long as it is sent a SIGQUIT signal and prepared in advance. Compare _ pid

Refer to the documentation and get the following description:

Dynamic attach has an attach listener thread in the target JVM. This is a thread that is started when the first attach request occurs. On Linux and Solaris, the client creates a file named. Pid _ pid (pid) and sends a SIGQUIT to the target JVM process. The existence of this file causes the SIGQUIT handler in HotSpot to start the attach listener thread. On Windows, the client uses the Win32 CreateRemoteThread function to create a new thread in the target process.

This makes it clear that on Linux we just need to create a "/ tmp/.attach_pid"

Going on to look at the source code of jattach, sure enough, it called the connect_socket function pair "/ tmp/.java_pid"

/ / async-profiler/src/jattach/jattach.c

/ / Connect to UNIX domain socket created by JVM for Dynamic Attach

Static int connect_socket (int pid) {

Int fd = socket (PF_UNIX, SOCK_STREAM, 0)

If (fd =-1) {

Return-1

}

Struct sockaddr_un addr

Addr.sun_family = AF_UNIX

Snprintf (addr.sun_path, sizeof (addr.sun_path), "% s/.java_pid%d", get_temp_directory (), pid)

If (connect (fd, (struct sockaddr*) & addr, sizeof (addr)) =-1) {

Close (fd)

Return-1

}

Return fd

}

A very common Socket creation function that returns a Socket file descriptor.

Back to the Main function, the main flow then calls the write_command function to write the parameters passed from the command line to the Socket, and calls the read_response function to receive the data returned from the target JVM process. There are two common Socket read and write functions. The source codes are as follows:

/ / async-profiler/src/jattach/jattach.c

/ / Send command with arguments to socket

Static int write_command (int fd, int argc, char** argv) {

/ / Protocol version

If (write (fd, "1", 2)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.