How to understand and master JVM 07/11 Update SLTechnology News&Howtos

How to understand and master JVM

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to understand and master JVM". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Basic introduction of JVM

JVM is the abbreviation of Java Virtual Machine, it is a fictional computer, a specification. It is realized by simulating all kinds of computer functions on the actual computer.

Well, in fact, putting aside such professional sentences, we know that JVM is actually similar to a small computer running under operating systems such as windows or linux. It interacts directly with the operating system and does not interact directly with the hardware, but the operating system can help us to interact with the hardware.

1.1 how Java files are run

For example, if we have now written a HelloWorld.java, then this HelloWorld.java will put aside everything, is it similar to a text file, but this text file is all written in English and has a certain indentation.

So our JVM doesn't recognize text files, so it needs a compiler to make it a HelloWorld.class that can read binary files.

① class loader

If JVM wants to execute the .class file, we need to load it into a class loader, which, like a porter, will move all the .class files into JVM.

② method area

The method area is used to store data similar to metadata information, such as class information, constants, static variables, compiled code, etc.

When the class loader moves the .class file here, it throws it into this block first.

③ reactor

The heap mainly contains some stored data, such as object instances, arrays, etc., which belongs to the thread sharing area as well as the method area. In other words, they are all thread-unsafe.

④ stack

Stack this is our code running space. Every method we write is run on the stack.

We have heard of the terms local method stack or local method interface, but we basically don't cover these two pieces. The two underlying parts work in C and have nothing to do with Java.

⑤ program counter

The main thing is to complete a load, similar to a pointer, pointing to the next line of code that we need to execute. Like the stack, it is exclusive to threads, that is, each thread will have its own corresponding area without the problem of concurrency and multithreading.

Small summary

The Java file is compiled into a .class bytecode file

The bytecode file is moved to the JVM virtual machine through the class loader

The main five blocks of the virtual machine: method area, heap are thread shared areas, there are thread safety problems, stack and local method stacks and counters are exclusive areas, there are no thread safety problems, and the tuning of JVM is mainly around the heap and stack.

1.2 simple code example

A simple student class

A main method

The steps to execute the main method are as follows:

After compiling App.java to get App.class, execute App.class, the system will start a JVM process, find a binary file called App.class from the classpath path, and load the class information of App into the method area of the runtime data area. This process is called App class loading.

JVM finds the main program entry of App and executes the main method

The first statement in this main is Student student = new Student ("tellUrDream"), which asks JVM to create a Student object, but at this time there is no information about the Student class in the method area, so JVM immediately loads the Student class and puts the information of the Student class into the method area.

After loading the Student class, JVM allocates memory in the heap for a new Student instance, and then calls the constructor to initialize the Student instance, which holds a reference to the type information of the Student class in the method area.

When student.sayName (); is executed, JVM finds the student object based on the reference of student, then navigates to the method table of the type information of the student class in the method area based on the reference held by the student object, and obtains the bytecode address of sayName ().

Execute sayName ()

In fact, do not care too much, just need to know that the object instance initialization will go to the method area to find class information, and then go to the stack to run the method. The way to find it is in the method table.

2. Introduction of class loader

As mentioned earlier, it is responsible for loading .class files, which will have a specific file tag at the beginning of the file, load the bytecode contents of the class file into memory and convert these contents into the runtime data structure in the method area, and ClassLoader is only responsible for loading the class file, while it is up to Execution Engine to decide whether it can be run or not.

2.1 flow of class loaders

From the time the class is loaded into the virtual machine memory, there are seven steps to releasing memory: load, verify, prepare, parse, initialize, use, and unload. The three parts of verification, preparation and analysis are collectively referred to as connections.

2.1.1 loading

Load the class file into memory

Convert static data structures to run-time data structures in the method area

Generate a java.lang.Class object representing this class in the heap as an entry for data access

2.1.2 Link

Verification: ensuring that the loaded class conforms to the JVM specification and security, and that the methods of the verified class will not make events that harm the virtual machine at run time, which is actually a security check.

Prepare: allocate memory space for the static variable in the method area and set the initial value of the variable, for example, static int a = 3 (Note: in the preparation phase, only static variables in the class (in the method area) are set, excluding instance variables (in heap memory), which are assigned when the object is initialized)

Parsing: the process by which a virtual machine replaces a symbolic reference in a constant pool with a direct reference (for example, I am now import java.util.ArrayList, this is a symbolic reference, and a direct reference is a pointer or object address. Note that the referenced object must be in memory)

2.1.3 initialization

Initialization is actually an assignment operation that executes the () method of a class constructor. The compiler automatically collects the assignment actions of all variables in the class. The example of static int a = 3 in the preparation phase is officially assigned to 3 at this time.

2.1.4 Uninstall

GC unloads useless objects from memory

2.2 loading order of class loaders

The order in which a Class class is loaded is also prioritized, and the order in which the class loader starts from the bottom up is like this

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

BootStrap ClassLoader:rt.jar

Extention ClassLoader: load the extended jar package

App ClassLoader: the jar package under the specified classpath

Custom ClassLoader: custom class loader

2.3 Parental appointment mechanism

When a class receives a load request, it will not try to load it first, but delegate it to the parent class to complete it. For example, I want to new a Person, this Person is our custom class, and if we want to load it, we will delegate App ClassLoader first, only if the parent class loader reports that it cannot complete the request (that is, the parent class loader has not found the Class needed to load). The subclass loader will try to load on its own.

The advantage of this is that no matter which loader loads the class in the rt.jar package, it will eventually be delegated to BootStrap ClassLoader to load, which ensures that different class loaders will get the same result.

In fact, this is also an isolation effect, preventing our code from affecting the JDK code. For example, I'm going to have one now.

Public class String () {public static void main () {sout;}}

At times like this, our code must report an error, because when loading, we actually find the String.class in rt.jar, and then we find that there is no main method.

III. Runtime data area

3.1 Local method stack and program counters

For example, when we click on the source code of the Thread class, we will see that its start0 method is modified with a native keyword, and there is no method body. This method decorated with native is the local method, which is implemented using C, and then these methods are usually placed in an area called the local method stack.

The program counter is actually a pointer that points to the next instruction in our program that needs to be executed. It is also the only area in the memory area where OutOfMemoryError does not appear, and the memory space is so small that it is negligible. This memory only represents the line number indicator of the bytecode executed by the current thread, and the bytecode parser selects the next bytecode instruction to be executed by changing the value of this counter.

If the native method is executed, the pointer will not work.

3.2 method area

The main function technology of the method area stores the metadata information, constants and static variables of the class. When the information it stores is too large, it will make an error when it cannot meet the memory allocation.

3.3 Virtual machine stack and virtual machine heap

In a word: stack tube operation, stack tube storage. The virtual machine stack is responsible for running the code, and the virtual machine stack is responsible for storing data.

3.3.1 the concept of virtual machine stack

It is the memory model that the Java method executes. It stores local variables, dynamic linked lists, method exits, and stack operations (on-and off-stack), and is exclusive to threads. At the same time, if we hear the local variable table, we are also talking about the virtual machine stack.

Public class Person {

Int a = 1

Public void doSomething () {

Int b = 2

}

3.3.2 exceptions in the virtual machine stack

If the depth of the stack requested by the thread is greater than the maximum depth of the virtual machine stack, a StackOverflowError is reported (this error often occurs in recursion). The Java virtual machine can also be extended dynamically, but as the extension continues to request memory, it will report an error OutOfMemoryError when it cannot request enough memory.

3.3.3 Life cycle of virtual machine stack

For stacks, there is no garbage collection. As soon as the program is finished, the space in the stack will be released naturally. The life cycle of the stack is the same as that of the thread.

Add here: 8 basic types of variables + object reference variables + instance methods all allocate memory in the stack.

3.3.4 execution of virtual machine stack

We often talk about stack frame data, to put it bluntly in JVM called stack frame, put into Java is actually the method, it is also stored in the stack.

The data in the stack exists in the format of stack frames, which is a data set about methods and runtime data. For example, if we execute a method a, it will generate a stack frame A1, and then A1 will be pushed into the stack. In the same way, method b will have a B1, method c will have a C1, and when the thread is finished, the stack will pop up C1 first, followed by B1, A1. It is a first-in-last-out, last-in-first-out principle.

3.3.5 reuse of local variables

The local variable table is used to store the method parameters and the local variables defined within the method. Its capacity is in Slot as the smallest unit, a slot can store less than 32-bit data types.

The virtual machine uses the local variable table by index positioning, with a range of [0, the number of slot of the local variable table]. The parameters in the method will be arranged in this local variable table in a certain order, but we don't care about how to arrange them. In order to save stack frame space, these slot can be reused, and when the method execution position exceeds a variable, then the slot of that variable can be reused by other variables. Of course, if we need to reuse, then our garbage collection will not touch the memory naturally.

3.3.6 the concept of virtual machine heap

JVM memory is divided into heap memory and non-heap memory, heap memory is divided into younger and older generations, and non-heap memory is permanent. The younger generation will be divided into Eden and Survivor regions. Survivor is also divided into FromPlace and ToPlace,toPlace. The survivor area is empty. The default percentage for Eden,FromPlace and ToPlace is 8:1:1. Of course, this thing can also be dynamically adjusted according to the rate of the generated object through a-XX:+UsePSAdaptiveSurvivorSizePolicy parameter.

Objects are stored in heap memory, and garbage collection is to collect these objects and give them to the GC algorithm for collection. Non-heap memory, as we have already said, is the method area. The permanent generation has been removed in 1.8. the substitute is a MetaSpace. The biggest difference is that metaSpace does not exist in JVM, it uses local memory. And has two parameters.

MetaspaceSize: initialize meta-space size, control the occurrence of GC MaxMetaspaceSize: limit the upper limit of meta-space size, and prevent excessive use of physical memory.

You can get a general idea of the reason for the removal: the change made by merging HotSpot JVM and JRockit VM, because JRockit does not have a permanent generation, but this also indirectly solves the OOM problem of the permanent generation.

3.3.7 introduction of the younger generation of Eden

When we new an object, we first put a piece of memory divided by Eden as the storage space, but we know that the heap memory is shared by threads, so it is possible for two objects to share the same memory. Here, the processing of JVM is that each thread will apply for a continuous piece of memory space in advance and specify the location of the object, and if the space is insufficient, it will apply for more than one block of memory space. This operation will be called TLAB, if you are interested, you can learn about it.

When the Eden space is full, an operation called Minor GC (a GC that occurs in the younger generation) is triggered, and the surviving objects are moved to the Survivor0 area. When Minor GC is triggered when the Survivor0 area is full, the surviving object will be moved to the Survivor1 area, and the two pointers of from and to will be exchanged, thus ensuring that a survivor area will always be empty and the survivor area pointed to by to will be empty within a period of time. Objects that are still alive after multiple Minor GC (the survival judgment here is 15 times, and the corresponding parameter to the virtual machine is-XX:TargetSurvivorRatio. Why 15, because HotSpot will record the age in the tag field of the object, and the space allocated is only 4 digits, so it can only be recorded up to 15) will move to the old age. In the old days, long-lived objects were stored, and when full, it would trigger the most commonly heard of Full GC, during which all threads would stop waiting for the GC to complete. Therefore, for applications with high response requirements, we should try to reduce the occurrence of Full GC so as to avoid the problem of response timeout.

Moreover, when the full gc is still unable to save objects in the old area, OOM will be generated. At this time, the heap memory in the virtual machine is insufficient. The reason may be that the heap memory setting is too small, which can be adjusted by the parameters-Xms and-Xms. It is also possible that the objects created in the code are large and large, and they are constantly being referenced so that they cannot be collected by garbage collection for a long time.

3.3.8 how to determine that an object needs to be killed

In the figure, the program counter, the virtual machine stack and the local method stack exist with the survival of the thread. Memory allocation and recycling are determined. Memory is naturally reclaimed as the thread ends, so there is no need to think about garbage collection. However, the Java heap and the method area are different, each thread is shared, and the memory allocation and recycling are dynamic. So the garbage collector focuses on heap and method memory.

Before recycling, it is necessary to determine which objects are still alive and which are dead. Here are two basic calculation methods

1. Reference counter calculation: add a reference counter to the object, add one each time the object is referenced, minus one when the reference expires, and will not be used again when the counter equals 0. However, there is a situation in which the GC cannot be recycled when a circular reference to an object occurs.

two。 Reachability analysis and calculation: this is an implementation similar to a binary tree, which takes a series of GC ROOTS as the starting set of living objects, searches down from this node, and the path of the search becomes a reference chain, adding objects that can be referenced by the set to the collection. Search when an object to the GC Roots does not use any reference chain, the object is not available. Mainstream commercial programming languages, such as Java,C#, rely on this trick to determine whether an object is alive or not.

(just learn about it) the objects that can be used as GC Roots in Java language are divided into the following categories:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Objects referenced in the virtual machine stack (local method table in stack frames) (local variables)

The object referenced by a static variable in the method area (static variable)

Objects referenced by constants in the method area

The object referenced by JNI in the local method stack (that is, the native-decorated method) (JNI is the way the Java virtual machine calls the corresponding C function, and new Java objects can also be created through the JNI function. And both local and global references to objects by JNI mark the objects they point to as unrecyclable)

Java threads that have been started and not terminated

The advantage of this method is that it can solve the problem of circular reference, but its implementation requires a lot of resources and time, as well as GC (its parsing process reference relationship cannot be changed, so all processes need to be stopped).

3.3.9 how to declare the true death of an object

The first thing that must be mentioned is a method called finalize ()

Finalize () is a method of the Object class. The finalize () method of an object is automatically called by the system only once, escapes the dead object through the finalize () method, and will not be called the second time.

Add: it is not recommended to call finalize () in the program to save yourself. It is recommended to forget the existence of this method in Java programs. Because the timing of its execution is uncertain, even if it is executed (abnormal exit of the Java program), and it is expensive to run, there is no guarantee of the order in which each object is called (even in different threads). It has been marked as deprecated in Java9, and it has been gradually replaced in java.lang.ref.Cleaner (that is, the set of strong, soft, weak, phantom references), which will be more lightweight and reliable than finalize.

It takes at least two marks to judge the death of an object.

If the object does not find a chain of references linked to the GC Roots after reachability analysis, it will be tagged for the first time and filtered. The condition of judgment is to determine whether the object needs to execute the finalize () method. If it is necessary for the object to execute the finalize () method, it is placed in the F-Queue queue.

GC secondary marks the objects in the F-Queue queue. If the object is re-associated with any object on the reference chain in the finalize () method, it is moved out of the "about to recycle" collection when it is tagged twice. If the object has not successfully escaped at this time, it can only be recycled.

If we are sure that the object is dead, how can we recycle the garbage?

3.4 garbage collection algorithm

It will not be expanded in great detail, and the commonly used algorithms for tagging, copying, tagging and generational collection

3.4.1 Mark removal algorithm

The tag removal algorithm is divided into two stages: "marking" and "clearing". Mark all the objects that need to be recycled, and collect them uniformly at the end of the tag. This routine is very simple, but also has shortcomings, the follow-up algorithms are based on this basis to improve.

In fact, it marks the dead object as free memory and records it in a free list. When we need to new an object, the memory management module will find free memory from the free list to allocate to the new object.

The deficiency is that the efficiency of marking and removal is relatively low. And this will result in a lot of fragmentation in memory. This results in not being able to allocate enough contiguous memory if we need to use larger blocks of memory. For example, the following picture

At this point, the available memory blocks are scattered, which leads to the problem of large memory objects just mentioned.

3.4.2 replication algorithm

In order to solve the problem of efficiency, the replication algorithm appeared. It divides the available memory into two equal parts according to capacity, using only one of them at a time. Like survivor, it is also played with from and to pointers. When the fromPlace is full, copy the surviving object to another toPlace and exchange the contents of the pointer. This solves the problem of fragments.

The cost of this algorithm is to shrink the memory, so the use of heap memory will become very inefficient.

However, they are not allocated according to 1:1, just like Eden and Survivor are not equal distribution is the same reason.

3.4.3 Mark finishing algorithm

The replication algorithm will have some efficiency problems when the object survival rate is high, and the marking process is still the same as the "mark-clear" algorithm, but the next step is not to clean up the recyclable objects directly, but to move all the surviving objects to one end, and then directly clean up the memory beyond the boundary.

3.4.4 Generation collection algorithm

There is no new idea in this algorithm, but the memory is divided into several blocks according to the survival period of the object. Generally, the Java heap is divided into the new generation and the old age, so that the most appropriate collection algorithm can be adopted according to the characteristics of each age. In the new generation, each garbage collection found that a large number of objects died, only a small number of survival, then choose the replication algorithm, only need to pay a small amount of replication cost of surviving objects to complete the collection. In the old days, because the object had a high survival rate and no extra space to allocate guarantee, it was necessary to use the "mark-clean" or "mark-organize" algorithm for recycling.

To put it bluntly, each of the eight Immortals showed their magical powers across the sea, and the specific problems were analyzed in detail.

3.5 (understand) various garbage collectors

Garbage collector in HotSpot VM and applicable scenarios

Until jdk8, the default garbage collectors are Parallel Scavenge and Parallel Old

Starting with jdk9, the G1 collector becomes the default garbage collector

At present, the G1 recycler has the shortest pause time and no obvious shortcomings, so it is very suitable for Web applications. When the Web application is tested in jdk8, the heap memory is 6G and the new generation 4.5G, the Parallel Scavenge reclaims the new generation pauses for as long as 1.5s. The G1 recycler only paused for 0.2 seconds to reclaim a new generation of the same size.

3.6 (understand) the common parameters of JVM

There are so many parameters of JVM, here are only a few of the more important, through a variety of search engines can also know this information.

The meaning of JVM parameter

In fact, there are some printing and CMS parameters, so I won't list them one by one here.

IV. Some aspects of JVM tuning

Based on the knowledge of jvm just involved, we can try to tune JVM, mainly the heap memory.

Shared data area size for all threads = Cenozoic generation size + old generation size + persistent generation size. The permanent generation usually has a fixed size of 64m. So adding the younger generation to the java heap will reduce the size of the older generation (because the old cleanup uses fullgc, so if the old age is too young, it will increase fullgc). This value has a great impact on system performance, and Sun officially recommends that it be configured as 3Universe 8 of the java heap.

4.1 adjust maximum and minimum heap memory

-Xmx-Xms: specifies the maximum value of the java heap (the default is 1x4 of physical memory (

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.