What is the principle of Java garbage collection mechanism? 07/01 Update SLTechnology News&Howtos

What is the principle of Java garbage collection mechanism?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "what is the principle of Java garbage collection mechanism". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the principle of Java garbage collection mechanism"?

Java garbage collection mechanism

1. Garbage collection mainly focuses on Java heap

The picture is from "Code out of efficiency".

The program counter, virtual machine stack and local method stack in the runtime area of Java memory rise and die with the thread, and the stack frames in the stack methodically perform the operation of unstack and stack with the entry and exit of the method. How much memory is allocated in each stack frame is basically known when the class structure is determined (although some optimizations will be made by the JIT compiler at run time), so memory allocation and recycling in these areas are deterministic, and there is no need to think too much about recycling, because when the method ends or the thread ends, the memory naturally follows the collection.

While the Java heap is different, multiple implementation classes in an interface may need different memory, and multiple branches of a method may need different memory. Only when the program is running can we know which objects will be created. The allocation and collection of this part of memory are dynamic, and this is the memory that the garbage collector focuses on.

two。 Determine which objects need to be recycled

There are two ways:

The reference counting method adds a reference counter to the object, and the counter value is increased by 1 for each reference; when the reference is invalid, the counter value is reduced by 1; when the counter is 0, the object can no longer be used, which is simple and efficient. The disadvantage is that it can not solve the problem of circular references between objects.

The reachability analysis algorithm uses a series of objects called "GC Roots" as the starting point to search down from these nodes, and the search path is called reference chain (Reference Chain). When an object is not connected to GC Roots with any reference chain, it is proved that the object is unavailable. This algorithm solves the problem of circular reference mentioned above.

In Java language, the objects that can be used as GC Roots include the following: a. The object referenced in the virtual machine stack (the local variable table in the stack frame). b. The object referenced by the class static property in the method area. c. An object referenced by a constant in the method area. d. The object referenced by the JNI (Native method) in the local method stack.

The node as a GC Roots is mainly in the context of global reference and execution. To be clear, the tracing gc must take the current set of objects as Roots, so the reference type object that determines the survival must be selected.

The area managed by GC is the Java heap and method area, and the virtual machine stack and local method stack are not managed by GC, so choosing the objects referenced in these areas as GC Roots will not be recycled by GC.

Both the virtual machine stack and the local method stack are private memory areas of the thread. As long as the thread does not terminate, it can ensure the survival of the referenced objects in them. In the method area, the objects referenced by the static properties of the class are obviously alive, and the objects referenced by constants may survive or be part of the GC Roots.

3. Strong, soft, weak, false citation

Before JDK1.2, an object had only two states: referenced and unreferenced.

Later, Java expanded the concept of citation and divided it into four categories: strong reference (Strong Reference), soft reference (Soft Reference), weak reference (Weak Reference) and virtual reference (Phantom Reference).

A strong reference is a reference that is ubiquitous in program code, such as "Object obj=new Object ()", and the garbage collector will never recycle surviving strong reference objects.

Soft references: objects that are also useful but not necessary. Before a memory overflow exception is about to occur on the system, these objects will be listed in the scope of collection for a second collection.

Weak references are also used to describe non-essential objects, and objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector works, objects associated with only weak references are recycled, regardless of whether there is enough memory.

Virtual reference is the weakest kind of reference relationship. An object instance cannot be obtained by virtual reference. The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector.

The picture is from "Code out of efficiency".

4. Reachability analysis algorithm

An unreachable object will be temporarily in the "probation" stage, and it takes at least two marking processes to actually declare an object dead:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

If an object does not have a reference chain connected to GC Roots after reachability analysis, it will be marked for the first time and filtered once if it is necessary for the object to execute the finalize () method.

When the object does not override the finalize () method, or the finalize () method has been called by the virtual machine, the virtual machine treats both cases as "unnecessary" and marks them directly for the second time.

If the object is determined to be necessary to execute the finalize () method, the object will be placed in a queue called F-Queue and later executed by a low-priority Finalizer thread automatically created by the virtual machine.

The so-called "execution" here means that the virtual machine triggers this method, but does not promise to wait for it to finish running, because if an object executes slowly in the finalize () method, it is likely to block the F-Queue queue all the time, or even cause the entire memory collection system to crash, and test the program:

Public class FinalizerTest {public static FinalizerTest object; public void isAlive () {System.out.println ("Ihumm alive");} @ Override protected void finalize () throws Throwable {super.finalize (); System.out.println ("method finalize is running"); object = this;} public static void main (String [] args) throws Exception {object = new FinalizerTest () / / when executed for the first time, the finalize method saves itself: object = null; System.gc (); Thread.sleep (500); if (object! = null) {object.isAlive ();} else {System.out.println ("iTunm dead") } / / for the second execution, the finalize method has already executed object = null; System.gc (); Thread.sleep (500); if (object! = null) {object.isAlive ();} else {System.out.println ("Ihumm dead");}

The things quoted from Java GC

The output is as follows:

Copymethod finalize is running I'm alive I'm dead

If you do not override finalize (), the output will be:

CopyI'm dead I'm dead

As can be seen from the implementation results:

The first time GC occurs, the finalize () method does execute and successfully escapes before it is recycled; the second time GC occurs, because the finalize () method is called only once by JVM, object is recycled.

It is worth noting that using the finalize () method to "save" objects is not worth advocating, because it is expensive and uncertain to run, and there is no guarantee of the order in which each object is called. Using try-finally or other methods is more appropriate and timely for the work that finalize () can do.

5. Recycling of permanent generation of Java reactor

There are two main parts of garbage collection in permanent generation: obsolete constants and useless classes.

Recycling obsolete constants is very similar to recycling objects in the Java heap. Take the collection of literals in the constant pool as an example, if a string "abc" has entered the constant pool, but there is no String object called "abc" in the current system, and this literal quantity is not referenced anywhere else, if memory collection occurs at this time, and if necessary, the "abc" constant will be cleaned out of the constant pool by the system. Symbolic references to other classes (interfaces), methods, and fields in the constant pool are similar.

A class needs to meet the following three conditions to be considered a "useless class":

All instances of the class have been recycled, that is, no instances of the class exist in the Java heap.

The ClassLoader that loaded the class has been recycled.

The corresponding java.lang.Class object of this class is not referenced anywhere, and its methods cannot be accessed anywhere through reflection.

Virtual machines can recycle useless classes that meet the above three conditions. Here we only say "yes", not like objects, which is bound to be recycled if it is not used.

In scenarios where ByteCode frameworks such as reflection, dynamic proxy, CGLib, dynamic generation of JSP and frequently customized ClassLoader such as OSGi are widely used, virtual machines are required to have the function of class unloading to ensure that permanent generations will not overflow.

Garbage collection algorithm

There are four types:

Mark-clear algorithm

Replication algorithm

Tag finishing algorithm

Generation collection algorithm

1. Mark-clear algorithm

The most basic collection algorithm is the "tag-clear" (Mark-Sweep) algorithm, which is divided into two stages: first, all the objects that need to be recycled are marked, and all the marked objects are uniformly reclaimed after the marking is completed.

It has two main shortcomings:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

In terms of efficiency, the efficiency of both marking and clearing processes is not high.

In the space problem, a large number of discontinuous memory fragments will be generated after mark removal, and too much space debris may lead to the need to allocate larger objects later when the program is running. unable to find enough continuous memory and have to trigger another garbage collection action in advance.

The execution process of the mark-clear algorithm is shown in the following figure.

two。 Replication algorithm

To solve the problem of recycling efficiency, a collection algorithm called Copying has emerged, which divides available memory into two equal-sized chunks of capacity, using only one of them at a time. When this piece of memory is used up, copy the surviving objects to another piece, and then clean up the used memory space at once.

In this way, the memory of the whole half area is reclaimed every time, and the complex situations such as memory fragments do not have to be considered in memory allocation, as long as the pointer at the top of the stack is moved and the memory is allocated sequentially, which is simple and efficient. It's just that the cost of this algorithm is to reduce the memory by half. The execution process of the replication algorithm is shown in the following figure:

Today's commercial virtual machines use this algorithm to recycle the new generation. IBM research pointed out that 98% of the objects in the new generation are "life and death", so there is no need to divide the memory space according to the proportion of 1:1. Instead, the memory is divided into a larger Eden space and two smaller Survivor spaces, each using Eden and one of the Survivor.

When recycling, copy the surviving objects in Eden and Survivor to another piece of Survivor space at once, and finally clean up the Eden and the Survivor space you just used. The HotSpot virtual machine defaults to Eden:Survivor = 8:1, which means that the memory space available in each new generation is 90% of the entire new generation capacity (one piece of Survivor is not available), and only 10% of the memory is "wasted".

Of course, 98% of the objects that can be recycled are only statistical data in general scenarios. There is no way to guarantee that no more than 10% of the objects will survive each time. When there is not enough Survivor space, you need to rely on other memory (in this case, the old age) for Handle Promotion.

The guarantee of memory allocation is like going to the bank to borrow money. If we have a good reputation and can repay the loan on time in 98% of cases, the bank may acquiesce that we can repay the loan on time and on time next time. As long as there is a guarantor who can guarantee that if I fail to repay the money, I can deduct money from his account, then the bank thinks there is no risk.

The same is true of the memory allocation guarantee, if another piece of Survivor space does not have enough space to store the living objects collected by the last Cenozoic generation, these objects will enter the old age directly through the allocation guarantee mechanism.

With regard to the content of allocating guarantees to the new generation, it will be explained in more detail when the garbage collector enforcement rules are explained in this chapter.

3. Marking-finishing algorithm

The replication algorithm will carry out more replication operations when the object survival rate is high, and the efficiency will become lower. More crucially, if you do not want to waste 50% of the space, you need to have additional space for memory allocation guarantees to deal with the extreme situation in which all objects in the used memory are 100% alive, so this algorithm generally could not be used directly in the old days.

According to the characteristics of the old years, someone proposed another "mark-clean" (Mark-Compact) algorithm, which is still the same as the "mark-clean" algorithm, but the next step is not to directly clean up the recyclable objects, but to make all the surviving objects move to one end, and then directly clean up the memory outside the end boundary. The schematic diagram of the "mark-clean" algorithm is as follows:

4. Generation collection algorithm

At present, the garbage collection of commercial virtual machines adopts the "generation collection" algorithm-Generational Collection, which divides the memory into several blocks and uses unused garbage collection algorithms according to the different survival periods of objects.

Generally, the Java heap is divided into the new generation and the old age, so that the most appropriate collection algorithm can be adopted according to the characteristics of each age.

In the new generation, during each garbage collection, it is found that a large number of objects die and only a small number of objects survive, so choose the replication algorithm and only pay a small amount of replication cost of surviving objects to complete garbage collection. In the old days, because the object had a high survival rate and no extra space to allocate guarantee, it was necessary to use the "mark-clean" or "mark-organize" algorithm for recycling.

Algorithm implementation of HotSpot

1. Enumerate the root node

Take the operation of finding reference chains from GC Roots nodes in reachability analysis as an example, nodes that can be used as GC Roots are mainly in global references (such as constant or quasi-static attributes) and execution context (such as local variables in stack frames). Now many applications only have hundreds of megabytes in the method area, so it is bound to take a lot of time to check the references one by one.

In addition, the sensitivity of the reachability analysis to the execution time is also reflected in the GC pause, because this analysis work must not appear the object reference relationship is still changing in the analysis process, otherwise the accuracy of the analysis results can not be guaranteed. This is one of the important reasons why all running Java threads of execution must be halted while GC is in progress (Sun calls this "Stop The World"), even in the CMS collector, which claims (almost) that no pause will occur, even when enumerating root nodes.

As a result, today's mainstream Java virtual machines use accurate GC (that is, virtual machines can know exactly what type of data is located in memory.) So when the execution system comes to a standstill, there is no need to check all the execution context and global reference locations, the virtual machine should have a way to know directly where the object references are stored.

In the implementation of HotSpot, a set of data structures called OopMap is used to achieve this purpose. When the class is loaded, HotSpot calculates what offset and what type of data in the object, and during the compilation of JIT, it will also record which locations in the stack and registers are references in a specific location, so GC can directly know this information when scanning.

two。 Safe Point (Safepoint)

With the help of OopMap, HotSpot can complete GC Roots enumerations quickly and accurately, but a real problem arises: it may lead to changes in reference relations, in other words, there are a lot of instructions that change the content of OopMap. If you generate a corresponding OopMap for each instruction, you will need a lot of extra space, so the space cost of GC will become very high.

In fact, HotSpot does not generate OopMap for every instruction, but as mentioned earlier, this information is recorded in specific locations called safe points, that is, programs can not be paused to start GC everywhere during execution, but only when the safe point is reached.

The selection of Safepoint should be neither so small as to have too little GC, nor so frequent as to overload the runtime.

For Safepoint, another issue to consider is how to get all threads to "run" to the nearest safe point and then stop when GC occurs. There are two options: preemptive interrupt (Preemptive Suspension) and active interrupt (Voluntary Suspension).

Among them, the preemptive interrupt does not need the execution code of the thread to cooperate actively. When GC occurs, all threads are interrupted first. If you find that the place where the thread is interrupted is not on the safe point, then restore the thread and let it "run" to the safe point. Few virtual machine implementations now use preemptive interrupts to pause threads in response to GC events.

The idea of active interrupt is that when GC needs to interrupt the thread, it does not operate directly on the thread, but simply sets a flag, each thread takes the initiative to poll this flag when it is executed, and suspends itself when it finds that the interrupt flag is true. The polling flag coincides with the security point, plus the place where memory needs to be allocated to create an object.

3. Security Zone (Safe Region)

Using Safepoint seems to have solved the problem of how to get into GC perfectly, but the actual situation is not necessarily.

The Safepoint mechanism ensures that when the program is executed, it will encounter a Safepoint security point that can enter the GC in a short time, but when the program "does not execute"?

The so-called program does not execute is not allocated CPU time, a typical example is the thread in the Sleep state or Blocked state, when the thread can not respond to the JVM virtual machine interrupt request, "go" to a safe place to interrupt suspension, JVM is obviously unlikely to wait for the thread to be reassigned CPU time. In this case, a security zone (Safe Region) is needed to resolve it.

A security zone means that the reference relationship does not change in a code snippet.

It is safe to start GC anywhere in this area. We can also think of Safe Region as an extended Safepoint. When the thread executes the code in the Safe Region, it first identifies that it has entered the Safe Region, so that when the JVM initiates the GC during that time, it does not care about the thread that identifies itself as the Safe Region state. When the thread leaves the Safe Region, it checks to see if the system has completed the root node enumeration (or the entire GC process), and if so, the thread continues to execute, otherwise it must wait until it receives a signal that it can safely leave the Safe Region.

Garbage collector

If the collection algorithm is the methodology of memory collection, then the garbage collector is the concrete implementation of memory collection. The collector discussed here is based on the HotSpot virtual machine after JDK 1.7 Update 14, which contains all the collectors shown in the following figure

The figure above shows seven collectors for different generations, and if there is a connection between the two collectors, they can be used together. The area where the virtual machine is located indicates whether it belongs to a new-generation collector or an old-age collector. Next, we will introduce the characteristics, basic principles and usage scenarios of these collectors one by one, and focus on the analysis of CMS and G1, two relatively complex collectors, to understand some of the details of their operation.

1. Serial collector (serial collector)

The Serial collector is the most basic and oldest collector, and it was once the only choice for the new generation of virtual machines. This is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will use only one CPU or one collection thread to complete garbage collection, but more importantly, when it does garbage collection, it must pause all other worker threads until it has finished collecting.

The name "Stop The World" may sound cool, but the work is actually initiated and done automatically by the virtual machine in the background, stopping all the user's working threads when the user is invisible, which is unacceptable for many applications. The following figure illustrates the operation of the Serial/Serial Old collector.

In fact, up to now, this collector is still the default new generation collector for virtual machines running in Client mode. It also has advantages over other collectors: simple and efficient (single-threaded ratio with other collectors). For an environment that limits a single CPU, the Serial collector can naturally achieve the highest single-threaded collection efficiency because it has no overhead of thread interaction.

In the user's desktop application scenario, the memory allocated to the virtual machine management is generally not very large, and the new generation that collects tens of megabytes or even one or two hundred megabytes of memory (only the memory used by the new generation, desktop applications basically will not be larger). The pause time can be controlled within tens of milliseconds or more than 100 milliseconds, as long as it does not occur frequently, this pause is acceptable. Therefore, the Serial collector is a good choice for virtual machines running in Client mode.

2. ParNew collector

The ParNew collector is actually the multithreaded version of the Serial collector. Except for using multiple threads for garbage collection, the other behaviors include all the control parameters available to the Serial collector (- XX:HandlePromotionFailure and-XX:PretenureSizeThreshold,-XX:SurvivorRatio, etc.), collection algorithm, Stop The World, object allocation rules, collection strategy, etc. are exactly the same as the Serial collector. These two collectors also share a considerable amount of code. The operation of the ParNew collector is shown in the following figure:

Apart from multithreaded collection, the ParNew collector does not have much innovation compared with the Serial collector, but it is the preferred new generation collector in many virtual machines running in Server mode. One of the performance-independent but important reasons is that, apart from the Serial collector, it is currently the only one that works with the CMS collector (concurrent collector, described later).

The ParNew collector will not perform better than the Serial collector in a single CPU environment, and even because of the overhead of thread interaction, the collector cannot be 100% guaranteed to surpass the Serial collector in both CPU environments implemented by hyperthreading technology.

Of course, with the increase in the number of CPU that can be used, it is very beneficial to the efficient use of system resources when GC. By default, it has the same number of collection threads as the number of CPU. In an environment with a very large number of CPU (such as 32), you can use the-XX:ParallelGCThreads parameter to limit the number of threads for garbage collection.

Note that starting with the ParNew collector, you will be exposed to several concurrent and parallel collectors. It is necessary to explain two nouns: concurrency and parallelism. Both nouns are concepts in concurrent programming, and they can be explained as follows in the context of talking about garbage collectors.

Parallel: refers to multiple garbage collection threads working in parallel, while the user thread is still waiting.

Concurrent: means that the user thread and the garbage collection thread execute at the same time (but not necessarily in parallel and may execute alternately), the user program continues to run, and the garbage collector runs on another CPU.

3. Parallel Scanvenge collector

Parallel Scavenge collector is a new generation collector, which is not only a collector using replication algorithm, but also a parallel multithreaded collector. It looks the same as ParNew, so what's so special about it?

The characteristic of Parallel Scavenge collector is that its focus is different from other collectors. Collectors such as CMS focus on shortening the pause time of user threads during garbage collection as much as possible, while the goal of Parallel Scavenge collector is to achieve a controllable throughput (Throughput).

The so-called throughput is the ratio of the time spent by CPU running user code to the total time consumed by CPU, that is, throughput = time to run user code / (time to run user code + garbage collection time). The virtual machine runs for a total of 100 minutes, of which garbage collection takes 1 minute, so the throughput is 99%.

The shorter the pause time is, the more suitable it is for programs that need to interact with users. Good response speed can improve user experience, while high throughput can efficiently use CPU time to complete program tasks as soon as possible, which is mainly suitable for tasks that operate in the background without too much interaction.

The Parallel Scavenge collector provides two parameters for precise control of throughput, the-XX:MaxGCPauseMillis parameter that controls the maximum garbage collection pause time and the-XX:GCTimeRatio parameter that directly sets the throughput size.

The value allowed for the MaxGCPauseMillis parameter is a number of milliseconds greater than 0, and the collector will try its best to ensure that the time taken to reclaim memory does not exceed the set value.

However, we should not think that if we set the value of this parameter a little lower, the garbage collection speed of the system will become faster. The reduction of GC pause time is achieved at the expense of throughput and space of the new generation: the system adjusts the size of the new generation, and collecting the new generation of 300MB must be faster than collecting 500MB. This also directly leads to more frequent garbage collection, which is collected once in 10 seconds with a pause of 100ms each. Now it is collected every 5 seconds with a pause of 70 milliseconds. It is true that pause time is falling, but so is throughput.

The value of the GCTimeRatio parameter should be an integer from 0 to 100, which is the ratio of garbage collection time to total time, which is equivalent to the reciprocal of throughput. If this parameter is set to 19, the maximum allowed GC time is 5% of the total time (that is, 1 / (1-19)), and the default value is 99, which is the maximum 1% (1 / (1-99)) garbage collection time allowed.

Because of its close relationship with throughput, Parallel Scavenge collectors are often called "throughput first" collectors. In addition to the above two parameters, the Parallel Scavenge collector has one more parameter-XX:+UseAdaptiveSizePolicy. This is a switch parameter. When this parameter is turned on, there is no need to manually specify detailed parameters such as the new generation size parameters-Xmn, the ratio parameter of Eden to Survivor-XX:SurvivorRatio, and the age threshold of the promotion object-XX:PretenureSizeThreshold. The virtual machine collects performance monitoring information according to the operation of the current system, and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput. This kind of regulation is called GC adaptive regulation strategy (GC Ergonomics).

4. Serial Old collector

Serial Old, an older version of the Serial collector, is also a single-threaded collector that uses the tag-collation algorithm. The main significance of this collector is also to be used by virtual machines in Client mode. If you are in Server mode, it has two main uses: one is to work with the Parallel Scavenge collector in JDK 1.5 and earlier, and the other is to serve as a backup scenario for the CMS collector when Concurrent Mode Failure occurs in concurrent collections. Both of these points will be explained in detail later. The operation of the Serial Old collector is shown in the following figure:

5. Parellel Old collector

Parallel Old is an old-fashioned version of the Parallel Scavenge collector, using multithreading and mark-up algorithms. This collector was only available in JDK 1.6. until then, the new generation of Parallel Scavenge collectors have been in an awkward state.

The reason is that if the new generation chooses the Parallel Scavenge collector, the old generation has no choice but the Serial Old (PS MarkSweep) collector (the Parallel Scavenge collector does not work with the CMS collector).

Due to the "drag" of the old Serial Old collector on the server application performance, the use of the Parallel Scavenge collector may not be able to maximize the throughput of the overall application, because the old single-threaded collection can not make full use of the server's multi-CPU processing capacity, in the old era of large and advanced hardware environment, the throughput of this combination may not even have the combination of ParNew and CMS "give power".

Until the emergence of the Parallel Old collector, the "throughput first" collector finally has a more veritable application combination. In situations that focus on throughput and CPU resource sensitivity, Parallel Scavenge plus Parallel Old collectors can be given priority. The operation of the Parallel Old collector is shown in the following figure:

6. CMS collector

CMS (Concurrent Mark Sweep) collector is a kind of collector whose goal is to obtain the shortest recovery pause time.

At present, a large part of Java applications are concentrated on the service side of the Internet website or Bmail S system. This kind of applications pay special attention to the response speed of the service, hoping that the system will have the shortest pause time in order to bring a better experience to users. The CMS collector meets the needs of such applications very well.

As can be seen from the name (including "Mark Sweep"), the CMS collector is implemented based on the "tag-clear" algorithm, and its operation process is more complex than the previous collectors. The whole process is divided into four steps, including:

Initial tag (CMS initial mark)

Concurrent tagging (CMS concurrent mark)

Relabel (CMS remark)

Concurrent cleanup (CMS concurrent sweep)

Among them, the two steps of initial tagging and relabeling still require "Stop The World". The initial tag only marks the objects to which GC Roots can be directly associated, which is very fast, and the concurrent marking stage is the process of GC RootsTracing, while the re-marking phase is to correct the marking records of that part of the object that changes due to the continued operation of the user program during the concurrent marking period. The pause time in this stage is generally slightly longer than the initial marking stage, but much shorter than the concurrent marking time.

Because the collector threads of the longest concurrent marking and concurrent cleanup processes throughout the process can work with the user thread, in general, the memory collection process of the CMS collector is executed concurrently with the user thread.

CMS is an excellent collector, its main advantages have been reflected in the name: concurrent collection, low pause, but CMS is far from perfect, it has the following three obvious disadvantages:

First, it leads to a decrease in throughput. The CMS collector is very sensitive to CPU resources. In fact, programs designed for concurrency are sensitive to CPU resources. In the concurrency phase, although it will not cause the user thread to stop, it will cause the application to slow down and the total throughput to decrease because it takes up some threads (or CPU resources).

By default, the number of recycling threads started by CMS is (CPU + 3) / 4, that is, when the CPU is more than 4, the garbage collection thread has no less than 25% of the CPU resources when the collection is issued concurrently, and decreases with the increase in the number of CPU. However, when there are less than 4 CPU (for example, 2), CMS may have a great impact on user programs. If the CPU load is already large and half of the computing power is allocated to execute the collector thread, it may suddenly reduce the execution speed of user programs by 50%, which is also unacceptable.

Second, the CMS collector cannot handle floating garbage (Floating Garbage), and a "Concurrent Mode Failure" failure may lead to another Full GC (collection of both the new and the old). Since the user thread is still running during the CMS concurrency cleanup phase, new garbage will naturally be generated as the program runs. This garbage appears after the marking process, and CMS cannot dispose of it in the current collection, so it has to be cleaned up again in the next GC. This part of the garbage is called "floating garbage".

It is also because the user thread still needs to run in the garbage collection phase, so it also needs to reserve enough memory space for the user thread to use, so the CMS collector can not wait until the old age is almost completely filled up like other collectors, but needs to reserve a part of the space to provide the program operation for concurrent collection.

In the default setting of JDK 1.5, the CMS collector will be activated when 68% of the space is used in the old age. This is a conservative setting. If the growth is not too fast in the old age of the application, you can appropriately increase the trigger percentage by increasing the value of the parameter-XX:CMSInitiatingOccupancyFraction to reduce the memory collection times and achieve better performance. In JDK 1.6, the startup threshold of the CMS collector has been raised to 92%.

If the memory reserved during the running of CMS does not meet the needs of the program, there will be a "Concurrent Mode Failure" failure, and the virtual machine will start a backup plan: temporarily enable the Serial Old collector to re-carry out the old garbage collection, so the pause time is very long, and the parameter-XX:CM SInitiatingOccupancyFraction set too high can easily lead to a large number of "Concurrent Mode Failure" failures, resulting in poor performance.

Third, the generation of space debris. CMS is a collector based on the Mark-clear algorithm, which means that a large number of space debris will be generated at the end of the collection. When there is too much space debris, it will bring a lot of trouble to the allocation of large objects, often there is a lot of space left in the old age, but can not find enough continuous space to allocate the current objects, so the Full GC has to be triggered in advance.

In order to solve this problem, the CMS collector provides a-XX:+UseCMSCompactAtFullCollection switch parameter (on by default), which is used to start the memory defragmentation process when the CMS collector cannot hold back the FullGC. The memory defragmentation process is not concurrent, the space debris problem is gone, but the pause time has to be longer. The virtual machine designer also provides another parameter-XX:CMSFullGCsBeforeCompaction, which is used to set how many times an uncompressed FullGC is executed, followed by one with compression (the default value is 0, which means defragmentation occurs every time you enter the FullGC).

7. G1 collector

G1 (Garbage-First) collector is one of the most cutting-edge achievements in the development of collector technology. G1 has become the default garbage collector since the JDK 9 version. It is a garbage collector for server applications. The mission of the HotSpot development team is to replace the CMS collector released in JDK 1.5 in the future (in the longer term). Compared with other GC collectors, G1 has the following features.

Parallelism and concurrency: G1 can make full use of the hardware advantages of multi-CPU and multi-core environment, and use multiple CPU (CPU or CPU cores) to shorten the Stop-The-World pause time. Some other collectors originally need to pause the GC actions executed by Java threads. G1 collectors can still allow Java programs to continue to execute concurrently.

Generational collection: like other collectors, the generational concept is still preserved in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can deal with newly created objects and old objects that have survived many times of GC for better collection results.

Spatial integration: unlike CMS's "mark-clean" algorithm, G1 is a collector based on "mark-clean" algorithm as a whole, and locally (between two Region) based on "copy" algorithm, but in any case, both algorithms mean that G1 does not produce memory space fragments during operation, and can provide regular available memory after collection. This feature makes it easier for the program to run for a long time, and when allocating large objects, the next GC will not be triggered in advance because the continuous memory space cannot be found.

Predictable pause: this is another major advantage of G1 over CMS. Reducing pause time is a common concern of G1 and CMS, but in addition to pursuing a low pause, G1 can also build a predictable pause time model that allows users to specify that they can not spend more than N milliseconds on garbage collection within a time period of M milliseconds, which is almost a feature of real-time Java (RTSJ) garbage collectors.

Other collectors before G1 cover the entire Cenozoic or old era, while G1 is no longer the case. When using the G1 collector, the memory layout of the Java heap is very different from that of other collectors. It divides the entire Java heap into equal-sized independent areas (Region). Although it still retains the concept of the new generation and the old age, the new generation and the old age are no longer physically isolated, they are both part of the collection of Region (not need to be contiguous).

The G1 collector can build a predictable pause time model because it can systematically avoid region-wide garbage collection in the entire Java heap. G1 maintains a priority list in the background, and each time it reclaims the most valuable Region according to the allowed collection time (that is, the origin of the Garbage-First name), which ensures that the G1 collector can achieve the highest collection efficiency in a limited time.

In the G1 collector, object references between Region and between new and old generations in other collectors, virtual machines use Remembered Set to avoid full heap scanning.

Each Region in G1 has a corresponding Remembered Set. When the virtual machine discovers that the program writes to data of type Reference, a Write Barrier temporarily interrupts the write operation to check whether the object referenced by the Reference is in a different Region (in the case of generational generation, check whether the object in the old age refers to the object in the new generation), and if so The relevant reference information is recorded in the Remembered Set of the Region to which the referenced object belongs through CardTable. When memory collection is performed, adding Remembered Set to the enumeration range of the GC root node ensures that the whole heap is not scanned and there is no omission.

Without counting the operation of maintaining the Remembered Set, the operation of the G1 collector can be roughly divided into the following steps:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Initial tag (Initial Marking)

Concurrent tagging (Concurrent Marking)

Final tag (Final Marking)

Filter Recycling (Live Data Counting and Evacuation)

The operation of the first few steps of G1 is similar to that of CMS.

The initial marking phase only marks the objects to which GC Roots can be directly associated, and modifies the value of TAMS (Next Top at Mark Start) so that when the user program runs concurrently in the next stage, a new object can be created in the correct available Region. This stage requires a pause in the thread, but it takes a short time.

The concurrent marking phase starts from GC Root to analyze the reachability of objects in the heap to find out the surviving objects. This phase takes a long time, but can be executed concurrently with user programs.

The final marking phase is to fix the part of the marking record that changes due to the continuous operation of the user program during the concurrent marking period. The virtual machine records the changes of the object during this period in the thread Remembered Set Logs, and the final marking phase needs to merge the Remembered Set Logs data into the Remembered Set. This stage needs to pause the thread, but can be executed in parallel.

Finally, in the screening and recovery stage, we first sort the recovery value and cost of each Region, and make a recovery plan according to the GC pause time expected by users. From the information revealed by Sun, this stage can also be executed concurrently with user programs, but because only part of the Region is recycled, the time can be controlled by the user, and pausing the user thread will greatly improve the collection efficiency. From the following figure, you can see more clearly the concurrency and pause stages in the operation steps of the G1 collector.

GC log

Reading GC logs is a basic skill in dealing with memory problems in Java virtual machines. It is just some artificial rules and does not have much technical content.

The log form of each collector is determined by its own implementation, in other words, the log format of each collector can be different. However, in order to make it easier for users to read, the virtual machine designer keeps the logs of each collector in common, such as the following two typical GC logs:

33.125: [GC [DefNew: 3324K-> 152K (3712K), 0.0025925 secs] 3324K-> 152K (11904K), 0.0031680 secs] 100.667: [Full GC [Tenured:0 K-> 210K (10240K), 0.0149142secs] 4603K-> 210K (19456K), [Perm:2999K- > 2999K (21248K)], 0.0150007 secs] [Times:user=0.01 sys=0.00,real=0.02 secs]

The first numbers 33.125: and 100.667: represent the time when the GC occurred, which means the number of seconds since the Java virtual machine started.

The [GC and [Full GC] at the beginning of the GC log indicate the type of standstill in this garbage collection, rather than being used to distinguish between the new generation GC and the old GC.

If there is a Full, it means that Stop-The-World has occurred in this GC. For example, the following log of the new generation collector ParNew will also appear [Full GC (this is usually due to problems such as failure to allocate guarantee, which leads to STW). If the collection is triggered by a call to the System.gc () method, [Full GC (System)] is displayed here.

[Full GC 283.736: [Parnew: 261599K-> 261599K (261952K), 0.0000288 secs]

The next [DefNew, [Tenured, [Perm] represent the region where the GC occurs, and the region name shown here is closely related to the GC collector used. For example, the new generation in the Serial collector used in the above example is called "Default New Generation", so it is shown as [DefNew. If it is a ParNew collector, the name of the new generation becomes [ParNew, which means "Parallel New Generation". If the Parallel Scavenge collector is used, the new generation that comes with it is called PSYoungGen, the old age is the same as the permanent generation, and the name is also determined by the collector.

3324K-> 152K (3712K) inside the following square brackets means that the memory area has been used before GC-> the memory area has been used since GC (the total capacity of this memory area). 3324K-> 152K (11904K) outside the square brackets indicates the capacity of the Java heap used before GC-> the capacity of the Java heap used after GC (total Java heap capacity).

After that, 0.0025925 secs represents the time, in seconds, taken by the GC in that area of memory. Some collectors will give more specific time data, such as [Times:user=0.01 sys=0.00,real=0.02 secs], where the user, sys and real are the same as the output time of Linux's time command, representing the CPU time consumed in user mode, CPU events consumed in kernel state and the wall clock time (Wall Clock Time) of operation from start to finish, respectively.

The difference between CPU time and wall clock time is that wall clock time includes all kinds of non-operational waiting time, such as waiting for disk ICPU O and waiting thread blocking, while CPU time does not include these times, but when the system has multiple CPU or multiple cores, multithreaded operations will stack these CPU times, so it is perfectly normal for readers to see that user or sys time exceeds real time.

Summary of garbage collector parameters

At this point, all kinds of garbage collectors in JDK 1.7 have been introduced, and many unstable running parameters of virtual machines are mentioned in the description process. These parameters are sorted out in Table 3-2 for readers' reference:

Memory allocation and recovery strategy

Memory allocation of objects, in general, is allocated on the heap, and objects are mainly allocated on the Cenozoic Eden area. In a few cases, it may also be allocated directly in the old days, where the rules of allocation are not 100% fixed, and the details depend on what kind of garbage collector combination is currently being used and the setting of memory-related parameters in the virtual machine.

The picture is from "Code out of efficiency".

1. Objects are assigned first in Eden

In most cases, objects are allocated in the Cenozoic Eden zone. When there is not enough space in the Eden zone to allocate, the virtual machine will initiate a Minor GC.

The virtual machine provides the collector log parameter-XX:+PrintGCDetails, which tells the virtual machine to print the memory collection log when garbage collection occurs, and to output the current memory allocation when the process exits.

Private static final int_1MB=1024 * 1024; / * VM parameters:-verbose:gc-Xms20M-Xmx20M-Xmn10M-XX:+PrintGCDetails-XX:SurvivorRatio=8 * / public static void testAllocation () {byte [] allocation1,allocation2,allocation3,allocation4; allocation1 = new byte [2 * _ 1MB]; allocation2 = new byte [2 * _ 1MB]; allocation3 = new byte [2 * _ 1MB]; allocation4 = new byte [4 * _ 1MB] / / appear once Minor GC}

Running result:

[GC [DefNew: 6651K-> 148K (9216K), 0.0070106 secs] 6651K-> 6292K (19456K), 0.0070426 secs] [Times:user=0.00 sys=0.00,real=0.00 secs] Heap def new generation total 9216K used 4326K [0x029d00000x033d00000x033d0000) eden space 8192 Kprit 51% used [0x029d0000m0x02de4828for0x031d0000) from space 1024Kprit 14% used [0x032d00000x032f5370men 0x033d0000) to space 1024Kj00% used [0x03d0000000x0x3d00000x032d0000) tenured generation total 1024K0used 6144K [0x033d0000000000000000x0300x0x0000) 1060K0d0d00000000039x0x0x0200 0x03dd0000) compacting perm gen total 12288K used 2114K [0x03dd0000meme 0x049d0000memoir 0x07dd0000) the space 12288K Magi 17% used [0x03dd0000m0x03fe0998m0x03fe0a00re0x049d0000) No shared spaces configured.

In the testAllocation () method of the above code, we try to allocate three 2MB size and one 4MB size object. At run time, the Java heap size is limited to 20MB by-Xms20M,-Xmx20M, and-Xmn10M parameters, where 10MB is allocated to the new generation and the rest of the 10MB is allocated to the old age. -XX:SurvivorRatio=8 determines that the spatial ratio of Eden area to an Survivor region in the Cenozoic is 8:1. From the output results, we can also clearly see the information of eden space 819K, from space 1024K and to space 1024K. The total available space of the Cenozoic is 9216KB (the total capacity of Eden area + 1 Survivor area).

A Minor GC occurs when you execute a statement that allocates allocation4 objects in testAllocation (). The result of this GC is that the new generation 6651KB becomes 148KB, while the total memory footprint is hardly reduced (because the allocation1, allocation2, and allocation3 objects are all alive, and the virtual machine finds few recyclable objects).

The reason for this GC is that when allocating memory to allocation4, it is found that Eden has been occupied by 6MB, and there is not enough space left to allocate 4MB memory for allocation4, so Minor GC occurs. During the GC period, the virtual machine found that the existing three 2MB-sized objects could not be put into the Survivor space (the Survivor space is only the size of 1MB), so it had to be transferred to the old era ahead of time through the allocation guarantee mechanism.

After the end of this GC, the allocation4 object of 4MB is successfully allocated to Eden, so the result of program execution is that Eden occupies 4MB (occupied by allocation4), Survivor is idle, and the old age is occupied by 6MB (occupied by allocation1, allocation2, allocation3). This can be confirmed by the GC log.

2. The difference between Minor GC and Full GC

New generation GC (Minor GC): refers to the garbage collection actions that occur in the new generation. Because most Java objects have the characteristics of dying out forever, Minor GC is very frequent, and the recovery speed is generally fast.

Old GC (Major GC/Full GC): refers to the old GC, the emergence of Major GC, often accompanied by at least one Minor GC (but not absolutely, there is a policy selection process of Major GC directly in the collection strategy of the Parallel Scavenge collector). Major GC is generally more than 10 times slower than Minor GC.

3. The big object went straight into the old age.

Large objects are Java objects that require a lot of contiguous memory space. The most typical large objects are very long strings and arrays (byte [] arrays are typical large objects). Large objects are bad news for the memory allocation of virtual machines (especially short-lived large objects, which should be avoided when writing programs). Garbage collection is triggered in advance to obtain enough contiguous space to "place" them when large objects tend to cause a lot of memory space.

The virtual machine provides the-XX:PretenureSizeThreshold parameter so that objects larger than this setting are allocated directly in the old age. The purpose of this is to avoid a large amount of memory replication between the Eden zone and the two Survivor regions.

Private static final int_1MB=1024 * 1024; / * VM parameter:-verbose:gc-Xms20M-Xmx20M-Xmn10M-XX:+PrintGCDetails-XX:SurvivorRatio=8 *-XX:PretenureSizeThreshold=3145728 * / public static void testPretenureSizeThreshold () {byte [] allocation; allocation = new byte [4 * _ 1MB]; / / directly assigned in the old age}

Running result:

Heap

Def new generation total 9216K used 671K [0x029d0000meme 0x033d00000x033d0000)

Eden space 8192 KMagol 8% used [0x029d0000recorder 0x02a77e98memo 0x031d0000)

From space 1024K Magne0% used [0x031d0000reel 0x031d00000x032d0000)

To space 1024KMagol 0% used [0x032d0000record0x032d0000leg0x033d0000)

Tenured generation total 10240K used 4096K [0x033d0000reel 0x03dd0000re0x03dd0000)

The space 10240KMagol 40% used [0x033d0000reix037d0010memoir 0x037d0200meme0x03dd0000)

Compacting perm gen total 12288K used 2107K [0x03dd0000rec 0x049d0000rem 0x07dd0000)

The space 12288K 17% used [0x03dd0000reel 0x03fdefd0rec 0x03fdf000meme 0x049d0000)

No shared spaces configured.

After executing the testPretenureSizeThreshold () method in the above code, we see that the Eden space is almost unused, while 40% of the old 10MB space is used, that is, the allocation object of 4MB is directly allocated in the old age. This is because the PretenureSizeThreshold parameter is set to 3MB (that is, 3145728, which cannot write 3MB directly like parameters such as-Xmx), so objects that exceed 3MB will be allocated directly in the old age.

Note that the PretenureSizeThreshold parameter is only valid for Serial and ParNew collectors. Parallel Scavenge collectors do not recognize this parameter, and Parallel Scavenge collectors generally do not need to set this parameter. If you encounter situations where you must use this parameter, you can consider the collector combination of ParNew and CMS.

4. Long-term surviving objects will enter the old age.

The virtual machine defines an object age (Age) counter for each object.

If the object is born in Eden and is still alive after the first Minor GC, and can be accommodated by Survivor, it will be moved to Survivor space and the age of the object will be set to 1. Each time an object "survives" a Minor GC in the Survivor area, the age increases by 1 year, and when its age increases to a certain extent (the default is 15 years old), it will be promoted to the old age.

The age threshold for the promotion of an object can be set by parameter-XX:MaxTenuringThreshold.

5. Dynamic object age determination

In order to better adapt to the memory conditions of different programs, we do not have to wait until the age required in MaxTenuringThreshold. When objects in the same year reach half of the Survivor space, they and objects older than them will directly enter the old age.

6. Space allocation guarantee

Before the occurrence of Minor GC, the virtual machine first checks whether the largest contiguous space available in the old era is larger than the total space of all objects in the new generation, and if this condition is true, Minor GC can ensure that it is secure.

As long as the continuous space of the old age is larger than the total size of the new generation of objects or the average size of previous promotions, Minor GC will be carried out, otherwise Full GC will be carried out.

At this point, I believe you have a deeper understanding of "what is the principle of Java garbage collection mechanism". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.