How to understand GC correctly 07/06 Update SLTechnology News&Howtos

How to understand GC correctly

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to correctly understand GC". In daily operation, I believe many people have doubts about how to correctly understand GC. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how to correctly understand GC"! Next, please follow the editor to study!

1. JVM Runtime data area

Before talking about GC, it is necessary to understand the memory model of JVM, how JVM plans memory, and the main areas of GC. As shown in the figure, the JVM runtime divides memory into five large areas, where the "method area" and "heap" are created with the startup of JVM, which are memory areas shared by all threads. The virtual machine stack, the local method stack and the program counter are created with the creation of the thread, and the thread is destroyed after running.

1.1 Program counter

The program counter (Program Counter Register) is a very small memory space that is almost negligible. It can be thought of as a line number indexer of bytecode executed by a thread, pointing to the next instruction that should be executed by the current thread. For: conditional branch, loop, jump, exception and other basic functions all depend on the program counter.

For a core of CPU, only one thread can be run at any one time. If a thread runs out of CPU time slices and is suspended, waiting for OS to reassign time slices before continuing execution, how does the thread know where the last execution went? It is achieved through program counters, and each thread needs to maintain a private program counter.

If the thread is executing the Java method, the counter records the address of the JVM bytecode instruction. If the Native method is executed, the counter value is Undefined.

The program counter is the only area of memory that does not specify any OutOfMemoryError conditions, which means that OOM exceptions cannot occur in this area, and GC will not reclaim this area!

1.2 Virtual Machine Stack

The virtual machine stack (Java Virtual Machine Stacks) is also thread-private, with the same life cycle as the thread.

Virtual machine stack describes the memory model of Java method execution. When JVM wants to execute a method, it will first create a stack frame (Stack Frame) to store: local variable table, Operand stack, dynamic link, method exit and other information. The stack execution begins after the stack frame is created, and the stack is released after the method execution is finished.

The process of method execution is the process of stack frames from on-stack to off-stack.

The local variable table is mainly used to store various basic data types, object references and returnAddress types that can be known by the compiler. The memory space required by the local variable table has been confirmed at compile time, and the size of the local variable table will not be modified during the run.

In the JVM specification, the virtual machine stack specifies two exceptions:

The stack depth requested by the StackOverflowError thread is greater than the stack depth allowed by JVM. The capacity of the stack is limited, and if the stack frame of the thread enters the stack exceeds the limit, a StackOverflowError exception will be thrown, such as method recursion.

The OutOfMemoryError virtual machine stack can be extended dynamically, and if enough memory cannot be applied for during the extension, an OOM exception will be thrown.

1.3. Local method stack

The native method stack (Native Method Stack) is also thread-private, much like the virtual machine stack. The difference is that the virtual machine stack serves to execute Java methods, while the local method stack serves to execute Native methods.

Like the virtual machine stack, StackOverflowError and OutOfMemoryError exceptions are specified for the local method stack in the JVM specification.

1.4. Java reactor

The Java heap (Java Heap) is shared by threads and is generally the largest memory area managed by JVM, as well as the main management area of the garbage collector GC.

The Java heap is created when JVM starts to hold object instances. Almost all objects are created in the heap, but with the development of JIT compiler and the maturity of escape analysis technology, the optimization techniques of stack allocation and scalar substitution make "all objects allocated on the heap" less absolute.

Because it is the main area managed by GC, it is also known as the GC heap. For efficient recycling of GC, the interior of the Java heap is divided as follows:

In the JVM specification, the heap can be physically discontiguous, as long as it is logically continuous. The minimum and maximum heap memory can be set through the-Xms-Xmx parameter.

1.5. Method area

The method Method Area, like the Java heap, is a memory area shared by threads. It is mainly used to store: class information loaded by JVM, constants, static variables, code generated by just-in-time compiler, and other data. Also known as: non-heap (Non-Heap), designed to distinguish it from the Java heap.

The restrictions of the JVM specification on the other method area are relatively loose, and JVM may not even garbage collect the method area. This led to the fact that in the old version of JDK, the method area was also known as PermGen.

It is not a good idea to use permanent generation to implement the method area, which can easily lead to memory overflow, so there is a "go permanent generation" action starting with JDK7 to remove the string constant pool that was originally placed in the permanent generation. In JDK8, the permanent generation is officially removed and meta-space is ushered in.

2. Overview of GC

Garbage collection (Garbage Collection), known as "GC" for short, is much older than the Java language itself. Lisp, which was born at the Massachusetts Institute of Technology in 1960, was the first language to use dynamic memory allocation and garbage collection technology.

To achieve automatic garbage collection, you first need to think about three things: I introduced the five major memory areas of JVM, the program counter takes up so little memory that it is almost negligible, and there is never a memory overflow, and GC does not need to reclaim it. The virtual machine stack and the local method stack "live and die together" with the thread, and the stack frames in the stack enter and exit the stack in an orderly manner with the operation of the method, and the compilation period of each stack frame allocation has been basically determined. therefore, the allocation and recovery of memory in these two areas are deterministic, and it is not necessary to consider how to recover them.

The method area is different. How many implementation classes are there in an interface? How much memory is consumed by each class? You can even create classes dynamically at run time, so GC needs to be recycled for the method area.

The same is true of Java heap, where almost all Java object instances are stored. How many object instances a class will create is known only when the program is running. The allocation and recycling of this part of memory is dynamic, and GC needs to pay attention to it.

2.1 which objects need to be recovered

The first step in implementing automatic garbage collection is to determine which objects can be collected. Generally speaking, there are two ways: reference counting algorithm and reachability analysis algorithm, the latter is almost always used in commercial JVM.

2.1.1 reference counting algorithm

Add a reference counter to the object, add 1 for each reference, subtract 1 for each reference, and when the counter is 0, the object is no longer referenced, so the object can be recycled.

Although the reference counting algorithm (Reference Counting) takes up some extra memory space, it is simple and efficient. It is a good implementation in most cases, but it has a serious drawback: it can not solve circular references.

For example, a linked list should be recycled as long as there is no reference to the linked list, but unfortunately, because all the element reference counters in the linked list are not 0, it cannot be recycled, resulting in a memory leak.

2.1.2 Reachability Analysis algorithm

At present, the mainstream commercial JVM uses reachability analysis to determine whether the object can be recycled. The basic idea of this algorithm is:

Through a series of root objects called "GC Roots" as the starting node set, from these nodes, through the reference relationship to search down, the search path is called "reference chain". If an object is not connected to the GC Roots, it means that the object is unreachable and can be recycled.

Object accessibility means that there is a direct or indirect citation relationship between the two parties. Root reachability or GC Roots reachability means that there is a direct or indirect reference relationship between objects and GC Roots.

There are the following types of objects that can be used as GC Roots: reachability analysis is that JVM first enumerates the root node, finds some objects that must survive in order to ensure the normal operation of the program, and then takes these objects as the root and starts to search down according to the reference relationship. Objects with direct or indirect reference chains survive, and objects that do not have reference chains are reclaimed.

For a detailed description of accessibility analysis, you can see the author's article: "vernacular understanding accessibility analysis algorithm".

2.2 when to recycle

JVM divides the memory into five large areas, and different GC will garbage collect for different areas. Generally speaking, GC types can be divided into the following categories:

Minor GC, also known as "Young GC" and "light GC", is only aimed at garbage collection of the new generation.

Major GC, also known as "Old GC", is only for garbage collection in the old days.

Mixed GC is mixed with GC for garbage collection for the new generation and some old ages, and only some garbage collectors support it.

Full GC whole heap GC, heavy GC, garbage collection for the entire Java heap and method zone, the longest GC.

When will GC be triggered, and what type of GC will be triggered? Different garbage collectors have different implementations, and you can also influence JVM decisions by setting parameters.

Generally speaking, the new generation will not trigger the GC until the Eden area is exhausted, but the Old area cannot do so, because some concurrent collectors can continue to run during the cleaning process, which means that the program is still creating objects and allocating memory, which requires "space allocation guarantee" in the old age, and the objects that the new generation cannot put will be put into the old age, if the recycling speed of the old era is slower than that of the object creation. This results in an "allocation guarantee failure", when JVM has to trigger Full GC to get more available memory.

2.3 how to recycle

Once you have located the object that needs to be recycled, it is time to start recycling. How to recycle objects has become another problem. What kind of recycling will be more efficient? Do you need to compress and organize the memory after recycling to avoid fragmentation? To solve these problems, the recovery algorithms of GC are roughly divided into the following three categories:

Mark-clear algorithm

Tag-copy algorithm

Marking-finishing algorithm

The recycling details of the specific algorithm will be described below.

3. GC recovery algorithm

JVM divides the heap into different generations, and the objects stored in different generations have different characteristics. Using different GC recovery algorithms for different generations can improve the efficiency of GC.

3.1 generational collection theory

At present, most JVM garbage collectors follow the "generational collection" theory, which is based on three hypotheses.

3.1.1 weak generational hypothesis

The vast majority of objects are life and death.

Think about whether the program we write is like this. Most of the time, we create an object just to do some business calculations, and after we get the results, the object is useless, that is, it can be recycled. Another example: the client requests to return a list of data, and the server converts the query from the database to the JSON response to the front end, and the list data can be recycled. Things like this can be called the object of "life and death".

3.1.2 strong generational hypothesis

The more times you go through GC, the harder it is to recycle objects.

This hypothesis is entirely based on probability statistics. If an object whose GC cannot be recycled for many times, it can be assumed that it will not be recycled the next time GC, so it is not necessary to recycle it frequently, move it to the old age, reduce the frequency of recycling, and let GC recycle the new generation with higher benefits.

3.1.3 Intergenerational citation hypothesis

Cross-generational references are relatively rare compared with those of the same generation.

This is an implicit inference based on the logical reasoning of the first two hypotheses: two objects that refer to each other should tend to exist or die at the same time. For example, if there is an intergenerational reference to a Cenozoic object, because it is difficult for the old object to die, the citation will make the Cenozoic object survive at the time of collection, and then be promoted to the old age as you get older. at this time, the intergenerational reference is also eliminated.

3.2 resolve intergenerational references

Although there are very few intergenerational references, it is still possible. If you scan the whole old age for very few intergenerational references, the cost of each GC will be too high, and the pause time of the GC will become unacceptable. If you ignore cross-generational references, it will cause the new generation of objects to be mistakenly recycled, resulting in program errors.

3.2.1 Remembered Set

JVM is solved through memory set (Remembered Set). By establishing the data structure of memory set in the new generation, it can avoid adding the whole old age to the scanning range of GC Roots when recycling the new generation, and reduce the cost of GC.

A memory set is an abstract data structure that points from a "non-collection area" to a "collection area". To put it bluntly, it marks "objects in the younger generation that are referenced by the old". Memory sets can have the following three recording accuracy:

Word length accuracy: the record is accurate to a machine word length, that is, the number of addressing bits of the processor.

Object precision: accurate to the object, whether there is an intergenerational reference pointer in the field of the object.

Card precision: accurate to a memory area, whether there are intergenerational references to objects in this area.

Word length precision and object precision are so refined that it takes a lot of memory to maintain memory sets, so many JVM adopt "card precision", also known as "card table" (Card Table). Card table is not only an implementation of memory set, but also the most commonly used form at present. it defines the recording accuracy of memory set and the mapping relationship between memory and memory.

HotSpot uses a byte array to implement the card table, which divides the heap space into a series of memory regions of the size of the second power, which is called "Card Page". The size of the card page is usually the power of 2, and HotSpot uses the power of 2, or 512 bytes. Each element of the byte array corresponds to a card page. If there is an intergenerational reference to an object in a card page, JVM will mark the card page as "Dirty" dirty. GC only needs to scan the memory area corresponding to the dirty page to avoid scanning the whole heap.

The structure of the card table is shown in the following figure:

3.2.2 write barrier

The card table is only used to mark which memory area has the data structure of intergenerational reference, and how does JVM maintain the card table? When will the card page get dirty?

HotSpot maintains the card table through "write barrier" (Write Barrier). JVM intercepts the action of "object attribute assignment", which is similar to the aspect programming of AOP. JVM can intervene in processing before and after object attribute assignment. The processing before and after assignment is called "pre-write barrier" and "post-write barrier". The pseudo code is as follows:

Void setField (Object o) {before (); / / pre-write barrier this.field = o; after (); / / post-write barrier}

When the write barrier is turned on, JVM will generate instructions for all assignment operations. Once a reference to an old object points to a younger object, HotSpot will set the corresponding card table elements to be dirty.

Please distinguish between the "write barrier" here and the "write barrier" of reordering memory instructions in concurrent programming to avoid confusion.

In addition to the overhead of the write barrier itself, card tables also face the problem of "pseudo-sharing" in high concurrency scenarios. Modern CPU cache systems are stored in "cache lines" (Cache Line). The size of Intel CPU cache lines is usually 64 bytes. When multithreading modifies independent variables, if these variables are in the same cache line, it will cause each other's cache lines to invalidate for no reason. Threads have to frequently issue load instructions to reload data, resulting in performance degradation.

A Cache Line is 64 bytes, each card page is 512 bytes, and 64 ✖️ 512 bytes is the 32KB. If the object updated by different threads is within this 32KB, it will cause the card table to be updated exactly to the same cache row, which affects performance. To avoid this problem, HotSpot supports making elements dirty only if they are not marked, which adds a judgment, but avoids the problem of pseudo-sharing. Set-XX:+UseCondCardMark to turn on this judgment.

3.3 Mark clear

The label removal algorithm is divided into two processes: marking and clearing.

The collector first marks the objects that need to be recycled, and clears them uniformly after the marking is completed. You can also mark living objects and then uniformly clear untagged objects, depending on the proportion of living objects and dead objects in memory.

Disadvantages:

The time consumption of performance instability marking and cleanup increases as the number of objects in the Java heap increases.

After the memory fragment mark is cleared, the memory will produce a large number of discontinuous space fragments, which is not conducive to the subsequent allocation of memory for new objects.

3.4 Mark replication

In order to solve the problem of memory fragmentation caused by the tag removal algorithm, the tag replication algorithm is improved.

The tag replication algorithm divides the memory into two areas, using only one of them at a time, marks it first when garbage collection is completed, copies the surviving objects to another area after marking, and then cleans up all the current areas.

The disadvantage is that if a large number of objects cannot be recycled, it will incur a lot of memory replication overhead. The available memory is reduced by half, and the memory waste is relatively large. Since the vast majority of objects are recycled at the first GC, and very few objects are often copied, it is not necessary to divide the space by 1:1. The default size ratio of the Eden zone to the Survivor zone of the HotSpot virtual machine is 8:1, that is, 80% of the Eden zone, 10% of the Eden from Survivor zone, 10% of the Magi to Survivor zone, 90% of the available memory of the whole Cenozoic era is Eden zone + one Survivor zone, and 10% of the other Survivor zone is used for partition replication.

If a large number of objects are still alive after the Minor GC, beyond the scope of a Survivor zone, an allocation guarantee (Handle Promotion) is performed to allocate the object directly to the old age.

3.5 marking and finishing

Tag replication algorithm not only needs more copy operations when a large number of objects survive, but also needs additional memory space to allocate guarantee, so this kind of recycling algorithm is generally not used in the old years.

Objects that can survive in the old era are generally objects that can not be recycled after many times of GC. Based on the "strong generation hypothesis", objects in the old era are generally difficult to be recycled. According to the survival characteristics of the old object, the tag finishing algorithm is introduced.

The marking process of the tag demarcation algorithm is the same as that of the tag cleanup algorithm, but the tag collation algorithm does not clean up the marked objects directly like the tag cleanup algorithm, but moves the surviving objects to one end of the memory area. and then directly clear the memory space outside the boundary. Compared with the tag clearing algorithm, the biggest difference between the tag finishing algorithm and the tag clearing algorithm is that the living objects need to be moved. Moving living objects in GC has both advantages and disadvantages.

The disadvantage is based on the "strong generation hypothesis". In most cases, a large number of objects will survive after the old GC. Moving these objects requires updating all reference reference addresses, which is an expensive operation, and the operation requires pausing all user threads, that is, the program will block the pause. JVM calls this pause: Stop The World (STW).

After moving the object to sort out the memory space, it will not produce a large number of discontinuous memory fragments, which is conducive to the subsequent allocation of memory for the object.

Thus it can be seen that there are advantages and disadvantages regardless of whether the object is moved or not. If you move, you will be responsible for memory collection, and memory allocation will be simple. If you do not move, memory recovery will be simple, and memory allocation will be complex. In terms of overall program throughput, moving objects is obviously more cost-effective because memory is allocated much more frequently than memory is reclaimed.

Another solution is not to move objects at ordinary times, to use the tag removal algorithm, and to enable the tag demarcation algorithm when memory fragmentation affects the allocation of large objects.

4. Garbage collector

There are numerous JVM implemented according to the Java Virtual Machine Specification, and there are N garbage collectors for users to choose from on each JVM platform, which is not clear in an article. Of course, developers do not need to know all the garbage collectors. Take Hotspot JVM as an example, the mainstream garbage collectors mainly have the following categories: serial: single-thread collection, user thread pauses. Parallel: multithreaded collection, user thread paused. Concurrency: user thread and GC thread run at the same time.

As mentioned earlier, most JVM garbage collectors follow the "generational collection" theory, and different garbage collectors collect different areas of memory. In most cases, JVM requires two garbage collectors to work together. The dotted line connection below indicates that the two collectors can be used together.

4.1 New Generation Collector 4.1.1 Serial

The most basic and earliest garbage collector, which uses the tag copy algorithm, starts only one thread to complete garbage collection, and all user threads (STW) are paused during collection. Use the-XX:+UseSerialGC parameter to enable the Serial collector. Because it is a single-thread collection, the scope of application of Serial is very limited:

The application is lightweight, with less than 100 MB of heap space.

Server CPU resources are tight.

4.1.2 Parallel Scavenge

A new generation of multithreaded collectors using the tag replication algorithm. Use the parameter-XX:+UseParallelGC on, ParallelGC is very concerned about the system throughput, it provides two parameters to control the system throughput:-XX:MaxGCPauseMillis: set the maximum garbage collection pause time, it must be an integer greater than 0, ParallelGC will work towards this goal, if this value is set too small, ParallelGC may not be guaranteed. If the user wants the GC to pause for a short time, ParallelGC will try to reduce the heap space, because recycling a smaller heap must take less time than recycling a larger heap, but this will trigger the GC more frequently, thus reducing the throughput of the system.

-XX:GCTimeRatio: sets the size of the throughput, which is an integer of 0,100. Assuming that GCTimeRatio is n, ParallelGC will spend no more than 1 / (1% n) on garbage collection. The default value is 19, which means that ParallelGC will spend no more than 5% of its time on garbage collection.

ParallelGC is the default garbage collector for JDK8. It is a throughput-first garbage collector. Users can set the maximum pause time and throughput of GC through-XX:MaxGCPauseMillis and-XX:GCTimeRatio. However, these two parameters contradict each other, and a smaller pause time means that GC needs to be recycled more frequently, thus increasing the overall time for GC recycling, resulting in a decline in throughput.

4.1.3 ParNew

ParNew is also a new generation of garbage collector that uses tag replication algorithm and multithreading. Its recycling strategy, algorithms, and parameters are the same as Serial, but simply change single-threaded to multithreaded, and it was born to cooperate with the CMS collector. CMS is an old collector, but Parallel Scavenge can not work with CMS, Serial is serial recycling, and the efficiency is too low, so ParNew was born.

Use the parameter-XX:+UseParNewGC on, but this parameter has been deleted in later versions of JDK9, because JDK9 default G1 collector, CMS has been replaced, and ParNew is born to cooperate with CMS, CMS is discarded, ParNew has no value.

4.2 Old Age Collector 4.2.1 Serial Old

Using the tag collation algorithm, like Serial, a single-threaded exclusive garbage collector for old times. The space of the old era is usually larger than that of the new generation, and the tag collation algorithm needs to move objects in the process of recycling to avoid memory fragmentation, so the recycling of the old era is more time-consuming than the new generation.

As the earliest old garbage collector, Serial Old also has the advantage that it can be used with most new generation garbage collectors, and it can also be used as a backup collector for CMS concurrent failures.

With the parameter-XX:+UseSerialGC enabled, serial collectors will be used in the new generation and the old age. Like Serial, it is not recommended to use this collector unless your application is very lightweight or CPU resources are tight.

4.2.2 Parallel Old

ParallelOldGC is a multi-threaded parallel exclusive garbage collector for the old times. like Parallel Scavenge, it is a throughput-first collector. ParallelOld was born to cooperate with Parallel Scavenge.

ParallelOldGC uses the tag collation algorithm, using the parameter-XX:+UseParallelOldGC on, the parameter-XX:ParallelGCThreads=n can set the number of threads turned on during garbage collection, and it is also the default old-age collector of JDK8.

4.2.3 CMS

CMS (Concurrent Mark Sweep) is a landmark garbage collector. Why do you say that? Because before it, GC thread and user thread cannot work at the same time, even if it is Parallel Scavenge, it is only when GC starts multiple threads to collect in parallel, the whole process of GC still has to pause the user thread, that is, Stop The World. The consequence of this is that the Java program will stutter for a while, which reduces the response speed of the application, which can not be accepted by the program running on the server.

Why pause the user thread when GC? First of all, if the user thread is not paused, it means that garbage will continue to be generated during the period and will never be cleaned up. Secondly, the operation of the user thread will inevitably lead to changes in the reference relationship of the object, which will lead to two situations: missing and mismarking.

The missing label is not garbage originally, but in the process of GC, the user thread modifies its reference relationship, so that the GC Roots is unreachable and becomes garbage. This situation is a little better, nothing more than some floating garbage, next time GC will clean it up.

Mislabeling is originally garbage, but in the process of GC, the user thread will repoint the reference to it, and once the GC collects it, it will cause the program to run incorrectly.

In order to achieve concurrent collection, the implementation of CMS is much more complex than the previous garbage collectors. The whole GC process can be roughly divided into the following four stages: 1. The initial tag only marks the objects to which GC Roots can be directly associated, which is very fast. The process of initial marking needs to trigger STW, but it is very fast, and the time-consuming of the initial tag is controllable because of the larger heap space, so the brief pause caused by this process can be ignored.

2. Concurrent markup is to traverse the initial tagged objects deeply, taking these objects as the root, traversing the whole object graph, this process takes a long time, and the marking time will become longer with the increase of heap space. Fortunately, this process will not trigger STW, the user thread can still work, the program can still respond, but the performance of the program will be affected a little. Because GC threads take up some CPU and system resources, they are sensitive to processors. The number of GC threads enabled by CMS by default is (number of CPU cores + 3) / 4. When the number of CPU cores exceeds 4, GC threads will occupy less than 25% of CPU resources. If the number of CPU is less than 4, GC threads will have a great impact on the program, resulting in a significant decline in program performance.

3. Relabeling due to concurrent marking, the user thread is still running, which means that during the concurrent marking, the user thread may change the reference relationship between objects, and two situations may occur: one is that objects that cannot be recycled can now be recycled, and the other is that objects that could have been recycled cannot be recycled now. In both cases, CMS needs to pause the user thread and do a relabel.

4. After the concurrent cleanup and relabeling is completed, you can clean up concurrently. This process takes a long time, and the overhead of cleaning increases with the increase of heap space. Fortunately, this process does not require STW, the user thread can still run normally, the program will not stutter, but like the concurrent flag, GC threads still need to occupy certain CPU and system resources when cleaning, which will lead to a decline in the performance of the program.

CMS opens up a precedent for concurrent collection, making it possible for user threads and GC threads to work at the same time, but the shortcomings are also obvious: 1. Processor-sensitive concurrent marking and concurrent cleaning phase, although CMS will not trigger STW, but marking and cleaning require GC threads to intervene in processing, GC threads will occupy certain CPU resources, resulting in a decline in program performance and slower program response speed. It is a little better if there are more CPU cores. When CPU resources are tight, GC threads have a great impact on the performance of the program.

2. Floating garbage concurrent cleaning phase, because the user thread is still running, the garbage generated by the user thread is called "floating garbage". Floating garbage cannot be cleaned by GC this time, so it can only be cleaned up until the next GC.

3. Concurrency failure due to the existence of floating garbage, CMS must reserve some space to load these newly generated garbage. CMS cannot wait until the Old area is full to clean up, like the Serial Old collector. In JDK5, CMS is activated when 68% of the space is used in the old days, leaving 32% of the space to load floating garbage, which is a more conservative configuration. If the old age is not growing too fast in the actual reference, you can appropriately increase this value by using the-XX:CMSInitiatingOccupancyFraction parameter. When it comes to JDK6, the trigger threshold is raised to 92%, leaving only 8% space to load floating garbage. If the memory reserved by CMS cannot hold floating garbage, it will result in "concurrency failure", when JVM has to trigger a standby scheme and enable the Serial Old collector to reclaim the Old area, which leads to a longer pause time.

4. Memory fragmentation because CMS uses the "tag cleanup" algorithm, which means that a large number of memory fragments will be generated in the heap after the cleanup is completed. Too much memory fragmentation can cause a lot of trouble, one of which is that it is difficult to allocate memory to large objects. The consequence is that there is obviously a lot of heap space, but can not find a continuous memory area to allocate memory for large objects, and have to trigger a Full GC, so the GC pause time will become longer. In view of this situation, CMS provides an alternative. By setting the-XX:CMSFullGCsBeforeCompaction parameter, when CMS triggers FullGC for N times due to memory fragmentation, defragment memory before entering FullGC next time, but this parameter is deprecated in JDK9.

4.2.3.1 tricolor marking algorithm

After introducing the CMS garbage collector, it is important to understand why the GC thread of CMS works with the user thread.

The vast majority of JVM uses "reachability analysis" algorithm to judge whether an object can be recycled. About this algorithm, you can see the author's previous article: vernacular understanding accessibility analysis algorithm.

Traversing from GC Roots, what is reachable is survival, and what is unreachable is recycled.

CMS marks objects in three colors: the process of tagging is roughly as follows:

At first, all the objects were white and were not accessed.

Sets the objects directly associated with the GC Roots to gray.

Traverses all references to the gray object, setting the gray object itself to black and the reference to gray.

Repeat step 3 until there are no gray objects.

At the end, the black object survives and the white object is recycled.

The premise that this process is executed correctly is that no other thread changes the reference relationship between objects. However, in the process of concurrent marking, the user thread is still running, so missing and mismarking will occur.

The missing flag assumes that GC is already traversing object B, while the user thread performs the operation of A.B=null, cutting off the reference from A to B. After the implementation of A.B=null, B, D, and E can all be recycled, but because B has turned gray, it will still be used as a living object and continue to traverse. The end result is that the current round of GC will not recycle B, D, E, and save it for the next GC, which is also part of the floating garbage.

In fact, this problem can still be solved through the "write barrier", as long as the write barrier is added when A writes B, the records of B being cut off are recorded, and they can be marked as white when they are re-marked.

The mismarking assumes that the GC thread has traversed to B, and the user thread performs the following actions:

B. references to B.During nullplash B to D are cut off. References to A.xxdestroy Dentax raceme A to D are established

References from B to D are cut off, and references from A to D are established. At this point, the GC thread continues to work, because B no longer refers to D, although A refers to D again, but because A has been marked black, GC will no longer traverse A, so D will be marked white and will finally be treated as garbage collection. You can see that the result of the wrong mark is much more serious than the leaking table. Floating garbage can be cleaned up next time by GC, and recycling the objects that should not be recycled will cause the program to run wrong.

Mislabeling occurs only if the following two situations are met:

All references that point to white in gray are broken.

A reference to black pointing to white is created.

As long as any condition is broken, the problem of wrong standard can be solved.

The original snapshot and incremental update of the original snapshot break the first condition: when the reference of the gray object to the white object is broken, the reference relationship is recorded. When the scan is over, scan again with these gray objects as the root. This means that regardless of whether the reference relationship is deleted or not, it will be scanned according to the snapshot of the object graph at the beginning of the scan.

The incremental update breaks the second condition: when a black-to-white reference is established, the new reference relationship is recorded, and then re-scanned with the black objects in these records as the root after the scan is over. It is equivalent to a black object that becomes a gray object once a reference to a white object is established.

The solution adopted by CMS is: write barrier + incremental update to achieve, breaking the second condition.

When a black-to-white reference is established, the reference relationship is recorded through the write barrier, and then re-scanned with the black object in the reference relationship as the root after the scan is over.

The pseudo code is roughly as follows:

Class A {private D d; public void setD (D d) {writeBarrier (d); / / insert a write barrier this.d = d;} private void writeBarrier (D d) {/ / record the reference relationship between An and D, and then rescan}} 4.3 mixed collector 4.3.1 G1

The full name of G1 is "Garbage First" garbage first collector, JDK7 is officially used, JDK9 is used by default, it appears to replace the CMS collector.

Since you want to replace CMS, there is no doubt that G1 is also a concurrent parallel garbage collector, user thread and GC thread can work at the same time, focusing on the response time of the application.

One of the biggest changes of G1 is that it is just logical generation, and there is no generation in physical structure. It divides the whole Java heap into several Region of different sizes, each Region can act as Eden area, Survivor area, or old space as needed, and G1 can adopt different strategies to deal with Region that play different roles.

For all garbage collectors before G1, the scope of recycling is either the entire Cenozoic (Minor GC), or the entire old age (Major GC), or the entire Java heap (Full GC). And G1 jumped out of this cage, it can face any part of the heap to form Collection Set (CSet) for recycling, the measure is no longer which generation it belongs to, but to determine which Region garbage is the most, choose the highest recycling value of Region recycling, which is the origin of the name "Garbage First".

Although G1 still retains the concept of generation, the new generation and the old age are no longer two continuous memory areas that are fixed, they are composed of a series of Region, and the space size of the new generation and the old age will be adjusted dynamically during each GC. The reason why G1 can control the pause time of GC and build a predictable pause time model is that it takes Region as the minimum unit of a single collection, and the memory space of each collection is an integral multiple of the size of Region, so that it can avoid full-area garbage collection in the entire Java heap.

G1 tracks the amount of garbage per Region, calculates the recycling value of each Region, maintains a priority list in the background, and then reclaims the "most garbage" Region first according to the time set by the user to allow GC to pause, thus ensuring that G1 can reclaim as much available memory as possible in a limited time.

The entire recycling cycle of G1 can be divided into the following phases:

The memory in the Eden area is exhausted, triggering the new generation of GC to begin to recover the Eden area and the Survivor area. After the new generation of GC, the Eden area will be cleared, at least one Survivor area will be retained, and the rest of the objects will either be cleaned up or promoted to the old age. In the process, the size of the new generation may be adjusted.

Concurrent tagging cycle 2.1initial tagging: tagging only objects directly associated with GC Roots will accompany a new generation of GC and result in STW. 2.2 Root area scanning: the new generation of GC triggered during the initial marking will empty the Eden area, and the surviving objects will be moved to the Survivor area. At this time, it is necessary to scan the old areas directly accessible by the Survivor area and mark these objects. This process can be performed concurrently. Concurrency markup: similar to CMS, it scans and finds surviving objects throughout the heap and marks them without triggering STW. Retag: triggers STW and fixes references between objects that are changed during concurrent markup as the user thread continues to execute. Exclusive cleanup: triggers the STW, calculates the recycling value of each Region, sorts the Region, and identifies areas for mixed recycling. Concurrency cleanup: identify and clean up completely idle Region without causing a pause.

Mixed recycling: in the concurrent cleanup phase of the concurrent marking cycle, G1 reclaims some space, but the proportion is still quite low. But after that, G1 has clearly known the recycling value of each Region. In the mixed recycling phase, G1 will give priority to recycling the most garbage Region, these Region not only include the new generation, but also include the old age, so it is called "mixed recycling". Living objects in the cleaned Region are moved to other Region, which also avoids memory fragmentation.

Like CMS, because the user thread is still running at the time of concurrent recycling, that is, allocating memory, G1 will trigger a Full GC to obtain more available memory if the recycling speed cannot keep up with the memory allocation speed.

Use the parameter-XX:+UseG1GC to open the G1 collector, and-XX:MaxGCPauseMillis to set the target maximum pause time. G1 will work towards this goal. If the GC pause time exceeds the target time, G1 will try to adjust a series of parameters, such as the ratio of the new generation to the old, heap size, promotion age, and so on, to achieve the preset goal. -XX:ParallelGCThreads is used to set the number of threads of GC during parallel recycling, and-XX:InitiatingHeapOccupancyPercent is used to specify how much the utilization of the entire Java heap triggers the execution of the concurrent marking cycle. The default value is 45.

4.3.2 Future-oriented ZGC

ZGC is a low-latency garbage collector with implementation nature just added to JDK11. Its goal is to control the pause time of GC within 10 milliseconds under any heap memory size with as little impact on throughput as possible.

ZGC is aimed at super-large heap, which supports the maximum heap space of 4TB. Like G1, it also adopts the memory layout of Region.

One of the biggest features of ZGC is that it uses shaded pointer Colored Pointer technology to mark objects. In the past, if JVM needs to store some additional data on an object that is only used by GC or JVM itself (such as GC age, biased thread ID, hash code), additional fields are usually added to the object header of the object to record. ZGC is great, recording the tag information directly on the reference pointer of the object.

What is Colored Pointer? Why can pointers referenced by objects themselves store data? In a 64-bit system, the memory that can theoretically be accessed is 64 power bytes of 2, or 16EB. But in fact, such a large amount of memory is far from needed, so both CPU and the operating system impose their own constraints for performance and cost considerations. For example, AMD64 architecture only supports 54-bit (4PB) address bus, Linux only supports 46-bit (64TB) physical address bus, and Windows only supports 44-bit (16TB) physical address bus.

In Linux systems, the top 18 bits cannot be used for addressing, and the remaining 46 bits can support the memory size of the maximum 64TB. In fact, the memory size of 64TB currently far exceeds the needs of the server. So ZGC targeted the remaining 46-bit pointer width and extracted its high 4 bits to store four flag information. Through these flag bits, the JVM can see directly from the pointer the tricolor flag state of its referenced object, whether it has entered the redistribution set (that is, it has been moved), and whether it can only be accessed through the finalize () method. As a result, there are only 42 bits left in the physical address bus that JVM can make use of, that is, the maximum memory space that ZGC can manage is 42 power bytes of 2, or 4TB.

At this point, the study on "how to correctly understand GC" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.