How to understand GC knowledge point CMS 07/13 Update SLTechnology News&Howtos

How to understand GC knowledge point CMS

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

How to understand GC knowledge point CMS, I believe that many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Today, I'm going to supplement the parts that haven't been discussed in detail before. The first thing I want to mention is the garbage collector.

There are three basic ways of recycling: clear, compress, and copy. The derived garbage collectors are:

Serial collector

The new generation collector, using stop replication algorithm, uses one thread for GC, serial, and other worker threads are paused.

Use the-XX:+UseSerialGC switch to control memory recycling when running in Serial + Serial Old mode (which is also the default for virtual machines running in Client mode).

ParNew collector

New generation collector, using stop replication algorithm, multi-threaded version of Serial collector, GC with multiple threads, parallel, other worker threads pause, focus on shortening garbage collection time.

Use the-XX:+UseParNewGC switch to control memory collection using the ParNew + Serial Old collector combination, and use-XX:ParallelGCThreads to set the number of threads performing memory collection.

Parallel Scavenge collector

The new generation of collectors, using stop replication algorithm, focus on CPU throughput, that is, the time / total time it takes to run user code. For example, if JVM runs for 100 minutes, in which user code is run for 99 minutes and garbage collection for 1 minute, the throughput is 99%. This kind of collector can make the most efficient use of CPU and is suitable for running background operations (other collectors that focus on shortening garbage collection time, such as CMS, have little waiting time. Therefore, it adapts to user interaction and improves user experience.

Use the-XX:+UseParallelGC switch to control garbage collection using a combination of Parallel Scavenge and Serial Old collectors (which is also the default in Server mode); use-XX:GCTimeRatio to set the ratio of user execution time to total time, which defaults to 99, that is, 1% of the time is spent on garbage collection. Use-XX:MaxGCPauseMillis to set the maximum pause time of GC (this parameter is only valid for Parallel Scavenge), and the switch parameter-XX:+UseAdaptiveSizePolicy can be dynamically controlled, such as automatically adjusting the Eden / Survivor ratio, the age of the old object, the size of the new generation, etc., this parameter is not available in ParNew.

Serial Old collector

Old-fashioned collector, single-threaded collector, serial, using mark-finishing algorithm, using single-threaded GC, other worker threads paused (note: in the old days for marking-finishing algorithm cleaning, also need to pause other threads), before JDK1.5, the Serial Old collector was used with ParallelScavenge. > the method of sorting is Sweep (cleanup) and Compact (compression). Cleanup is to kill the abandoned objects, leaving only the surviving objects. Compression is a moving object, which fills up the space to ensure that the memory is divided into two pieces, one full of objects and one idle).

Parallel Old collector

The old collector, multi-threaded, parallel, multi-threaded mechanism is not as good as Parallel Scavenge, using the tag-finishing algorithm, when Parallel Old execution, still need to pause other worker threads. > the collation of Parallel Old collectors is different from Serial Old. Here, the collation is Copy (copy) and Compact (compression). Replication means to copy the surviving objects to a pre-prepared area, rather than cleaning up abandoned objects like Sweep (clear).

Parallel Old is useful in multicore computing. After the emergence of Parallel Old (JDK 1.6), it works well with Parallel Scavenge, which fully reflects the effect of giving priority to the throughput of Parallel Scavenge collectors. Use the-XX:+UseParallelOldGC switch to control the collection using the Parallel Scavenge + ParallelOld combined collector.

CMS

The full name Concurrent Mark Sweep, the old era collector, is committed to obtaining the shortest collection pause time (that is, shortening the garbage collection time), using mark-clear algorithm, multithreading, the advantage is concurrent collection (user thread and GC thread can work at the same time), the pause is small.

Use-XX:+UseConcMarkSweepGC for ParNew + CMS + Serial Old for memory collection, and give priority to ParNew + CMS (see later). When the user thread runs out of memory, the alternative Serial Old collection is used.

How to start

First, let's take a look at the circumstances under which CMS performs GC:

First, JVM decides when to start garbage collection based on-XX:CMSInitiatingOccupancyFraction and-XX:+UseCMSInitiatingOccupancyOnly.

If-XX:+UseCMSInitiatingOccupancyOnly is set, CMS GC will be triggered only if the old occupancy actually reaches the ratio set by the-XX:CMSInitiatingOccupancyFraction parameter.

If-XX:+UseCMSInitiatingOccupancyOnly is not set, the system will decide when to trigger CMS GC based on the statistics. Therefore, it is sometimes encountered that the CMS GC is set at 80%, but it has already been triggered at 50%, because this parameter is not set.

Specific implementation

The implementation process of CMS GC is as follows:

Initial tag (CMS-initial-mark)

This phase is the stop the world phase, so the objects marked in this phase are only the objects most directly reachable from the root set.

At this stage, a log will be printed: CMS-initial-mark:961330K (1572864K), indicator timing, used space and total space of the old age.

Concurrent tagging (CMS-concurrent-mark)

This phase is executed concurrently with the application thread, and the so-called concurrent collector refers to this. The main function is to mark reachable objects, and there is no need for user thread pause at this stage.

Two logs will be printed at this stage: CMS-concurrent-mark-start,CMS-concurrent-mark

Pre-cleaning (CMS-concurrent-preclean)

The main purpose of this phase is to do some pre-cleaning, because the markup and application threads execute concurrently, so the state of some objects will change after marking, and this phase solves this problem. Because the later CMS-remark phase will also stop the world, in order to make the pause time as small as possible, we also need to do some work in the preclean phase to save time.

Two logs will be printed at this stage: CMS-concurrent-preclean-start,CMS-concurrent-preclean

Controlled pre-cleaning (CMS-concurrent-abortable-preclean)

The purpose of this phase is to make CMS GC more manageable and to perform some pre-cleanup to reduce the time it takes to cause application pauses during the CMS-remark phase.

Several parameters are involved in this phase:

-XX:CMSMaxAbortablePrecleanTime: the execution of the abortable-preclean phase ends when this time is reached. -XX:CMSScheduleRemarkEdenSizeThreshold (default 2m): controls when the abortable-preclean phase starts, that is, when the current lightweight usage reaches this value, the abortable-preclean phase will start. -XX:CMSScheduleRemarkEdenPenetratio (default 50%): controls when the abortable-preclean phase ends execution.

Three logs will be printed at this stage: CMS-concurrent-abortable-preclean-start,CMS-concurrent-abortable-preclean,CMS:abort preclean due to time XXX

Relabel (CMS-remark)

The application thread is paused at this stage for much less time than the concurrent tag, but slightly longer than the initial tag, because all objects are rescanned and marked.

The following logs are printed at this stage:

YG occupancy:964861K (2403008K) refers to the situation of the younger generation at the time of implementation.

CMS remark:961330K (1572864K), which refers to the situation in the old years of implementation.

In addition, the time-consuming processes such as weak reference processing, class unloading, and so on are printed.

Concurrent cleanup (CMS-concurrent-sweep)

Concurrent garbage cleaning is performed at this stage.

Concurrent reset status waits for the next CMS trigger (CMS-concurrent-reset)

This phase is to reset the relevant data structures for the next CMS GC.

Summary

The collection process of CMS can be summarized as follows: 2 tagging, 2 pre-purging, 1 relabeling, 1 purging.

In the process of CMS cleanup, only initial marking and relabeling need to pause user thread briefly, and concurrent marking and concurrent cleaning do not need to pause user thread, so it is very efficient and suitable for high interaction situations.

CMS also has its drawbacks: it consumes additional CPU and memory resources. The shortage of CPU and memory resources will increase the burden on the system (the default number of starting threads for CMS is (number of CPU + 3) / 4).

In addition, during the concurrent collection process, the user thread is still running and still generating memory garbage, so "floating garbage" may be generated (it cannot be cleaned this time, but only the next Full GC). Therefore, during GC, you need to reserve enough memory for the user thread to use.

So collectors that use CMS do not trigger Full GC when the old age is over, but do Full GC when most of them are used (default 68%, that is, 2 XX:CMSInitiatingOccupancyFraction 3, set with-Full GC). If the user thread does not consume a lot of memory, you can appropriately increase-XX:CMSInitiatingOccupancyFraction to reduce the number of GC and improve performance. If the reserved user thread does not have enough memory, Concurrent Mode Failure will be triggered, and an alternative scenario will be triggered: use the Serial Old collector for collection, but the pause time will be long, so the-XX:CMSInitiatingOccupancyFraction should not be set too large.

Also, CMS uses a mark-clear algorithm, which can lead to memory fragmentation, so you can use-XX:+UseCMSCompactAtFullCollection to set whether to defragment after FullGC and-XX:CMSFullGCsBeforeCompaction to set how many uncompressed FullGC to have a FullGC with compression.

Concurrency and parallelism

Concurrent collection: > refers to the simultaneous execution of the user thread and the GC thread (not necessarily parallel or alternately, but generally at the same time). There is no need to pause the user thread (in fact, in CMS, the user thread still needs to pause, but it is very short, and the GC thread executes on another CPU).

Parallel collection: > refers to multiple GC threads working in parallel, but the user thread is paused at this time

So, the Serial is serial, the Parallel collector is parallel, and the CMS collector is concurrent.

After reading the above, have you mastered how to understand GC knowledge point CMS? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.