How to use GC algorithm to realize garbage priority algorithm 04/16 Update SLTechnology News&Howtos

How to use GC algorithm to realize garbage priority algorithm

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use GC algorithm to achieve garbage priority algorithm related knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that everyone after reading this article on how to use GC algorithm to achieve garbage priority algorithm will have something to gain, let's take a look.

G1-Garbage First (garbage first algorithm)

The main design goal of G1 is to make the timing and distribution of STW pauses predictable and configurable. In fact, G 1 is a soft real-time garbage collector, which means you can set a specific performance indicator for it. You can specify that the STW pause must not exceed x milliseconds in any xx millisecond time range. For example, any 1 second pause time shall not exceed 5 milliseconds. Garbage-First GC will try its best to achieve this goal (there is a good chance that it will be satisfied, but it is not entirely certain how much will be hard real-time [hard real-time]).

In order to achieve this goal, G1 has some unique implementations. First of all, the heap is no longer divided into successive younger and older spaces. Instead, it is divided into multiple (usually 2048) small heap areas (* aller heap regions) that can store objects. Each small stack area may be Eden area, Survivor area or Old area. Logically, all the Eden and Survivor areas together are the younger generation, and all the Old zones are put together, which is the old age:

This partition allows GC not to collect the entire heap space each time, but to deal with it incrementally: only a portion of the small heap area is processed at a time, called this collection set. Each pause will collect all the younger generation's small stack areas, but may only include some of the old ones:

Another innovation of G1 is to estimate the total number of surviving objects in each small stack during the concurrency phase. The principle used to build a recycling set (collection set) is that the small heap area with the most garbage will be collected first. This is also the origin of the name G1: garbage-first.

To enable the G1 collector, the command line arguments used are:

Java-XX:+UseG1GC com.mypackages.MyExecutableClassEvacuation Pause: Fully Young (transfer pause: pure year replacement mode)

At the beginning of the application startup, G1 has not yet performed the not-yet-executed concurrency phase, so it does not get any additional information and is in the initial fully-young mode. After the young generation is full of space, the application thread is paused, the surviving objects in the young generation area are copied to the survival area, and if there is no survival area, select any part of the free small heap area as the survival area.

The process of replication is called Evacuation, which works basically the same way as the younger generation collectors mentioned earlier. The log information of the transfer pause is very long, and we have removed some unimportant information for simplicity. We will explain it in detail after the concurrency phase. In addition, because of the large number of logging records, the logs of the parallel phase and the "other" phase will be split into several parts to explain:

0.134: [GC pause (G1 Evacuation Pause) (young), 0.0144119 secs] [Parallel Time: 13.9 ms, GC Workers: 8]... [Code Root Fixup: 0.0 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.1 ms] [Other: 0.4 ms]... [Eden: 24.0m (24.0m)-> 0.0B (13.0m) Survivors: 0.0B-> 3072.0K Heap: 24.0m (256.0M)-> 21.9m (256.0m)] [Times: user=0.04 sys=0.04, real=0.02 secs]

0.134: [GC pause (G1 Evacuation Pause) (young), 0.0144119 secs]-G1 transfer is paused to clean up only the space of the younger generation. The pause starts at 134ms after the JVM starts and lasts for 0.0144 seconds.

[Parallel Time: 13.9 ms, GC Workers: 8]-indicates that the subsequent activity is executed in parallel by eight Worker threads, consuming 13.9ms (real time).

... For the convenience of reading, some of the contents have been omitted. Please refer to the following article.

[Code Root Fixup: 0.0 ms]-releases internal data used to manage parallel activities. It's usually close to zero. This is a serial execution process.

[Code Root Purge: 0.0 ms]-cleaning up other parts of the data is also very fast, but it is almost zero if it is not necessary. This is a serial execution process.

[Other: 0.4 ms]-time consumed by other activities, many of which are executed in parallel.

... Please refer to the following article.

[Eden: 24.0m (24.0m)-> 0.0B (13.0m)-usage / total capacity of the Eden area before and after the pause.

Survivors: 0.0B-> 3072.0K-the amount of alive area used before and after the pause.

Heap: 24.0m (256.0m)-> 21.9m (256.0m)]-the usage and total capacity of the entire heap before and after the pause.

[Times: user=0.04 sys=0.04, real=0.02 secs]-the duration of the GC event is measured in three parts:

User-the total CPU time consumed by the GC thread during this garbage collection.

The time consumed by system calls and waiting for events during the sys-GC process.

Real-time the application was paused. In parallel GC (Parallel GC), this number is approximately equal to: (user time + system time) / GC threads. Eight threads are used here. Note that there is always a percentage of processes that cannot be parallelized.

Note: system time (wall clock time, elapsed time) refers to the time taken by the system clock from the time a program runs to its termination. Generally speaking, the system time is greater than the CPU time.

The most onerous GC tasks are performed by multiple dedicated worker threads. The following log describes their behavior:

[Parallel Time: 13.9 ms, GC Workers: 8] [GC Worker Start (ms): Min: 134.0, Avg: 134.1, Max: 134.1, Diff: 0.1] [Ext Root Scanning (ms): Min: 0.1, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.2] [Update RS (ms): Min: 0, Avg: 0, Max: 0, Diff: 0, Sum: 0] [Processed Buffers: Min: 0 Avg: 0, Max: 0, Diff: 0, Sum: 0] [Scan RS (ms): Min: 0, Avg: 0, Max: 0, Diff: 0, Sum: 0] [Code Root Scanning (ms): Min: 0, Avg: 0, Max: 0, Diff: 0. 2, Sum: 0. 2] [Object Copy (ms): Min: 10.8, Avg: 12.1, Max: 12.6, Diff: 1.9 Sum: 96.5] [Termination (ms): Min: 0.8,1.5, Max: 2.8, Diff: 1.9, Sum: 12.2] [Termination Attempts: Min: 173, Avg: 293.2, Max: 362, Diff: 189, Sum: 2346] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] GC Worker Total (ms): Min: 13.7, Avg: 13.8, Max: 13.8 Diff: 0.1, Sum: 110.2] [GC Worker End (ms): Min: 147.8, Avg: 147.8, Max: 147.8, Diff: 147.8]

[Parallel Time: 13.9 ms, GC Workers: 8]-indicates that the following activities are executed in parallel by eight threads and take 13.9ms (real time).

[GC Worker Start (ms)-the timestamp relative to the start of the pause when the worker thread of GC starts. If there is a big difference between Min and Max, it indicates that there are too many threads used by other processes on the machine, squeezing out the CPU time of GC.

[Ext Root Scanning (ms)-how long did it take to scan off-heap (non-heap) root, such as classloaders, JNI references, JVM's system root, etc. The run time is shown later, and "Sum" refers to the CPU time.

[Code Root Scanning (ms)-how long did it take to scan the root in the actual code: for example, local variables and so on (local vars).

[Object Copy (ms)-how long it took to copy the surviving objects in the collection area.

[Termination (ms)-how long does it take for GC's worker thread to ensure that it can stop safely, doing nothing during that time, and the thread stops running after stop.

[Termination Attempts-how many try and teminate attempts have been made by the worker thread of GC. If worker finds that there are still some tasks that have not been finished, the attempt will fail and cannot be terminated for the time being.

[GC Worker Other (ms)-trivial activities that are not worth listing separately in the GC log.

GC Worker Total (ms)-the total working time of the worker thread for GC.

[GC Worker End (ms)-the timestamp when the worker thread of GC completes the job. Generally speaking, these numbers should be roughly equal, otherwise it means that too many threads are suspended, probably because of the bad neighbor effect (noisy neighbor).

In addition, during the suspension of the transfer, there are a number of small activities that are trivial. We will only introduce some of them here, and the rest will be discussed later.

[Other: 0.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.1 ms] [Humongous Register: 0.0 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 0.0 ms]

[Other: 0.4 ms]-time consumed by other activities, many of which are also executed in parallel.

[Ref Proc: 0.2 ms]-time to handle non-strong references (non-strong): clean up or decide if cleanup is needed.

[Ref Enq: 0. 0 ms]-used to arrange the remaining non-strong references into the appropriate ReferenceQueue.

[Free CSet: 0.0 ms]-the time it takes to return the small heaps released in the collection so that they can be used to allocate new objects.

Concurrent Marking (concurrent markup)

Many concepts of G 1 collector are based on CMS, so the following requires you to have some understanding of CMS. Although there are many differences, the goal of concurrent tagging is basically the same. The concurrent tagging of G1 records all living objects at the beginning of the marking phase through Snapshot-At-The-Beginning (initial snapshot). Even when marking, some of them become rubbish at the same time. Through the fact that the object is survival information, the survival state of each small heap area can be constructed, so that the recycling set can be selected efficiently.

This information will be used to perform garbage collection in older areas in the following stages. It is executed completely concurrently in two cases: first, if it is determined that a small heap contains only garbage during the marking phase; and second, an old-fashioned small heap that contains both garbage and living objects during the STW transfer pause.

When the overall usage ratio of heap memory reaches a certain value, the concurrent tag is triggered. The default value is 45%, but it can also be set by the JVM parameter InitiatingHeapOccupancyPercent. Like CMS, the concurrency tag of G1 consists of multiple phases, some of which are completely concurrent, and some of which require application threads to be paused.

Phase 1: Initial Mark (initial tag). This phase marks all objects that are directly reachable from GC root. A STW pause is required in CMS, but G 1 usually handles these things while transferring the pause, so its overhead is very small. You can see the pause in the first line of the Evacuation Pause log (initial-mark):

1.631: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.0062656 secs]

Phase 2: Root Region Scan (Root area scan). This phase marks all living objects that are reachable from the root area. Root areas include non-empty areas and areas that have to be collected during the tagging process. Because migrating objects can cause a lot of trouble during the process of concurrent marking, this phase must be completed before the next transfer pause. If a transfer pause must be initiated, the root area scan will be asked to abort until it is complete before continuing the scan. In the current version of the implementation, the root area is the surviving small stack area: y includes the portion of the younger generation small pile area that is sure to be cleaned up in the next transfer pause.

1.362: [GC concurrent-root-region-scan-start] 1.364: [GC concurrent-root-region-scan-end, 0.0028513 secs]

Phase 3: Concurrent Mark (concurrent tagging). This phase is very similar to CMS: it just traverses the object graph and marks accessible objects in a special bitmap. In order to ensure the accuracy of the snapshot at the beginning of the tag, all application threads perform reference updates to the object graph concurrently, G1 requires that obsolete references referenced in the previous phase for marking purposes be discarded.

This is achieved by using the Pre-Write barrier (not to be confused with the Post-Write described later, nor with the memory barrier (memory barriers) in multithreaded development). The function of the Pre-Write barrier is that when G1 does concurrent marking, if the program changes a property of the object, it will store the previous reference in the log buffers. It is handled by the concurrent markup thread.

1.364: [GC concurrent-mark-start] 1.645: [GC concurrent-mark- end, 0.2803470 secs]

Phase 4: Remark (marked again). Like CMS, this is a STW pause to complete the marking process. For G1, it briefly stops the application thread, stops writing to the concurrent update log, processes a small amount of information, and marks all living objects that are not marked at the beginning of the concurrent tag. This phase also performs some additional cleanup, such as reference handling (see Evacuation Pause log) or class unloading (class unloading).

1.645: [GC remark 1.645: [Finalize Marking, 0.0009461 secs] 1.646: [GC ref-proc, 0.0000417 secs] 1.646: [Unloading, 0.0011301 secs], 0.0074056 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

Phase 5: Cleanup (cleanup). The last small stage prepares for the upcoming transfer phase, counting all the living objects in the small heap area, and sorting the small heap area to improve the efficiency of GC. This phase also performs all the necessary house-keeping activities for the next tag: maintaining the internal state of the concurrent tag.

Finally, it is important to note that all small heap areas that do not contain living objects are recycled at this stage. Some of them are concurrent: for example, the collection of empty storage areas, and most of the survival rate calculation, this phase also requires a short STW pause to complete the job without being affected by the application thread. The log of this STW pause is as follows:

1.652: [GC cleanup 1213M-> 1213m (1885m), 0.0030492 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]

If you find that some small heap areas contain only garbage, the log format may be slightly different, such as:

1.872: [GC cleanup 1357M-> 173m (1996m), 0.0015664 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 1.874: [GC concurrent-cleanup-start] 1.876: [GC concurrent-cleanup-end, 0.0014846 secs]

Evacuation Pause: Mixed (transfer pause: mixed mode)

It is an optimal situation to be able to clean up entire small piles in the old days concurrently, but sometimes this is not the case. After the concurrent markup is complete, G1 will perform a hybrid collection (mixed collection) that not only cleans up the younger generation, but also adds some old-age areas to the collection set.

The transfer pause (Evacuation pause) of mixed mode does not necessarily follow the concurrent marking phase. There are a lot of rules and historical data that can affect the timing of mixed mode. For example, if a lot of small heap areas could be vacated concurrently in the old days, there would be no need to start mixed mode.

Therefore, it is likely that there will be multiple fully-young staging pauses between concurrent marking and mixed staging pauses.

The specific numbers and their order added to the old-age small heap area of the collection are also determined based on many rules. It includes data such as specified soft real-time performance metrics, survivability, and GC efficiency collected during concurrent marking, plus some configurable JVM options. The process of mixed collection is largely the same as the previous fully-young gc, but here we also introduce a concept: remembered sets (Historical memory set).

Remembered sets (Historical memory set) is used to support independent collection of different small heap areas. For example, when collecting zones A, B, and C, we must know if there are references from D or E to determine their viability. However, it takes a long time to traverse the whole heap, which goes against the original intention of incremental collection, so some optimization measures must be taken. Other GC algorithms have independent Card Table to support the younger generation of garbage collection, while G1 uses Remembered Sets.

As shown in the following figure, each small heap has a remembered set that lists all references to this area from the outside. These references will be treated as additional GC root. Note that objects identified as garbage in the old days are ignored during concurrent tagging, even if there are external references to them: because in this case the referrer is also garbage.

The next behavior is the same as other garbage collectors: multiple GC threads find out in parallel which are living objects and determine which are garbage:

Finally, the surviving object is transferred to the surviving area (survivor regions), and a new small heap area is created if necessary. Now, the empty stack area is released and can be used to store new objects.

In order to maintain remembered set, a Post-Write barrier is created as long as a field is written while the program is running. If the generated reference is cross-region, that is, pointing from one zone to another, a corresponding entry appears in the Remembered Set of the target area. To reduce the overhead caused by Write Barrier, the process of putting cards into Remembered Set is asynchronous and optimized. In general, the Write Barrier stores the dirty card information in the local buffer (local buffer), and a special GC thread collects it and passes the relevant information to the remembered set in the referenced area.

Compared with the pure younger generation mode, you can find some interesting things about the log in mixed mode:

[Update RS (ms): Min: 0. 7, Avg: 0. 8, Max: 0. 9, Diff: 0. 2, Sum: 6. 1] [Processed Buffers: Min: 0, Avg: 2. 2, Max: 5, Diff: 5, Sum: 18] [Scan RS (ms): Min: 0. 0, Avg: 0. 1, Max: 0. 2, Diff: 0. 2, Sum: 0. 8] [Clear CT: 0. 2 ms] [Redirty Cards: 0. 1 ms]

[Update RS (ms)-because Remembered Sets is processed concurrently, you must ensure that the card in the buffer is processed before the actual garbage collection. If the number of card is large, the load on GC concurrent threads may be high. The possible reason is that too many fields have been modified or CPU resources are limited.

[Processed Buffers-how many local buffers (local buffer) are handled by each worker thread.

[Scan RS (ms)-how long did it take to scan references from RSet.

[Clear CT: 0.2 ms]-time to clean up the cards in the card table. The cleanup simply removes the "dirty" status, which is used to identify whether a field has been updated for use by Remembered Sets.

[Redirty Cards: 0.1 ms]-the time it takes to mark the appropriate location in the card table as dirty. The "appropriate location" is determined by heap memory changes performed by the GC itself, such as reference queuing, and so on.

This is the end of the article on "how to use the GC algorithm to implement the garbage priority algorithm". Thank you for reading! I believe that everyone has a certain understanding of "how to use GC algorithm to achieve spam priority algorithm" knowledge, if you want to learn more knowledge, welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.