Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the common garbage collectors in JVM

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the common garbage collectors in JVM, which have a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.

If the collection algorithm is the methodology of memory collection, then the garbage collector is the concrete implementation of memory collection.

There is no stipulation on how the garbage collector should be implemented in the Java virtual machine specification, so the garbage collectors provided by different manufacturers and different versions of virtual machines may be very different, and generally provide parameters for users to combine the collectors used in different ages according to their own application characteristics and requirements.

Garbage collector for HotSpot virtual machine

The figure shows seven collectors that act on different generations, and if there is a connection between the two collectors, they can be used together. The area where the virtual machine is located indicates whether it belongs to a new-generation collector or an old-age collector.

Concept understanding

Concurrency and parallelism

Both nouns are concepts in concurrent programming, and they can be explained as follows in the context of talking about garbage collectors.

Parallel: refers to multiple garbage collection threads working in parallel, while the user thread is still waiting.

Concurrent: means that the user thread and the garbage collection thread execute at the same time (but not necessarily in parallel and may execute alternately), the user program continues to run, and the garbage collector runs on another CPU.

Minor GC and Full GC

New generation GC (Minor GC): refers to the garbage collection actions that occur in the new generation. Because most Java objects have the characteristics of dying out forever, Minor GC is very frequent, and the recovery speed is generally fast.

Old GC (Major GC / Full GC): refers to the GC that occurred in the old years, with the emergence of Major GC, which is often accompanied by Minor GC at least once (but not absolutely, there is a policy selection process of Major GC directly in the collection strategy of the Parallel Scavenge collector). Major GC is generally more than 10 times slower than Minor GC.

Throughput

Throughput is the ratio of the time spent by CPU running user code to the total time consumed by CPU, that is, throughput = time to run user code / (time to run user code + garbage collection time).

The virtual machine ran for a total of 100 minutes, of which garbage collection took 1 minute, and the throughput was 99%.

1. Serial collector

The Serial collector is the most basic and oldest collector, and it was (before JDK 1.3.1) the only choice for the new generation of virtual machines to collect.

Properties:

This collector is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will use only one CPU or one collection thread to complete garbage collection, but more importantly, when it does garbage collection, it must pause all other worker threads until its collection is complete. Stop The World

Application scenarios:

The Serial collector is the default new generation collector for virtual machines running in Client mode.

Advantages:

Simple and efficient (compared with the single thread of other collectors), for an environment that limits a single CPU, the Serial collector has no overhead of thread interaction, so concentrating on garbage collection can naturally achieve the highest efficiency of single thread collection.

II. ParNew collector

Properties:

The ParNew collector is actually a multithreaded version of the Serial collector. In addition to using multiple threads for garbage collection, the other behaviors, including all the control parameters available to the Serial collector, collection algorithm, Stop The World, object allocation rules, collection strategy, and so on, are exactly the same as the Serial collector. In implementation, the two collectors also share a lot of code.

Application scenarios:

The ParNew collector is the preferred new generation collector for many virtual machines running in Server mode.

The important reason is that apart from the Serial collector, it is currently the only one that works with the CMS collector.

During JDK 1.5, HotSpot released a garbage collector that can almost be considered epoch-making in strong interactive applications-- CMS Collector. This collector is the first truly concurrent collector in the HotSpot virtual machine, and it enables garbage collection threads and user threads to work at the same time for the first time.

Unfortunately, CMS, as an old collector, does not work with the new generation collector Parallel Scavenge that already exists in JDK 1.4.0, so when using CMS to collect older years in JDK 1.5, the new generation can only choose one of the ParNew or Serial collectors.

Serial collector VS ParNew collector:

The ParNew collector will never work better than the Serial collector in a single CPU environment, and even because of the overhead of thread interaction, the collector cannot be 100% guaranteed to surpass the Serial collector in either of the two CPU environments implemented by hyperthreading technology.

However, with the increase in the number of CPU that can be used, it is very beneficial to the effective use of system resources in GC.

3. Parallel Scavenge collector

Properties:

Parallel Scavenge collector is a new generation collector, which is not only a collector using replication algorithm, but also a parallel multithreaded collector.

Application scenarios:

The shorter the pause time is, the more suitable it is for programs that need to interact with users. Good response speed can improve user experience, while high throughput can efficiently use CPU time to complete program tasks as soon as possible, which is mainly suitable for tasks that operate in the background without too much interaction.

Comparative analysis:

Parallel Scavenge collectors, VS CMS collectors, etc.:

The characteristic of Parallel Scavenge collector is that its focus is different from other collectors. Collectors such as CMS focus on shortening the pause time of user threads during garbage collection as much as possible, while the goal of Parallel Scavenge collector is to achieve a controllable throughput (Throughput).

Because of its close relationship with throughput, Parallel Scavenge collectors are often called "throughput first" collectors.

Parallel Scavenge collector VS ParNew collector:

An important difference between the Parallel Scavenge collector and the ParNew collector is that it has an adaptive adjustment strategy.

Adaptive adjustment strategy for GC:

The Parallel Scavenge collector has one parameter-XX:+UseAdaptiveSizePolicy. When this parameter is turned on, there is no need to manually specify detailed parameters such as the size of the new generation, the ratio of Eden to Survivor, and the age of the object of promotion. The virtual machine collects performance monitoring information according to the operation of the current system, and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput. This adjustment is called GC adaptive adjustment strategy (GC Ergonomics).

4. Serial Old collector

Properties:

Serial Old, an older version of the Serial collector, is also a single-threaded collector that uses the tag-collation algorithm.

Application scenarios:

Client mode

The main significance of the Serial Old collector is also to be used by virtual machines in Client mode.

Server mode

If you are in Server mode, it has two main uses: one is to work with the Parallel Scavenge collector in JDK 1.5 and earlier, and the other is to serve as a backup scenario for the CMS collector when Concurrent Mode Failure occurs in concurrent collections.

5. Parallel Old collector

Properties:

Parallel Old is an old-fashioned version of the Parallel Scavenge collector, using multithreading and mark-up algorithms.

Application scenarios:

In situations that focus on throughput and CPU resource sensitivity, Parallel Scavenge plus Parallel Old collectors can be given priority.

This collector was only available in JDK 1.6. until then, the new generation of Parallel Scavenge collectors have been in an awkward state. The reason is that if the new generation chooses the Parallel Scavenge collector, the old generation has no choice but the Serial Old collector (the Parallel Scavenge collector does not work with the CMS collector). Due to the "drag" of the old Serial Old collector on the server application performance, the use of the Parallel Scavenge collector may not be able to maximize the throughput of the overall application, because the old single-threaded collection can not make full use of the server's multi-CPU processing capacity, in the old era of large and advanced hardware environment, the throughput of this combination may not even have the combination of ParNew and CMS. Until the advent of the Parallel Old collector, the Throughput first collector finally had a more veritable application portfolio.

VI. CMS collector

Properties:

CMS (Concurrent Mark Sweep) collector is a kind of collector whose goal is to obtain the shortest recovery pause time. At present, a large part of Java applications are concentrated on the service side of the Internet website or Bmail S system. This kind of applications pay special attention to the response speed of the service, hoping that the system will have the shortest pause time in order to bring a better experience to users. The CMS collector meets the needs of such applications very well.

The CMS collector is implemented based on the "mark-clear" algorithm, and its operation process is more complex than the previous collectors, which is divided into four steps:

Because the collector threads of the longest concurrent marking and concurrent cleanup processes throughout the process can work with the user thread, in general, the memory collection process of the CMS collector is executed concurrently with the user thread.

Initial tag (CMS initial mark)

The initial tag only marks the objects to which GC Roots can be directly associated, which is very fast and requires "Stop The World".

Concurrent tagging (CMS concurrent mark)

The concurrent marking phase is the process of GC Roots Tracing.

Relabel (CMS remark)

The purpose of the relabeling phase is to correct the markup records of the objects whose markup changes are caused by the continued operation of the user program during the concurrent tagging period. the pause time of this phase is generally slightly longer than that of the initial tagging phase, but much shorter than that of concurrent tagging, and "Stop The World" is still needed.

Concurrent cleanup (CMS concurrent sweep)

The concurrent cleanup phase clears the object.

Advantages:

CMS is an excellent collector, and its main advantages are already reflected in the name: concurrent collection and low pause.

Disadvantages:

The CMS collector is very sensitive to CPU resources

In fact, programs designed for concurrency are sensitive to CPU resources. In the concurrency phase, although it will not cause the user thread to stop, it will cause the application to slow down and the total throughput to decrease because it takes up some threads (or CPU resources).

By default, the number of recycling threads started by CMS is (CPU + 3) / 4, that is, when the CPU is more than 4, the garbage collection thread has no less than 25% of the CPU resources when the collection is issued concurrently, and decreases with the increase in the number of CPU. But when there are less than 4 CPU (for example, 2), the impact of CMS on user programs may become great.

The CMS collector cannot handle floating garbage

The CMS collector cannot handle floating garbage, and a "Concurrent Mode Failure" failure may result in another Full GC generation.

Since the user thread is still running during the CMS concurrency cleanup phase, new garbage will naturally be generated as the program runs. This garbage appears after the marking process, and CMS cannot dispose of it in the current collection, so it has to be cleaned up again in the next GC. This part of the garbage is called "floating garbage".

It is also because the user thread still needs to run in the garbage collection phase, so it also needs to reserve enough memory space for the user thread to use, so the CMS collector can not wait until the old age is almost completely filled up like other collectors, but needs to reserve a part of the space to provide the program operation for concurrent collection. If the memory reserved during the CMS run does not meet the needs of the program, there will be a "Concurrent Mode Failure" failure, and the virtual machine will start a backup plan: temporarily enable the Serial Old collector to restart the old garbage collection, so the pause time is very long.

CMS collectors produce a lot of space debris.

CMS is a collector based on the Mark-clear algorithm, which means that a large number of space debris will be generated at the end of the collection.

When there is too much space debris, it will bring a lot of trouble to the allocation of large objects, often there is a lot of space left in the old age, but can not find enough continuous space to allocate the current objects, so the Full GC has to be triggered in advance.

G1 Collector

Properties:

G1 (Garbage-First) is a garbage collector for server applications. The mission given by the HotSpot development team is to replace the CMS collector released in JDK 1.5 in the future. Compared with other GC collectors, G1 has the following features.

Other collectors before G1 cover the entire Cenozoic or old era, while G1 is no longer the case. When using the G1 collector, the memory layout of the Java heap is very different from that of other collectors. It divides the entire Java heap into equal-sized independent areas (Region). Although it still retains the concept of the new generation and the old age, the new generation and the old age are no longer physically isolated, they are both part of the collection of Region (not need to be contiguous).

The G1 collector can build a predictable pause time model because it can systematically avoid region-wide garbage collection in the entire Java heap. G1 tracks the value of garbage accumulation in each Region (the amount of space obtained by recycling and the empirical value of the time it takes to recycle), and maintains a priority list in the background, preferentially recycling the most valuable Region each time according to the allowed collection time (that's where the Garbage-First name comes from). This method of using Region to divide memory space and priority area recovery ensures that the G1 collector can achieve the highest collection efficiency in a limited time.

Parallelism and concurrency

G1 can make full use of the hardware advantages of multi-CPU and multi-core environment, and use multiple CPU to shorten the Stop-The-World pause time. Some other collectors originally need to pause the GC actions executed by Java threads. G1 collectors can still allow Java programs to continue to execute concurrently.

Generational collection

Like other collectors, the concept of generation is still preserved in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can deal with newly created objects and old objects that have survived many times of GC for better collection results.

Spatial integration

Different from CMS's "mark-clean" algorithm, G1 is a collector based on "tag-collation" algorithm as a whole, and locally (between two Region) based on "copy" algorithm, but in any case, these two algorithms mean that G1 does not produce memory space fragments during operation, and can provide regular available memory after collection. This feature makes it easier for the program to run for a long time, and when allocating large objects, the next GC will not be triggered in advance because the continuous memory space cannot be found.

A predictable pause

This is another advantage of G1 over CMS. Reducing pause time is a common concern of G1 and CMS, but in addition to pursuing a low pause, G1 can also build a predictable pause time model, which allows users to specify that the time spent on garbage collection should not exceed N milliseconds within a time period of M milliseconds.

Execution process:

The operation of the G1 collector can be roughly divided into the following steps:

Initial tag (Initial Marking)

The initial marking phase only marks the objects to which GC Roots can be directly associated, and modifies the value of TAMS (Next Top at Mark Start) so that when the user program runs concurrently in the next stage, a new object can be created in the correct available Region. This stage requires a pause in the thread, but it takes a short time.

Concurrent tagging (Concurrent Marking)

The concurrent marking phase starts from GC Root to analyze the reachability of objects in the heap to find out the surviving objects. This phase takes a long time, but can be executed concurrently with user programs.

Final tag (Final Marking)

The final marking phase is to fix the part of the marking record that changes due to the continuous operation of the user program during the concurrent marking period. The virtual machine records the changes of the object during this period in the thread Remembered Set Logs, and the final marking phase needs to merge the Remembered Set Logs data into the Remembered Set. This stage requires thread pause, but can be executed in parallel.

Filter Recycling (Live Data Counting and Evacuation)

The filter recovery phase first sorts the recovery value and cost of each Region, and makes a recovery plan according to the GC pause time expected by the user. In fact, this stage can also be executed concurrently with the user program, but because only part of the Region is recycled, the time can be controlled by the user, and halting the user thread will greatly improve the collection efficiency.

VIII. Summary

Although we are comparing the various collectors, we are not trying to pick out the best collector. Because so far there is no best collector, let alone an omnipotent collector, we only choose the one that is most appropriate for specific applications. There is no need to explain this to prove that if there is a perfect collector that is universal and applicable in any scenario, there is no need for the HotSpot virtual machine to implement so many different collectors.

Thank you for reading this article carefully. I hope the article "what are the common garbage collectors in JVM" shared by the editor will be helpful to you? at the same time, I also hope you can support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report