How to understand memory layout and GC principle 01/05 Update SLTechnology News&Howtos

How to understand memory layout and GC principle

2026-01-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to understand memory layout and GC principle". In daily operation, I believe many people have doubts about how to understand memory layout and GC principle. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how to understand memory layout and GC principle". Next, please follow the editor to study!

ZGC (by far the best GC collector in history)

ZGC (The Z Garbage Collector) is a low-latency garbage collector introduced in JDK 11. Its design goals include: many improvements have been made on the basis of G1 (JDK 11 began to be introduced)

Pause time does not exceed 10ms

The pause time does not increase with the size of the heap or the size of the active object

Support for 8MB~4TB-level heaps (future support for 16TB).

From the design goal, we know that ZGC is suitable for memory management and recycling of large memory and low latency services. This paper mainly introduces the application and excellent performance of ZGC in low latency scenarios. The article is divided into four parts:

GC pain: introduce the GC pain points encountered in the actual business, and analyze the CMS collector and G1 collector pause time bottleneck

ZGC principle: analyze the essential reason why the ZGC pause time is shorter than G1 or CMS, and the technical principle behind it.

ZGC tuning practice: focus on sharing the understanding of ZGC tuning, and analyze several practical tuning cases

Upgrade ZGC effect: show the effect of applying ZGC in production environment.

The pain of GC

The system availability of many low-latency and high-availability Java services is often plagued by GC pauses. GC pause refers to STW (Stop The World) during garbage collection. When STW occurs, all application threads stop activity and wait for the GC pause to end.

Take Meituan's risk control service as an example, some upstream businesses require results to be returned in the risk control service 65ms, and the availability should reach 99.99%. However, because of the GC standstill, we failed to achieve the above usability goals. At that time, the CMS garbage collector was used, a single Young GC 40ms, 10 times a minute, and the average response time of the interface was 30ms. According to the calculation, the response time of requests with (40ms + 30ms) * 10 / 60000ms = 1.12% will increase from 0 to 40ms, and the response time of requests with 30ms * 10 / 60000ms = 0.5% will increase 40ms.

It can be seen that the GC pause has a great impact on the response time. In order to reduce the impact of GC pause on system availability, we tuned it from the point of view of reducing single GC time and GC frequency, and tested the G1 garbage collector, but these three measures failed to reduce the impact of GC on service availability.

CMS and G1 pause time bottleneck

Before introducing ZGC, let's review the GC process of CMS and G1 and the bottleneck of pause time. CMS's new generation of Young GC, G1 and ZGC are all based on the tag-copy algorithm, but the different implementation of the algorithm leads to huge differences in performance.

The tag-copy algorithm is applied to the new generation of CMS (ParNew is the default new generation garbage collector for CMS) and G1 garbage collector.

The tag-copy algorithm can be divided into four phases:

The marking phase, that is, starting with the GC Roots collection, marking active objects

Cleanup phase, that is, cleaning up all inactive objects

Transfer phase, that is, copy the active object to the new memory address

In the relocation phase, because the transfer causes the address of the object to change, in the relocation phase, all pointers to the old address of the object are adjusted to the new address of the object.

Taking G1 as an example, the main bottleneck of G1 pause time is analyzed through the process of mark-copy algorithm in G1 (which is used by both Young GC and Mixed GC of G1). The G1 garbage collection cycle is shown in the following figure:

The mixed recovery process of G1 can be divided into three stages: marking stage, cleaning stage and replication stage. Marking stage pauses to analyze the initial marking stage: the initial marking phase refers to the process of marking all direct child nodes from GC Roots, which is STW. Due to the small number of GC Roots, this phase usually takes a very short time. Concurrent marking phase: the concurrent marking phase refers to the reachability analysis of the objects in the heap starting from GC Roots to find out the living objects. This phase is concurrent, meaning that the application thread and the GC thread can be active at the same time. Concurrent tags take a lot of time, but because it's not STW, we don't care much about how long this phase takes. Relabeling phase: relabeling objects that have changed during the concurrent tagging phase. This phase is STW. The cleanup phase pauses to analyze and count the partitions with and without living objects in the cleanup phase, which does not clean up garbage objects or perform replication of living objects. This phase is STW, so there are no floating objects at this stage. Replication phase pause Analysis the transfer phase in the replication algorithm requires the allocation of new memory and member variables of the replication object. The transfer phase is STW, in which memory allocation usually takes a very short time, but the replication of object member variables may take a long time, because the replication time is proportional to the number of surviving objects and object complexity. The more complex the object, the longer it takes to copy. (there is no memory fragmentation and is not suitable for assignment migration of large objects.) in four STW processes:

The initial tag takes less time because it only marks GC Roots. (STW)

Re-tagging because the number of objects is small, and it takes less time.

The cleanup phase is less time-consuming because of the small number of memory partitions. (STW)

The transfer phase takes a long time to process all living objects. (STW)

Therefore, the bottleneck of G1 pause time is the transfer phase STW in marker-replication. Why can't the transfer phase be executed concurrently as the marking phase? The main reason is that G1 fails to solve the problem of accurately locating the address of the object in the process of transfer.

The Young GC of G1 and the Young GC of CMS, which mark-copy the whole process STW, are not described in detail here.

Fully concurrent ZGC based on ZGC principle

Like ParNew and G1 in CMS, ZGC also uses the mark-copy algorithm, but ZGC has made significant improvements to this algorithm: ZGC is almost all concurrent in the marking, transfer, and relocation phases, which is the key reason why the pause time of ZGC is less than the 10ms goal.

The ZGC garbage collection cycle is shown in the following figure:

ZGC has only three STW phases: initial tagging, re-tagging, and initial transfer.

Among them, both the initial marking and the initial transfer only need to scan all the GC Roots, and the processing time is proportional to the number of GC Roots, which is generally very short; the STW time in the re-marking phase is very short, with the most 1ms, and if it exceeds 1ms, it will enter the concurrent marking phase again. That is, almost all pauses in ZGC depend only on the GC Roots collection size, and the pause time does not increase with the size of the heap or the size of the active object. Compared with ZGC, the transfer phase of G1 is completely STW, and the pause time increases with the increase of the size of the living object.

ZGC has only three STW phases: initial tagging, re-tagging, and initial transfer. Among them, both the initial marking and the initial transfer only need to scan all the GC Roots, and the processing time is proportional to the number of GC Roots, which is generally very short; the STW time in the re-marking phase is very short, with the most 1ms, and if it exceeds 1ms, it will enter the concurrent marking phase again. That is, almost all pauses in ZGC depend only on the GC Roots collection size, and the pause time does not increase with the size of the heap or the size of the active object. Compared with ZGC, the transfer phase of G1 is completely STW, and the pause time increases with the increase of the size of the living object.

ZGC, the key technology of ZGC, solves the problem of accurate access to objects in the process of transfer through coloring pointer and read barrier technology, and realizes concurrent transfer.

The general principle is described as follows: "concurrency" in concurrency transfer means that the GC thread is constantly accessing the object while the application thread is transferring the object. Assuming that the object is transferred, but the object address is not updated in time, the application thread may access the old address, resulting in an error. In ZGC, the application thread will trigger the "read barrier". If it is found that the object has been moved, the "read barrier" will update the read pointer to the new address of the object, so that the application thread always accesses the new address of the object. So how does JVM tell that an object has been moved? Is to use the address of the object reference, that is, the shading pointer. The following describes the technical details of shading pointers and read barriers.

Shading pointers is a technique for storing information in pointers.

ZGC supports only 64-bit systems and divides the 64-bit virtual address space into multiple subspaces, as shown in the following figure:

Among them, [0~4TB) corresponds to the Java heap, [4TB ~ 8TB) is called M0 address space, [8TB ~ 12TB) is called M1 address space, [12TB ~ 16TB) is reserved unused, and [16TB ~ 20TB) is called Remapped space.

When an application creates an object, it first requests a virtual address in the heap space, but the virtual address does not map to the real physical address. ZGC also requests a virtual address for the object in the M0, M1 and Remapped address spaces, and the three virtual addresses correspond to the same physical address, but only one of the three spaces is valid at the same time. ZGC sets up three virtual address spaces because it uses the idea of "space for time" to reduce GC pause time. The space in "space for time" is a virtual space, not a real physical space. The switching process of these three spaces will be described in detail in the following chapters.

Corresponding to the above address space partition, ZGC actually uses only bit 0x41 of the 64-bit address space, while bit 42' 45 stores metadata and bit 47' 63 is fixed at 0. ZGC stores object survival information in 42-45 bits, which is completely different from traditional garbage collection and putting object survival information in the object header.

Reading barrier

Read barrier is a technique used by JVM to insert a small piece of code into the application code. This code is executed when the application thread reads the object reference from the heap. It is important to note that only "read object references from the heap" will trigger this code.

Read barrier example: Object o = obj.FieldA / / to read a reference from the heap, you need to add a barrier Object p = o / do not add a barrier, because you are not reading a reference o.dosomething () from the heap / / you do not need to add a barrier, because you are not reading a reference from the heap int I = obj.FieldB / / you do not need to add a barrier, because it is not an object reference

The code function of the read barrier in ZGC: in the process of object marking and transfer, it is used to determine whether the reference address of the object satisfies the condition, and takes the corresponding action.

ZGC concurrent processing demonstration next describes in detail the process of switching the address view during a garbage collection cycle in ZGC:

Initialization: after ZGC initialization, the address view of the entire memory space is set to Remapped. The program runs normally, allocates objects in memory, starts garbage collection after meeting certain conditions, and enters the marking stage at this time.

Concurrent marking phase: the view is M0 when entering the marking phase for the first time. If the object has been accessed by the GC markup thread or the application thread, adjust the address view of the object from Remapped to M0. So, at the end of the marking phase, the address of the object is either M0 view or Remapped. If the address of the object is M0 view, the object is active; if the address of the object is Remapped view, the object is inactive.

Concurrent transfer phase: after the marking ends, the transfer phase is entered, and the address view is set to Remapped again. If the object has been accessed by the GC transfer thread or the application thread, adjust the address view of the object from M0 to Remapped.

In fact, there are two address views M0 and M1 during the marking phase, and the above process shows that only one address view is used. The two are designed to distinguish between the previous tag and the current tag. That is, after entering the concurrent marking phase for the second time, the address view is adjusted to M1 instead of M0.

Shaded pointer and read barrier techniques are applied not only in the concurrent transfer phase, but also in the concurrent marking phase: if the object is set to marked, the traditional garbage collector needs to make a memory access and put the object survival information in the object header; in ZGC, it only needs to set the 42th-45th bit of the pointer address, and because it is a register access, it is faster than accessing memory.

At this point, the study on "how to understand the memory layout and the principle of GC" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.