Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement a garbage Collector in Java

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about how to achieve a garbage collector in Java. The article is rich in content and analyzes and describes for you from a professional point of view. I hope you can get something after reading this article.

1. Object survival determination algorithm:

1) reference counting algorithm: add a reference counter to each object, add 1 whenever there is a local reference, and subtract 1 when the reference expires. An object with a counter of 0 at any time can no longer be referenced. However, the mainstream Java virtual machine does not choose the reference counter algorithm to manage memory, the main reason is that it is difficult to solve the problem of circular reference between objects.

2) accessibility analysis algorithm: through a series of objects called "GCRoots" as the starting point, the search path is called reference chain. When an object does not have any reference chain to GCRoots, it is proved that the object is unavailable. Mainstream implementation.

In Java language, the objects that can be used as GCRoots include the following: a. Objects referenced in the virtual machine stack; b. The object referenced by the class static property in the method area; c. The object referenced by the constant in the method area; d. The object referenced by JNI in the local method stack.

2. Live or die

To actually declare an object dead, you have to go through at least two marking processes:

1) after the reachability analysis, it is found that it is unreachable, mark it for the first time, and put the objects that do not need to execute the finalize method into a F-Queue queue

2) GC makes a second small-scale tag on the objects in the F-Queue queue, and if it is still unreachable, it will basically be recycled.

3. On finalize method

1) when the object does not override the finalize method, or the finalize method has been called by the virtual machine, the virtual machine regards both cases as "unnecessary execution". The so-called "execution" means that the virtual machine triggers the method, but does not promise to wait for it to finish running.

2) the finalize method will only be called once by the system

3) all the work that the finalize method can do, try-finally or other methods can do better and more timely, so it is recommended that you can completely forget the existence of this method in the Java language.

4. Garbage collection algorithm: tag-clean, copy, mark-organize, generation collection

1) tag-cleanup algorithm

a. It is divided into two stages: marking and cleaning, first marking the objects that need to be recycled, and uniformly recycling all the marked objects after the marking is completed.

b. Deficiency: one is the efficiency problem, the two processes of marking and cleaning are not efficient; the other is the space problem, which will produce a large number of discontinuous memory fragments after mark cleanup. Too much space debris may result in a garbage collection action that has to be triggered in advance when the program needs to allocate larger objects later, because it is unable to find enough continuous memory.

2) replication algorithm

a. Divide the available memory into two equal chunks of capacity, using only one of them at a time. When one piece of memory is used up, copy the surviving objects to another, and then clean up the used memory space at once. Efficiency is improved, and there is no memory fragmentation problem. The cost is that the memory has been reduced to half its original size.

b. 98% of the new generation of objects are "life and death", so instead of dividing the memory space according to the 1:1 ratio, the memory is divided into a larger Eden space and two smaller Survivor spaces, using Eden and one of the Survivor at a time. When recycling, copy the surviving objects in Eden and Survivor to another piece of Survivor space at once, and finally clean up the Eden and the Survivor space you just used. The default size ratio of Eden and Survivor for HotSpot virtual machines is 8:1, that is, 90% of the memory space available in each new generation is 90% of the entire new generation capacity, and only 10% of the memory is "wasted". If another piece of Survivor space does not have enough space to store the surviving objects collected by the last Cenozoic generation, these objects will enter the old age directly through the allocation guarantee mechanism.

3) marking-finishing algorithm

a. There are two stages: tagging and sorting, first marking the objects that need to be recycled, then moving the surviving objects to one end, and then directly cleaning up the memory beyond the boundary.

4) Generation collection algorithm

a. Generally, the Java heap is divided into the new generation and the old age, so that the most appropriate collection algorithm can be adopted according to the characteristics of each age. In the new generation, each garbage collection found that a large number of objects died, only a small number of survival, then choose the replication algorithm, only need to pay a small amount of replication cost of surviving objects to complete the collection. In the old days, because the object had a high survival rate and no extra space to allocate guarantee, it was necessary to use the "mark-clean" or "mark-organize" algorithm for recycling.

5. Algorithm implementation of HotSpot.

1) enumerating root nodes: as the current mainstream Java virtual machines use accurate GC, when the execution system pauses, there is no need to check all execution contexts and global reference locations. The virtual machine should have a way to directly know where the object references are stored. In the implementation of HotSpot, a set of data structures called OopMap are used to record object references.

2) safe point: the program can not be paused to start GC in all places during execution, but can only be paused when the safe point is reached. The selection of safety points should not be so small that the GC wait time is too frequent or too frequent to increase the runtime load too much. Therefore, the selection of security points is basically based on whether the program has the characteristics of allowing the program to execute for a long time. The most obvious feature of "long-time execution" is to reuse sequences, such as method calls, loop jumps, exception jumps, etc., so instructions with these functions will produce security points. Another issue to consider is how to get all threads to "run" to the nearest safe point and then stop when GC occurs. There are two options: preemptive interrupt and active interrupt. Few virtual machine implementations now use preemptive interrupts to pause threads in response to GC events. The idea of active interrupt when GC needs to interrupt the thread, it does not operate directly on the thread, but simply sets a flag, each thread takes the initiative to poll this flag when it is executed, and finds that when the interrupt flag is true, the interrupt is suspended, and the polling flag coincides with the safety point, in addition to the need to allocate memory to create objects.

3. Security zone: means that the reference relationship does not change in a code snippet. It is safe to start GC anywhere in this area. When a thread executes code in a safe zone, it first identifies that it has entered the safe zone, so that when JVM initiates a GC during that time, it doesn't care about the thread that identifies itself as the safe zone. When the thread leaves the safe zone, it checks to see if the system has completed the root node enumeration, and if so, the thread continues to execute, otherwise it must wait until it receives a signal that it can safely leave the safe zone.

6. Garbage collector: a. New generation collectors: Serial, Parnew, Parallel Scavenge,b. Old era collectors: Parallel Old, CMS, Serial Old,c.G1.

1) Serial collector: a single-threaded collector, when it performs garbage collection, must pause all other worker threads until its collection is finished.

2) ParNew collector: a multithreaded version of the Serial collector, except for the Serial collector, which currently works with the CMS collector

3) Parallel Scavenge collector: achieves a controllable throughput at target, also known as "throughput first" collector

4) Serial Old collector: an old version of the Serial collector, which is also a single-threaded collector, using the tag-collation algorithm

5) Parallel Old collector: an old version of the Parallel Scavenge collector that uses multithreading and mark-up algorithms to give priority to the Parallel Scavenge + Parallel Old collector in situations that focus on throughput and CPU resource sensitivity.

6) CMS collector: the collector, which aims to obtain the shortest recovery pause time, is implemented based on the "mark-clean" algorithm. The advantages are: concurrent collection and low pause.

7) G1 collector: it divides the whole Java heap into several independent regions of equal size, and can establish a predictable pause time model. G1 tracks the value of garbage accumulation in each Region and maintains a priority list in the background, giving priority to the collection of the most valuable Region each time according to the allowed collection time.

7. GC log: the daily solstice of each collector is different, but all have something in common. Here is a typical GC log.

1) 33.125: [GC [DefNew: 3324K-> 152K (3712K), 0.0025925 secs] 3324K-> 152K (11904K), 0.0031680 secs]

2) 100.667: [Full GC [Tenured: 0K-> 210K (10240K), 0.0149142 secs] 4603K-> 210K (19456K), [Perm: 2999K-> 2999K (21248K)], 0.015007 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]

a. The initial number represents the time when the GC occurred, which means the number of seconds since the Java virtual machine started

B. [GC and [Full GC] explain the type of standstill in this garbage collection, not to distinguish between the new generation GC and the old GC. Full, indicating that a Stop-The-World has occurred in GC this time.

C. [DefNew, [Tenured, [Perm] represent the area where GC occurs. Different garbage collectors display different names.

d. Inside the square brackets 3324-> 152K (3712) indicates "capacity used in this memory area before GC-> capacity used in this memory area after GC (total capacity of this memory area)"

e. 3324-> 152K (11904K) outside square brackets indicates "Java heap used capacity before GC-> Java heap used capacity after GC (total Java heap capacity)"

f. After that, 0.0025925 secs indicates the time taken by the GC in the memory area (in seconds). Some collectors will give a more specific time. User, sys and real represent the CPU time consumed in the user mode, the CPU time consumed in the kernel state, and the wall clock time (Wall Clock Time) of the operation from start to finish. The difference between CPU time and wall clock time is that wall clock time includes all kinds of non-operational waiting time.

Memory allocation rules: here are some of the most common memory allocation rules

1) objects are assigned first in Eden

2) large objects directly enter the old era: the so-called large objects are Java objects that require a lot of continuous memory space. The most typical large objects are very long strings and arrays to avoid short-lived large objects.

3) long-term surviving objects directly enter the old age: the virtual machine defines an object age counter for each object. If the object is still alive after each MinorGC and can be accommodated by Survivor, the age of the object is increased by 1. When its age reaches a certain level (default is 15), it will be promoted to the old age.

4) dynamic object age determination: the virtual machine does not require that the age of the object must reach MaxTenuringThreshold in order to promote the old age. If the total size of all objects of the same age in the Survivor space is more than half of that of the Survivor space, the objects whose age is greater than or equal to that age can directly enter the old age.

9. Space allocation guarantee

Before the occurrence of MinorGC, the virtual machine first checks whether the maximum available contiguous space in the old era is larger than the total space of all objects in the new generation. If this condition is true, then MinorGC can ensure that it is safe. If not, the virtual machine will check whether the HandlePromotionFailure setting value allows guarantee to fail. If so, it will continue to check whether the maximum available contiguous space in the old age is greater than the average size of objects promoted to the old age. If it is greater than, an MinorGC will be attempted, although this time the MinorGC is risky, and if it is less than or the HandlePromotionFailure setting does not allow risk, then do a FullGC instead.

This is how to achieve a garbage collector in the Java shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report