In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)05/31 Report--
Today Xiaobian to share with you what the Java garbage collector and memory allocation method is the relevant knowledge, detailed content, clear logic, I believe that most people still know too much about this knowledge, so share this article for your reference, I hope you can learn something after reading this article, let's take a look at it.
Each part of the runtime area of Java memory, in which the program counter, the virtual machine stack and the local method stack are born with the thread and die with the thread. The stack frames in the stack methodically execute the stack and stack operations with the entry and exit of the method. The amount of memory allocated in each stack frame is basically known when the class structure is determined (although some optimizations will be made by the just-in-time compiler at run time, but in the discussion based on the conceptual model, generally speaking, it can be considered to be knowable at compile time), so the memory allocation and recovery of these areas are deterministic, and there is no need to think too much about how to reclaim them. When the method ends or the thread ends, memory naturally follows
Recycled.
Citation counting algorithm
Add a reference counter to the object that increments the counter value whenever there is a reference; when the reference expires, the counter value is subtracted by one; an object with a counter of zero at any time can no longer be used.
Simple reference counting is difficult to solve the problem of circular references between objects.
Reachability analysis algorithm
The basic idea of this algorithm is to use a series of root objects called "GC Roots" as the starting node set, from these nodes, search downward according to the reference relationship, the path of the search process is called "reference chain" (Reference Chain). If there is no reference chain connection between an object and GC Roots, or in terms of graph theory, it is from GC Roots to the time when the object is unreachable. It proves that this object can no longer be used.
The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector. After JDK 1.2, the PhantomReference class is provided to implement virtual references.
False reference scene?
Even if the object is determined to be unreachable in the reachability analysis algorithm, it is not "necessary to die". At this time, they are still in the stage of "probation". In order to really declare the death of an object, it has to go through at least two marking processes: if the object is found to have no reference chain connected to the GC Roots after the reachability analysis, it will be marked for the first time and then screened. The filter condition is whether it is necessary for this object to execute the finalize () method. If the object does not override the finalize () method, or if the finalize () method has been called by the virtual machine, the virtual machine treats both cases as "unnecessary".
If the object is determined to be really necessary to execute the finalize () method, the object will be placed in a queue called F-Queue, and their finalize () method will be executed later by a low-scheduling priority Finalizer thread automatically created by the virtual machine.
The reason for this is that if the finalize () method of an object executes slowly, or, more extremely, a dead loop occurs, it is likely to cause other objects in the F-Queue queue to wait forever, or even cause the entire memory collection subsystem to crash. The finalize () method is the last chance for the object to escape the fate of death. Later, the collector will mark the object in F-Queue for the second small scale, if the
Like saving yourself successfully in finalize ()-just re-associate yourself with any object on the reference chain, such as assigning yourself (the this keyword) to a class variable or an object's member variable, then it will be removed from the collection "about to be recycled" on the second tag; if the object hasn't escaped at this time, it's basically going to be recycled.
This is because the finalize () method of any object is automatically called by the system only once, and if the object faces the next collection, its finalize () method will not be executed again, so the self-rescue action of the second piece of code fails.
Finalize (), which is expensive and uncertain to run, cannot guarantee the calling order of each object, and has now been explicitly declared as a non-recommended syntax. Some textbooks describe it as suitable for clean work such as "shutting down external resources", which is a complete consolation to the use of the finalize () method. All the work that finalize () can do can be done better and more timely using try-finally or other ways, so the author suggests that you can completely forget this method in the Java language.
Recovery method area
It is mentioned in the Java virtual machine specification that virtual machines can not be required to implement garbage collection in the method zone. in fact, there are collectors that do not implement or fully implement method zone type unloading (for example, class unloading is not supported by ZGC collectors in JDK 11).
There are two main parts of garbage collection in the method area: obsolete constants and types that are no longer used. Recycling obsolete constants is very similar to recycling objects in the Java heap. For example, if a string "java" has entered the constant pool, but the current system does not have a string object whose value is "java", in other words, no string object references the "java" constant in the constant pool, and there is no other place in the virtual machine to refer to this literal quantity. If memory collection occurs at this point, and the garbage collector determines that it is necessary, the "java" constant will be cleaned out of the constant pool by the system. Symbolic references to other classes (interfaces), methods, and fields in the constant pool are similar.
It is relatively simple to determine whether a constant is "obsolete" or relatively simple, but it is more demanding to determine whether a type belongs to a "class that is no longer used". The following three conditions need to be met at the same time:
All instances of the class have been recycled, that is, there are no instances of the class and any derived subclasses in the Java heap.
The classloader that loads the class has been recycled, and this condition is usually difficult to achieve unless it is a well-designed alternative classloader scenario, such as OSGi, JSP reloading, and so on.
The corresponding java.lang.Class object of this class is not referenced anywhere, and the methods of this class cannot be accessed anywhere through reflection.
The Java virtual machine is allowed to recycle useless classes that meet the above three conditions, which only means "allowed", not like objects, which are bound to be recycled without references. The HotSpot virtual machine provides the-Xnoclassgc parameter to control whether to recycle the type, and you can also use-verbose:class and-XX:+TraceClass-Loading,-XX:+TraceClassUnLoading to view class loading and unloading information, where-verbose:class and-XX:+TraceClassLoading can be used in the Product version of the virtual machine, and the-XX:+TraceClassUnLoading parameter needs the virtual machine support of the FastDebug version [1].
In scenarios where bytecode frameworks such as reflection, dynamic proxy and CGLib are widely used to dynamically generate frequent custom class loaders such as JSP and OSGi, the Java virtual machine is usually required to have the ability to unload types to ensure that it does not cause excessive memory pressure on the method area.
Garbage collection algorithm
From the point of view of how to determine the demise of objects, garbage collection algorithms can be divided into two categories: "reference counting garbage collection" (Reference Counting GC) and "tracking garbage collection" (Tracing GC), which are often called "direct garbage collection" and "indirect garbage collection".
Generational collection theory
1) weak generational hypothesis (Weak Generational Hypothesis): the vast majority of objects are permanent.
2) the strong generation hypothesis (Strong Generational Hypothesis): the more times the object goes through the garbage collection process, the more difficult it is for the object to die.
3) Intergenerational citation hypothesis (Intergenerational Reference Hypothesis): intergenerational citations are only a small number compared with those of the same generation.
In fact, this is an implied inference that can be deduced from the first two hypotheses: two objects with mutual reference should tend to exist or die at the same time. For example, if there is an intergenerational reference to a Cenozoic object, because it is difficult for the old object to die, the citation will make the Cenozoic object survive at the time of collection, and then be promoted to the old age as you get older. at this time, the intergenerational reference is also eliminated.
Generational collection is not as easy as simply dividing memory areas, it has at least one obvious difficulty: objects are not isolated, and there are intergenerational references between objects.
If you want to carry out a collection limited to the Cenozoic area (Minor GC), but the objects in the Cenozoic era may be cited by the old age, in order to find out the living objects in this area, you have to traverse all the objects in the old age in addition to the fixed GC Roots to ensure the correctness of the accessibility analysis results, and vice versa. Although the scheme of traversing all objects throughout the old age is feasible in theory, it will undoubtedly put a great performance burden on memory recovery. In order to solve this problem, it is necessary to add the third rule of thumb to the generational collection theory: the intergenerational citation hypothesis.
According to this hypothesis, we should no longer scan the whole old age for a small number of intergenerational references, nor waste space to record whether each object exists and which intergenerational references exist. We just need to establish a global data structure in the new generation (this structure is called "memory set", Remembered Set), which divides the old age into small blocks to identify which memory of the old era will have intergenerational references. After that, when Minor GC occurs, only small chunks of memory containing intergenerational references are added to the GCRoots for scanning. Although this approach requires maintaining the correctness of the recorded data when the object changes the reference relationship (such as assigning itself or an attribute to a value), it adds some runtime overhead, but it is still cost-effective compared to scanning the entire old age when collecting.
Partial collection (Partial GC shyness): refers to garbage collection whose goal is not to collect the entire Java heap completely, which is divided into:
■ new generation collection (Minor GC/Young GC): refers to garbage collection that targets only the new generation.
■ Old Age Collection (Major GC/Old GC): refers to the goal of garbage collection in the old days. Currently, only CMS collectors have the behavior of collecting old times alone. In fact, apart from the CMS collector, there are no collections that are only for the old days.
■ mixed collection (Mixed GC): the goal is to collect garbage collection throughout the Cenozoic generation and some of the older generations. Currently, only G1 collectors have this kind of behavior.
■ whole heap collection (Full GC): garbage collection for the entire Java heap and method zone.
Mark-clear algorithm
It has two main disadvantages: the first is that the execution efficiency is unstable. If there are a large number of objects in the Java heap, and most of them need to be recycled, a large number of marking and clearing actions must be carried out, resulting in a decrease in the execution efficiency of both marking and clearing processes as the number of objects increases. The second is the fragmentation of memory space, which will produce a large number of discontinuous memory fragments after marking and clearing. Too much space debris may result in not finding enough continuous memory later when large objects need to be allocated during the running of the program, which will have to trigger another garbage collection action in advance.
Tag-copy algorithm
In order to solve the problem of low efficiency of mark-removal algorithm in the face of a large number of recyclable objects. At present, most commercial Java virtual machines give priority to using this collection algorithm to recover the new generation.
It divides available memory into two equal chunks according to capacity, using only one of them at a time. When this piece of memory is used up, copy the surviving objects to another piece, and then clean up the used memory space at once. If most of the objects in memory are alive, this algorithm will incur a lot of overhead of inter-memory replication, but for most objects that can be recycled, the algorithm needs to copy a small number of living objects. and each time the memory is reclaimed for the whole half of the area, so when allocating memory, there is no need to consider the complexity of space debris, as long as you move the top pointer and allocate it sequentially.
The cost of this replication recycling algorithm is to reduce the available memory by half, which is a bit of a waste of space.
The specific method of Appel recycling is to divide the new generation into a larger Eden space and two smaller Survivor spaces, using only Eden and one piece of Survivor for each memory allocation. When garbage collection occurs, the objects that are still alive in Eden and Survivor are copied to another Survivor space at once, and then the Eden and the used Survivor space are cleaned up directly. The default size ratio of Eden to Survivor for HotSpot virtual machines is 8 ∶ 1, which means that the available memory space in each Cenozoic generation is 90% of the entire Cenozoic generation capacity (80% of Eden plus 10% of Survivor), and only one Survivor space, that is, 10% of Cenozoic generation will be "wasted". When there is not enough Survivor space to accommodate objects that survive a single Minor GC, you need to rely on other areas of memory (in fact, mostly old times) for Handle Promotion.
Marking-finishing algorithm
According to the survival characteristics of old objects, Edward Lueders proposed another targeted "marking-finishing" (Mark-Compact) algorithm in 1974, in which the marking process is still the same as the "marking-clearing" algorithm, but the subsequent step is not to directly clean up the recyclable objects, but to make all the surviving objects move to one end of the memory space, and then directly clean up the memory beyond the boundary.
The essential difference between the marking-clearing algorithm and the marking-finishing algorithm is that the former is a non-mobile recycling algorithm, while the latter is mobile. Whether to move the surviving objects after recycling is a risk decision with both advantages and disadvantages:
If moving living objects, especially in the old days, where there were a large number of living areas for each collection, moving living objects and updating all places referencing them would be an extremely heavy operation. And this kind of object moving operation must pause the user's application all the time, which makes the user have to weigh its disadvantages carefully. Pauses like this are vividly described as "Stop The World" by the original virtual machine designers.
[1] the latest ZGC and Shenandoah collectors use read Barrier (Read Barrier) technology to implement concurrent execution of the finishing process and user threads, and how this collector works will be described later.
[2] usually the mark-clear algorithm also needs to pause the user thread to mark and clean up recyclable objects, but the pause time is relatively short.
However, if moving and sorting living objects is not considered at all like the mark-clear algorithm, the problem of space fragmentation caused by living objects scattered in the heap can only be solved by more complex memory allocators and memory accessors. For example, through the "partition idle allocation linked list" to solve the memory allocation problem (the computer hard disk storage of large files does not require physically continuous disk space, the ability to store and access on the fragmented hard disk is achieved through the hard disk partition table). Memory access is one of the most frequent operations of user programs, and it is not even one of them. If additional burden is added to this link, it will directly affect the throughput of applications.
Based on the above two points, whether moving objects has drawbacks, memory recovery will be more complex when moving, and memory allocation will be more complex if not moved. From the standstill time of garbage collection, the pause time of not moving objects will be shorter, or even without pause, but in terms of the throughput of the whole program, moving objects will be more cost-effective. In this context, the essence of throughput is the sum of the efficiency of the Mutator (user programs that use garbage collection, which is replaced by "user programs" or "user threads" in most places in this book for ease of understanding) and the collector.
The Parallel Scavenge collector focused on throughput in the HotSpot virtual machine is based on the mark-collation algorithm, while the delay-focused CMS collector is based on the mark-clear algorithm, which proves this from the side.
In addition, there is a "and thin mud" solution that does not add too much additional burden on memory allocation and access, by allowing the virtual machine to use the mark-clear algorithm most of the time and temporarily tolerate the existence of memory fragmentation. until the degree of fragmentation of memory space is large enough to affect object allocation, use the mark-demarcation algorithm to collect it again to get regular memory space. The aforementioned CMS collector based on the mark-removal algorithm uses this approach when faced with too much space debris.
That's all of the article "what is the Java garbage collector and how to allocate memory?" Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.