In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "what is the garbage collection mechanism of python grammar". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
As soon as introduced
When the interpreter executes the syntax of defining variables, it will apply for memory space to store the values of variables, and the capacity of memory is limited, which involves the recycling of memory space occupied by variable values. When a variable value is useless (garbage for short), the memory it occupies should be recycled, then what kind of variable value is useless?
Because the variable name is the only way to access the variable value, when a variable value is no longer associated with any variable name, we can no longer access the variable value, the variable value is useless and should be treated as a garbage collection. There is no doubt that the application and collection of memory space is a very energy-consuming thing, and there is a great danger that a little carelessness may cause memory overflow problems. Fortunately, the Cpython interpreter provides an automatic garbage collection mechanism to help us solve this problem.
Second, what is the garbage collection mechanism?
Garbage collection mechanism (GC) is a machine that comes with the Python interpreter, which is specially used to collect the memory space occupied by the values of unavailable variables.
Third, why use the garbage collection mechanism?
A large amount of memory space will be requested when the program is running, but if some useless memory space is not cleaned up in time, it will lead to memory exhaustion (memory overflow) and program crash, so managing memory is an important and complicated task, while the garbage collection mechanism of the python interpreter frees programmers from the complicated memory management.
Fourth, the principle analysis of garbage collection mechanism.
Python's GC module mainly uses "reference count" (reference counting) to track and collect garbage. On the basis of reference counting, we can also solve the problem of circular references that may be generated by container objects through "mark and sweep", and further improve the efficiency of garbage collection by exchanging space for time through "generation collection".
4.1. What is reference counting?
The reference count is the number of times a variable value is associated with a variable name
Such as: age=18
The variable value 18 is associated with a variable name age, which is called reference count 1
Reference count increased:
Age=18 (at this point, the reference count of the variable value 18 is 1)
M=age (the memory address of age is given to m, and at this point, m is associated with 18, so the reference count of the variable value 18 is 2.)
Reference count reduced:
Age=10 (the name age is disassociated with the value 18 and then associated with 3, and the reference count of the variable value 18 is 1)
Del m (del means to disassociate the variable name x from the variable value 18, in which case the reference count of variable 18 is 0)
Once the reference count of 18 becomes 0, the memory address it occupies should be reclaimed by the garbage collection mechanism of the interpreter.
4.2. Citation count extension Reading
The increase or decrease of the number of times the variable value is associated will lead to the execution of the reference counting mechanism (increasing or decreasing the reference count of the value), which has an obvious efficiency problem.
If execution efficiency is only a weakness of the reference counting mechanism, then unfortunately, there is also a fatal weakness in the reference counting mechanism, namely circular references (also known as cross-references).
# We define two lists as follows: listing 1 and listing 2 for short, variable name L1 pointing to listing 1, variable name L2 pointing to list 2 > L1 = ['xxx'] # listing 1 is referenced once, and the reference count of listing 1 becomes 1 > L2 = [' yyy'] # listing 2 is referenced once The reference count of listing 2 becomes 1 > l1.append (L2) # appends listing 2 to L1 as the second element, and the reference count of listing 2 becomes 2 > l2.append (L1) # appends listing 1 to L2 as the second element The reference count in listing 1 changes to cross-reference between L1 and L2 # L1 = [memory address of xxx', memory address of listing 2] # L2 = [memory address of yyy', memory address of listing 1] > > L1 ['xxx', [' yyy', [...] > > L2 ['yyy', [' xxx', [...] > > L1 [1] [1] [0] 'xxx'
Circular references cause: the value is no longer associated with any name, but the reference count of the value is not zero and should be recycled but not recycled. What do you mean? Just imagine, look at the following
> the reference count of del L1 # listing 1 minus 1, the reference count of listing 1 becomes 1 > del L2 # listing 2 minus 1, and the reference count of listing 2 becomes 1
At this point, only the mutual references between listing 1 and listing 2 are left, and the reference count of the two lists is not 0, but the two lists are no longer associated with any other objects, and no one can refer to them any more. So the memory space occupied by them should be recycled, but because of the existence of mutual references, the reference count of each object is not 0, so the memory occupied by these objects will never be released. So circular references are fatal, which is no different from memory leaks caused by manual memory management.
Therefore, Python introduces "mark-clear" and "generational recycling" to solve the problem of circular reference and low efficiency of reference counting, respectively.
4.2.1 Mark-clear
Container objects (such as list,set,dict,class,instance) can contain references to other objects, so circular references can be generated. The "mark-clear" count is to solve the problem of circular references.
Before we understand the mark removal algorithm, we need to make it clear that there are two areas in memory with regard to the storage of variables: the stack area and the stack area. when defining variables, the relationship between the variable name and the value memory address is stored in the stack area. the value of the variable is stored in the stack area, and memory management reclaims the contents of the stack area, which is explained in detail as follows
Two variables x = 10 and y = 20 are defined.
When we perform xroomy, the stack area and heap area in memory change as follows
The tagging / clearing algorithm stops the entire program when the available memory space of the application is exhausted, and then performs two tasks, the first is tagging, and the second is clearing
1. Mark
The process of tagging is to traverse all GC Roots objects (all content or threads in the stack area can be used as GC Roots objects), and then mark all objects that can be accessed directly or indirectly by GC Roots objects as living objects, and the rest are non-living objects and should be cleared.
2. Clear
The cleanup process traverses all objects in the heap and removes all unmarked objects.
Direct reference refers to the memory address referenced directly from the stack area, and indirect reference refers to the memory address referenced from the stack area to the heap area. Take our previous two lists L1 and L2 as examples to draw the following image
When we delete both L1 and L2, we clean up the contents of L1 and L2 in the stack area.
In this way, when the tag removal algorithm is enabled, it is found that there are no more L1 and L2 in the stack area (only mutual references to each other in the heap area), so neither listing 1 nor listing 2 is marked as alive, and both will be cleaned up. This solves the problem of memory leakage caused by circular references.
4.2.2 Generation recovery
Background:
Based on the recycling mechanism of reference counting, every time memory is reclaimed, it is necessary to traverse the reference count of all objects, which is very time-consuming, so generational recycling is introduced to improve the efficiency of recycling. Generational recycling uses the strategy of "space for time".
Generation:
The core idea of generation recovery is: after many scans, there are no recovered variables, the gc mechanism will think that this variable is a commonly used variable, and the frequency of scanning it by gc will be reduced. The specific implementation principle is as follows:
Generation refers to the classification of variables according to their survival time (that is, different generations).
The newly defined variable is put into the level of the new generation, assuming that the new generation is scanned every 1 minute, and if the variable is still referenced, then the weight of the object (the weight is essentially an integer) is increased by one. When the weight of the variable is greater than a set value (assuming 3), it will be moved to a higher level of youth, and the frequency of gc scanning in adolescence is lower than that of the new generation (scan interval is longer). Suppose the youth generation is scanned every 5 minutes, so that the total number of variables to be scanned by each gc becomes smaller, saving the total scanning time, and then the objects in the youth generation will be moved to the old age in the same way. That is, the higher the level (generation), the lower the frequency of scanning by the garbage collection mechanism.
Recycling:
Recycling is still the basis for using reference count as a basis for recycling.
Although generational recycling can improve efficiency, it also has some disadvantages:
For example, as soon as a variable is moved from the Cenozoic to the adolescence, the binding of the variable is unbound, and the variable should be recycled, but the scanning frequency of the adolescent is lower than that of the new generation.
This is the end of the content of "what is the garbage collection mechanism of python grammar". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.