Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the garbage collection mechanism in Python?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

What this article shares with you is about what the garbage collection mechanism in Python is. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

As the automatic memory management mechanism of modern programming language, GC focuses on two things: 1. Find useless junk resources in memory 2. Clean up the garbage and give up memory to other objects. GC completely frees programmers from the burden of resource management, giving them more time to focus on business logic. But that doesn't mean programmers don't have to know GC. After all, knowing more about GC is still good for us to write more robust code.

Reference count

The default garbage collection mechanism adopted by Python language is "reference counting method Reference Counting". This algorithm was first proposed by George E. Collins in 1960. 50 years later, this algorithm is still used by many programming languages. The principle of "reference counting" is that each object maintains an ob_ref field to record the number of times the object is currently referenced, whenever a new reference points to the object. Its reference count ob_ref plus 1, every time the reference of the object expires, count ob_ref minus 1, once the reference count of the object is 0, the object is immediately recycled, the memory space occupied by the object will be released. Its disadvantage is that it requires additional space to maintain reference counting, which is secondary, but the main problem is that it cannot solve the "circular reference" of objects. Therefore, there are many languages such as Java that do not use this algorithm as a garbage collection mechanism.

What are circular references? an and B refer to each other and no longer have external references to either of An and B. although their reference count is 1, they should obviously be recycled, for example:

A = {} # object A reference count is 1 b = {} # object B reference count is 1 a ['b'] = b # B reference count 1 b ['a'] = a # A reference count plus 1 del a # A reference minus 1 del b # B reference to object 1 reference to B object is 1 reference to B object is 1

In this example, after the program executes the del statement, the An and B objects do not have any references to these two objects, but each of these two objects contains a reference to the other object. Although the two objects cannot be referenced by other variables, they are two inactive objects or garbage objects to GC, but their reference count has not been reduced to zero. So if you use reference counting to manage these two objects, they will not be recycled, they will always reside in memory, resulting in a memory leak (memory space is not freed after use). In order to solve the problem of circular reference of objects, Python introduces two kinds of GC mechanisms: tag-cleanup and generational recycling.

Mark clear

Tag cleanup (Mark-Sweep) algorithm is a garbage collection algorithm based on tracking collection (tracing GC) technology. It is divided into two stages: the * phase is the marking phase, in which GC marks all "active objects", and the second stage is to recycle the "inactive objects" that are not marked. So how does GC determine which objects are active and which are inactive?

Objects are connected by references (pointers) to form a digraph, objects constitute the nodes of the digraph, and reference relationships constitute the edges of the digraph. Starting from the root object (root object), traverse the object along the directed edge, the reachable (reachable) object is marked as the active object, and the inreachable object is the inactive object to be cleared. The root object is the global variable, call stack, and register.

In the figure above, we regard the small black circle as a global variable, that is, as a root object, starting from the small black circle, object 1 can be directly reached, then it will be marked, objects 2 and 3 can be indirectly reached and will be marked, while 4 and 5 are not reachable, then 1, 2, 3 are active objects, 4 and 5 are inactive objects will be recycled by GC.

As an auxiliary garbage collection technology of Python, tag removal algorithm mainly deals with some container objects, such as list, dict, tuple,instance and so on, because it is impossible to cause circular reference problems for string and numeric objects. Python uses a two-way linked list to organize these container objects. However, this simple and crude tag removal algorithm also has obvious disadvantages: it must scan the entire heap memory sequentially before clearing inactive objects, and scan all objects even if only a small number of active objects are left.

Generation by generation recovery

Generation recycling is a space-for-time operation. Python divides memory into different sets according to the survival time of objects, and each set is called a generation. Python divides memory into three "generations", namely the younger generation (the 0th generation), the middle age (the first generation) and the old age (the second generation). They correspond to three linked lists, and their garbage collection frequency decreases with the increase of the object's survival time. Newly created objects will be allocated to the younger generation, and when the total number of linked lists of the younger generation reaches the upper limit, the Python garbage collection mechanism will be triggered to recycle those objects that can be recycled, while those that will not be recycled will be moved to the middle age, and so on, the objects in the old era are the longest living objects, even throughout the life cycle of the system. At the same time, generational recycling is based on label removal technology. Generational collection is also used as an auxiliary garbage collection technology for Python to deal with those container objects.

This is what the garbage collection mechanism in Python is, and the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report