How to use weak references in Python 04/04 Update SLTechnology News&Howtos

How to use weak references in Python

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the knowledge of "how to use weak references in Python". The editor shows you the operation process through an actual case. The method of operation is simple and fast, and it is practical. I hope this article "how to use weak references in Python" can help you solve the problem.

Background

Before we start talking about weak references (weakref), let's take a look at what are weak references? What exactly does it do?

Suppose we have a multithreaded program that processes application data concurrently:

# takes up a lot of resources, and the cost of creating and destroying is very high.\ class Data:\ def _ _ init__ (self, key):\ pass

The application data Data is uniquely identified by a key, and the same data may be accessed by multiple threads at the same time. Because Data takes up a lot of system resources, the cost of creation and consumption is very high. We want Data to maintain only one copy in the program, and we don't want to create it repeatedly even if it is accessed by multiple threads at the same time.

To this end, we try to design a caching middleware Cacher:

Import threading# data cache class Cacher: def _ init__ (self): self.pool = {} self.lock = threading.Lock () def get (self, key): with self.lock: data = self.pool.get (key) if data: return data self.pool [key] = data = Data (key) return data

Cacher internally uses a dict object to cache the created copy of Data and provides a get method to get the application data Data. The get method looks up the cache dictionary when getting the data, and returns it directly if the data already exists; if the data does not exist, create one and save it to the dictionary. Therefore, after the data is created for the first time, it goes into the cache dictionary, and if other threads access it at the same time, they all use the same copy in the cache.

It feels very good! But the only drawback is: Cacher has the risk of resource leakage!

Because once Data is created, it is saved in the cache dictionary and will never be released! In other words, the program's resources, such as memory, will continue to grow and are likely to explode eventually. Therefore, we hope that a data can be automatically released after all threads no longer access it.

We can maintain the number of references to the data in Cacher, and the get method automatically accumulates this count. At the same time, a new remove method is provided to release the data, which first reduces the number of references and deletes the data from the cached field when the number of references is reduced to 00:00.

The thread calls the get method to get the data, and after the data is used up, you need to call the remove method to release it. Cacher is equivalent to implementing the reference counting method by itself, which is too troublesome! Doesn't Python have a garbage collection mechanism built in? Why do applications need to implement themselves?

The main crux of the conflict is Cacher's cache dictionary: as a middleware, it does not use data objects itself, so data should not be referenced in theory. Is there any cool techs that can find the target without generating a reference? We know that assignments will lead to references!

Typical usage

At this time, weak reference (weakref) made its grand debut! Weak reference is a special object, which can associate the target object without generating a reference.

# create a data > d = Data ('fasionchan.com') > d# create a weak reference to the data > import weakref > r = weakref.ref (d) # call the weak reference object, and you can find the object pointing to > > r () > r () is dTrue# to delete the temporary variable dcedData object and have no other references. It will be recycled > del d# to call the weak reference object again. Found that the target Data object is no longer there (return None) > r ()

In this way, we just need to change the Cacher cache dictionary to save weak references, and the problem will be solved!

Import threadingimport weakref# data cache class Cacher: def _ _ init__ (self): self.pool = {} self.lock = threading.Lock () def get (self Key): with self.lock: r = self.pool.get (key) if r: data = r () if data: return data data = Data (key) self.pool [key] = weakref.ref (data) return data

Because the cache dictionary holds only weak references to Data objects, Cacher does not affect the reference count of Data objects. When all threads run out of data, the reference count drops to zero and is released.

In fact, it is common to cache data objects with dictionaries, so the weakref module also provides two dictionary objects that hold only weak references:

Weakref.WeakKeyDictionary, the key only holds the mapping class with weak references (once the key no longer has a strong reference, the key-value pair entry will automatically disappear)

Weakref.WeakValueDictionary, the value only holds the mapping class with weak references (once the value no longer has a strong reference, the key-value pair entry will automatically disappear)

Therefore, our data cache dictionary can be implemented in weakref.WeakValueDictionary, and its interface is exactly the same as that of an ordinary dictionary. In this way, we no longer have to maintain weak reference objects on our own, and the code logic is simpler and clearer:

Import threadingimport weakref# data cache class Cacher: def _ init__ (self): self.pool = weakref.WeakValueDictionary () self.lock = threading.Lock () def get (self, key): with self.lock: data = self.pool.get (key) if data: return data self.pool [key] = data = Data (key) return data

The weakref module also has many useful utility classes and utility functions. For details, please refer to the official documentation. I will not repeat them here.

working principle

So who is weak quotation and why is it so magical? Next, let's unveil it and get a glimpse of it!

> d = Data ('fasionchan.com') # weakref.ref is a built-in type object > from weakref import ref > ref# calls the weakref.ref type object and creates a weak reference instance object > r = ref (d) > r

After the previous chapter, we are familiar with reading the built-in object source code, and the relevant source code files are as follows:

The Include/weakrefobject.h header file contains the object structure and some macro definitions

The Objects/weakrefobject.c source file contains weak reference type objects and their method definitions

Let's first pick the field structure of the weak reference object and define it on lines 10-41 in the Include/weakrefobject.h header file:

Typedef struct _ PyWeakReference PyWeakReference;/* PyWeakReference is the base struct for the Python ReferenceType, ProxyType, * and CallableProxyType. * / # ifndef Py_LIMITED_APIstruct _ PyWeakReference {PyObject_HEAD / * The object to which this is a weak reference, or Py_None if none. * Note that this is a stealth reference: wr_object's refcount is * not incremented to reflect this pointer. * / PyObject * wr_object; / * A callable to invoke when wr_object dies, or NULL if none. * / PyObject * wr_callback; / * A cache for wr_object's hash code. As usual for hashes, this is-1 * if the hash code isn't known yet. * / Py_hash_t hash; / * If wr_object is weakly referenced, wr_object has a doubly-linked NULL- * terminated list of weak references to it. These are the list pointers. * If wr_object goes away, wr_object is set to Py_None, and these pointers * have no meaning then. * / PyWeakReference * wr_prev; PyWeakReference * wr_next;}; # endif

Thus, the PyWeakReference structure is the physical body of the weakly referenced object. It is a fixed-length object with five fields in addition to the fixed header:

Wr_object, the object pointer, points to the referenced object, and the weak reference can find the referenced object according to this field, but will not produce a reference.

Wr_callback, which points to a callable object that will be called when the referenced object is destroyed

Hash, caching the hash value of the referenced object

Wr_prev and wr_next are forward and backward pointers, respectively, used to organize weakly referenced objects into two-way linked lists

Combined with the comments in the code, we know:

A weak reference object associates the referenced object through the wr_object field, as shown in the dotted arrow above

An object can be associated by multiple weakly referenced objects at the same time, and the Data instance object in the figure is associated by two weakly referenced objects

All weak references associated with the same object are organized into a two-way linked list, and the chain header is stored in the referenced object, as shown by the solid line arrow above.

When an object is destroyed, Python traverses its weak reference list, processing it one by one:

Set the wr_object field to None, and if the weak reference object is called, it will return None, and the caller will know that the object has been destroyed.

Execute the callback function wr_callback (if any)

Thus it can be seen that the work of weak references is actually the Observer pattern in the design pattern. When an object is destroyed, all its weakly referenced objects are notified and properly disposed of.

Implementation details

Mastering the fundamentals of weak quotation is enough for us to make good use of it. If you are interested in the source code, you can delve deeper into some of its implementation details.

As we mentioned earlier, all weak references to the same object are organized into a two-way linked list, and the chain header is stored in the object. Because there are so many types of objects that can create weak references, it is difficult to be represented by a fixed structure. Therefore, Python provides a field tp_weaklistoffset in the type object that records the offset of the weak reference chain header pointer in the instance object.

Therefore, for any object o, we only need to find its type object t through the ob_type field, and then find the weak reference chain header of the object o according to the tp_weaklistoffset field in t.

Python provides two macro definitions in the Include/objimpl.h header file:

/ * Test if a type supports weak references * / # define PyType_SUPPORTS_WEAKREFS (t) ((t)-> tp_weaklistoffset > 0) # define PyObject_GET_WEAKREFS_LISTPTR (o)\ ((PyObject * *) (char *) (o)) + Py_TYPE (o)-> tp_weaklistoffset))

PyType_SUPPORTS_WEAKREFS is used to determine whether a type object supports weak references. Weak references are supported only if tp_weaklistoffset is greater than zero, and weak references are not supported by built-in objects such as list.

PyObject_GET_WEAKREFS_LISTPTR is used to retrieve the weak reference chain header of an object. It first finds the type object t through the Py_TYPE macro, then finds the offset through the tp_weaklistoffset field, and finally adds it with the object address to get the address of the chain header field.

When we create a weak reference, we need to call the weak reference type object weakref and pass the referenced object d as a parameter. The weak reference type object weakref is the type of all weak reference instance objects and is a globally unique type object defined in Objects/weakrefobject.c, that is, _ PyWeakref_RefType (line 350).

According to what you have learned in the object model, when Python calls an object, it executes the tp_call function in its type object. Therefore, when you call the weak reference type object weakref, you execute the type object of weakref, that is, the tp_call function of type. The tp_call function goes back to call the tp_new and tp_init functions of weakref, where tp_new allocates memory for the instance object, and tp_init is responsible for initializing the instance object.

Back in the Objects/weakrefobject.c source file, you can see that the tp_new field of PyWeakref_RefType is initialized to * weakref___new_* (line 276). The main processing logic of this function is as follows:

Parse the parameter to get the referenced object (line 282)

Call the PyType_SUPPORTS_WEAKREFS macro to determine whether the referenced object supports weak references and throw an exception if not (line 286)

Call the GET_WEAKREFS_LISTPTR line to fetch the weak reference chain header field of the object and return a secondary pointer for easy insertion (line 294)

Call get_basic_refs to fetch the callback at the top of the linked list that is an empty base weak reference object (if any, line 295)

If callback is empty and the object has an underlying weak reference with null callback, reuse the instance and return it directly (line 296)

If it cannot be reused, call the tp_alloc function to allocate memory, complete field initialization, and insert into the object's weak reference list (line 309)

If the callback is empty, insert it directly at the front of the linked list for subsequent reuse (see point 4)

If the callback is not empty, insert it after the underlying weak reference object (if any) to ensure that the underlying weak reference is located in the chain header for easy access

When an object is reclaimed, the tp_dealloc function calls the PyObject_ClearWeakRefs function to clean up its weak references. This function takes the weak reference list of the object, then iterates through it one by one, cleans up the wr_object field and executes the wr_callback callback function, if any. Details are no longer unfolded, if you are interested, you can check the source code in Objects/weakrefobject.c, which is located on line 880.

Well, through the study of this section, we have thoroughly mastered the knowledge of weak citation. Weak references can manage the target object without generating a reference count, and are often used in frameworks and middleware. Weak references may seem magical, but the design principle is a very simple observer pattern. After the weak reference object is created, it is inserted into a linked list maintained by the target object to observe (subscribe) the destruction event of the object.

This is the end of the introduction to "how to use weak references in Python". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.