In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the knowledge about "Python advanced skills how to reduce memory occupation by half with one line of code". In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!
Picture results
Let me explain how it works.
First, let's consider a simple "learning" example by creating a Dataltem class that is a person's personal information such as name, age, address, etc.
class DataItem(object): def __init__(self, name, age, address): self.name = name self.age = age self.address = address
Beginner's question: How do you know how much memory is occupied by more than one such object?
First, let's try to solve:
d1 = DataItem("Alex", 42, "-") print ("sys.getsizeof(d1):", sys.getsizeof(d1))
The answer we got was 56bytes, which seemed to take up very little memory and was quite satisfactory. So, we're trying another example of an object with more data:
d2 = DataItem("Boris", 24, "In the middle of nowhere") print ("sys.getsizeof(d2):", sys.getsizeof(d2))
The answer is still 56bytes, and at this point, it seems we realize what's wrong? Not everything is what it seems at first sight.
Intuition doesn't disappoint us. Nothing is that simple. Python is a very flexible language with dynamic typing, and it stores a lot of additional data for its work. They themselves occupy a lot.
For example, sys.getsizeof("") returns 33bytes, which is an empty line of up to 33 bytes! And sys.getsizeof(1) returns 24bytes, a whole number takes up 24 bytes (I want to consult C programmers, stay away from the screen, do not want to read further, so as not to lose confidence in aesthetics). For more complex elements such as dictionaries, sys.getsizeof(. ())Back to 272 bytes, this is for empty dictionaries, I won't go any further, I hope the principle is clear and RAM manufacturers need to sell their chips.
But let's go back to our DataItem class and initial beginner's doubts.
How much memory does this class take up?
First, we print the full contents of this class in lowercase:
def dump(obj): for attr in dir(obj): print(" obj.% s = %r" % (attr, getattr(obj, attr)))
This function will reveal hidden "behind-the-scenes" content that makes all Python functions (types, inheritance, and other content) work correctly.
The results were impressive:
How much memory does all this take up?
Here is a function that recursively calls the getsizeof function to calculate the actual data size of an object.
def get_size(obj, seen=None): # From # Recursively finds size of objects size = sys.getsizeof(obj) if seen is None: seen = set() obj_id = id(obj) if obj_id in seen: return 0 # Important mark as seen *before* entering recursion to gracefully handle # self-referential objects seen.add(obj_id) if isinstance(obj, dict): size += sum([get_size(v, seen) for v in obj.values()]) size += sum([get_size(k, seen) for k in obj.keys()]) elif hasattr(obj, '__dict__'): size += get_size(obj.__ dict__, seen) elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)): size += sum([get_size(i, seen) for i in obj]) return size
Let's try:
d1 = DataItem("Alex", 42, "-") print ("get_size(d1):", get_size(d1)) d2 = DataItem("Boris", 24, "In the middle of nowhere") print ("get_size(d2):", get_size(d2))
The answers we got were 460bytes and 484bytes, which seemed to be true.
Using this function, you can perform a series of experiments. For example, I wonder how much space the data would take up if the DataItem structure were placed in a list. The get_size ([d1]) function returns 532bytes , which is obviously the same as the 460+ overhead mentioned above. But get_size ([d1, d2]) returns 863bytes , less than 460 + 484 above. The result of get_size ([d1, d2, d1]) is more interesting-we get 871 bytes, just a little more, which means Python is smart enough not to allocate memory for the same object again.
Now let us look at the second part of the question.
Is it possible to reduce memory overhead?
yes, it can. Python is an interpreter, and we can extend our class at any time, for example, by adding a new field:
d1 = DataItem("Alex", 42, "-") print ("get_size(d1):", get_size(d1)) d1.weight = 66 print ("get_size(d1):", get_size(d1))
Great, but what if we don't need it? We can force the interpreter to specify list objects of classes using the__slots__command:
class DataItem(object): __slots__ = ['name', 'age', 'address'] def __init__(self, name, age, address): self.name = name self.age = age self.address = address
More information can be found in the documentation (RTFM), which says "__ dict__and__weakref__." Using__dict__saves a lot of space."
We confirm: Yes, it does matter, get_size (d1) returns... 64 bytes instead of 460 bytes, which is 7 times less. Also, objects are created 20% faster (see the first screenshot in this article).
Alas, it's not because of other overhead that such a large memory gain is actually used. Create an array of 100,000 by simply adding elements and look at the memory consumption:
data = [] for p in range(100000): data.append(DataItem("Alex", 42, "middle of nowhere")) snapshot = tracemalloc.take_snapshot() top_stats = snapshot.statistics('lineno') total = sum(stat.size for stat in top_stats) print("Total allocated size: %.1f MB" % (total / (1024*1024)))
We don't use__slots__16.8MB of memory and use 6.9MB. This operation is certainly not the best, but it does have the least code change.(Not 7 times of course, but it’s not bad at all, considering that the code change was minimal.)
The shortcomings now. Activating__slots__disables the creation of all elements, including__dict__ , which means, for example, that the following code to convert a structure to json will not run:
def toJSON(self): return json.dumps(self.__ dict__)
This problem is easy to fix, it is enough to produce dict programming way through all elements of the loop:
def toJSON(self): data = dict() for var in self.__ slots__: data[var] = getattr(self, var) return json.dumps(data)
It is also impossible to dynamically add new class variables to this class, but in this case this is not required.
Last test of the day. What's interesting is how much memory the whole program needs. Add an infinite loop to the program so that it doesn't end and look at memory consumption in Windows Task Manager.
No__slots__:
6.9 Mb becomes 27Mb …Good guy, after all, we saved memory, 27Mb instead of 70, not a bad example for adding a line of code
Note: The TraceMelc debug library uses a lot of additional memory. Apparently, she added extra elements to each object she created. If you turn it off, the total memory consumption will be much less, and the screenshot shows two options:
What if you want to save more memory?
This can be done using the numpy library, which allows you to create structures in C style, but in my case it requires a deeper refinement of the code, and the first approach suffices.
The strange thing is that Habré never analyzes in detail the use of__slots__, and I hope this article will fill that gap.
"Python advanced skills how to use a line of code to reduce the memory footprint by half" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.