In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the relevant knowledge of what the python set () de-weight method is, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe you will gain something after reading this python set () de-weight method. Let's take a look at it.
What is set?
Mathematically, set is called a collection of different elements, and the members of a set (set) are usually called set elements (set elements). Python introduces this concept into its collection type objects. A collection object is a set of unordered hashable values. Set relational testing and operators such as union, intersection, and so on also work as expected in Python.
Characteristics of set
The elements of a collection have three characteristics:
1. Certainty: the elements in the collection must be determined
two。 Heterogeneity: the elements in the set are different from each other. For example, if the set A = {1) a}, then a cannot be equal to 1)
3. Disorder: there is no order between the elements in the set, for example, {3Magne 4je 5} and {3pm 5pm 4} are counted as the same set.
Set (set) in python is a set of unordered and non-repeating elements, the basic functions include relational testing and elimination of repetitive elements, and can also calculate intersection, difference, union, etc., it is similar to the behavior of list (list), the difference is that set includes duplicate values, and set elements are unordered.
In python, you can create a collection with curly braces {}. Note: if you want to create or initialize an empty collection, you must use set () instead of {}. Because the latter {} is used to create an empty dictionary, we will introduce the dictionary as a data structure later.
1. Set deduplicates a simple example ls = [1 print (set (ls))
We know that the easiest way to remove duplicates for a list is to call the set function directly, which can be done by using the uniqueness of the set elements. However, what exactly this underlying principle is has been half-understood.
Take a look at the following analysis
Second, re-set implementation mechanism class Foo: def _ init__ (self,name,count): self.name = name self.count = count def _ hash__ (self): print ("% s called hash method"% self.name ") return hash (id (self)) def _ eq__ (self) Other): print ("% s called the eq method") if self.__dict__ = = other.__dict__: return True else:return Falsef1 = Foo ('f1fujin1) f2 = Foo (' f2fujin2) f3 = Foo ('f3fujin3) ls = [f1magentic f3] print (set (ls))
As you can see from the above, the set method is to call the hash method, and then rejudge based on whether the hash value is the same, but is that what it is? Take a look at the following procedure.
Class Foo: def _ _ init__ (self,name,count): self.name = name self.count = count def _ _ hash__ (self): print ("% s called hash method"% self.name) return hash (self.count) def _ _ eq__ (self Other): print ("% s called the eq method"% self.name) return self.__dict__ = = other.__dict__f1 = Foo ('f1fujin1) f2 = Foo (' f2fujin1) f3 = Foo ('f3fujin3) ls = [f1memf2fuf3] print (set (ls))
I think we can see that, in fact, the hash values of F1 force f3 are equal, but set does not simply judge that F1 force f3 is repeated, but further determines whether the two values are equal by the eq method. Only when they are equal will they be considered to be the same in fact. To verify the above, let's take a look at the following code.
F1 = Foo ('f1century Magazine 1) f2 = Foo (' f1Phone1) f3 = Foo ('f3fujin3) ls = [f1Magne f2menf3] print (set (ls))
It can be seen that after the weight, there are only two elements, so the above statement is proved.
III. Conclusion
The deduplication of set is achieved by combining two functions, _ _ hash__ and _ _ eq__.
1. When the hash values of two variables are different, the two variables are considered to be different.
2. When the hash value of two variables is the same, the _ _ eq__ method is called. When the return value is True, the two variables are considered to be the same, and one should be removed. Do not remove the weight when returning to FALSE.
IV. Application scenario requirements
There is a company with 100 employees, because the database is not perfect and it takes a long time to use, there is a lot of duplicate data that needs to be cleared. The specific requirements are as follows:
The attributes of each employee are: name, gender, age, department. Due to changes in age and department, it is now believed that as long as two employees have the same name and gender, they are considered to be the same person.
Please realize that the employees are heavy:
Class Staff: def _ _ init__ (self,name,gender,age,department): self.name = name self.gender = gender self.age = age self.department = department def _ _ hash__ (self): return hash (self.name+self.gender) def _ eq__ (self, other): return Truels = ['zs','ls','ww','zq'] gender_list = [' man' 'femal'] staff_list = [] for i in range (100): staff_list.append (Staff (ls [I% 4], gender_ list [I% 2]) print (set (staff_list)) print ([(i.namethei.gender) for i in set (staff_list)]) about "what is the method of removing duplicates in python set ()". Thank you for reading! I believe you all have a certain understanding of the knowledge of "what is the way to remove weight from python set ()". If you still want to learn more knowledge, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.