In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "how to customize the Set collection in the Go language". In the daily operation, I believe that many people have doubts about how to customize the Set collection in the GE language. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how to customize the Set collection in the Go language". Next, please follow the editor to study!
1. Go language practice-- Custom set Set
There is a dictionary (Map) type implemented as Hash Table in the go language, but there is no such data type as Set in the standard data type. Comparing the main features of Set and Map, the similar features are as follows:
The elements in them are unrepeatable.
All of them can only extract all the elements iteratively.
The order in which the elements in them are iterated is independent of the insertion order of the elements, and there is no guarantee of any order.
However, there are some differences between them, as follows:
The element of Set is a single value, while the element of Map is a key-value pair.
The non-repeatability of the element of Set means that there cannot be any two single values that are equal. The element non-repeatability of Map means that the values of keys in any two key-value pairs cannot be equal.
As you can see from the above features, the collection type (Set) can be used as a simplified version of the dictionary type (Map). That is, you can write an implementation of type Set in Map. In fact, in the Java language, the java.util.HashSet class uses the java.util.HashMap class as the underlying support. So here we start from HashSet and gradually abstract the collection Set.
1. Define HashSet
First, create a source file called hash_set.go in the code package basic/set of the workspace's src directory (you can define it yourself, but keep it consistent later).
According to the code package basic/set, the package declaration statement of the source file hash_set.go (for some rules, see the previous series of blog posts) is as follows:
Package set
It was mentioned above that you can use a collection type as a simplified version of a dictionary type. Now our HashSet takes the dictionary type as its underlying implementation. HashSet declares as follows:
Type HashSet struct {m map [interface {}] bool}
As stated above, the only field type in the HashSet type is map [interface {}] bool. The reason for choosing such a dictionary type is that by setting the key type of dictionary m to interface {}, the element of HashSet can be of any type, because here you need to use the key in the value of m to store the element value of type HashSet. The benefits of element types that use the bool type as the value of m are as follows:
From the point of view of the stored form of the value, the bool type value takes up only one byte.
From the point of view of the representation of values, there are only two values of type bool-true and false. Also, both values are predefined constants.
Using the bool type as the value type makes it easier to determine whether a key exists in the dictionary type value. For example, if true is always used as the value of an element when adding a key-value pair to the value of m, the resulting value of the index expression m ["a"] always reflects whether the key-value pair with the key "a" is included in the value of m. For values of type map [interface {}] bool, the following:
If m ["a"] {/ / determine whether m contains a key-value pair with the key "a" / / omitting other statements}
Now that the basic structure of the HashSet type is determined, consider how to initialize the HashSet type value. Because the zero value of the dictionary type is nil, and using the new function to create a HashSet type value, that is, new (HashSet) .m, the result will be a nil (about the new function, please refer to my other blog Go language Learning Notes 5). Therefore, you need to write a function specifically for creating and initializing values of type HashSet, which is declared as follows:
Func NewHashSet () * HashSet {return & HashSet {m: make (map [interface {}] bool)}}
As you can see above, field m is initialized using the make function. Also notice that the result declaration of the function NewHashSet is of type * HashSet rather than HashSet, so that the method collection of this result value contains all methods that call the recipient type HashSet or * HashSet. The benefits of this will be explained later when you write the Set interface type.
two。 Implement the basic functions of HashSet
According to the HashSet types in other programming languages, most of them should provide the following basic functions:
Add element values.
Delete the element value.
Clears all element values.
Determines whether an element value is included.
Gets the number of element values.
Determines whether the value is the same as other HashSet types.
Get all element values, that is, generate an iterable snapshot.
Gets its own string representation.
Now these functions are realized one by one, and readers can realize them by themselves. The following is for reference only.
(1)。 Add element value
/ / the method Add returns a result value of type bool to indicate whether the operation to add the element value was successful. / / the recipient type in the declaration of the method Add is * HashSet. Func (set * HashSet) Add (e interface {}) bool {if! set.m [e] {/ / the current m value does not contain the key value with the value of e. Set.m [e] = true// adds the key value pair with the key e (the value represented) and the element true to the value of m return true// added successfully} return false / / add failed}
The use of * HashSet instead of HashSet is mainly from the point of view of saving memory space. The analysis is as follows:
When the recipient type of the Add method is HashSet, each call to it requires a copy of the current HashSet type value. Although there is only one field of reference type in the HashSet type, this is also an overhead. And it doesn't take into account the fact that fields in the HashSet type may become more.
When the recipient type of the Add method is * HashSet, the current * HashSet type value copied when it is called is only a pointer value. In most cases, the memory space occupied by a pointer value is always small by the other type of value it points to. No matter how much memory space is required for the other type of value pointed to by a pointer value, the memory space it occupies is always the same.
(2)。 Delete element value
/ / call the built-in delete function to delete the dictionary value func (set * HashSet) Remove (e interface {}) {delete (set.m, e) / / the first parameter is the target dictionary type, and the second parameter is the key of the key-value pair to be deleted.
(3)。 Clear all elements
/ / reassign the field m in HashSet to func (set * HashSet) Clear () {set.m = make (map [interface {}] bool)}
If the recipient type is HashSet, the assignment statement in this method only assigns a value to the field m in a copy of the current value, while the field m in the current value is not reassigned. After this assignment statement in the method Clear is executed, the element in the current HashSet type value is equivalent to being emptied. The old dictionary value that has been unbound from field m becomes useless data because it is no longer bound to any program entity. It will be discovered and reclaimed by the garbage collector of the Go language at some point later.
(4)。 Determines whether an element value is included.
/ / the method Contains is used to determine whether its value contains an element value. / / the judgment here benefits from the field mfunc (set * HashSet) Contains (e interface {}) bool {return set.m [e]} whose element type is bool.
When an interface {} type value is added to a dictionary value as a key, the Go language first gets the actual type (that is, the dynamic type) of the interface {} type value, and then uses the corresponding hash function to hash the value, that is, the interface {} type value can always be calculated correctly. However, the key of a dictionary type cannot be a function type, a dictionary type, or a slice type, otherwise it will cause a run-time panic and prompt as follows:
Panic: runtime error: hash of unhashable type
(5)。 Gets the number of element values.
/ / method Len is used to obtain the number of HashSet element values func (set * HashSet) Len () int {return len (set.m)}
(6)。 Determines whether the value is the same as other HashSet types.
/ method Same is used to determine whether two HashSet type values are the same func (set * HashSet) Same (other * HashSet) bool {if other = = nil {return false} if set.Len ()! = other.Len () {return false} for key: = range set.m {if! other.Contains (key) {return false} return true}
The necessary condition for two HashSet types to have the same value is that they should contain exactly the same elements. Since the iterative order of the elements in the HashSet type value is always uncertain, you don't have to care whether the two values are consistent in this respect. If you want to determine whether two HashSet type values are the same value, you need to use the pointer operation to compare the memory address.
(7)。 Get all element values, that is, generate an iterable snapshot.
The so-called snapshot is the image of the target value at a certain time. For a HashSet type value, the iteration order of the elements in its snapshot can always be determined, and the snapshot only reflects the state of the HashSet type value at a certain time. In addition, you need to select one of the data types that can be iterated over and sequentially determined as a snapshot. This type must have a single value as an element, so the dictionary type should not be excluded first. And because the number of elements in the value of the HashSet type is not always fixed, a snapshot of it cannot be represented by a value of an array type. As shown in the above analysis, the type of snapshot that can be used in the Go language should be a slice type or channel type.
/ / method Elements is used to generate snapshot func (set * HashSet) Elements () [] interface {} {initialLen: = len (set.m) / / get the length of field m in HashSet That is, m contains the number of elements / / initializes a variable snapshot of type [] interface {} to store the element value in the value of m snapshot: = make ([] interface {}, initialLen) actualLen: = 0 / / sets the iterative value to the specified element location of the snapshot value (the value of the variable snapshot) in the given order, and this process does not create any new values. For key: = range set.m {if actualLen
< initialLen { snapshot[actualLen] = key } else {//m的值中的元素数量有所增加,使得实际迭代的次数大于先前初始化的快照值的长度 snapshot = append(snapshot, key)//使用append函数向快照值追加元素值。 } actualLen++//实际迭代的次数 } //对于已被初始化的[]interface{}类型的切片值来说,未被显示初始化的元素位置上的值均为nil。 //m的值中的元素数量有所减少,使得实际迭代的次数小于先前初始化的快照值的长度。 //这样快照值的尾部存在若干个没有任何意义的值为nil的元素, //可以通过snapshot = snapshot[:actualLen]将无用的元素值从快照值中去掉。 if actualLen < initialLen { snapshot = snapshot[:actualLen] } return snapshot} 注意:在 Elements 方法中针对并发访问和修改 m 的值的情况采取了一些措施。但是由于m的值本身并不是并发安全的,所以并不能保证 Elements 方法的执行总会准确无误。要做到真正的并发安全,还需要一些辅助的手段,比如读写互斥量。 (8).获取自身的字符串表示形式。 //这个String方法的签名算是一个惯用法。 //代码包fmt中的打印函数总会使用参数值附带的具有如此签名的String方法的结果值作为该参数值的字符串表示形式。func (set *HashSet) String() string { var buf bytes.Buffer//作为结果值的缓冲区 buf.WriteString("HashSet{") first := true for key := range set.m { if first { first = false } else { buf.WriteString(",") } buf.WriteString(fmt.Sprintf("%v", key)) } //n := 1 //for key := range set.m { // buf.WriteString(fmt.Sprintf("%v", key)) // if n == len(set.m) {//最后一个元素的后面不添加逗号 // break; // } else { // buf.WriteString(",") // } // n++; //} buf.WriteString("}") return buf.String() } 如上已经完整地编写了一个具备常用功能的Set的实现类型,后面将讲解更多的高级功能来完善它。 3.高级功能 集合 Set 的真包含的判断功能。根据集合代数中的描述,如果集合 A 真包含了集合 B,那么就可以说集合 A 是集合 B 的一个超集。 // 判断集合 set 是否是集合 other 的超集 func (set *HashSet) IsSuperset(other *HashSet) bool { if other == nil {//如果other为nil,则other不是set的子集 return false } setLen := set.Len()//获取set的元素值数量 otherLen := other.Len()//获取other的元素值数量 if setLen == 0 || setLen == otherLen {//set的元素值数量等于0或者等于other的元素数量 return false } if setLen >0 & & otherLen = = 0 {/ / other is the number of elements 0 the number of elements is greater than 0, then set is also a superset of other return true} for _, v: = range other.Elements () {if! set.Contains (v) {/ / as long as one of the set contains data in other, false return false}} return true} is returned
The operation of a set includes union, intersection, difference and symmetric difference.
Union operation means that all the elements in two sets are merged and combined into one set.
Intersection operation refers to finding the common elements in two sets and forming them into a set.
The meaning of subtraction operation of set An on set B is to find the elements that only exist in set A but not in set B and form them into a set.
The symmetric difference operation is similar to but different from the difference operation. Symmetric difference operation means to find the elements that only exist in set A but not in set B, then find the elements that only exist in set B but not in set A, and finally combine them to form a set.
Realize union operation
/ / generate union of collection set and collection other func (set * HashSet) Union (other * HashSet) * HashSet {if set = = nil | | other = = nil {/ / set and other are nil, then their union is nil return nil} unionedSet: = NewHashSet () / / create a new HashSet type value with a length of 0 That is, the number of elements is 0 for _, v: = range set.Elements () {/ / add elements in set to unionedSet unionedSet.Add (v)} if other.Len () = = 0 {return unionedSet} for _, v: = range other.Elements () {/ / add elements in other to unionedSet, if the same is encountered, unionedSet.Add (v)} return unionedSet} is not added (reflected in Add method logic)
Realize intersection operation
/ / generate the intersection of set set and set other func (set * HashSet) Intersect (other * HashSet) * HashSet {if set = = nil | | other = = nil {/ / set and other are nil, then their intersection is nil return nil} intersectedSet: = NewHashSet () / / create a new HashSet type value with a length of 0, that is, 0 if other.Len () = 0 {/ / other Return intersectedSet return intersectedSet} if set.Len () directly
< other.Len() {//set的元素数量少于other的元素数量 for _, v := range set.Elements() {//遍历set if other.Contains(v) {//只要将set和other共有的添加到intersectedSet intersectedSet.Add(v) } } } else {//set的元素数量多于other的元素数量 for _, v := range other.Elements() {//遍历other if set.Contains(v) {//只要将set和other共有的添加到intersectedSet intersectedSet.Add(v) } } } return intersectedSet} 差集 // 生成集合 set 对集合 other 的差集func (set *HashSet) Difference(other *HashSet) *HashSet { if set == nil || other == nil {// set和other都为nil,则它们的差集为nil return nil } differencedSet := NewHashSet()//新创建一个HashSet类型值,它的长度为0,即元素数量为0 if other.Len() == 0 { // 如果other的元素数量为0 for _, v := range set.Elements() {//遍历set,并将set中的元素v添加到differencedSet differencedSet.Add(v) } return differencedSet//直接返回differencedSet } for _, v := range set.Elements() {//other的元素数量不为0,遍历set if !other.Contains(v) {//如果other中不包含v,就将v添加到differencedSet中 differencedSet.Add(v) } } return differencedSet} 对称差集 // 生成集合 one 和集合 other 的对称差集func (set *HashSet) SymmetricDifference(other *HashSet) *HashSet { if set == nil || other == nil {// set和other都为nil,则它们的对称差集为nil return nil } diffA := set.Difference(other)//生成集合 set 对集合 other 的差集 if other.Len() == 0 {//如果other的元素数量等于0,那么other对集合set的差集为空,则直接返回diffA return diffA } diffB := other.Difference(set)//生成集合 other 对集合 set 的差集 return diffA.Union(diffB)//返回集合 diffA 和集合 diffB 的并集} 4.进一步重构 目前所实现的 HashSet 类型提供了一些必要的集合操作功能,但是不同应用场景下可能会需要使用功能更加丰富的集合类型。当有多个集合类型的时候,应该在它们之上抽取出一个接口类型以标识它们共有的行为方式。依据 HashSet 类型的声明,可以如下声明 Set 接口类型: type Set interface { Add(e interface{}) bool Remove(e interface{}) Clear() Contains(e interface{}) bool Len() int Same(other Set) bool Elements() []interface{} String() string} 注意: Set 中的 Same 方法的签名与附属于 HashSet类型的 Same 方法有所不同。这里不能再接口类型的方法的签名中包含它的实现类型。因此这里的改动如下: func (set *HashSet) Same(other Set) bool { //省略若干语句} 修改了 Same 方法的签名,目的是让 *HashSet 类型成为 Set 接口类型的一个实现类型。 高级功能的方法应该适用于所有的实现类型,完全可以抽离出成为独立的函数。并且,也不应该在每个实现类型中重复地实现这些高级方法。如下为改造后的 IsSuperset 方法的声明: // 判断集合 one 是否是集合 other 的超集// 读者应重点关注IsSuperset与附属于HashSet类型的IsSuperset方法的区别func IsSuperset(one Set, other Set) bool { if one == nil || other == nil { return false } oneLen := one.Len() otherLen := other.Len() if oneLen == 0 || oneLen == otherLen { return false } if oneLen >0 & & otherLen = 0 {return true} for _, v: = range other.Elements () {if! one.Contains (v) {return false}} return true} this is the end of the study on "how to customize Set collections in the Go language", hoping to solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 268
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.