In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "what are the common solutions to Hash conflicts?" the content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's ideas to study and learn what are the common solutions to Hash conflicts.
Introduction of HASH algorithm
Hash function (English: Hash function), also known as hash algorithm, hash function, is a method to create small digital "fingerprints" from any kind of data. The hash function compresses the message or data into a summary, reducing the amount of data and fixing the format of the data. This function scrambles and mixes the data and recreates a fingerprint called a hash value (hash values,hash codes,hash sums, or hashes). Hash values are usually represented by a short string of random letters and numbers.
A hash table is a structure that stores data by key-value (key-indexed). We only need to enter the value to be looked up, namely key, to find its corresponding value.
The idea of hashing is simple: if all keys are integers, you can use a simple unordered array: use the key as an index, and the value is its corresponding value, so that you can quickly access the value of any key. This is the case for simple keys, which we extend to handle more complex types of keys.
The Hash algorithm can convert a data into a flag, which is closely related to every byte of the source data. Another feature of Hash algorithm is that it is difficult to find the inverse rule.
Hash algorithm is also called hash algorithm. Although Hash algorithm is called algorithm, it is actually more like an idea. Hash algorithm does not have a fixed formula, as long as it conforms to the hash idea, it can be called Hash algorithm.
Common HASH algorithm
Introduction to common Hash algorithms:
1) MD4
MD4 (RFC 1320) is designed by Ronald L. Rivest of MIT in 1990, and MD is the abbreviation of Message Digest (message Summary). It is suitable for high-speed software implementation on 32-bit word-length processors-it is based on 32-bit Operand status operations.
2) MD5
MD5 (RFC 1321) is an improved version of MD4 made by Rivest in 1991. It still groups the input in 512 bits, and its output is a cascade of four 32-bit words, the same as MD4. MD5 is more complex and slower than MD4, but safer, and performs better at anti-analysis and anti-differential.
3) SHA-1 and others
SHA1 is designed by NIST NSA to be used with DSA. It produces a hash of length 160bit for inputs with a length less than 264, so it is more resistant to exhaustion (brute-force). The SHA-1 designer is based on the same principle as MD4 and imitates the algorithm.
Properties of HASH algorithm
All hash functions have the following basic characteristic: if two hash values are different (according to the same function), then the original input of the two hash values is also different. This property is the result of the certainty of the hash function, and the hash function with this property is called an one-way hash function.
Hash table, which is designed from the point of view of fast access, is also a typical practice of "space for time". As the name implies, the data structure can be understood as a linear table, but the elements in it are not closely arranged, but there may be gaps.
For example, we store 70 elements, but we may have applied for 100 elements for these 70 elements. 70max 100mm 0.7, this number is called load factor.
We do this for the purpose of "quick access".
We arrange the storage location for each element based on a fixed function H whose results are randomly and evenly distributed as far as possible, so that the linear search of ergodic property can be avoided and fast access can be achieved.
This is similar to 70 people going to a restaurant with 100 chairs for dinner. The hash function evaluates to a storage unit address, each of which is called a "bucket". If there are m buckets in a hash table, then the range of the hash function should be [05m MMI 1].
What is a hash collision?
If different inputs are mapped to the same hash value, a "hash collision" (collision) occurs.
Assuming that the size of the hash table is 11 (that is, there are 11 slots), now you want to store a string of data in the table: 1.
To do a simple calculation: hash (1) = 5, that is, data 1 should be placed in the fifth slot of the hash table; hash (2) = 1, so data 2 should be placed in the first slot of the hash table; hash (3) = 1, that is, data 3 should also be placed in the first slot of the hash table-thus causing a collision (also known as a collision).
Common solutions to Hash conflicts
There are several common ways to resolve Hash conflicts:
1. Open addressing method
This method is also called rehashing method. Its basic idea is that when the hash address of the keyword key is conflicted, another hash address p1 is generated based on p, and then another hash address p2, based on p, is generated based on p. Until a non-conflicting hash address pi is found and the corresponding element is stored in it This method has a general form of rehash function:
Hi= (H (key) + di)% Mirage 1, 2, … , n
Where H (key) is a hash function, m is a table length, and di is called an incremental sequence. If the value of the incremental sequence is different, the corresponding re-hashing is also different. There are three main types:
Linear detection rehash
Dii=1,2,3,... , mmur1
The characteristic of this method is that when a conflict occurs, the next unit in the table is checked sequentially until an empty unit is found or the whole table is searched.
Secondary detection and rehashing
Di=12,-12,22,-22,... , k 2m m m k 2
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.