Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The difference between a red-black tree and a hash table

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)06/01 Report--

The basic principles of hash and red-black tree

Hash (hash), also known as hash, becomes a fixed output to the array through the hashing algorithm. Among all the linear data structures, the array has the fastest positioning speed, because it can be located directly to the corresponding array space through the array subscript, so it does not need to be searched one by one.

The spin of the red-black tree is a genius design, is a special balanced binary tree data structure, the characteristic is from hundreds of thousands of data can be found in a few steps, fast.

Second, use the scene

1. Speed comparison

The Internet of things may be connected by millions of devices or users, which requires a lot of high concurrency, so the first requirement for network security products is performance and speed. Generally speaking, the hash search speed is faster than the red-black tree, and the search speed is basically independent of the amount of data and belongs to the constant level, while the search speed of the RB tree is the log (n) level.

The time complexity of red-black tree search and deletion is O (logn), and that of Hash search and deletion is O (1). If the height of the red and black tree is not less than 8, the digital search is used, and there is not much difference in performance between the two.

That is, not all scenarios, hashes are faster than red-black trees, depending on the degree of optimization of the code. The linux highly concurrent EPOLL mode event management used by hihttps is the red-black tree.

2. Data prediction

Static data, can basically predict the size, using hash. For example, there are several hundred rules for t initialization that can be controlled. In addition, there will not be too many TOPIC blacklist and whitelist, URL addresses and so on, and hashes will be used.

Dynamic data, such as statistical IP address, task scheduling, epoll high concurrency event management, can not be judged, may be very little, may be very much, and it is better to use a red-black tree. Of course, if you roughly know that the number of device IP addresses is in a certain range, such as only a few thousand, you can also use hashes.

3. Memory consumption

Places with strict memory requirements, such as embedded systems, use red-black trees. The red-black tree takes up less memory (only the nodes where it exists need to be allocated memory), while the hash should allocate enough memory to store the hash table in advance, wasting memory.

Where it doesn't matter about memory consumption, such as the server has a lot of memory, use hash. The biggest disadvantage of hashing is that if the memory allocation is small, the elements may conflict, and the conflicting elements are more than 8 linked lists, which is not as efficient as the red-black tree. Java's hashmap is a combination of hash and red-black trees. When the number of nodes of the same hash value is not less than 8, it is no longer stored in the form of a single linked list, but a red-black tree.

4 complexity

Hashing is simpler and the red-black tree algorithm is a little more complex, but it doesn't matter. Dashen has already opened up a lot of stable versions.

III. Summary of application scenarios

The red-black tree is orderly, and the hash is disordered. According to the needs of the project, many of Alibaba's projects use red-black tree more. The author thinks that it is mainly related to memory. If the memory is demanding, use the red-black tree; if the memory is large enough, sacrifice memory for faster speed, hash is completely suitable.

Hiihttps open source waf uses a large number of hash algorithms, which may be related to speed concurrency requirements. In a word, data structure is the most basic subject of network security.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report