In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about why Redis6.0 introduced multithreading. The editor thought it was very practical, so I shared it with you as a reference. Let's follow the editor and have a look.
Why did Redis6.0 adopt single-threaded model before?
Strictly speaking, it is not single-threaded since Redis 4. 0. In addition to the main thread, there are some background threads that handle slower operations, such as the release of useless connections, the deletion of large key, and so on.
Single-threaded model, why is the performance so high?
The author of Redis has considered many aspects from the beginning of the design. In the end, I chose to use a single-threaded model to process commands. There are several important reasons for choosing the single-threaded model:
Redis operations are based on memory, and the performance bottleneck of most operations is not CPU.
Single-threaded model avoids the performance overhead caused by switching between threads
Client requests can also be processed concurrently using a single-threaded model (multiplexing Icano).
Using the single-threaded model, it is more maintainable and less expensive to develop, debug and maintain.
The third reason mentioned above is the decisive factor for Redis's final adoption of the single-threaded model, and the other two reasons are the additional benefits of using the single-threaded model. Here we will introduce the above reasons in order.
Performance bottleneck is not in CPU
The following figure is an illustration of the single-threaded model on Redis's official website. The bottleneck of Redis is not CPU, its main bottleneck is memory and network. In a Linux environment, Redis can even submit 1 million requests per second.
Why is it that CPU is not the bottleneck of Redis?
First of all, the vast majority of Redis operations are memory-based and pure kv (key-value) operations, so commands are executed very quickly. We can roughly understand that the data in redis is stored in a large HashMap, and the advantage of HashMap is that the time complexity of finding and writing is O (1). This structure is used in Redis to store data, which lays the foundation for high performance of Redis. According to the description of Redis's official website, Redis can ideally submit a million requests per second, and each request takes an order of nanosecond. Since every Redis operation is so fast that a single thread can be completely done, why use multithreading?
Thread context switching problem
In addition, thread context switching occurs in multithreaded scenarios. Threads are scheduled by CPU, and a core of CPU can only execute one thread at the same time. A series of operations will occur in the process of CPU switching from thread A to thread B. the main process includes saving the execution site of thread A, and then loading the execution site of thread B. this process is called "thread context switching". It involves the preservation and recovery of thread-related instructions.
Frequent thread context switching can lead to sharp performance degradation, which can lead to a decline in performance instead of increasing the speed of processing requests, which is one of the reasons why Redis is cautious about multithreading technology.
In a Linux system, you can use the vmstat command to view the number of context switches. Here is an example of how vmstat views the number of context switches:
Vmstat 1 means to count every second, where the cs column refers to the number of context switches. In general, the context switch of an idle system is less than 1500 per second.
Parallel processing of client requests (Istroke O multiplexing)
As mentioned above: the bottleneck of Redis is not CPU, its main bottleneck is memory and network. The so-called memory bottleneck is easy to understand. When Redis is used as a cache, many scenarios need to cache a large amount of data, so a lot of memory space is needed, which can be solved by cluster sharding, such as Redis's own non-central cluster sharding scheme and Codis's proxy-based cluster sharding scheme.
As for the network bottleneck, Redis uses the multiplexing technology on the network Istroke O model to reduce the impact of the network bottleneck. The use of a single-threaded model in many scenarios does not mean that the program cannot process tasks concurrently. Although Redis uses a single-threaded model to process user requests, it uses Imax O multiplexing technology to "parallel" multiple connections from the client while waiting for requests from multiple connections. Using Icano multiplexing technology can greatly reduce the system overhead, the system no longer needs to create a special listening thread for each connection, and avoids the huge performance overhead caused by the creation of a large number of threads.
Let's explain the multiplexing Icano model in detail below. In order to understand more fully, let's first understand a few basic concepts.
Socket (socket): a Socket can be understood as a communication endpoint in two applications when they communicate over the network. When communicating, one application writes data to Socket, and then sends the data to the Socket of another application through the network card. What we usually call remote communication between HTTP and TCP protocols, the underlying layer is based on Socket. Five kinds of network IO models should also realize network communication based on Socket.
Blocking and non-blocking: blocking means that a request cannot return a response immediately, and the response cannot be returned until all the logic has been processed. Non-blocking instead, send a request and return the reply immediately, without waiting for all the logic to be processed.
Kernel space and user space: in Linux, the stability of applications is far less than that of operating system programs. In order to ensure the stability of the operating system, Linux distinguishes kernel space from user space. It can be understood that kernel space runs operating system programs and drivers, and user space runs applications. Linux isolates operating system programs and applications in this way, preventing applications from affecting the stability of the operating system itself. This is also the main reason why the Linux system is super stable. All system resource operations are carried out in kernel space, such as reading and writing disk files, memory allocation and recycling, network interface calls, and so on. So in a network IO reading process, the data is not read directly from the network card to the application buffer in the user space, but is first copied from the network card to the kernel space buffer, and then from the kernel to the application buffer in the user space. For the network IO writing process, on the contrary, the data is copied from the application buffer in user space to the kernel buffer, and then sent out through the network card from the kernel buffer.
The multiplexing Icano model is based on the multiplex event separation function select,poll,epoll. Take epoll adopted by Redis as an example. Before initiating a read request, update epoll's socket monitor list, and then wait for the epoll function to return (this process is blocking, so multiplexing IO is essentially blocking the IO model). When some socket has data arriving, the epoll function returns. At this point, the user thread formally initiates the read request to read and process the data. In this mode, a special monitoring thread is used to check multiple socket, and if a socket has data arriving, it is handed over to the worker thread for processing. Because waiting for Socket data to arrive is very time-consuming, this approach solves the problem that blocking a Socket connection in the IO model requires a thread, and there is no problem of CPU performance loss caused by busy polling in the non-blocking IO model. There are many practical application scenarios of multiplexing IO model, such as Redis,Java NIO, which we are familiar with, and Netty, the communication framework adopted by Dubbo.
The following figure is a detailed flow of Socket programming based on the epoll function.
Maintainability
We know that multi-threading can make full use of multi-core CPU. In high concurrency scenarios, it can reduce the CPU loss caused by I-CPU waiting, and bring good performance. However, multithreading is a double-edged sword, which not only brings benefits, but also brings difficulties in code maintenance, online problems are difficult to locate and debug, deadlocks and other problems. The execution process of the code in the multithreaded model is no longer serial, and shared variables accessed by multiple threads at the same time can lead to weird problems if they are not handled properly.
Let's take a look at the strange phenomena that occur in a multithreaded scenario through an example. Look at the following code:
Class MemoryReordering {int num = 0; boolean flag = false; public void set () {num = 1; / statement 1 flag = true; / / statement 2} public int cal () {if (flag = = true) {/ / statement 3 return num + num; / / statement 4} return-1;}}
What is the return value of the cal () method when flag is true? Many people will say: do you have to ask? Definitely return 2
The result may surprise you! In the above code, since statements 1 and 2 have no data dependencies, instruction reordering may occur, and it is possible that the compiler will put flag=true in front of num=1. At this point, the set and cal methods are executed in different threads, regardless of the sequence. The cal method, as long as flag is true, enters the code block of if to perform the addition operation. The possible order is:
Statement 1 is executed before statement 2, and the order of execution may be: statement 1-> statement 2-> statement 3-> statement 4. Before executing statement 4, num = 1, so the return value of cal is 2
Statement 2 executes before statement 1, and the order of execution may be: statement 2-> statement 3-> statement 4-> statement 1. Before executing statement 4, num = 0, so the return value of cal is 0
We can see that if instruction reordering occurs in a multithreaded environment, it will have a serious impact on the results.
Of course, on the third line, you can add the keyword volatile to flag to avoid instruction rearrangement. That is, a memory fence is added at the flag to block the reordering of the code before and after the flag. Of course, multithreading also brings problems such as visibility, deadlock and shared resource security.
Why does boolean volatile flag = false;Redis6.0 introduce multithreading?
The multi-threaded part introduced by Redis6.0 is actually only used to deal with the reading and writing of network data and protocol parsing, and the execution command is still a single worker thread.
From the figure above, we can see that when Redis processes network data, the process of calling epoll is blocked, that is, this process will block threads. If the concurrency is very high, reaching tens of thousands of QPS, this may become a bottleneck. Generally, we encounter this kind of network IO bottleneck problem, which can be solved by increasing the number of threads. Turning on multi-threading can not only reduce the impact caused by network I / O waiting, but also make full use of the multi-core advantage of CPU. Redis6.0 is no exception, where multithreading is added to process network data in order to improve the throughput of Redis. Of course, the related command processing is still single-threaded, and there are no problems caused by concurrent access under multi-thread.
Performance comparison
Stress test configuration:
Redis Server: Aliyun Ubuntu 18.04 GHZ 8 CPU 2.5 GHZ CPU, 8 GB memory, mainframe model ecs.ic5.2xlargeRedis Benchmark Client: Aliyun Ubuntu 18.04 page8 2.5 GHZ CPU, 8 GB memory, mainframe model ecs.ic5.2xlarge
The multithreaded version is Redis 6.0and the single-threaded version is Redis 5.0.5. The multithreaded version requires the following new configurations:
Io-threads 4 # start 4 IO threads io-threads-do-reads yes # request parsing is also using IO thread
Pressure test command: redis-benchmark-h 192.168.0.49-a foobared-t set,get-n 1000000-r 100000000-- threads 4-d ${datasize}-c 256
The picture comes from the Internet.
The picture comes from the Internet.
As you can see from the above, the performance of the GET/SET command in the multithreaded version is almost double that of the single thread. In addition, these data are only used to simply verify whether multi-threaded Iripple O really brings performance optimization, and there is no stress test for specific scenarios, the data is for reference only. This performance test is based on the unstble branch and does not rule out the possibility that the performance of subsequent official versions will be better.
It can be seen that single thread has the advantage of single thread, and multi-thread has the advantage of multi-thread. only by fully understanding the essential principle can it be flexibly used in production practice.
Thank you for reading! On why the introduction of multi-threading Redis6.0 to share here, I hope the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it and let more people see it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.