How to solve the shock Group problem of nginx 07/03 Update SLTechnology News&Howtos

How to solve the shock Group problem of nginx

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article is to share with you about how to solve the problem of nginx. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

For the nginx shock problem, the first thing we need to understand is that during nginx startup, the master process will listen on the ports specified in the configuration file, and then the master process will call the fork () method to create each child process. According to the working principle of the process, the child process will inherit all the memory data of the parent process and the listening port, that is to say, the worker process will listen on each port after startup. About the shock group, it means that when the client has a request to create a new connection, the connection establishment event of each worker process will be triggered, but only one worker process can handle the event normally, while the other worker processes will find that the event has expired and cycle back into the waiting state. This phenomenon that "startles" all worker processes due to one event is a group problem. Obviously, if all worker processes are triggered, it will consume a lot of resources. This article focuses on how nginx deals with shock problems.

1. Solution method

In the previous article, we mentioned that when each worker process is created, the ngx_worker_process_init () method is called to initialize the current worker process, and there is a very important step in this process, that is, each worker process calls the epoll_create () method to create a unique epoll handle for itself. For each port that needs to be listened on, there is a file descriptor corresponding to it, and the worker process must add the file descriptor to the epoll handle of the current process through the epoll_ctl () method, and listen for the accept event, then it will be triggered by the client connection establishment event to handle the event. It can also be seen here that the worker process cannot trigger the corresponding event if it does not add the file descriptor corresponding to the port it needs to listen to to the epoll handle of the process. Based on this principle, nginx uses a shared lock to control whether the current process has permission to add the port that needs to be listened to to the epoll handle of the current process, that is, only the process that acquires the lock will listen on the target port. In this way, it is guaranteed that only one worker process will be triggered each time an event occurs. The following figure shows a schematic diagram of the worker process work cycle:

With regard to the process in the figure, it is important to note that each worker process will attempt to acquire a shared lock after entering the loop. If not, the file descriptor of the listening port will be removed from the epoll handle of the current process (even if it does not exist). The main purpose of doing this is to prevent the loss of client connection events, even if this may cause a small number of panic problems, but not serious. Just imagine, if, according to theory, the file descriptor of the listening port is removed from the epoll handle when the current process releases the lock, then the file descriptor corresponding to each port does not have any epoll handle to listen for during this period of time before the next worker process acquires the lock, which will result in the loss of events. If, conversely, according to the figure, the listening file descriptors are removed when the lock acquisition fails, because the lock acquisition failed, it means that a process must have listened to these file descriptors, so it is safe to remove them at this time. But one problem with this is that, according to the figure above, when a loop finishes, the current process releases the lock and then handles other events, noting that it does not release the listening file descriptor in the process. At this point, if another process acquires the lock and listens to the file descriptor, then two processes are listening to the file descriptor, so if a connection establishment event occurs on the client, two worker processes will be triggered. This problem can be tolerated for two main reasons:

The shock phenomenon at this time only triggers fewer worker processes, which is much better than startling all worker processes every time.

The main reason for this shock problem is that the current process releases the lock, but does not release the listening file descriptor, but after releasing the lock, the worker process mainly handles the read and write events and check flag bits of the client connection. This process is very short. After processing, it will try to acquire the lock. At this time, the listening file descriptor will be released. The worker process that acquires the lock waits longer for the event to handle the client's connection establishment event, so the probability of spooky problems is relatively small.

two。 Source code explanation

The method of the initial event of the worker process is mainly carried out in the ngx_process_events_and_timers () method. Let's take a look at how this method handles the whole process. Here is the source code of the method:

Void ngx_process_events_and_timers (ngx_cycle_t * cycle) {ngx_uint_t flags; ngx_msec_t timer, delta; if (ngx_trylock_accept_mutex (cycle) = = NGX_ERROR) {return } / / the event is processed here, for the kqueue model, it points to the ngx_kqueue_process_events () method, / / for the epoll model, it points to the ngx_epoll_process_events () method / / the main function of this method is to get the event list in the corresponding event model. Then add the event to the ngx_posted_accept_events / / queue or ngx_posted_events queue (void) ngx_process_events (cycle, timer, flags) / / start processing the accept event here, handing it over to the ngx_event_accept () method of ngx_event_accept.c Ngx_event_process_posted (cycle, & ngx_posted_accept_events); / / start releasing lock if (ngx_accept_mutex_held) {ngx_shmtx_unlock (& ngx_accept_mutex);} / / if it does not need to be processed in the event queue, handle the event directly / / handle the event directly, and if it is an accept event, hand it over to the ngx_event_accept () method of ngx_event_accept.c / / if it is a read event, it is handed over to the ngx_http_wait_request_handler () method of ngx_http_request.c; / / for the finished event, it is finally handled by the ngx_http_keepalive_handler () method of ngx_http_request.c. / / start processing other events except the accept event ngx_event_process_posted (cycle, & ngx_posted_events);}

In the above code, we omitted most of the checking work, leaving only the skeleton code. First, the worker process calls the ngx_trylock_accept_mutex () method to acquire the lock, which listens to the file descriptor for each port if the lock is acquired. The ngx_process_events () method is then called to handle the overheard events in the epoll handle. The shared lock is then released, and finally the read and write events of the connected client are handled. Let's take a look at how the ngx_trylock_accept_mutex () method acquires shared locks:

Ngx_int_t ngx_trylock_accept_mutex (ngx_cycle_t * cycle) {/ / attempt to acquire the shared lock if (ngx_shmtx_trylock (& ngx_accept_mutex)) {/ / ngx_accept_mutex_held = 1 means the current process has acquired the lock if (ngx_accept_mutex_held & & ngx_accept_events = 0) {return NGX_OK } / / here, the file descriptor of the current connection is mainly registered in the queue of the corresponding event. For example, when the change_list array / / nginx of the kqueue model enables each worker process, by default, the worker process inherits the socket handle that the master process listens to. / / this leads to a problem, that is, when there is a client event on a port, it will wake up all the processes listening on that port. / / but only one worker process can successfully handle the event, while other processes are awakened to find that the event has expired and / / will continue to wait, a phenomenon known as "panic group". / / nginx solves the panic phenomenon on the one hand by sharing locks here, that is, only the worker process that acquires the lock can handle / / client events, but in fact, the worker process re-adds the listening events of each port to the current worker process while acquiring the lock, while other worker processes will not listen. In other words, only one worker process listens on each port at a time. / / this avoids the "shock group" problem. / / the ngx_enable_accept_events () method here adds listening events for each port to the current process. If (ngx_enable_accept_events (cycle) = = NGX_ERROR) {ngx_shmtx_unlock (& ngx_accept_mutex); return NGX_ERROR;} / / Flag has successfully acquired the lock ngx_accept_events = 0; ngx_accept_mutex_held = 1; return NGX_OK } / / the previous acquisition of the lock failed, so here you need to reset the state of the ngx_accept_mutex_held and clear if (ngx_accept_mutex_held) {/ / if the ngx_accept_mutex_held of the current process is 1, reset it to 0 And delete the listening / / events of the current process on each port if (ngx_disable_accept_events (cycle, 0) = = NGX_ERROR) {return NGX_ERROR } ngx_accept_mutex_held = 0;} return NGX_OK;}

In the above code, there are essentially three main things to do:

Try to use the CAS method to acquire a shared lock through the ngx_shmtx_trylock () method

After acquiring the lock, the ngx_enable_accept_events () method is called to listen for the file descriptor corresponding to the target port.

If the lock is not acquired, the ngx_disable_accept_events () method is called to release the listening file descriptor

Thank you for reading! This is the end of the article on "how to solve the shock problem of nginx". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.