Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the optimization scheme of reading more and writing less in C++ high concurrency scenario?

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about the optimization scheme of reading more and writing less under the high concurrency scenario of C++. The article is rich in content and analyzes and describes for you from a professional point of view. I hope you can get something after reading this article.

Overview

When it comes to the optimization scheme with high concurrency, we can often think of module horizontal split, database read-write separation, database division and table division, adding cache, adding mq and so on, all of which are solved from the system architecture. Single module as a component unit of the system, its performance can also greatly affect the overall performance. This paper discusses its solution from the scenario of more reading and less writing under a single module, in order to better achieve high concurrency.

For different business scenarios, the frequency of reading and writing is different, and there are two common business scenarios:

Read more and write less: typical scenarios such as ad retrieval, whitelist update and maintenance, loadbalancer

Read less and write more: typical scenarios such as qps statistics

This paper analyzes the problems encountered in the scenario of reading more and writing less (also known as writing more and reading more), and discusses an appropriate solution.

Analysis.

In the scenario of more reading and less writing, the service is mostly in reading, and the reading time should not be too long, usually millisecond or lower; the frequency of updates is less frequent, such as every few seconds. By simply adding a mutex lock, a critical area can be vacated, which can often achieve the desired results and ensure that the data is updated correctly.

However, as long as a lock is added, there will be competition. Even if a read-write lock is added, although the read-write lock is not mutually exclusive, writing will still affect read, and when both read and write compete for the lock, the lock will be given priority to write (the characteristic of read-write lock). For example, when writing, all read requests are required to block until the writer thread or co-program releases the lock. If the critical section takes a long time to write, all read requests will be affected. From the monitoring chart, there will be a very sharp time-consuming burr, and all read requests will be waiting in the queue to be processed. If the service can finish processing these read requests before the next update cycle, the situation may not be so bad. But in extreme cases, if the next update cycle comes and the read request is not finished, it will form a vicious circle, with constant read requests waiting in the queue, resulting in the queue being crowded and the service faking death. If the situation is a little worse, after the upstream service finds that a node is faking death, because of the load balancing strategy, it will generally retry to request other nodes, which increases the pressure on other nodes. Eventually led to an avalanche of the entire system.

Therefore, locking should be avoided as far as possible in high concurrency scenarios. If it cannot be avoided, it is necessary to keep the granularity of locks as small as possible, and it is better to be close to lock-free. Simply locking a large critical area is not a suitable solution in high concurrency scenarios.

Double buffering

There is a data structure called double buffering, which is very common. For example, the display principle of the display screen, the current frame displayed on the display screen, and the next frame are ready in the background buffer. As soon as the time period arrives, the foreground frame will be replaced directly, so that the refresh without stutters can be achieved. The guiding idea of its implementation is space for time. The working principle of this data structure is as follows:

The data is divided into foreground and background.

All reading threads read foreground data without locking, and use a pointer to point to the foreground data currently read.

Only one thread is responsible for updating. When updating, prepare the background data first, and then cut the pointer directly. After that, all new read requests see the new foreground data.

Some of the reads are still processed at the old foreground, because the update is not complete, so the writer thread cannot exit the writer thread. The writer thread needs to wait for all the threads that fall into the old foreground to finish reading before exiting. By the way, update the old foreground data (that is, the current new background) again, which can ensure the consistency of the foreground and background data, which is useful when doing incremental updates.

The difficulties to be overcome in the realization of the project

How does the writer thread know that all the readers have finished reading in the foreground?

One way is to let each reading thread maintain a lock and lock it when reading. At this time, it will not affect the reading of other threads, but it will affect writing. After reading, the lock will be released (sometimes there may be overhead of notifying the writing thread, but the writing itself is very little). The writer thread only needs to confirm that the lock has been released, and release it immediately after confirmation. Confirm that this action is very fast (less than 25nsbook1swriting 103mswriting 106us = 109 ns). The reader thread hardly senses the existence of the lock.

Each thread has its own lock. Do you need to use a global map to map thread id to locks?

No, and in doing so, the global map will add a global lock, which brings us back to the problem we encountered at the beginning of the analysis. In fact, each thread can have private storage (thread local storage, referred to as TLS), if it is a co-program, it corresponds to the TLS of the co-program (but for go language, officially does not support TLS, want to achieve similar functions, either find a way to get TLS, or not based on the protocol lock, but use a global lock, but try to keep the lock granularity small, this article is mainly for C++ language, do not discuss the implementation of other languages in depth. In this way, each reader thread locks its own lock, which does not affect other reading threads, and the purpose of the lock is only to ensure read priority.

For thread private storage, you can use pthread_key_create, pthread_setspecific,pthread_getspecific series functions

Core code implementation

Read

When template int DoublyBufferedData::Read (typename DoublyBufferedData::ScopedPtr* ptr) {/ / ScopedPtr is destructed, the lock Wrapper* w = static_cast (pthread_getspecific (_ wrapper_key)) will be released; / / non-first read, get pthread local lock if (BAIDU_LIKELY (w! = NULL)) {w-> BeginRead (); / / lock ptr- > _ data = UnsafeRead (); ptr- > _ w = w; return 0 } w = AddWrapper (); if (BAIDU_LIKELY (w! = NULL)) {const int rc = pthread_setspecific (_ wrapper_key, w); / / first read, set pthread local lock if (rc = = 0) {w-> BeginRead (); ptr- > _ data = UnsafeRead (); ptr- > _ w = w; return 0 }} return-1;}

Write

Template template size_t DoublyBufferedData::Modify (Fn& fn) {BAIDU_SCOPED_LOCK (_ modify_mutex); / / lock to ensure that only one write int bg_index =! _ index.load (butil::memory_order_relaxed); / / points to the background buffer const size_t ret = fn (_ data [BG _ index]); / / modifies the background buffer if (! ret) {return 0 } / / cut pointer _ index.store (bg_index, butil::memory_order_release); bg_index =! bg_index; / / and other threads reading the old foreground end {BAIDU_SCOPED_LOCK (_ wrappers_mutex); for (size_t I = 0; I)

< _wrappers.size(); ++i) { _wrappers[i]->

WaitReadDone ();}} / / confirm that it has not been read, modify the new background data directly, and const size_t ret2 = fn (_ data [BG _ index]); return ret2;} to the new foreground.

For a complete implementation, please refer to brpc's DoublyBufferData.

A brief talk on the implementation of double buffering in golang

Common double buffer loading implementation

Based on the counter, use atomic to ensure atomicity, read into the critical area, counter + 1, exit-1, write judgment counter 0, switch, but the counter is a global lock. This scheme can also be adopted by C++, but the counter is also a global lock after all, so the performance will be so poor. Even if you use smart pointer shared_ptr, you will face the problem of mutual exclusion of smart pointer reference counts. The reason for using counters instead of TLS is that go does not support TLS. Compared with the TLS version and counter version, TLS has better performance, because there is no mutex problem of grabbing counters, but the counter itself is very fast, and the performance has not been tested, so you can try.

The realization of sync.Map

It is also based on the counter, but the counter is to ensure that the probability of invalidation of the read foreground cache is not too high, has the function of inhibition and convergence, and realizes the unlocked read. In a few cases, when the foreground cache cannot read the data, it will read the background cache. At this time, it is also necessary to add a lock, while the counter + 1. When the counter value reaches a certain level (more than the number of elements in the background cache), the switch is performed.

Is it suitable for scenarios with less reading and more writing?

Not suitable, double buffering gives priority to ensuring read performance, while scenarios with more writes and less reads need to give priority to ensuring write performance.

The above is the optimization scheme of reading more and writing less in the C++ high concurrency scenario shared by the editor. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report