What are the ways to deal with high concurrency 07/06 Update SLTechnology News&Howtos

What are the ways to deal with high concurrency

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "what are the ways to deal with high concurrency". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

/ 01 definition of high concurrency /

First, what is the definition of high concurrency?

High concurrency means large traffic, so we need to use technical means to resist the impact of traffic. The effect of these methods is like you can manipulate the traffic, so that the traffic can obediently be handled by the system more smoothly according to your expectations, and give users a better experience. Yes, just like Dayu's flood control.

I think almost every programmer has a question in his mind: how much QPS or TPS can be considered high concurrency?

In fact, this problem cannot be judged simply by a uniform standard figure. Because the complexity of different business is different, it can not be generalized.

The criterion I use myself is that when a functioning system has a performance problem without deliberately optimizing performance, it is beginning to enter the range of high concurrency.

Yes, it's that simple and rude. There are no specific figures.

However, in scenarios where high concurrency is normal, there is usually a monitoring system that will continuously observe whether at least the following indicators are normal.

TPS . Transactions per second, where a transaction is the process in which a client sends a request to the server and the server reacts. (from the perspective of the client)

QPS . The query rate per second can be calculated by concurrency / average response time. (from a server perspective, a TPS may correspond to multiple QPS)

Number of concurrent users. The number of users who can normally use the system functions that the system can carry at the same time.

Response time. Often called RT, it refers to the time it takes the system to respond to a request.

Bandwidth. If it is a business with a large amount of data transmission, we also need to consider the problem of bandwidth, such as video and audio applications.

If you don't even know what these indicators mean, read them a few more times and figure them out first, otherwise your high concurrency is doing with your eyes closed.

/ 02 ideas for dealing with high concurrency /

Many inexperienced partners encounter high concurrency problems, regardless of whether they come up against the cache.

It's right to use caching, but caching is not a panacea. It works everywhere. After all, caching is "stateful", and in software development, dealing with "stateful" things is always much more troublesome than "stateless", because there are data consistency issues to consider.

I suggest you follow the following three steps to consider the problem.

01 comb request transfer link

Having sorted out the links requesting circulation, it is as if you have a "battle map" in your hand, and you can arrange troops more intuitively and accurately.

After all, software development itself is an engineering problem, and we can't do things on the basis of feeling. Solving the problem of high concurrency is no exception.

02 determine the goal

There is a good saying, "there is no best, only better". If you don't set a goal first, it will become an endless pursuit of performance. Excessive high performance, not only will not produce additional benefits, but a waste of input costs, after all, resources are costly and limited.

I suggest you refine the specific numerical goals to each API. For example, I usually first determine how much TPS of the entire line of business should be achieved. For example, the TPS for placing an order should reach 100.

Next, according to the previous combed link diagram, I will find out which Service will be involved, which API, and which API will be called multiple times by different Service.

Then, according to the order in which the API is called in the link (usually the value set by the upstream API should be appropriately enlarged) and the number of times it will be called repeatedly, the QPS target of each API is obtained. such as,

To get a list of shopping carts, there are two calls, QPS = 200.

Batch acquisition of goods inventory, there are three calls, QPS = 300.

To get the membership information, there is a call, QPS = 100.

……

According to this idea, determine the QPS of the API involved in all the business lines, and then add the QPS value of the same API to get the ideal QPS value of each API at the whole system level (equivalent to the estimated peak value of each business line). such as,

Get a list of shopping carts, which appear in three lines of business, QPS = 200,500,100,800.

Batch acquisition of commodity inventory, which appears in two lines of business, QPS = 300,200 = 500.

Get the membership information, which appears in two business lines, QPS = 100,200,300.

……

Of course, the actual goal set apart from QPS, there are other indicators, the above is just an example.

In addition, you need to pay extra attention to the "response time" of TP90 and TP99. Because even if the average response time reaches the standard, it does not mean that the whole system is stable. Because there may be 80% of requests that respond particularly quickly, bringing the average down, but there are also a large number of requests that are particularly time-consuming. My usual experience is that if the TP99 is more than twice the average, it needs attention, because it means that there is a significant performance bottleneck somewhere.

03 formulate a specific optimization plan

In fact, there are many specific optimization schemes, which should be chosen according to the actual situation. However, how to choose, you still need to have a global vision. Because the whole system is integrated, by cooperating with each other to form a joint force, the difficulty of optimization can be greatly reduced. For example, after selecting an optimization scheme, the upstream API can carry 90% of the traffic not going downstream, so the QPS target of the downstream API can be reduced appropriately.

The following is my comprehensive complexity and cost of the specific scheme, from simple to complex.

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

First optimize at the code level, such as code performance, multithreading, request merging, pooling (connection pooling, thread pooling, object pooling, etc.).

Upgrade the hardware if you can.

Resolutely do not split the system if it can be solved by cache. Also, the cache is as upstream as possible. For example, cdn > page > api > service.

If there is a bottleneck in data processing, then priority is given to whether the business accepts asynchrony, and if so, MQ is used to trim peaks and fill valleys. Or current restriction and demotion.

If MQ is not useful and is the bottleneck of the overall throughput, as long as it is not the bottleneck of writing data, try to solve the problem by splitting the program rather than the database. (at this point, service governance needs to be introduced. In addition, there will be consistency problems and data consolidation problems to be solved.)

If you have to move to the data level, priority should be given to the separation of database read and write, rather than directly splitting the data.

We really have no choice but to split the data, giving priority to splitting the business vertically rather than horizontally. (horizontal split data merge will be more expensive than vertical split)

Finally, the database is split horizontally, supporting unlimited expansion, once and for all.

You see, although there are many solutions, you can also observe some rules: the more upstream you solve the problem, the lower the cost. Therefore, it is most appropriate to consider the overall situation with funnel thinking. From the client request to the access layer, to the logic layer, to the DB layer, layer by layer, filter out the request, and Fail Fast even if an exception occurs (return as soon as possible if you fail).

/ 03 landing /

When it really hits the ground, there are many specific technical details involved, such as load balancing, caching, message queues, sub-libraries, sub-tables, and so on. I'm not going to expand here. I have to write a lot about each point. Interested partners can follow my previous series of articles on distributed systems, "8 months of polishing, a collection of distributed systems for programmers."

In fact, if we really want to treat high concurrency as a systematic thing, the field of vision should not only be limited to the dimension of "performance", but also need to consider at least "availability" and "scalability".

Availability is the time when the system can be served normally. Although the access speed of one is not so fast, it is non-downtime and fault-free throughout the year; although the access speed of the other is very fast, there are accidents and outages every three or five, and users must choose the former.

Scalability indicates the rapid expansion ability of the system, and whether it can be completed in a short time to undertake this part of the traffic in case of sudden large traffic impact. For example, double 11 activities, star hot search and other scenes. After all, it is impossible to accurately predict the range of future traffic, and it is even less likely to prepare a large amount of redundant resources for it at any time.

Therefore, these three goals need to be considered as a whole, because they are interrelated and even influence each other.

For example, when you consider the scalability of the system, you will design the service to be stateless, which not only improves scalability, but also indirectly improves the performance and availability of the system, because you can scale out at any time.

For example, to improve availability, service interfaces are usually timed out to prevent a large number of threads from blocking and causing a system avalanche on slow requests. The specific timeout is set to how much, the reference is the performance of API.

This is the end of the content of "what are the ways to deal with high concurrency". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.