[Enmo College] Architectural Design | what is the "high concurrency" of Internet architecture? 07/04 Update SLTechnology News&Howtos

[Enmo College] Architectural Design | what is the "high concurrency" of Internet architecture?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

architecture design| What is "high concurrency" in Internet architecture?

1. What is high concurrency?

High concurrency is one of the factors that must be considered in the design of Internet distributed system architecture, which usually refers to the design to ensure that the system can handle many requests simultaneously and in parallel.

Some commonly used indicators of high concurrency:

Response Time: The time it takes the system to respond to a request. For example, it takes 200ms for the system to process an HTTP request, and this 200ms is the response time of the system.

Throughput: Number of requests processed per unit time.

QPS(Quety Per Second): Number of requests per second. In the Internet, the distinction between this metric and throughput is less obvious.

Number of concurrent users: The number of users who simultaneously carry normal system functions. For example, in an instant messaging system, the simultaneous online volume represents the number of concurrent users of the system to some extent.

2. How to improve the concurrency capability of the system

Internet distributed architecture design, improve the concurrency of the system, there are two main methods: vertical scale up (Scale Up) and horizontal scale out (Scale Out).

Vertical extension:

Improve the processing capacity of a single machine. There are two ways to expand vertically:

(1) Enhance the performance of stand-alone hardware, for example: increase the number of CPU cores such as 32 cores, upgrade better network cards such as 10G, upgrade better hard disks such as SSD, expand the capacity of hard disks such as 2T, and expand system memory such as 128G;

(2) Improve stand-alone architecture performance, such as: using Cache to reduce IO times, using asynchronous to increase single-service throughput, and using lock-free data structures to reduce response time;

In the early days of rapid Internet business development, if budget is not an issue, it is strongly recommended to use "enhancing stand-alone hardware performance" to improve system concurrency capability, because at this stage, the company's strategy is often to develop business and seize time, and "enhancing stand-alone hardware performance" is often the fastest way. Whether it is to improve the performance of stand-alone hardware or to improve the performance of stand-alone architecture, there is a fatal deficiency: stand-alone performance always has limits. Therefore, the ultimate solution for high concurrency in Internet distributed architecture design is horizontal scaling.

Horizontal extension:

Increasing the number of servers linearly scales system performance. Horizontal scaling is a requirement for system architecture design. How to design horizontal scaling at all levels of the architecture and the common horizontal scaling practices at all levels of the Internet company architecture are the focus of this paper.

III. Common Internet Layered Architecture

Common Internet distributed architectures are as follows:

(1) Client layer: typical caller is browser or mobile application APP

(2) Reverse proxy layer: system entry, reverse proxy

(3) Site application layer: implement core application logic, return html or json

(4) Service layer: If service is realized, there will be this layer.

(5) Data-cache layer: cache accelerates access to storage

(6) Data-database layer: database solidification data storage

How is the horizontal expansion of the whole system implemented at each level?

IV. Layered horizontal scaling architecture practice

Horizontal extension of reverse proxy layer

The horizontal extension of the reverse proxy layer is achieved through "DNS polling": dns-server configures multiple resolution IPs for a domain name, and each time DNS resolution requests access dns-server, polling returns these IPs.

When nginx becomes a bottleneck, as long as the number of servers is increased, the deployment of new nginx services is added, and an external IP is added, the performance of the reverse proxy layer can be expanded to achieve theoretically infinite concurrency.

Horizontal extension of the site layer

The horizontal extension of the site layer is achieved through "nginx". By modifying nginx.conf, multiple web backends can be set up.

When the web backend becomes a bottleneck, as long as the number of servers is increased, the deployment of new web services is added, and the new web backend is configured in the nginx configuration, the performance of the site layer can be expanded to achieve theoretically infinite concurrency.

Horizontal extension of service layer

Horizontal extension of the service layer is achieved through "service connection pools."

When the site layer invokes the downstream service layer RPC-server through the RPC-client, the connection pool in the RPC-client will establish multiple connections with downstream services. When the service becomes a bottleneck, as long as the number of servers is increased, new service deployments are added, and new downstream service connections are established at the RPC-client, the service layer performance can be expanded to achieve infinite concurrency in theory. If you need elegant service layer auto-scaling, you may need to support service auto-discovery in the configuration center.

Horizontal extension of the data layer

In the case of a large amount of data, the data layer (cache, database) involves the horizontal expansion of data, which is originally stored on a server (cache, database) horizontally split to different servers to achieve the purpose of expanding system performance.

There are several common horizontal splitting methods for the Internet data layer:

Split by Range Level

Each data service stores a certain range of data. The above figure is an example:

user0 library, storing uid range 1-1kw

user1 library, storing uid range 1kw-2kw

The benefits are:

(1) The rule is simple, service can be routed to the corresponding storage service only by judging the uid range;

(2) data balance is better;

(3) Easy to expand, you can add a uid[2kw,3kw] data service at any time;

Deficiencies are:

The load of requests is not necessarily balanced. Generally speaking, newly registered users will be more active than old users, and the pressure of large range service requests will be greater.

Split by hash level

Each database stores part of the data after hashing a certain key value. The above figure is an example:

user0 library, storing even uid data

user1 library, storing odd uid data

The benefits are:

(1) The rule is simple, the service only needs to hash the uid to route to the corresponding storage service;

(2) data balance is better;

(3) the uniformity of requests is better;

Deficiencies are:

It is not easy to expand, expand a data service, hash method change, may need to carry out data migration;

It should be noted here that scaling system performance through horizontal splitting is fundamentally different from scaling database performance through master-slave synchronous read-write separation.

Scale database performance with horizontal splitting:

(1) The amount of data stored on each server is 1/n of the total amount, so the performance of a single machine will also be improved;

(2) The data on n servers do not intersect, and the union of data on that server is the full set of data;

(3) The data is horizontally split into n servers, theoretically the read performance is expanded n times, and the write performance is also expanded n times (in fact, it is far more than n times, because the data volume of a single machine becomes 1/n of the original);

Scales database performance with master-slave synchronous read-write separation:

(1) The amount of data stored on each server is the same as the total amount;

(2) The data on the n servers are the same, and they are all complete sets;

(3) Theoretically, the reading performance is expanded n times, the writing performance is still single point, and the writing performance is unchanged;

Horizontal splitting of the cache layer is similar to horizontal splitting of the database layer. It is also mostly in the form of range splitting and hash splitting, so it is no longer expanded.

V. Summary

There are two ways to improve the concurrency capability of a system: Scale Up and Scale Out. The former vertical expansion can improve concurrency by improving the performance of stand-alone hardware or improving the performance of stand-alone architecture, but stand-alone performance always has its limit. The ultimate solution for Internet distributed architecture design with high concurrency is the latter: horizontal expansion.

In the hierarchical architecture of the Internet, the practice of scaling at each level is different:

(1) The reverse proxy layer can be horizontally extended by "DNS polling";

(2) The site layer can be horizontally extended through nginx;

(3) Service layer can be horizontally extended through service connection pool;

(4) The database can be horizontally expanded according to the data range or the data hash;

After horizontal expansion of each layer, the performance of the system can be improved by increasing the number of servers, so that the performance is unlimited in theory.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.