In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
PHP system how to support high concurrency, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
High concurrency systems are different. For example, the middleware system with millions of concurrent requests per second, the gateway system with tens of billions of requests per day, and the system with instantaneous hundreds of thousands of requests per second.
When they deal with high concurrency, the coping architecture is different because of the different characteristics of the system.
In addition, for example, the order system, commodity system and inventory system in the e-commerce platform, the architecture design in the high concurrency scenario is also different, because the business scenario behind is different.
The simplest system architecture
Suppose your system is deployed on a machine at the beginning, with a database connected behind it, and the database is deployed on a server.
We can even reproduce the real point, for example, the machine deployed in your system is 4-core 8G, and the database server is 16-core 32G.
At this point, assume that the total number of users of your system is 100000, and the number of daily active users is very small. The number of daily active users varies according to the scenario of different systems. Let's take a more objective ratio of 10%. The daily active users are 10,000.
According to the rule of 28, it takes 4 hours during the peak period every day, and the proportion of active users during the peak period reaches 80%, that is, 8000 people are active within 4 hours.
Then everyone makes a request to your system, let's count him 20 times a day. At its peak, 8000 people made only 160000 requests, an average of 10 requests per second in four hours (14400 seconds).
Okay! It has nothing to do with high concurrency, isn't it?
Then at the system level, there are 10 requests per second, and each call to the database will have several database operations, such as doing crud and so on.
So let's take one request for 3 database requests. In that case, the database layer will only have 30 requests per second. Is that right?
According to the configuration of this database server, the support is absolutely no problem. The system described above, shown in a diagram, is as follows:
Database sub-database and sub-table + read-write separation
Suppose that the number of users continues to grow at this time, reaching 10 million registered users, and then the number of daily active users is 1 million.
At this time, the number of requests at the system level will reach 1000 picks per second. At the system level, you can continue to expand the capacity by clustering. Anyway, the previous load balancer layer will distribute the traffic evenly.
However, at this point, the number of requests accepted at the database level will reach 3000 paces, which is a bit of a problem.
At this point, the number of concurrent requests at the database level has doubled, and you are bound to find that the database load online is getting higher and higher.
Every time it comes to the peak, the pressure on disk IO, network IO, memory consumption and CPU load will be very high, and people are very worried about whether the database server can withstand it.
Yes, generally speaking, for that kind of common configuration online database, it is recommended to add up read and write concurrently, according to the configuration of our example above, not more than 3000max s.
Because the database is under too much pressure, the first problem is that the system performance may be degraded during peak hours, because the high database load will affect the performance.
On the other hand, what if you fail your database under too much pressure?
So at this time, you have to do sub-library sub-table + read-write separation of the system, that is, split a library into multiple libraries and deploy on multiple database services, which is used as the main library to carry write requests.
Each master library then mounts at least one slave library, and the slave library carries the read request.
At this time, it is assumed that the read and write concurrency at the database level is 3000max s, in which the write concurrency accounts for 1000max s, and the read concurrency accounts for 2000max s.
Then once the database is divided into tables, the main database is deployed on two database servers to support the write request, and the write concurrency carried by each server is 500max s.
If one server deployment slave library is mounted on each master library, the read concurrency of each slave library supported by the two slave libraries is 1000 seconds.
To sum up, when the concurrent volume continues to grow, we need focus at the database level: sub-database sub-table, read-write separation.
The architecture diagram at this time is as follows:
Introduction of cache cluster
Then it's easy. If you have more and more registered users, you can keep adding machines, such as machines at the system level, to carry higher concurrent requests.
Then, if the write concurrency at the database level is getting higher and higher, the capacity of the database server will be expanded, and the machine can be expanded by dividing the database and tables. If the read concurrency at the database level is getting higher and higher, the capacity will be expanded and more slave databases will be added.
But there is a big problem here: the database itself is not used to carry high concurrency requests, so generally speaking, the concurrency carried by a single database machine per second is in the order of thousands of orders of magnitude, and the machines used in the database are relatively high configuration, more expensive machines, the cost is very high.
If you simply add the machine all the time, it's not right.
Therefore, there is usually a cache in the high concurrency architecture, and the cache system is designed to carry the high concurrency.
Therefore, the concurrency of the single machine is tens of thousands per second, or even hundreds of thousands per second, and the carrying capacity for high concurrency is one or two orders of magnitude higher than that of the database system.
So you can introduce a cache cluster for requests that write less and read more according to the business characteristics of the system.
Specifically, when writing to the database, a piece of data is written to the cache cluster at the same time, and then the cache cluster is used to carry most of the read requests.
In this way, by caching the cluster, you can host higher concurrency with fewer machine resources.
For example, in the figure above, the read request is currently 2000 seconds per second, and the two slave libraries have each resisted 1000 read requests per second, but perhaps 1800 read requests per second can be read directly from the unchanged data in the cache.
At this point, once you introduce the cache cluster, you can resist the 1800bp read request, and the read request that falls to the database level is 200Universe.
Similarly, let's take a picture of the architecture and feel it together:
According to the above architecture, what are its benefits?
In the future, your system may have tens of thousands of read requests per second, but 80% / 90% of them may be read through the cache cluster, and the machines in the cache cluster may be able to support tens of thousands of read requests per second, so it costs very little machine resources. Maybe two or three machines will be enough.
If you try it with a database, you may have to keep adding from the database to 10 or 20 machines to withstand tens of thousands of concurrent reads per second, which is extremely expensive.
All right, let's briefly summarize the third point to consider for hosting high concurrency:
Do not blindly expand the database, the database server is expensive, and itself is not used to host high concurrency.
For the requests of writing less and reading more, a cache cluster is introduced to resist a large number of read requests.
Introduction of message middleware cluster
Then let's take a look at the pressure on the database to write this block, which is actually similar to reading.
If you say that all your write requests are on the main database layer of the database, of course, it's okay, but what if the writing pressure is getting bigger and bigger?
For example, if you have to write tens of thousands of pieces of data per second, do you keep adding machines to the main library?
Yes, of course, but by the same token, you consume a lot of machine resources, which is determined by the characteristics of the database system.
Under the same resources, the database system is too heavy and complex, so the concurrent carrying capacity is in the order of thousands / s, so you need to introduce some other technologies at this time.
For example, message middleware technology, that is, MQ cluster, can do very good write request asynchronization processing to achieve the effect of peak cutting and valley filling.
Let's say that you now have 1000 write requests per second, of which, for example, 500 requests must be written to the database immediately, but the other 500 write requests can allow asynchronization to wait for dozens of seconds, or even a few minutes before falling into the database.
At this point, you can introduce a message middleware cluster, write 500 requests per second that allow asynchronization to MQ, and then do a peak trimming and valley filling based on MQ.
For example, if you consume it at a steady speed of 100amp s, and then fall into the database, the writing pressure on the database will be greatly reduced.
At this point, the architecture diagram looks like this:
If you look at the architecture diagram above, first of all, the message middleware system itself is also created for high concurrency, so usually a single machine supports tens of thousands or even hundreds of thousands of concurrent requests.
Therefore, like the cache system, it can support very high concurrent requests with very few resources, and it is no problem to use it to support some highly concurrent writes that allow asynchronization. it takes a lot less machine usage than using a database to directly support that part of highly concurrent requests.
And after the peak-cutting and valley-filling of the message middleware, for example, if the database is written at a stable speed of 100max s, then the write request pressure received at the database level will become 500max s + 100max s = 600max s?
Let's take a look, do you find that it reduces the pressure on the database? So far, we have been able to make the system architecture resist the maximum request pressure with as few machine resources as possible, reducing the burden on the database by the following means:
System clustering.
Database level sub-database sub-table + read-write separation.
For requests with more reads and less writes, a cache cluster is introduced.
Aiming at the pressure of high writing, message middleware cluster is introduced.
Initially, a simple exposition of a high concurrency system is over. But the story is far from over.
First of all, the topic of high concurrency itself is very complex, far from what some articles can say clearly, its essence is that the real high concurrency system architecture that supports complex business scenarios is actually very complex.
For example, the middleware system with millions of concurrent requests per second, the gateway system with tens of billions of requests per day, the second-kill system with hundreds of thousands of requests per second, the architecture of large-scale high concurrency e-commerce platform supporting hundreds of millions of users, and so on.
In order to support high concurrent requests, in the design of system architecture, a variety of complex architectures will be designed combined with specific business scenarios and characteristics, which requires a lot of underlying technical support and the ability of exquisite architecture and mechanism design.
In the end, the architecture complexity of various complex systems will be far beyond the imagination of most students who have never come into contact with it.
But with such a complex system architecture, it is difficult to make clear the details and the production process through some articles.
Secondly, the topic of high concurrency itself contains much more than just a few topic mentioned in this article: sub-library, sub-table, cache, and messages.
A complete and complex high concurrency system architecture must include:
A variety of complex self-developed infrastructure systems.
All kinds of exquisite architecture design (such as hot spot cache architecture design, multi-priority high throughput MQ architecture design, system full-link concurrency performance optimization design, etc.).
There are also the overall technical solution of the high concurrency architecture composed of various complex systems.
There are also NoSQL (Elasticsearch, etc.) / load balancing / Web server and other related technologies.
So keep in mind that you should be in awe of technology, which is difficult to express through some articles.
Finally, when the production is really on the ground, there will be a lot of technical problems in your system in high concurrency scenarios.
For example, message middleware throughput does not need to be optimized, disk write pressure is too large and poor performance, memory consumption is too large and easy to burst, database and table middleware do not know why they lost data, and so on.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.