Evolution of large-scale website architecture 07/13 Update SLTechnology News&Howtos

Evolution of large-scale website architecture

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

1. Introduction

At the beginning, the evolution of large-scale website architecture slowly evolved from small to big. Any good architecture is not designed, but iterated through business development. I agree with this view. I am very interested in website architecture technology and have been paying close attention to learning architecture technology. Through the development of large-scale website technology, I would like to analyze the technical architecture model of large-scale website and analyze the design of large-scale Internet architecture in depth. In this article, we only focus on the evolution of architecture.

Through the e-commerce business as an example, the functions of the system include user module [user registration and management], commodity module [commodity display and management], and transaction module [create transaction and management]. Through the illustration, this paper analyzes how to develop a huge distributed architecture system from a single LAMP.

Highly available reference indicators for online systems:

2. The evolution of architecture 2.1. The first stage [building website on a stand-alone]

In the early days of the open source frameworks used by website programs such as maven+spring+struct+hibernate and maven+spring+springmvc+mybatis;, we often run all the programs and software on a single machine. It is usually composed of app server and DB server, such as tomcat and mysq;, and finally connect and operate the database through JDBC.

Generally, 50, 000 pv to 300000 pv visits, combined with kernel parameter tuning, web application performance parameter tuning and database tuning, can basically run stably.

2.2. The second stage [separation of application server and database]

With the development of the website, when the number of visits increases gradually, the load of the server increases slowly, the pressure of the system becomes greater and greater, and the response speed becomes slower and slower. at this time, it is more obvious that the database and the application interact with each other, and the application is prone to problems. The database is also prone to problems, and when the database goes wrong, the application is also prone to problems. When the server is not overloaded, we should be prepared in advance to improve the load capacity of the website. If our code level has been difficult to optimize, it is a good way to physically separate the application from the database without improving the performance of a single machine, which can not only effectively improve the load capacity of the system, but also cost-effective. At this time, we can split the database and the web server, which not only improves the load capacity of a single machine, but also improves the disaster recovery capacity.

With the increase of the number of users, the increase of bandwidth demand, the increase of CPU processing capacity; it is not enough to integrate the needs of all users, or the demand of website business. It is rarely and should not be directly extended in a load-balanced manner here because data synchronization is very difficult. Therefore, our architecture extension, will not directly do the load balancing of each group of components as a whole to expand, but functionally cut. There are no new technical requirements at this time, but you find that it works, the system is back to its previous response speed, and supports higher traffic, and will not affect each other because of the database and the application.

2.3. The third phase [application server load balancing]

As traffic continues to increase, a single application server can no longer meet the demand. Assuming that there is no pressure on the database server, we can change the application server from one to two or more, dispersing users' requests to different servers, thus increasing the load capacity. At this point, we should choose an appropriate load balancer product. Generally speaking, keepalived with ipvsadm to do load balancer is an artifact. This stage is the key point where more basic knowledge is needed. The following can be used to load balance user requests using DNS parsing:

We should pay attention to: the selection of load balancing products, the selection of load balancing algorithm, the problem of user session retention, and the reasonable server configuration from the point of view of application occupying resources.

Solution: select the products and technologies suitable for the business according to the following technical characteristics. For example, one server for nginx+keepalived, one server for apache+tomcat, and one server for mysql.

The advantages and disadvantages of the three load balancers are as follows: excerpt from (Internet)

Advantages of LVS:

1. Strong anti-load ability, working in layer 4 only for distribution, no traffic generation, this characteristic also determines its strongest performance in load balancing software; no traffic, while ensuring that the performance of equalizer IO will not be affected by large traffic

2. Work stably and have a complete dual-computer hot backup scheme, such as LVS+Keepalived and LVS+Heartbeat

3. A wide range of applications can be used for load balancing of all applications.

4. The configuration is relatively low, which is both a disadvantage and an advantage, because there are no things that can be configured too much, so we do not need too much contact, which greatly reduces the probability of human error.

Disadvantages of LVS:

1. The software itself does not support regular processing and cannot do dynamic and static separation, which highlights the advantages of Nginx/HAProxy+Keepalived.

2. If the website application is relatively large, LVS/DR+Keepalived will be more complex, especially for the machines behind the Windows Server application, the implementation, configuration and maintenance process will be more troublesome, relatively speaking, Nginx/HAProxy+Keepalived is much simpler.

Advantages of Nginx:

1. Working at layer 7 of OSI, you can make some diversion strategies for http applications. For example, for domain name, directory structure. Its regularization is more powerful and flexible than HAProxy.

2. Nginx has very little dependence on the network. In theory, it can carry out the load function as soon as it can ping, which is its advantage.

3. Nginx is easy to install and configure, and easy to test.

4. it can bear high load pressure and is stable, and can generally support more than tens of thousands of concurrent times.

5. Nginx can detect internal faults in the server through the port, such as the status code and timeout returned by the web page processed by the server, and will resubmit the request that returns an error to another node

6. Nginx is not only an excellent load balancer / reverse proxy software, but also a powerful Web application server. LNMP is also a very popular web environment, which is comparable to the LAMP environment. Nginx has an advantage over apache in dealing with static pages, especially in anti-high concurrency.

7. Nginx is becoming more and more mature as a Web reverse acceleration cache, which is faster than the traditional Squid server. Friends in need can consider using it as a reverse proxy accelerator.

Disadvantages of Nginx:

1. Nginx does not support url detection.

2. Nginx can only support http and Email, which is its weakness.

3. The Session of Nginx is maintained, and the guiding ability of Cookie is relatively lacking.

Advantages of HAProxy:

1. HAProxy supports virtual hosts and can work at layers 4 or 7 (supports multiple network segments)

2. It can supplement some shortcomings of Nginx, such as the maintenance of Session, the boot of Cookie and so on.

3. Servers that support url detection of the backend

4. Like LVS, HAProxy itself is only a load balancing software; in terms of efficiency, HAProxy has a better load balancing speed than Nginx, and it is better than Nginx in concurrent processing.

5. HAProxy can load balance Mysql reads, detect and load balance the backend MySQL nodes, but the performance is not as good as LVS when the number of MySQL slaves in the backend exceeds 10.

6. There are many HAProxy algorithms, up to 8.

Basic technologies of load balancing:

1. Http redirection. HTTP redirection is the request forwarding at the application layer. The user's request is sent to the HTTP redirect load balancer server. According to the algorithm, the user is required to redirect, and the browser automatically rerequests the IP address of the actual server.

Pros: simplicity

Disadvantages: poor performance

2. DNS domain name resolution load balancer. DNS domain name resolution load balancer means that when a user requests a DNS server to obtain the IP address corresponding to the domain name, the DNS server resolves directly to the server IP after the load balancer.

Advantages: it is easy to use DNS and there is no need for us to maintain it.

Disadvantages: using DNS, it does not have fault detection function, DNS parsing results will be cached, so the effect is general. Most of DNS load balancers are commercial products, and their functions and management permissions are limited.

3. Reverse proxy server. The request forwarded by the reverse proxy server is at the HTTP protocol level, so it is also called application layer load balancing. The user's request arrives at the reverse proxy server, and the reverse proxy server forwards it to the specific server according to the algorithm. Apache and nginx are commonly used in reverse proxy software.

Advantages: easy to deploy.

Cons: the reverse proxy server is the distributor for all requests and responses, and its performance may become a bottleneck. Especially uploading large files.

4. IP layer load balancing. After the user's request arrives at the load balancer, the load balancing is realized by modifying the destination IP address of the request. Represents the NAT pattern that implements LVS

Advantages: better performance.

Disadvantages: the broadband of the load balancer has become a bottleneck.

5. Data link layer load balancing. After the user's request reaches the load balancer, the load balancing is realized by modifying the mac address of the request. Unlike IP load balancer, after the request to access the server, it directly returns to the customer without going through the load balancer. Represents the DR pattern that implements LVS

Common scheduling algorithms: extracted from (Internet)

1. Rr polling scheduling algorithm. As the name implies, polling for distribution requests.

Pros: easy to implement

Disadvantages: do not consider the processing power of each server

2. Wrr weighted scheduling algorithm. We set the weight weight for each server, and the load balancer dispatches the server according to the weight, and the number of times the server is called is proportional to the weight.

Advantages: taking into account the different processing power of the server

3. Sh original address hash: extract the user IP, get a key according to the hash function, and then look up and deal with the corresponding value, that is, the target server IP, according to the static mapping table. If the target machine is overloaded, null is returned.

4. Dh destination address hash: same as above, but now the IP of the destination address is extracted for hashing.

Advantages: the above two algorithms can achieve the same user to access the same server.

5. Lc is at least connected. Priority is given to forwarding requests to servers with a small number of connections.

Advantages: make the load of each server in the cluster more uniform.

6. Wlc weighted least join. Add weights to each server on the basis of lc. The algorithm is: (number of active connections * 256 + number of inactive connections) / weight, and the servers with small calculated values are selected first.

Pros: requests can be allocated according to the capabilities of the server.

7. The minimum expected delay of sed. In fact, sed is similar to wlc, except that the number of inactive connections is not considered. The algorithm is: (number of active connections + 1) * 256 / weight. Similarly, servers with small values are selected first.

8. Nq never stands in line. Improved sed algorithm. Let's think about the circumstances under which we can "never queue", that is, when the number of connections to the server is 0, then if the number of connections to the server is 0, the equalizer forwards the request directly to it without sed calculation.

9. LBLC has the least connection based on locality. According to the destination IP address of the request, the equalizer finds out the server in which the IP address is recently used, and forwards the request to it. If the server is overloaded, the least number of connections algorithm is used.

10. Minimum locality-based connections of LBLCR with replication. According to the destination IP address of the request, the equalizer finds out the "server group" recently used by the IP address, note that it is not a specific server, and then uses the minimum number of connections to pick out a specific server from the group and forward the request. If the server is overloaded, then according to the minimum number of connections algorithm, find a server among the servers in the cluster that are not in the server group, join the server group, and then forward the request.

Session session persistence:

Session sticky

Ip based

Cookie based

Session replication

Session server

1 、 Session Sticky . Always direct the connection of the same requester to the same RS (still selected by the scheduling method on the first request); there is no fault tolerance, which is detrimental to the balancing effect. The common algorithm is the ip_ hash method, that is, the two hash algorithms mentioned above.

Advantages: easy to implement.

Disadvantages: no high availability, anti-equilibrium.

2 、 Session Replication . Session is synchronized between RS, so each RS holding all the session; in the cluster is not suitable for large cluster environments.

Advantages: reduce the pressure on the load balancing server and can be dispatched at will.

Disadvantages: large amount of bandwidth and memory, limited concurrency ability.

3. Session Server: using a separately deployed server to manage session; uniformly realizes the decoupling of session and application server.

Pros: compared with session replication's solution, the pressure on broadband and memory between clusters is much less.

Cons: the database where session is stored needs to be maintained.

4. Cookie Base:cookie base stores the session in the cookie, and the browser tells the application server what the session is, which also realizes the decoupling of session and the application server.

Advantages: easy to implement, basically maintenance-free.

Disadvantages: cookie length limit, low security, broadband consumption.

The application is classified from the perspective of resource occupancy:

1. CPU Bound [CPU intensive] generally refers to aap server.

2. IO Bound [IO intensive] generally refers to db server

With the above theoretical experience, according to the business, we can re-evolve the architecture into the following solutions:

2.4. The fourth stage [database read-write separation]

In the previous stages, we all assumed that the back-end database load was fine, but as the number of visits increased, the database load gradually increased. Then someone may immediately think of splitting the database into two and then load balancing, just like the application server. But for the database, it's not that simple. If we simply split the database into two, and then load the requests for the database to machine An and machine B respectively, it will obviously cause data inconsistency between the two databases. Then in this case, we can first consider the use of read-write separation.

It should be noted that referencing the problems that the master and slave of MySQL will inevitably face, the problem of data replication, and the problem of application selecting data sources.

Solution: use the master+slave that comes with MySQL to achieve master-slave replication. Use third-party database middleware, such as mycat. Mycat is developed from cobar. Mycat is currently a good mysql open source database sub-database and sub-table middleware in China.

2.5. The fifth stage [introducing search engine to realize full-text search]

MySQL does not have strong support for global search. MySQL only does full-text indexing to MyISAM engines but does not support transactions, while MySQL does not support full-text indexing on InnoDB engines. When a site reaches this stage, its user search volume is very large. As an e-commerce site, 70% of transactions are done through search, so this increasing search demand makes it impossible for our database to cope with this need at all, because each search is a comprehensive query. Fuzzy search is often inadequate, even with the separation of read and write, the problem has not been solved. If we want to meet this demand, we generally have to build our own search engine to ease the pressure of reading the library.

The search engine can not replace the database, it solves the problem of "reading" in some scenarios, whether to introduce the search engine or not, we need to comprehensively consider the needs of the whole system. The system structure after the introduction of search engine:

2.6. Phase 6 [introduction of caching]

1. Page caching

When the web server is under great pressure, if part of the content of the site can be cached and part of the content cannot be cached, we can use ESI dynamic caching technology. The web cache server can also cache pages in jpg, jpeg, gif, png, html, css, and js formats. For example: varnish, squid

2. Data caching

With the increase of the number of visits, there are gradually many users accessing the same part of the content, for these more popular content, it is not necessary to read from the database every time, we can use caching technology. For example: memcached, redis

Advantages: reduce the pressure on the database and greatly improve the access speed

Disadvantages: need to maintain cache server and increase coding complexity

The scheduling algorithm of the cache server cluster is different from the application servers and databases mentioned above, so it is best to use the "consistent hash algorithm".

2.7. Phase 7 [database split]

When the number of visits increases gradually, the effect of adding machines to a single application is not obvious, so it is necessary to consider the split of the database in combination with the business to improve the efficiency of the system. Our website has evolved to the present, the data of transactions, goods and users are still in the same database. Although the method of increasing cache and separation of read and write is adopted, as the pressure on the database continues to increase, the bottleneck of the database becomes more and more prominent. At this time, the main database write operation pressure can only do database split. We can have two options: vertical split and horizontal split.

1. Vertical split: split the data of different businesses in the database into different database servers.

2. Horizontal split: split the data in a single table to several different database servers.

2.8. Phase 8 [apply split]

With the development of business, there are more and more services and applications. We need to think about how to avoid making applications bloated. This requires taking the app apart and changing it from one app to two or more. Or with our example above, we can separate users, goods and transactions. Become two subsystems of "user, commodity" and "user, transaction".

It should be noted that after this split, there may be some same code, such as user-related code, goods and transactions require user information, so the same code for operating user information is retained in both systems. How to ensure that the code can be reused is a problem that needs to be solved.

Solution: by taking the service-oriented route to solve the problem, the public services are separated to form a service-oriented model, referred to as SOA.

The system structure after service:

Need to note: how to make remote service invocation

Solution: we can solve the problem by introducing message middleware

2.9. Phase 9 [introduction of message middleware]

As the website continues to develop, there may be submodules developed in different languages and subsystems deployed on different platforms in our system. At this point, we need a platform to transmit reliable, platform-and language-independent data, to make load balancing transparent, to collect call data and analyze it during the call process, to speculate a series of requirements, such as the visit growth rate of the website, and to predict how the website should grow. Open source message middleware has Ali's dubbo, which can be used with Google's open source distributed program coordination service zookeeper to realize server registration and discovery.

The structure after introducing message middleware:

3. High performance design

Generally speaking, the three-tier architecture (3-tier architecture) divides the whole business application into three layers: interface layer (User Interface layer), business logic layer (Business Logic Layer) and data access layer (Data access layer). The purpose of distinguishing levels is for the idea of "high cohesion and low coupling". Hierarchical structure is the most common and important structure in software architecture design. The hierarchical structure recommended by Microsoft is generally divided into three layers, from bottom to top: data access layer, business logic layer (or domain layer), presentation layer. (Baidu encyclopedia)

Features of microservices:

Single responsibility: each service in the microservice corresponds to a unique business capability, achieving a single responsibility.

Micro: the service splitting granularity of micro-service is very small, for example, a user management can be used as a service. Although each service is small, it is "complete with five internal organs".

Service orientation: service orientation means that each service exposes the service interface API. Do not care about the technical implementation of the service, so that it has nothing to do with the platform and language, nor is it limited to what technology to implement, as long as the interface of Rest is provided.

Autonomy: autonomy means that services are independent and do not interfere with each other.

Team independence: each service is an independent development team, the number of people should not be too large.

Technology independence: because it is service-oriented, it provides Rest interface and no one else interferes with what technology is used.

Separation of front and rear ends: the front and rear ends are developed separately to provide a unified Rest interface. The back end no longer has to develop different interfaces for PC and mobile segments.

Database separation: each service uses its own data source

Deployment is independent, although there are calls between services, but to ensure that service restart does not affect other services. Conducive to continuous integration and continuous delivery. Each service is an independent component, reusable, replaceable, less coupled, and easy to maintain.

Architecture optimization:

User-centric, providing a fast web access experience. The main parameters are short response time, large concurrent processing capacity, high throughput and stable performance parameters.

It can be divided into front-end optimization, application layer optimization, code layer optimization and storage layer optimization.

Front-end optimization: the part before the business logic of the website

Browser optimization: reduce the number of Http requests, use browser caching, enable compression, Css Js locations, Js async, reduce Cookie transfers

CDN acceleration, reverse proxy

Application layer optimization: a server that handles the business of a website. Use caching, async, clustering

Code optimization: reasonable architecture, multithreading, resource reuse (object pool, thread pool, etc.), good data structure, JVM tuning, singleton, Cache, etc.

Storage optimization: cache, solid state disk, optical transmission, optimized read and write, disk redundancy, distributed storage (HDFS), NOSQL, etc.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.