A case study on the Evolution of web Architecture 07/01 Update SLTechnology News&Howtos

A case study on the Evolution of web Architecture

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "the case study of the evolution of web architecture". In the daily operation, I believe that many people have doubts about the case study of the evolution of web architecture. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "case study of web architecture evolution"! Next, please follow the editor to study!

1. Overview

Taking Taobao as an example, this paper introduces the evolution process of server architecture from 100 concurrency to tens of millions of concurrency, and enumerates the relevant technologies that will be encountered in each evolution stage, so that we can have an overall understanding of the evolution of architecture. finally, the article summarizes some principles of architecture design.

2. Basic concepts

Before introducing architecture, in order to prevent some readers from not understanding some concepts in architecture design, here are some of the most basic concepts:

Multiple modules in a distributed system are deployed on different servers, which can be called distributed systems, such as Tomcat and database deployed on different servers, or two Tomcat with the same function deployed on different servers.

If some nodes in the high availability system fail and other nodes can continue to provide services for it, the system can be considered to have high availability.

Cluster A domain-specific software is deployed on multiple servers and provides a class of services as a whole, which is called a cluster. For example, Master and Slave in Zookeeper are deployed on multiple servers to provide centralized configuration services. In common clusters, clients are often able to connect to any node to get services, and when a node in the cluster is offline, other nodes can automatically take over it to continue to provide services, which shows that the cluster has high availability.

When a load balancing request is sent to the system, the request is evenly distributed to multiple nodes in some way, so that each node in the system can handle the request load evenly, then the system can be considered to be load balanced.

When the forward proxy and reverse proxy system want to access the external network, the request is forwarded through a proxy server, which looks like the access initiated by the proxy server in the external network. At this time, the proxy server implements the forward proxy. When the external request enters the system, the proxy server forwards the request to a server in the system. For the external request, only the proxy server interacts with it. At this time, the proxy server implements the reverse proxy. To put it simply, the forward proxy is the process that the proxy server accesses the external network instead of the internal system, and the reverse proxy is the process that the external request to access the system is forwarded to the internal server through the proxy server.

3. Architecture evolution

3.1. Stand-alone architecture

Take Taobao as an example. At the beginning of the site, the number of applications and users are small, you can deploy Tomcat and database on the same server. When the browser initiates a request to the www.taobao.com, it first converts the domain name to the actual IP address 10.102.4.1 through the DNS server (domain name system), and the browser accesses the Tomcat corresponding to the IP.

With the growth of the number of users and the competition for resources between Tomcat and database, stand-alone performance is not enough to support the business.

3.2. first evolution: Tomcat is deployed separately from the database

Tomcat and database respectively monopolize server resources, which significantly improve their respective performance.

With the growth of the number of users, concurrent reading and writing databases have become a bottleneck.

3.3. Second evolution: introduction of local cache and distributed cache

Increase the local cache on the same Tomcat server or the same JVM, and increase the distributed cache externally, cache the popular product information or the html page of the popular product, etc. Through caching, most requests can be intercepted before reading and writing to the database, which greatly reduces the pressure on the database. The technologies involved include using memcached as local cache and Redis as distributed cache, as well as cache consistency, cache penetration / breakdown, cache avalanche, failure of hot data sets and so on.

The cache resists most of the access requests. With the growth of the number of users, the concurrency pressure mainly falls on the stand-alone Tomcat, and the response slows down gradually.

3.4, the third evolution: the introduction of reverse proxy to achieve load balancing

Deploy Tomcat on multiple servers and distribute requests evenly to each Tomcat using reverse proxy software (Nginx). It is assumed here that Tomcat supports up to 100 concurrency and Nginx supports up to 50000 concurrency, so in theory Nginx can resist 50000 concurrency by distributing requests to 500 Tomcat. The technologies involved include: Nginx and HAProxy, both of which are reverse proxy software working in the seventh layer of the network, mainly supporting http protocol, but also involving session sharing, file upload and download.

Reverse proxy greatly increases the amount of concurrency that the application server can support, but the growth of concurrency also means that more requests are penetrated into the database, and the stand-alone database eventually becomes a bottleneck.

3.5. The fourth evolution: database read-write separation.

The database is divided into a read library and a write library, and there can be multiple read libraries. The data of the write library can be synchronized to the read library through the synchronization mechanism. For the scenario where you need to query the latest written data, you can write one more copy in the cache and obtain the latest data through the cache. The technologies involved include: Mycat, which is a database middleware, which can be used to organize separate reading and writing of the database and sub-database and tables, and the client can access the lower-level database through it. It will also involve the problems of data synchronization and data consistency.

There are more and more services, and there is a large gap in the number of visits between different businesses. different businesses directly compete for databases, which affect the performance of each other.

3.6. The fifth evolution: the database is divided by business.

Save the data of different services to different databases, so that the competition for resources between businesses is reduced. For businesses with large visits, more servers can be deployed to support them. At the same time, cross-business tables can not do correlation analysis directly, which needs to be solved through other ways, but this is not the focus of this article, interested can search their own solutions.

With the growth of the number of users, the writing of stand-alone libraries will gradually reach the performance bottleneck.

3.7. The sixth evolution: splitting a large table into smaller tables

For example, for comment data, you can hash according to commodity ID and route to the corresponding table for storage. For payment records, you can create a table by hour, and each hourly table can be divided into small tables, and use user ID or record number to route the data. As long as the amount of real-time table data is small enough and requests can be evenly distributed to small tables on multiple servers, the database can scale horizontally to improve performance. The Mycat mentioned earlier also supports access control when a large table is split into small tables.

This practice significantly increases the difficulty of database operation and maintenance, and has higher requirements for DBA. When the database is designed to this structure, it can already be called a distributed database, but it is only a logical database as a whole. Different components of the database are implemented separately by different components, such as the management and request distribution of sub-libraries and tables, which are implemented by Mycat, the parsing of SQL is realized by a stand-alone database, and read-write separation may be realized by gateways and message queues. The summary of query results may be implemented by the database interface layer, and so on, this architecture is actually a kind of implementation of MPP (massively parallel processing) architecture.

At present, there are many MPP databases both open source and commercial, such as Greenplum, TiDB, Postgresql XC, HAWQ and so on. Commercial ones such as GBase of Nanjing University, Snowball DB of Ruifan Technology, LibrA of Huawei and so on. Different MPP databases have different emphasis. For example, TiDB focuses more on distributed OLTP scenarios and Greenplum focuses on distributed OLAP scenarios. These MPP databases basically provide SQL standard support capabilities such as Postgresql, Oracle and MySQL, which can parse a query into a distributed execution plan and distribute it to each machine for parallel execution, and finally summarize and return the data by the database itself. it also provides capabilities such as rights management, sub-database sub-table, transaction, data copy and so on, and most of them can support clusters with more than 100 nodes. It greatly reduces the cost of database operation and maintenance, and enables the database to achieve horizontal expansion.

Both database and Tomcat can be scaled horizontally, and the concurrency that can be supported is greatly increased. With the growth of the number of users, the Nginx of a single machine will eventually become a bottleneck.

The Seventh Evolution: use LVS or F5 to load balance multiple Nginx

Because the bottleneck is Nginx, it is impossible to achieve load balancing of multiple Nginx through two-tier Nginx. The LVS and F5 in the figure are load balancing solutions working at the fourth layer of the network. LVS is software, runs in the kernel state of the operating system, and can forward TCP requests or higher-level network protocols, so the protocols supported are richer and the performance is much higher than Nginx. It can be assumed that the stand-alone LVS can support hundreds of thousands of concurrent request forwarding. F5 is a kind of load balancing hardware, which is similar to the capability provided by LVS, but has higher performance than LVS, but is expensive. Because LVS is a stand-alone version of the software, if the server where the LVS is located is down, the entire back-end system will be inaccessible, so a backup node is required. You can use keepalived software to simulate the virtual IP, and then bind the virtual IP to multiple LVS servers. When the browser accesses the virtual IP, it will be redirected to the real LVS server by the router. When the main LVS server goes down, the keepalived software will automatically update the routing table in the router and redirect the virtual IP to another normal LVS server, thus achieving the effect of high availability of the LVS server.

It should be noted here that the drawing from Nginx layer to Tomcat layer in the above figure does not mean that all Nginx forward requests to all Tomcat. In actual use, it may be a part of Tomcat under several Nginx, which is highly available through keepalived, while other Nginx is connected to another Tomcat, so that the number of Tomcat that can be accessed can be multiplied.

Because LVS is also stand-alone, as the number of concurrency increases to hundreds of thousands, the LVS server will eventually reach the bottleneck, when the number of users will reach 10 million or even hundreds of millions of users, users are distributed in different areas, and the distance from the server room is different, resulting in significantly different access delays.

3.9. the eighth evolution: load balancing among computer rooms through DNS polling

A domain name can be configured in the DNS server to correspond to multiple IP addresses, and each IP address corresponds to a virtual IP in a different computer room. When a user accesses a www.taobao.com, the DNS server uses a polling policy or other policy to select an IP for the user to access. This method can achieve load balancing between computer rooms. At this point, the system can expand horizontally at the computer room level, and the concurrency of 10 million to 100 million levels can be solved by increasing the computer room, and the request concurrency at the entrance of the system is no longer a problem.

With the richness of data and the development of business, the needs of retrieval and analysis are becoming more and more abundant. It is impossible to solve such rich needs by relying on the database alone.

3.10. The ninth evolution: introducing technologies such as NoSQL database and search engine

When there is a large amount of data in the database, the database is not suitable for complex queries, and often can only meet the scenarios of ordinary queries. For statistical report scenarios, the results may not be able to run when the amount of data is large, and other queries will slow down when running complex queries. For scenarios such as full-text retrieval and variable data structure, the database is inherently not suitable. Therefore, it is necessary to introduce an appropriate solution for a specific scenario. For example, massive file storage can be solved through distributed file system HDFS, data of key value type can be solved by HBase and Redis, full-text retrieval scenarios can be solved by search engines such as ElasticSearch, and multi-dimensional analysis scenarios can be solved by Kylin or Druid.

Of course, the introduction of more components will increase the complexity of the system at the same time, the data stored by different components need to be synchronized, the consistency needs to be considered, and more operation and maintenance means are needed to manage these components.

The introduction of more components to solve the rich requirements, the business dimension can be greatly expanded, followed by an application contains too much business code, business upgrade iteration becomes difficult

3.11. Tenth Evolution: split large applications into small applications

Divide the application code according to the business block, so that the responsibilities of a single application are clearer and can be upgraded and iterated independently between each other. At this time, some common configurations may be involved between applications, which can be solved through the distributed configuration center Zookeeper.

There are common modules between different applications, and the separate management of the application will lead to multiple copies of the same code, resulting in the upgrade of all application code when the common function is upgraded.

3.12. The 11th evolution: the function of reuse is extracted into micro-services.

For example, user management, order, payment, authentication and other functions exist in multiple applications, then the code of these functions can be extracted separately to form a separate service to manage. This kind of service is the so-called micro-service. Applications and services access public services through HTTP, TCP or RPC requests, and each individual service can be managed by a separate team. In addition, service governance, current limit, circuit breaker, degradation and other functions can be realized through Dubbo, SpringCloud and other frameworks to improve the stability and availability of services.

The interface access methods of different services are different, and the application code needs to adapt to a variety of access methods in order to use the service. in addition, the application accesses the service, and the services may also access each other, so the call chain will become very complex and the logic will become confused.

The 12th evolution: introducing the enterprise service bus ESB to shield the access differences of the service interface

Through ESB unified access protocol conversion, applications uniformly access back-end services through ESB, and services and services call each other through ESB, so as to reduce the coupling degree of the system. This kind of architecture in which a single application is split into multiple applications, public services are extracted separately to manage, and the enterprise message bus is used to uncouple the services, which is the so-called SOA (service-oriented) architecture, which is easy to be confused with the micro-service architecture, because the presentation is very similar. According to personal understanding, micro-service architecture more refers to the idea of extracting public services from the system for separate operation and maintenance management, while SOA architecture refers to an architectural idea of splitting services and unifying service interface access. SOA architecture includes the idea of micro-services.

With the continuous development of business, the number of applications and services will continue to become more and more, and the deployment of applications and services will become more complex. In addition, for scenarios that require dynamic expansion, such as great promotion, and the performance of services needs to be expanded horizontally, it will be very difficult for operation and maintenance to prepare the running environment and deploy services on the new services.

3.14. The 13th evolution: introducing containerization technology to realize operating environment isolation and dynamic service management.

At present, the most popular containerization technology is Docker, and the most popular container management service is Kubernetes (K8S). Applications / services can be packaged as Docker images, which can be dynamically distributed and deployed through K8S. The Docker image can be understood as the smallest operating system that can run your application / service, which contains the running code of the application / service, and the running environment is set up according to the actual needs. After the entire "operating system" is packaged as an image, it can be distributed to the machines that need to deploy related services. Starting the Docker image directly can set up the service and make the deployment and operation and maintenance of the service easier.

Before the big push, you can divide the servers on the existing machine cluster to start the Docker image to enhance the performance of the service. After the big promotion, you can turn off the image without affecting other services on the machine. (before section 3.14, the service running on the new machine needs to modify the system configuration to adapt to the service, which will cause damage to the running environment required by other services on the machine.)

After using containerization technology, the problem of dynamic expansion and reduction of service capacity can be solved, but the machine still needs to be managed by the company itself. when it is not greatly promoted, it still needs a lot of idle machine resources to deal with the promotion, the cost of the machine itself and the cost of operation and maintenance are extremely high, and the utilization rate of resources is low.

3.15. 14th Evolution: hosting the system on the Cloud platform

The system can be deployed to the public cloud, making use of the massive machine resources of the public cloud to solve the problem of dynamic hardware resources. During the period of great promotion, apply for more resources temporarily in the cloud platform, combine Docker and K8S to quickly deploy services, release resources after the promotion ends, truly pay on demand, greatly improve resource utilization, and greatly reduce the cost of operation and maintenance.

The so-called cloud platform abstracts massive machine resources into a resource whole through unified resource management, on which hardware resources (such as CPU, memory, network, etc.) can be dynamically applied on demand, and provides a general operating system, common technology components (such as Hadoop technology stack, MPP database, etc.) for users to use, and even developed applications. Users can solve their needs (such as audio and video transcoding services, e-mail services, personal blogs, etc.) without what technologies are used within the application. The following concepts are involved in the cloud platform:

IaaS: infrastructure as a service. Corresponding to the machine resources mentioned above, the machine resources are unified as a whole, and the hardware resources can be applied dynamically.

PaaS: platform as a service. Corresponding to the above, provide common technical components to facilitate the development and maintenance of the system

SaaS: software as a service. Corresponding to the provision of developed applications or services mentioned above, pay according to functional or performance requirements.

At this point, the study of "case study on the evolution of web architecture" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.