Example Analysis of constructing Internet High performance WEB system 07/04 Update SLTechnology News&Howtos

Example Analysis of constructing Internet High performance WEB system

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you the "example analysis of building an Internet high-performance WEB system". The content is simple and easy to understand, and the organization is clear. I hope it can help you solve your doubts. Let Xiaobian lead you to study and learn the "example analysis of building an Internet high-performance WEB system" article.

Since the development of the Internet, various applications have emerged one after another, with hundreds of millions of users. So how to build an excellent high-performance, highly reliable application system is crucial for every developer. This article will I learn and use in the work of some of the methods summarized, hoping to play a role in other students, in the future development of similar problems encountered, can quickly find solutions. I mainly use JAVA language, so the following do not do special instructions, are using JAVA language

Key to high performance

To achieve high performance, I summarized three points:

cache

DNS cache

database cache

distributed cache

split

business split

database splitting

asynchronous

network asynchronous

disk asynchronous

using message

These are three common scenarios, and no matter where you hit a performance bottleneck, keep these three in mind and most of the time you can find a solution. The following describes the application of these three points in various aspects of the overall architecture

stateless services

Stateless services we must first think of stateless objects, stateless objects can be simply understood as objects without Field, such as model/entity objects do not belong to stateless objects, because they contain Field, such as **Controller in typical MVC scenarios, **Service is stateless, they only contain method. Some are also stateful, such as the Action of the Structures 2 framework, so Structures 2 is now less used. Stateless objects make it possible to build stateless services because there are no stateful objects in the request chain, so each request is independent, and this architecture helps us scale our services.

Stateless services sometimes inevitably encounter stateful objects, such as the most common session. Because http requests themselves are stateless, cookies and sessions must be used in conjunction to identify multiple http requests belonging to the same user. There are usually two ways to solve this problem:

use cookies to store

Using distributed session services

The first is to store all the object information in the cookie, and read the information in the cookie at the server through the corresponding algorithm. This information is usually encrypted.

The second method is to store the session in a distributed database or distributed cache, usually in redis or memcache. This service extension relies on third-party database or caching capabilities. Taobao has similar components, and the open source world also has distributed sessions based on memcache and redis.

Stateless services use splitting and caching

business split

Statelessness allows application service levels to scale, but when a single application is too big and bloated, it is necessary to split the application. Vertical splitting means splitting according to business, for example, in e-commerce system, splitting according to order system, point system, etc. Splitting can be easier to develop and more convenient to expand. After the system is large, the number of visits to each business is different. For example, the buyer system must have a much larger number of visits than the seller system. At this time, you can only increase the machine of the buyer system.

In addition to splitting into different systems according to different services, our application layers can also be split, generally divided into application layers, logical layers and atomic layers. The application layer is the assembly of various data and logic services. The logic layer contains a large amount of reusable logic. The atomic layer directly operates the database, and some basic data operations are included in it.

Regardless of the form of disassembly, the system after disassembly is physically separated, so communication between systems is the most important issue in disassembly.

RPC

Before RPC services, there were many methods of system communication, such as RMI and Web Service, but RPC is now the mainstream communication method in a more convenient, efficient and cross-platform way. Almost every big company has its own RPC framework: Taobao's HSF, 58's SCF, and there are many excellent open source frameworks: Dubbo, GRPC, Thrift, etc. There are also many large companies using dubbo in China: Jingdong and Dangdang are all.

RPC calls are generally used in scenarios where coupling is heavy and synchronous calls are made. MQ as another asynchronous communication means is also widely used in various services. Commonly used are ActiveMQ, RabbitMQ, Kafka, RocketMQ. The first two are typically enterprise-class applications, characterized by support for a wide range of features and specifications. The latter two are Internet-grade, with stronger throughput and higher performance, but at the expense of many MQ features. mq is generally used in scenarios that require final consistency, such as user registration and issuing points. After registration, the user can directly return to the foreground successfully, and then send a registration success message to the mq system. The point action subscribes to the registration event and consumes the event information of mq.

MQ's biggest benefit is peak clipping and decoupling. In RPC-style synchronous invocation scenarios, if A and B are invoked in the same logic, then A and B must be expanded at the same time when expanding, but after having the message, A sends a message to B, and B temporarily cannot process it, or it can wait until A peaks and B continues to process it, even if B cannot match A's ability to send messages in the short term.

database splitting

Generally, projects will experience changes in data volume from small to large, so database splitting is also handled according to different stages of different data volumes.

Read-write separation, which is the first thing most applications do when they encounter performance bottlenecks. Most Internet applications are reading more than 90% of the scenes. One master writes and the other slaves read. However, this master-slave mode also has some problems, such as some data needs to be timely, that is, it needs to be read immediately after writing. Because master-slave synchronization is copied asynchronously through log, there is a window of data inconsistency. At this time, it is necessary to ensure the safety of data by forcibly reading the master library. Attention must be paid when developing.

Vertical segmentation is to put different businesses into different databases by splitting them, so that the pressure on a single database can be reduced and the overall performance can be improved. Vertical segmentation should pay attention to the business boundary problem, the boundary problem is that there is a table, feel placed in A and B library are appropriate. This depends on experience and cannot be over-considered, because no matter how well you divide it before, in the iteration of the application, there will always be more tables where no clear boundaries can be found. This problem is the same in business module partitioning.

Horizontal division, usually means sharding. Split different fields in the same table into different tables, or divide the same table into different fragments according to hash or business fields. This generally requires the support of DAL frameworks, including TDDL, Cobar, Mycat, etc. The main thing is that the framework makes it invisible to the programmer to split the database, just like manipulating a database. However, the current DAL framework cannot achieve this goal, especially in the case of cross-library transactions, which generally need to be handled in other ways.

Cross-database transactions/distributed transactions

Cross-database transactions are generally resolved through final consistency, that is, there is no requirement that ACID can be satisfied, and the time window for data inconsistency is allowed, but there will always be a point in time when the data will reach a final consistent state. There are many solutions, but the core principles are the same, except that they are all completed by compensation.

cache usage

There is a famous saying in the computer world: "Any problem in computer science can be solved by adding an indirect middle layer." Caching is an intermediate layer.

There are a lot of cached scenarios, almost everywhere you can think of. Here we talk about the usual database data cache

There are generally two types of cache, local and remote. Generally speaking, one type of cache can be used, because although cache is good, maintaining cache updates and deletions is a very troublesome thing. General caches can be divided into read caches (most scenarios) and write caches (generally for scenarios with low data security).

For example, the data in the database is read out and written into the cache at the same time. The next time the data is read, the data in the cache can be read directly, thus greatly reducing the pressure on the database. It is very simple to say. In fact, there are many kinds of architectures. Each architecture has advantages and disadvantages. You can understand them in detail.

Write cache, that is, first write data to cache, and then persist it for a period of time, which will also improve efficiency. The problem with this solution is that if it goes down at this time, some data will be lost, so it is suitable for scenarios with low data security.

Although cache speed is fast, in addition to maintenance update is more troublesome, memory is also more expensive hardware, so in addition to storing hot data in cache, general cache maintenance data index or main field for list display, real large and complete data also need other methods to solve.

static

For most scenarios, our data will not change for a certain period of time, or even if it changes, only a small part of the page will change, and the part that does not change can be taken out separately for static. For example, Jingdong Mall's page is static. After staticization, the data does not have to be retrieved from the cache or database every time, and then packaged into a page, but directly requests to return to the static page. The performance is undoubtedly greatly improved.

In addition to the above commonly used methods, there are also many important methods:

CDN acceleration

DNS cache

page caching

Using Distributed Storage

Programming with multithreading

The above is "building Internet high-performance WEB system example analysis" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.