Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Design scheme of second kill system

2025-02-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Design scheme of second kill system

First, the key points of system architecture design.

1. Two questions, one alternative

(1) the second kill actually mainly solves two problems.

One is that the core idea of concurrent reading is to minimize the number of users coming to the server to "read" data, or to read less data.

One is concurrent writing, and we separate a library at the database level to do special processing.

(2) it is also necessary to do some protection against the second kill system and design cover-up cases for unexpected situations to prevent the worst from happening.

2. from the architect's point of view, if we want to build a read-write, high-performance, high-availability system with super traffic, we need to follow several principles.

(1) the requested data is as small as possible.

(2) the number of requests is minimal

(3) and do not have a single point

3. From a technical point of view, "stable, accurate and fast"

(1) High concurrent access in high performance is very critical. There are four ways to deal with it.

The static and dynamic separation scheme of design data,

The discovery and isolation of hot spots,

Requested peak trimming and hierarchical filtering,

The extreme optimization of the server,

(2) the realization of commodity inventory reduction in consistency is also critical, and there are mainly the following two deductions.

Take a picture of inventory reduction

Payment to reduce inventory

(3) it is necessary to design an alternative for high availability.

Second, the five principles that the second kill system should pay attention to (combined with business dynamic balance)

1. Data should be as little as possible.

(1) users can request as little data as possible.

The data requested by the user includes the data uploaded to the system and the data returned by the system. The reason why users request as little data as possible is that it takes time for data transmission on the network, whether the request data or the returned data need to be processed by the server, and the server has to do compression and character coding when doing network communication, which consume CPU very much, so it is necessary to reduce the amount of data transmitted.

(2) the less data the system depends on.

In order to complete some business, the system needs to read and save data, and generally needs to deal with background services and databases. Invoking other services involves serialization and deserialization of data, which is also a major killer of CPU and also increases latency. The database itself can easily be a bottleneck, so the less you deal with the database, the better, and the simpler and smaller the data, the better.

2. The number of requests should be as small as possible.

(1) minimize additional requests

There is some consumption for every request made by the browser, such as a three-way handshake, sometimes a limit on the number of pages or links, and some requests that need to be loaded serially. If the domain names are different, the DNS resolution of these domain names is also involved, which may take longer.

(2) merge CSS and JavaScript files

Separated by commas in URL, individual files are still stored on the server side, and the server automatically parses the URL and merges it into a file to return together.

3. The path should be as short as possible

(1) A socket link is usually generated when passing through a node.

Shortening the request path can not only increase availability, but also improve performance (serialization, deserialization) and reduce latency (network transmission time).

(2) replace RPC call with JVM call

Replace the RPC call with the JVM call as appropriate.

4. Rely on as little as possible

(1) minimize the number of systems or services that must be relied on to complete a user request

If the dependence can be removed in an emergency, such as the second kill page, this page must strongly rely on product information, user information, and other coupons, transaction lists, which are not necessary for second kill. It can be removed in an emergency.

(2) establish a system level

We can grade the system, such as 0-level system, 1-level system, 2-level system, 3-level system, if the 0-level system is the most important system, then the 0-level system is also the most important system, and so on. The level 0 system should minimize the strong dependence on the level 1 system and prevent the important system from being dragged down by the unimportant system.

5. No single point.

(1) the most important thing in designing a system is to eliminate a single point.

Single point means there is no backup and the risk is uncontrollable.

(2) the scheme of avoiding single point

Avoid service and state binding, and services are stateless. For example: make the machine and related configuration dynamic, and the configuration is pushed dynamically through the configuration center.

Third, different architecture cases in different scenarios

1. For example, from 1w/s to 10w/s.

The second kill system is developed independently and optimized. Do the machine cluster on the system deployment, the second-kill traffic will not affect the purchase of normal goods. Hot spot data is stored separately in a separate cache system to improve performance. The second kill answer is added to prevent the second killer from grabbing the order.

The request order of magnitude of 2.100w/s

Complete separation of the page, so that the second kill does not need to refresh the entire page, just click the treasure button to minimize the refreshed data. The server caches the second-kill goods locally, does not need to call the background service that depends on the system to obtain data, and even does not need to query data in the public cache cluster, which can not only reduce system calls, but also reduce the pressure on the public cache cluster. Increase the current limit protection of the system to prevent the worst from happening.

4. Optional scheme for dynamic and static separation

1. Definition of dynamic and static data

Dynamic data personalized data related to visitors, static data including html pages stored on the hard disk, and data processed by the business independent of the visitor

2. Key points of static data caching

Static data is cached to the nearest place to the user. Can exist (browser, CDN, server Cache)

Static transformation (caching http links directly instead of just caching data)

The Web server directly caches static data (nginx, apache).

3. 5 aspects of dynamic and static separation

URL uniformization (partition save)

Separate the relevant factors of the viewer (whether or not login, login identity, etc., obtained through dynamic request)

Separate time factor (server output time is obtained by dynamic request)

Asynchronized regional factors (obtain region-related information asynchronously)

Without Cookie, static pages do not contain cookie (deleted by code software)

4. Static data and dynamic data assembly

Make dynamic data requests on the proxy server and insert dynamic data into static pages

The browser initiates a dynamic request, and the browser assembles the page.

5. Several architecture schemes of dynamic and static separation

Physical machine stand-alone deployment

Unified cache

CDN (there are several problems: 1, failure problem; 2, hit rate problem; 3, release update problem;)

5. pertinently deal with the 28th principle of "hot spot data" of the system.

1. Hot spot data processing

(1) how to find static hot spot data

The hot spot goods are screened by registration, and the hot spot data is preprocessed by the background system.

The system predicts that the commodities of top N are excluded every day, and the background system preprocesses the hot spot data.

(2) the way to find dynamic hot spot data

Build an asynchronous system, which can collect the hot key of middleware products in all aspects of the transaction link, such as Nginx, cache, RPC service framework and so on.

The main purpose of establishing a hot spot reporting and distribution specification of hot spot services that can be subscribed according to demand is to simultaneously interpret the hot spots found by the upstream system to the downstream system through the time difference between the access time of each system on the transaction link, and protect them in advance.

The hotspot data collected by the upstream system is sent to the hotspot server, and then the downstream system does hotspot protection.

2. Matters needing attention in building a hot spot discovery system

The hot service backend grabs data logs asynchronously.

Hot spot service discovery and hotspot protection module of middleware coexist

Hot spots should be sent in close to real time.

3. Deal with hot spot data

(1) optimize hot spot data

The most effective way to optimize hot spot data is to cache hot spot data, which can be replaced by LRU elimination algorithm.

(2) restrictions

Do consistent Hash according to the commodity id and put it into different queues to prevent too many services from being occupied by some goods.

(3) isolation

Isolate this hot spot data so that 1% of requests do not affect another 99% of requests.

Isolation has the following levels: business isolation, system isolation and data isolation.

VI. Flow peak cutting scheme

1. Queue up

Turn an one-step operation into a two-step operation, and adding one step acts as a buffer.

2. Answer the question

Prevent some second killers from cheating.

Delay request, request quantity peaking

3. Layered filtering

Cache the read data of dynamic requests on the web side to filter out invalid data

Read the data without strong consistent check

Reasonable slicing of write data based on time to filter out expired invalid requests

Current-limiting protection for write requests to filter out requests that exceed the carrying capacity of the system

Perform a strong consistent check on the write data and retain only the last valid data

7. Scheme to improve system performance

1. Factors that affect performance (server performance is generally measured by QPS)

(1) the server side of a response is time-consuming, and what affects the performance is the execution time of CPU.

(2) number of threads processing requests, reasonable number of concurrent threads

VIII. Core logic of inventory reduction design

1. Three ways to reduce inventory and possible problems

(1) issuing orders to reduce inventory, issuing orders without payment (large second kill systems are generally used to issue orders to reduce inventory)

(2) payment to reduce inventory, and inventory is oversold

(3) withholding inventory (keep inventory for a certain period of time after placing an order)

2. The extreme optimization of inventory reduction in seconds

(1) the reduction and storage logic of second-kill goods is very simple, and the deduction can be completed by the cache system.

(2) the complicated logic of inventory deduction should be completed in the database.

IX. Design of alternative schemes

1. Where should high availability construction start?

(1) Architectural phase

Mainly consider scalability and fault tolerance to avoid the emergence of a single machine

(2) coding stage

The main consideration is the robustness of the code, which involves setting a reasonable timeout for remote calls.

You should also expect the returned result set of the call to prevent the returned result from going beyond the scope of the program processing.

(3) Test phase

Ensure the coverage of test cases

Ensure that when the worst happens, there is a corresponding process.

(4) release stage

There should be an emergency rollback mechanism.

(5) Operation phase

Monitoring the system should be accurate and timely.

If you find a problem, you can give an accurate alarm, and the data should be accurate and detailed in order to troubleshoot the problem.

(6) failure occurs

Stop loss in time

Recover in time, and locate the cause to solve the problem

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report