Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to thoroughly solve the problem of distributed system consistency in big data

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Big data in how to thoroughly solve the problem of distributed system consistency, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

In order to solve the consistency problem, we must first understand what is the consistency problem. Consistency problem is a common distributed problem, which can be subdivided into final consistency and strong consistency, but it usually refers to strong consistency. The book says, "you have me in you. I have you in one"; many people have great power, which leads to the idea and logic of divide and rule.

Horizontal split: the level here, I understand as horizontal spatial dimension split, not only refers to the split of database tables and cache, but also refers to the pooling technology, which can be compared to the concept of cluster, that is, multiple identical services are on the same service. Distributed: a service is split into different services.

Vertical split: "professional people do professional things", so simple, easy to maintain, analogous to the single responsibility principle of design patterns, the single responsibility of design patterns means that there is only one reason for class change.

Consistency refers to the weak consistency between distributed service-oriented systems, including application system consistency and data consistency.

According to this concept, we can also subdivide the direction of high concurrency into high concurrency of data and high concurrency of requests to propose a solution.

Consistency problem

Case 1: placing orders and withholding inventory

In the e-commerce system, how to maintain the consistency of placing orders and deducting inventory. If placing an order first fails to deduct inventory, it will be oversold; if placing an order is not successful, withholding inventory is successful, which will lead to less selling.

Case 2: synchronous call timeout

When system A calls system B timed out, A gets feedback, but there is no guarantee that B completes the preset function, resulting in A unable to feedback the caller.

Case 3: asynchronous callback timeout

Most payments are made with asynchronous callbacks.

A calls BBME B to notify An asynchronously, but A fails to receive the success message so that it cannot jump to the order page.

Case 4: inconsistency between cache and database

In order to cope with high concurrent read operations, add a layer of cache before accessing the database, such as the display of e-commerce product details page, so how to maintain the consistency of data between the cache and the database? If there is a strong consistency requirement for data, you can't slow down the storage.

There are a total of 8 cases. Here are the 4 cases I am interested in, especially the last one.

Models and ideas for solving the problem of consistency

Analyze and propose solutions according to the problems thrown out.

ACID principle

Atomicity (Atomicity): atom means the smallest particle, or something that can no longer be divided. The principle of indivisibility of database transactions is atomicity. All queries that make up the transaction must either be executed or cancelled.

Consistency: rules for data that should be consistent before / after a transaction.

Isolation: to put it simply, the operation of one transaction is invisible to other transactions.

Durability: when a transaction is committed, its impact should be preserved and cannot be undone.

CAP principle

C: consistency

A: availability

P: partition fault tolerance

No distributed system can satisfy three points at the same time. Taobao double 11 satisfies the AP principle, and the difference between zookeeper and Eureka is that zookeeper satisfies the CP,Eureka and meets the AP principle.

BASE model

BA: basic availability

S: soft state, which can be out of sync for a period of time

E: final consistency

The idea of BASE can solve the sexual problems one by one.

So how do we solve the problem of consistency between cache and database in case 4?

At present, most of the solutions are to clear the cache first, then write the library, and then more cache operations, but the problems caused by high concurrency can not be really solved, but the conditions are relatively harsh.

Two-phase submission, three-phase submission, TCC,

Two-phase commit protocol: preparation phase, submission phase

Three-phase submission protocol: inquiry phase, preparation phase, submission phase

TCC protocol: Try,Confirm,Cancel, execute try first, no problem execute confirm, and execute Cancel if there is a problem.

This is the answer to how to thoroughly solve the problem of distributed system consistency in big data. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report