In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the knowledge of "what is the principle of java distributed consistency". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
I. Overview
A distributed system is a system in which hardware or software components are distributed on different network computers and communicate and coordinate with each other only through message transmission.
2. Characteristics of distributed system
Multiple computers in a distributed system will be distributed at will in space, and at the same time, the distribution of machines will change at any time. There is no master-slave distinction between computers in peer-to-peer distributed systems, and all the computer nodes that make up the distributed system are peer-to-peer. Replica is one of the most common concepts in distributed systems, which refers to a redundant way for distributed systems to provide data and services. Concurrency in a computer network, concurrent operations in program operation are common, and some shared resources, such as databases or distributed storage, may be operated concurrently. How to efficiently coordinate distributed concurrent operations is also one of the biggest challenges in distributed system architecture and design. Lack of global clock because the distributed system is composed of a series of processes randomly distributed in space, which are obviously distributed, and these processes communicate through the exchange of messages. Because of the lack of a global clock, it is difficult to define the order in which two events occur. Therefore, the abnormal situation considered in the design stage will certainly occur in the actual operation of the system, and there will be a lot of abnormal faults that can not be considered in the design process, so don't let go of any abnormal situation in the system design.
Common problems in distributed systems
Communication anomaly distributed system needs to communicate between nodes, so each network communication will be accompanied by the risk of network unavailability (optical fiber, routing, DNS and other hardware devices are not available). Due to the abnormal situation of the network, the network delay between some nodes in the distributed system is increasing, and finally only some nodes in the distributed system can communicate normally. this situation is called network partition, commonly known as "brain fissure". When there is a network partition, local small clusters will appear in the distributed system. in extreme cases, these small clusters will independently complete the functions that need to be completed by the whole distributed system, including the transaction processing of data. this poses a great challenge to distributed consistency. Because of the problem of the network, there is a unique concept of "three states" in each request and response of the distributed system, that is, success, failure and timeout. Due to the reliability of the network department, although in most cases, the network communication can accept the successful failure response, when the network is abnormal, there will be a timeout phenomenon, usually in the following two cases:
Due to network reasons, the request was not successfully sent to the receiver, but lost in the sending process.
The request is successfully processed by the receiver, but in the process of response feedback to the sender, the loss site occurs.
IV. Distributed things
Everyone must be familiar with distributed things, after all, all distributed systems will encounter this problem. In general, a distributed transaction can be regarded as composed of multiple distributed operation sequences, and this series of distributed operations are usually regarded as sub-things. Therefore, distributed things can also be defined as a kind of nested things, which also has the characteristics of ACID things. However, in distributed transactions, the execution of each subsystem is distributed, so it is particularly complex to implement a distributed transaction processing system that can guarantee the characteristics of ACID. So there are some classical distributed theories such as CAP and BASE. # # CAP CAP theory tells us that a distributed system cannot meet the three basic requirements of consistency (c:consistency), availability (A:Availability) and partition fault tolerance (P:Partition tolerance) at the same time.
Consistency in a distributed environment, consistency refers to whether consistency can be maintained among multiple replicas. For a system in which data copies are distributed in different nodes, if the data of one node is updated, but the data of the second node is not updated accordingly, then the old data is obtained when reading the second node at this time. This is typical data inconsistency. If a data update can be achieved, and all users and any node read the new data, then the system is considered to have strong consistency. Availability means that the service must be available all the time, and the result can always be returned within a limited time for each operation request of the user. The focus is on "limited time" and "return results". "limited time" is different in different business scenarios. The "return result" refers to the explicit processing result, that is, success or failure. Partition fault tolerance when a distributed system encounters any network partition failure, it still needs to be able to provide services that meet consistency and availability, unless the whole system is down.
CAP diagrams are included:
Application of CAP Theorem
Give up C.A.P interpretation abandon P if you want to avoid the problem of partition fault tolerance after giving up P, the simple way is to put all the data (or only data related to things) on one distributed node. But it should be noted that to give up P means to give up the expansibility of the system. Abandon An abandon An in the event of a network partition or other failure, then the affected services need to wait for recovery, and during this period can not provide services to abandon the consistency mentioned in C here, not completely, but to give up strong data consistency, only the final consistency of the data is retained. How long it takes to achieve data consistency depends on the system design, including the length of time for data copies to be replicated between different nodes.
In fact, from a design point of view, distributed systems can not meet the three requirements of strong consistency, availability and partition fault tolerance. For distributed systems, partition fault tolerance is a basic attribute, so system architects tend to focus on how to find a balance between C (consistency) and A (usability) directly according to business characteristics.
BASE theory
BASE is proposed by the eBay architect. BASE is the result of the tradeoff between consistency and availability in CAP, which comes from the summary of the practice of large-scale Internet distributed system, and is gradually evolved based on CAP's law. Its core idea is that even if strong consistency can not be achieved, each application can make the system achieve the final consistency in an appropriate way according to its own business characteristics. Let's take a look at the three elements in BASE.
Basic availability means that when a distributed system fails unpredictably, partial availability is allowed to be lost. Please note that this is by no means equivalent to the unavailability of the system. The following two typical examples are:
Loss of response time: normally, a search engine takes 0.5 seconds to return query results, but due to failures (such as system computer room disconnection or fire), the response time of query results is increased to 2 seconds.
Service degradation: in the mode of second kill and rush purchase, limit the proportion of users' requests to ensure the stability of the system.
The weak state is also known as the soft state, and the ratio of the soft state to the hard state means that the data in the system is allowed to have an intermediate state, and it is considered that the intermediate state will not affect the overall availability of the system, that is, there is a delay in the process of allowing the system to synchronize data between replicas of data at different points.
Finally, all the copies of data in the consistency system can reach a consistent state after a period of synchronization. There is no need to guarantee strong consistency of data in real time.
Amazon's chief technology officer has published an article detailing the final consistency. He believes that the final consistency is a special kind of weak consistency: the system can ensure that the data will eventually reach a consistent state in the absence of other new update operations. therefore, all clients can get the latest values for data access to the system. At the same time, under the premise of no failure, the time delay for data to reach a consistent state depends on network delay, system load, data replication scheme design and other factors.
In practice, there are five variants of final consistency:
Causal consistency (causal consistency) causal consistency means that if process A notifies process B after updating a data item, then process B's subsequent access to the data item should be able to obtain the latest value after process A's update, and if process B wants to update the data, it must be based on the latest value of process A, and the update cannot be lost. At the same time, process C, which has nothing to do with process A, has unrestricted data access.
Reading known and written (read your write) means that after process A updates the data, it always has access to the updated values. In other words, the data I read by myself must not be older than the data I wrote last time.
Session consistency (session consistency) means that the system can guarantee the consistency of "reading what is known and writing" in the same valid session, that is, after the update operation, the latest value can always be obtained in the same session.
Monotone read consistency (monotonic read consistency) means that if a process reads a value of a data item from the system, the system should not return an older value for any subsequent data access to that process.
Monotone write consistency A system needs to be able to ensure that writes from the same process are executed sequentially.
These are the consistency variants commonly used in the five types of system architecture, which can be combined with each other to design a distributed system with final consistency. Generally speaking, BASE theory is oriented to large-scale, highly available and scalable distributed systems, which is contrary to the characteristics of traditional ACID and different from the strong consistency model of ACID. It proposes to achieve availability by sacrificing strong consistency and allow data inconsistencies over a period of time, but finally achieve a consistent state. At the same time, in the actual distributed scenario, different businesses have different requirements for data consistency, so in the design, ACID and BASE theory are often used together.
This is the end of the content of "what is the principle of java distributed consistency". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.