In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "what is the time, clock and event order of the web distributed system". In the daily operation, I believe that many people have doubts about the time, clock and event order of the web distributed system. I have consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the questions of "what is the time, clock and event order of the web distributed system?" Next, please follow the editor to study!
Physical clock vs logical clock
One might ask, why do distributed systems not use physical clocks (physical clock) to record events? Each event is stamped with a timestamp, and when you need to compare the order, just compare the corresponding timestamp.
This is because there is a unified standard of physical time in real life, but the time recorded by each node in the distributed system is not the same, even if NTP time synchronization nodes are set, there is a millisecond deviation between nodes [1] [2]. Therefore, the distributed system needs another way to record the event sequence relationship, which is the logical clock (logical clock).
Lamport timestamps
Leslie Lamport proposed the concept of logical clock in 1978 and described a representation of logical clock, which is called Lamport timestamp (Lamport timestamps) [3].
According to the existence of node interaction, distributed systems can be divided into three types of events, one occurs within the node, the second is to send events, and the third is to receive events. The principle of Lamport timestamp is as follows:
Figure 1: Lamport timestamps space time (photo source: wikipedia)
Each event corresponds to a Lamport timestamp with an initial value of 0
If the event occurs within the node, time stamp plus 1
If the event belongs to a sent event, add a timestamp of 1 and put the timestamp in the message
If the event belongs to a received event, timestamp = Max (local timestamp, timestamp in the message) + 1
Suppose there are events a, b, C (a), C (b) denote the Lamport timestamps corresponding to events an and b, respectively, if C (a)
< C(b),则有a发生在b之前(happened before),记作 a ->B, for example, there is C1-> B1 in figure 1. Through this definition, events with different Lamport timestamps in the event set can be compared, and the partial order relation (partial order) of the event is obtained.
If C (a) = C (b), what is the order of events an and b? Suppose that an and b occur on nodes P and Q, respectively, and Pi and Qj denote the numbers we give to P and Q, respectively. If C (a) = C (b) and Pi j, also defined as an occurring before b, it is marked as a = > b. If we number A, B and C in figure 1 as Ai = 1, Bj = 2, Ck = 3, because C (B4) = C (C3) and Bj
< Ck,则 B4 =>C3 .
Through the above definition, we can sort all events and obtain the total order relation (total order) of events. For the example above, we can sort from C1 to A4.
Vector clock
Lamport timestamps help us get event sequence relationships, but there is another order relationship that can not be well represented by Lamport timestamps, that is, simultaneous relationships (concurrent) [4]. For example, in figure 1, event B4 and event C3 have no causal relationship and belong to simultaneous events, but the Lamport timestamp defines that the two are in sequence.
Vector clock is another logical clock method based on Lamport timestamp. It records not only the Lamport timestamp of this node but also the Lamport timestamp of other nodes through vector structure. The principle of Vector clock is similar to that of Lamport timestamp, using the following illustration:
Figure 2: Vector clock space time (Image Source: wikipedia) _ _
Suppose that events an and b occur on node P and Q respectively, and Vector clock is Ta and Tb respectively. If Tb [Q] > Ta [Q] and Tb [P] > = Ta [P], then an occurs before b and is marked as a-> b. So far, it's not much different from the Lamport timestamp, so how can Vector clock tell if there is a relationship at the same time?
If Tb [Q] > Ta [Q] and Tb [P] < Ta [P], it is considered that an and b occur at the same time, which is recorded as a b. For example, there is no causal relationship between the fourth event on node B in figure 2 and the second event on node C, which is a simultaneous event.
Version vector
Based on Vector clock, we can obtain the sequence relationship of any two events, and the result is either sequential or simultaneous. Identifying the sequence of events has a very important extended application in engineering practice, and the most common application is to find data conflicts (detect conflict).
There are generally multiple replicas (replication) of data in distributed systems, and multiple replicas may be updated at the same time, which will cause data inconsistency between replicas [7]. The implementation of Version vector is very similar to Vector clock [8] for the purpose of discovering data conflicts [9]. Here is an example to illustrate the use of Version vector [10]:
Figure 3: Version vector
The client side writes data, and the request is processed by Sx and the corresponding vector ([Sx, 1]) is created, recorded as data D1.
The second request is also processed by Sx, and the data is modified to D2 vector to ([Sx, 2]).
The third and fourth requests are processed by Sy and Sz respectively. The client side reads D2 first, and then D3 and D4 are written to Sy and Sz.
During the fifth update, the client read three data versions D2, D3 and D4, and judged the existence of data conflict between D3 and D4 by a method similar to Vector clock, and finally solved the data conflict and wrote it into D5.
Vector clock is only used to find data conflicts, not to resolve data conflicts. How to resolve data conflicts varies from scenario to scenario. The specific methods are to take the last update as the standard (last write win), or to leave the conflicting data to client for client to decide how to handle it, or to avoid data conflicts in advance through quorum resolution [11].
Because the logical clock information of all data is recorded on all nodes, one of the problems that Vector clock and Version vector may face in practical application is that the vector is too large, and the meta data for data management is even larger than the data itself [12].
The solution to this problem is to use server id instead of client id to create vector (because the number of server is stable relative to client), or to set the maximum size, and if the size value is exceeded, the oldest vector information [10] [13] is eliminated.
At this point, the study on "what is the time, clock and event sequence of the web distributed system" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.