In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
How to understand the Gossip protocol, I believe that many inexperienced people do not know what to do. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Gossip protocol is also called Epidemic Protocol (Epidemiological Protocol).
Originally used for node synchronization data in distributed database, it has been widely used in database replication, information diffusion, cluster membership confirmation, fault detection and so on.
Gossip protocol is based on the philosophy of six-degree separation theory (Six Degrees of Separation). To put it simply, a person can know anyone in the world through six middlemen. The mathematical formula is:
N represents complexity, N represents the total number of people, and W represents the contact width of each person. According to Dunbar's number, that is, each person knows 150 people, the six degrees is 1506 = 11390625000000 (about 11.4 trillion).
Based on the six-degree separation theory, the spread of any information is actually very fast, and the number of network interactions will not be very much. For example, Facebook conducted an experiment on February 4, 2016: it studied the data of 1.59 billion registered users at that time and found that the "network diameter" of this magic number was 4.57. translated into vernacular, it means that each person is 4.57 people apart from others.
-principle-
Gossip protocol implementation process:
Seed nodes spread messages periodically [assuming that the period is limited to 1 second].
The infected node randomly selects N adjacent nodes to spread messages [assuming that fan-out (fan out) is set to 6, spreading to up to 6 nodes at a time].
The node only receives messages and does not feedback the results.
Each spreading message selects a node that has not yet been sent to spread.
The node that receives the message no longer spreads to the sending node: a-> B, so when B spreads, it is no longer sent to A.
The information dissemination and diffusion of Goosip protocol usually needs to be initiated by seed nodes. The whole propagation process may take a certain amount of time, because there is no guarantee that all nodes will receive the message at a certain time, but in theory, all nodes will receive the message, so it is a final consistency protocol.
The Gossip protocol is a multi-master protocol, and all writes can be initiated by different nodes and synchronized to other replicas. The network nodes in Gossip are all peer-to-peer nodes and are unstructured networks.
-message type-
There are two ways of message transmission in Gossip protocol: Anti-Entropy (anti-entropy propagation) and Rumor-Mongering (rumor spread).
Anti-entropy propagation is the transmission of all data with a fixed probability. All participating nodes have only two states: Suspective (pathogen) and Infective (infection). This node state is also called simple epidemics (SI model). The process is that the seed node will share all the data with other nodes in order to eliminate any data inconsistency between the nodes, and it can ensure the final and complete consistency. The disadvantage is that the number of messages is very large and unlimited; it is usually only used for data initialization of newly added nodes.
Rumor spread is to spread only newly arrived data with a fixed probability. All participating nodes have three states: Suspective (pathogen), Infective (infection), and Removed (cure). This node state is also called complex epidemics (SIR model). The process is that the message contains only the latest update, and the rumor message is marked as removed after a certain point in time and is no longer propagated. The disadvantage is that the system has a certain probability of inconsistency, which is usually used for incremental data synchronization between nodes.
-means of communication-
The ultimate goal of the Gossip protocol is to distribute data to every node in the network. According to different specific application scenarios, there are three communication modes between the two nodes in the network: push mode, pull mode and Push/Pull.
Push: node A pushes the data (key,value,version) and the corresponding version number to node B, and node B updates the data in A that is newer than itself.
Pull:An only pushes data key, version to BMagel B, and pushes local data (Key, value, version) newer than A to A Magee A to update the local.
Push/Pull: similar to Pull, but with one more step, A pushes the local data newer than B and updates the local.
If the data synchronization of two nodes is defined as one cycle at a time, Push needs to communicate once, Pull needs 2 times, and Push/Pull needs 3 times in a cycle. Although the number of messages has increased, Push/Pull is the best in terms of effect, which can theoretically make the two nodes exactly the same in one cycle. Intuitively, the convergence speed of Push/Pull is the fastest.
-Summary-
To sum up, we can conclude that Gossip is a decentralized distributed protocol, and data spreads one by one through nodes like viruses. Because it is exponential transmission, the overall transmission speed is very fast, much like 2019-nCoV (COVID-19), which is out of control in the United States. It has the following advantages:
Scalability: allow the arbitrary increase and decrease of nodes, and the state of the new nodes will eventually be the same as that of other nodes.
Fault tolerance: the downtime and restart of any node will not affect the propagation of Gossip messages, and has the natural fault-tolerant characteristics of distributed systems.
Decentralization: no central node, all nodes are peer-to-peer, any node does not need to know the whole network condition, as long as the network is connected, any node can spread the message to the whole network.
Consistent convergence: messages propagate through the network at an exponential speed of one to ten, so inconsistencies in the state of the system can converge to consistency in a quick time. The speed of the message has reached logN.
simple
There are also the following shortcomings:
Message delay: the node randomly sends messages to a few nodes, and the message eventually reaches the whole network through multiple rounds of dissemination; it inevitably causes message delay.
Message redundancy: nodes regularly randomly select the surrounding nodes to send messages, and the nodes that receive the messages will repeat this step; it will inevitably cause the same node to receive messages many times, increasing the pressure of message processing.
Because of the above advantages and disadvantages, Gossip protocol is suitable for data consistency processing in AP scenarios. Common applications are: P2P network communication, Apache Cassandra, Redis Cluster, Consul.
After reading the above, have you mastered how to understand the Gossip protocol? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.