In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to achieve the high reliability of Kafka". In daily operation, I believe that many people have doubts about how to achieve the high reliability of Kafka. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to achieve the high reliability of Kafka". Next, please follow the editor to study!
Topic partition copy
Before Kafka 0.8.0, Kafka had no concept of replicas. At that time, people would only use Kafka to store unimportant data, because without replicas, data was likely to be lost. However, with the development of business, the ability to support replicas is becoming stronger and stronger, so in order to ensure the reliability of data, Kafka has introduced partition replicas since version 0.8.0 (see KAFKA-50 for details). In other words, each partition can be artificially configured with several copies (for example, specify replication-factor when creating a theme, or configure default.replication.factor at the Broker level), which is usually set to 3.
Kafka ensures that events in a single partition are ordered, and that partitions can be online (available) or offline (unavailable). Among the many partition replicas, one copy is Leader, and the rest is follower. All read and write operations are done through Leader, and follower regularly replicates data on leader. When Leader dies, one of the follower will become the new Leader again. Through partitioned replicas, data redundancy is introduced, while providing the data reliability of Kafka.
Kafka's partitioned multi-replica architecture is the core of Kafka reliability assurance. Writing messages to multiple copies enables Kafka to guarantee message persistence in the event of a crash.
Producer sends messages to Broker
If we want to send a message to the topic corresponding to Kafka, we need to do it through Producer. As we mentioned earlier, the Kafka topic corresponds to multiple partitions, and there are multiple copies under each partition; in order to allow users to set data reliability, Kafka provides a message confirmation mechanism in Producer. In other words, we can configure to determine how many copies of the message are sent to the corresponding partition before the message is sent successfully. It can be specified by the acks parameter when defining the Producer (set by the request.required.acks parameter prior to version 0.8.2.x).
This parameter supports the following three values:
Acks = 0: this means that if the producer is able to send the message over the network, the message is considered to have been successfully written to the Kafka. In this case, errors can still occur, such as the sent object cannot be serialized or the network card fails, but if the partition is offline or the entire cluster is unavailable for a long time, no error will be received. Running very fast in acks=0 mode (which is why many benchmarks are based on this mode), you can get amazing throughput and bandwidth utilization, but if you choose this mode, you will certainly lose some messages.
Acks = 1: this means that if Leader receives a message and writes it to a partition data file (not necessarily synchronized to disk), it will return an acknowledgement or error response. In this mode, if a normal Leader election occurs, the producer will receive a LeaderNotAvailableException exception during the election, and if the producer can handle the error properly, it will retry to send a quiet message, and eventually the message will safely reach the new Leader. However, it is still possible to lose data in this mode, such as when the message has been successfully written to the Leader, but the Leader crashes before the message is copied to the follower replica.
Acks = all (which means the same as request.required.acks =-1): means that Leader waits for all synchronized copies to be quietly received before returning a confirmation or error response. If combined with the min.insync.replicas parameter, you can determine at least how many copies will receive a quiet message before returning a confirmation, and the producer will try again until the message is successfully submitted. However, this is also the slowest approach, because the producer needs to wait for all copies to receive the current message before continuing to send other messages.
According to the actual application scenario, we set different acks to ensure the reliability of the data.
In addition, Producer can also choose synchronous (default, configured through producer.type=sync) or asynchronous (producer.type=async) mode to send messages. If set to asynchronous, it will greatly improve the performance of message delivery, but it will increase the risk of data loss. If you need to ensure the reliability of the message, you must set producer.type to sync.
Leader election
Before introducing the Leader election, let's take a look at the ISR (in-sync replicas) list. The leader of each partition maintains an ISR list, in which is the Borker number of the follower copy. Only the follower copy of the Leader can be added to the ISR, which is configured through the replica.lag.time.max.ms parameter. Only members of ISR can be chosen as leader.
2) data consistency
The data consistency introduced here mainly means that both the old Leader and the newly elected Leader,Consumer can read the same data. So how does Kafka work?
Assume that the copy of the partition is 3, where copy 0 is Leader, copy 1 and copy 2 are follower, and are in the ISR list. Although copy 0 has been written to Message4, Consumer can only read to Message2. Because all ISR synchronize Message2, only messages above High Water Mark can be read by Consumer, while High Water Mark depends on the partition with the least offset in the ISR list, which corresponds to copy 2 of the figure above, which is very similar to the bucket principle.
The reason for this is that messages that have not been replicated by enough replicas are considered "insecure", and if the Leader crashes and another copy becomes the new Leader, these messages are likely to be lost. If we allow consumers to read these messages, it may break consistency. Imagine that a consumer reads and processes Message4 from the current Leader (replica 0). At this time, Leader is hung up and replica 1 is elected as the new Leader. At this time, another consumer reads the message from the new Leader and finds that the message does not exist, which leads to data inconsistency.
Of course, the introduction of the High Water Mark mechanism will slow down the replication of messages between Broker for some reason, and the time it takes for the message to reach the consumer will be longer (because we will wait for the message to be copied first). The delay time can be configured through the parameter replica.lag.time.max.ms parameter, which specifies the maximum delay time that the replica can allow when replicating the message.
At this point, the study on "how to achieve the high reliability of Kafka" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.