What are the architectures available for Kafka 07/19 Update SLTechnology News&Howtos

What are the architectures available for Kafka

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "what are the available architectures of Kafka". In daily operation, I believe many people have doubts about the available architecture of Kafka. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "what are the available architectures of Kafka?" Next, please follow the editor to study!

Controller election

When adding a partition or adding replicas, a new Leader is elected from all replicas.

Leader, what if the election? How to play the vote? Do all the copies of partition directly initiate a vote and start the campaign? For example, it is implemented in ZK.

How to use ZK to realize the election? What function of ZK can sense the change (increase or decrease) of nodes? Or why can ZK lock and release locks?

Three features are used: watch mechanism; nodes do not allow repeated writes; temporary nodes.

This implementation is relatively simple, but there will be some drawbacks. If there are too many partitions and replicas, and all replicas are directly elected, once a node increases or decreases, a large number of watch events will be triggered and the load of ZK will be too heavy.

This is what early versions of kafka did, but later it was implemented in a different way.

Not all repalica participate in the leader election, but are under the unified command of one of the Broker. The role of this Broker is called Controller (controller).

Just like the Redis Sentinel architecture, when performing a failover, you must first select a node responsible for the failover from all the sentinels. Kafka also has to select a unique Controller from all the Broker first.

All Broker will try to create a temporary node / controller in zookeeper, and only one can be created successfully (first come, first served).

If the Controller goes down or there is a problem with the network, the temporary nodes on the ZK will disappear. Other Brokder started campaigning for a new Controller after listening to the news that Controller was offline through watch. The method is the same as before, and whoever writes a / cotroller node in ZK first becomes the new Controller.

After becoming a Controller node, its responsibility is also somewhat heavier than that of other nodes:

Listen for Broker changes

Listen for Topic changes

Listen for Partition changes

Obtain and manage information about Broker, Topic, and Partition

Manage the master-slave information of Partiontion

Partition copy Leader election

Once Controller is sure, you can start to do the partition selection. Next is to find a candidate. Obviously, every replica wants to recommend themselves, but are all replica eligible to run? No, there are a few concepts here.

Assigned-Replicas (AR): all copies of a partition. In-Sync Replicas (ISR): of all the copies above, keep a certain degree of synchronization with the leader data. Out-Sync Replicas (OSR): replicas that lag too much to synchronize with leader.

AR=ISR + OSR. Normally, OSR is empty, everyone is synchronized normally, AR=ISR.

Who can take part in the election? Definitely not AR, not OR, but ISR. And the ISR is not fixed, it's a dynamic list.

As mentioned earlier, if the synchronization delay exceeds 30 seconds, kick out the ISR, enter the OSR; and join the ISR if you catch up.

By default, when a leader copy fails, only the copy in the ISR collection is eligible to be elected as the new leader.

What if ISR is empty? A group of dragons cannot be leaderless. In this case, copies other than ISR can be allowed to participate in the election. Allow copies other than ISR to participate in the election, called unclean leader election.

Unclean.leader.election.enable=false

Change this parameter to true (generally, it is not recommended to enable it, it will cause data loss).

With Controller and candidates with ISR, what are the rules for determining leader?

First of all, what are the common election protocols (or consensus algorithms) in distributed systems?

ZAB (ZK), Raft (Redis Sentinel) they are variants of Paxos algorithm, the core idea is: first come, first served, minority subordinate to the majority.

But instead of using these methods, kafka uses an algorithm that implements itself.

Why? For example, protocols such as ZAB may lead to brain cracks (multiple leader when nodes cannot communicate with each other) and crowd effect (a large number of watch events are triggered).

There are instructions in the document:

Https://kafka.apachecn.org/documentation.html#design_replicatedlog

When it comes to the election implementation of kafka, the closest thing is Microsoft's PacificA algorithm.

In this algorithm, the default is to make the first replica in ISR become leader. Like the throne of the Chinese emperor, priority is given to the eldest son.

Master-slave synchronization

After the leader determines, the client's read and write can only operate on the leader node. Follower needs to synchronize data to leader.

The offset of different raplica is different, how on earth can synchronization be synchronized?

After the content, you need to understand several concepts first.

LEO (Log End Offset): the offset of the next message waiting to be written (the latest offset + 1).

HW (Hign Watermark High Water level): the smallest LEO in ISR. Leader manages that the smallest LEO of all ISR is HW.

Consumer can only be consumed to the location before HW at most. In other words, other copies cannot be consumed without synchronizing past messages.

Why is kafka designed like this?

If it is consumed before the synchronization is successful, the offset of the consumer group will be too large, and if the leader crashes, messages will be lost in the middle.

Then take a look at how the messages are synchronized.

Replica 1 and Replica2 synchronized one piece of data each, and HW advanced 1 to 7 because Replica2 pushed 1 to 7.

Replica 1 and Replica2 synchronize 2 pieces of data each, HW and LEO overlap, both up to 9.

Here you need to know how the slave node is synchronized with the master node.

The follower node sends a fetch request to Leader, and after leader sends data to follower, it needs to update the LEO of follower.

After the follower receives the data response, it writes the message and updates the LEO in turn.

Leader update HW (LEO with minimum ISR)

Kafka designs unique ISR replication, which can provide high throughput while ensuring data consistency.

Replica fault handling follower fault

First of all, if follower applauds, he will be kicked out of ISR first.

After follower is restored, where do I start to synchronize data?

Suppose Replica1 is down.

After recovery, messages higher than the HW are first truncated (6, 7) according to the previously recorded HW (6).

Then synchronize the message to Leader. After catching up with Leader (30 seconds), rejoin ISR.

Leader failure

Also take the above figure as an example, if the Leader in the figure fails.

Choose a Leader first, because Replica1 takes precedence, it will become Leader.

To ensure data consistency, other follower needs to truncate messages higher than HW (there are no messages to intercept here).

Then Replica2 synchronizes the data.

At this point, data 8 in the original Leader is lost.

Note: this mechanism can only ensure data consistency between replicas, not data loss or duplication.

At this point, the study of "what are the available architectures of Kafka" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.