Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to parse the Kafka architecture

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article is to share with you about how to analyze the Kafka architecture, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Kafka is an open source, distributed, partitioned, replicable publish and subscribe messaging system based on log submissions. It has the following characteristics:

1. Message persistence: in order to obtain valuable information from big data, any loss of information is unaffordable. Kafka uses the O (1) disk structure design, which provides stable performance even when storing large volumes of data. When using Kafka, message is stored and replicated to prevent data loss.

two。 High throughput: the design is that multiple clients working on ordinary hardware facilities can handle hundreds of megabytes of data per second.

3. Distributed: Kafka Broker's centralized cluster supports message partitioning, while consumer uses distributed consumption.

4. Multiple Client support: Kafka is easily supported with other platforms, such as Java, .NET, PHP, Ruby, Python.

5. Real-time: messages are immediately visible to consumer as soon as they are generated by producer. This feature is critical for event-based systems.

Here's a brief description of the Kafka architecture:

Description of each component of Kafka

Topic and Broker

Partition log

Partition distribution

Broker

Topic

Producer

Consumer

Guarantees provided by Kafka

Architecture diagram

Description of each component of Kafka Broker

Each kafka server is called a Broker, and multiple borker form a kafka cluster.

One or more Broker can be deployed on a machine, and these multiple Broker are connected to the same ZooKeeper to form a Kafka cluster.

Topic

Kafka is a publish / subscribe messaging system, and its logical structure is as follows:

Topic is a message class alias, and a topic usually places a class of messages. Each topic has one or more subscribers, that is, the consumer consumer of the message.

The Producer pushes the message to the topic, and the consumer subscribing to the topic pulls the message from the topic.

Topic and broker

One or more Topic can be created on a Broker. The same topic can be distributed in multiple Broker under the same cluster.

Partition log

Kafka maintains multiple partitions (partition) for each topic, and each partition maps to a logical log file:

Whenever a message is published to a partition,broker on a topic, the message should be appended to the last segment of the logical log file. These segments will be flush to disk. Flush can be done by time or by the number of message.

Each partition is an ordered, immutable, structured sequence of submitted log records. In each partition, each log record is assigned a sequence number-- often called offset,offset, which is unique within the partition. The argument logic file is divided into multiple files segment (each segment is the same size).

The Broker cluster will retain all published message records, regardless of whether the messages have been consumed or not. The retention time depends on a matching retention period. For example, if the retention policy is set to 2day, then each message is retained within two days of publication, and the message can be consumed during the retention time of this 2day. It will not be retained after expiration.

Partition distribution

Log partitioning is distributed across multiple broker in a kafka cluster. Each partition will be copied and exist on a different broker. This is done for disaster recovery. Specific copies will be copied, which broker will be copied, can be configured. After the relevant replication policy, each topic resides one or more partition on each broker. As shown in the figure:

For the same partition, any broker in which it resides can play two roles: leader and follower.

Look at the example above. The red representative is a leader.

For the 4 partition of topic1:

The leader of Part 1 is broker1,followers and broker2.

The leader of Part2 is broker2,followers and broker1.

The leader of Part3 is broker3,followers and broker1.

The leader of Part4 is broker4,followers and broker2.

For the 3 partition of topic2:

The leader of Part1 is broker1,followers and broker2.

The leader of Part2 is broker2,followers and broker3.

The leader of Part3 is broker3,followers and broker4.

For the 4 partition of topic2:

The leader of Part 1 is broker4,followers and broker1.

The leader of Part2 is broker2,followers and broker1.

The leader of Part3 is broker3,followers and broker1.

The leader of Part4 is broker1,followers and broker2.

Here is a real example:

The leader of partition 0 in the figure is broker 2, which has three replicas:2,1,3.

In-Sync Replica: in synchronization, that is, which broker is processing synchronization. The ISR of partition 0 is 2 and 1, indicating that all three replica are in a normal state. If there is a broker down, then it will not appear in the ISR.

After stopping the broker1:

The Leader of each partition is used to process read and write requests to that partition.

The followers of each partition is used to copy data asynchronously from its leader.

Kafka dynamically maintains a collection of synchronous replicas (in-sync replicas (ISR)) consistent with Leader, and persists the latest collection of synchronous replicas (ISR) to zookeeper. If there is a problem with the leader, one of the followers of the partition will be elected as the new leader.

So, in a kafka cluster, each broker usually plays two roles: leader in one partition and followers in other partition. Leader is the busiest, handling read and write requests. In this way, the purpose of evenly distributing leader to different broker is to ensure load balancing.

Producer

Producer, as the producer of the message, needs to deliver the message to the specified destination (a partition of a topic) after producing the message. Producer can choose which partition to publish messages to according to the specified algorithm or random way to choose partition.

Consumer

In Kafka, there is also the concept of consumer group, which logically groups some consumer. Because each kafka consumer is a process. So the consumers in a consumer group will probably be made up of different processes distributed on different machines. Each message in a Topic can be consumed by multiple consumer group, but only one consumer in each consumer group can consume the message. So, if you want a message to be consumed by multiple consumer, then the consumer must be in different consumer group. So it can also be understood that consumer group is the logical subscriber of topic.

Each consumer can subscribe to multiple topic.

Each consumer retains the offset it reads to a partition. Consumer retains offset through zookeeper.

Guarantees provided by Kafka

1. If producer sends messages to a specific partition, it will be stored in order, that is, if the sending order is message1, message2, message3. So the offset of these three messages recorded in partition log is message1_offset < message2_offset < message3_offset.

2. Consumer also browses the records in log in an orderly manner.

3. If a topic specifies that replication factor is N, then NMel 1 Broker error is allowed.

Architecture diagram

After introducing the above components, you should now be able to easily understand the architecture diagram of Kafka:

The above is how to analyze the Kafka architecture, the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report