Kafka Learning Notes: collation of knowledge points (1) 07/01 Update SLTechnology News&Howtos

Kafka Learning Notes: collation of knowledge points (1)

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

I. kafka Architecture 1.1 Topology

As shown below:

Figure .1

1.2 related concepts

As shown in figure. 1, kafka-related nouns are explained as follows:

1.producer: a message producer, a terminal or service that publishes messages to a kafka cluster. 2.broker: the server contained in the kafka cluster. 3.topic: the category to which each message published to the kafka cluster belongs, that is, kafka is topic-oriented. 4.partition: partition is a physical concept, and each topic contains one or more partition. The unit assigned by kafka is partition. 5.consumer: a terminal or service that consumes messages from a kafka cluster. 6.Consumer group: in high-level consumer API, each consumer belongs to a consumer group, and each message can only be consumed by one Consumer in the consumer group, but can be consumed by multiple consumer group. 7.replica: a copy of partition to ensure the high availability of partition. 8.leader: a character in replica, producer and consumer only interact with leader. 9.follower: a role in replica that copies data from leader. 10.controller: one of the servers in the kafka cluster, used for leader election and various failover. 12.zookeeper: kafka stores the meta information of the cluster through zookeeper. 1.3 zookeeper nodes

The storage structure of kafka in zookeeper is shown in the following figure:

Figure .2

II. Producer release message 2.1Writing method

Producer uses push mode to publish messages to broker, and each message is append to patition, belonging to sequential write disk (sequential write disk efficiency is higher than random write memory, ensuring kafka throughput).

2.2 message routing

When producer sends a message to broker, it chooses which partition to store it to based on the partitioning algorithm. Its routing mechanism is as follows:

1. If patition is specified, it is used directly; 2. Patition is not specified but key is specified. A patition 3 is selected by hash the value of key. Patition and key are not specified, and a patition is selected by polling.

Attach the Java client partition source code, clear at a glance:

/ / create message instance public ProducerRecord (String topic, Integer partition, Long timestamp, K key, V value) {if (topic = = null) throw new IllegalArgumentException ("Topic cannot be null"); if (timestamp! = null & & timestamp)

< 0) throw new IllegalArgumentException("Invalid timestamp " + timestamp); this.topic = topic; this.partition = partition; this.key = key; this.value = value; this.timestamp = timestamp;} //计算 patition，如果指定了 patition 则直接使用，否则使用 key 计算 private int partition(ProducerRecord record, byte[] serializedKey , byte[] serializedValue, Cluster cluster) { Integer partition = record.partition(); if (partition != null) { List partitions = cluster.partitionsForTopic(record.topic()); int lastPartition = partitions.size() - 1; if (partition < 0 || partition >

LastPartition) {throw new IllegalArgumentException ("Invalid partition given with record:% d is not in the range [0.% d].", partition, lastPartition);} return partition;} return this.partitioner.partition (record.topic (), record.key (), serializedKey, record.value (), serializedValue, cluster) } / / Select patition public int partition (String topic, Object key, byte [] keyBytes, Object value, byte [] valueBytes, Cluster cluster) {List partitions = cluster.partitionsForTopic (topic); int numPartitions = partitions.size (); if (keyBytes = = null) {int nextValue = counter.getAndIncrement (); List availablePartitions = cluster.availablePartitionsForTopic (topic); if (availablePartitions.size () > 0) {int part = DefaultPartitioner.toPositive (nextValue)% availablePartitions.size () Return availablePartitions.get (part) .partition ();} else {return DefaultPartitioner.toPositive (nextValue)% numPartitions;}} else {/ / A pair of keyBytes hash to select a patition return DefaultPartitioner.toPositive (Utils.murmur2 (keyBytes))% numPartitions;} 2.3.Writing process

The producer write message sequence diagram is as follows:

Figure. 3

Process description:

1. Producer first finds the leader of the partition from the "/ brokers/.../state" node of the zookeeper. Producer sends the message to the leader 3. Leader writes the message to the local log 4. Followers from the leader pull message, after writing to the local log, leader sends ACK 5. After leader receives the ACK of replica in all ISR, add HW (high watermark, the last commit's offset) and send ACK to producer

Would like to know more about technology sharing concerns: mingli.com

Please add the ball if you need it: 2042849237

2.4 producer delivery guarantee

In general, there are three situations:

1. At most once messages may be lost, but will never be transmitted repeatedly. 2. At least one messages will never be lost, but may be repeated. 3. Exactly once each message is definitely transmitted once and only once

When producer sends a message to broker, once the message is commit, it will not be lost because of the existence of replication. However, if the communication is interrupted due to network problems after producer sends data to broker, then Producer cannot determine whether the message has been commit. Although Kafka can't determine what happened during a network failure, producer can generate something similar to a primary key and idempotently retry multiple times in the event of a failure, thus achieving Exactly once, but it hasn't been implemented yet. So by default, a message from producer to broker ensures At least once. You can send At most once asynchronously by setting producer.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.