Why is Kafka so good? 07/01 Update SLTechnology News&Howtos

Why is Kafka so good?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "Why Kafka is so powerful". The content in the article is simple and clear, easy to learn and understand. Please follow the editor's train of thought to study and learn "Why Kafka is so powerful".

1 introduction to Kafka

1.1 Overview of Kafka

Kafka architecture

Kafka is a distributed message queue based on publish / subscribe mode. Depending on its strong throughput, Kafka is mainly used in the field of big data real-time processing. It plays an important role in the process of data acquisition, transmission and storage.

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Apache Kafka, written by Scala, is an open source messaging system project developed by the Apache Software Foundation. The goal of the project is to provide a unified, high-throughput, low-wait platform for processing real-time data.

Kafka is a distributed message queue. Kafka classifies messages according to Topic when they are saved. Kafka cluster is composed of multiple Kafka instances, and each instance Server is called broker.

Both Kafka cluster and Consumer rely on ZooKeeper cluster to store some meta information to ensure system availability.

1.2 Kafka benefits

Support multiple producers and consumers.

Support the horizontal expansion of broker.

Replica set mechanism to achieve data redundancy and ensure that data is not lost.

Classify the data through topic.

By sending compressed data in batches, the overhead of data transmission is reduced and the amount of data swallowed is increased.

Support multiple modes of messages, messages are based on disk to achieve data persistence.

High-performance processing of information, in the case of big data, can guarantee sub-second message delay.

A consumer can support multiple topic messages.

The consumption of CPU, memory and network is relatively small.

Supports data replication and mirrored clusters across data centers.

1.3 shortcomings of Kafka

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Because it is sent in bulk, so the data can not reach the real time.

Message ordering within the unified partition can only be supported, but global message ordering cannot be achieved.

Monitoring is not perfect and plug-ins need to be installed.

Data is lost and transactions are not supported.

Data may be consumed repeatedly and messages will be out of order. It is available to ensure that the messages within a fixed partition are orderly, but if a topic has multiple partition, it cannot be guaranteed to be orderly. It needs the support of zookeeper. Generally, topic needs to be created manually, and its deployment and maintenance are generally higher than mq.

1.4 Kafka architecture

Broker: a kafka server is a broker. A cluster consists of multiple broker. A broker can hold multiple topic.

Producer: the message producer is the client that sends messages to Kafka broker.

Consumer: message consumers, pull messages from Kafka broker to consume. Messages can be consumed at an appropriate rate according to the consumption power of Consumer.

Topic: can be understood as a queue, producers and consumers are oriented to a topic.

Partition: in order to achieve scalability, a very large topic can be distributed to multiple broker, a topic can be divided into multiple partition, each partition is an ordered queue, a bit balanced producer allocation mechanism.

Replication: in order to ensure that when a node in the cluster fails, the partition data on that node is not lost and the kafka can continue to work, kafka provides a copy mechanism. Each partition of a topic has several replicas, one leader and several follower.

Leader: a partition has a Leader, and the object that the producer sends data, and the object of consumer consumption data is leader.

Follower: a partition has a Follower that synchronizes data from leader in real time, keeping it synchronized with leader data. When a leader fails, a follower becomes the new follower. Note that the number of copies in Kafka cannot exceed the number of Broker!

Consumer Group: the consumer group consists of multiple consumer. Each consumer in the group is responsible for consuming data from different segments, and a partition can only be consumed by consumers in one group; consumer groups do not affect each other. All consumers belong to a consumer group, that is, the consumer group is a logical subscriber.

Offset: consumers can specify the starting offset when consuming a message in a topic.

1.5 ZooKeeper action

ZooKeeper plays an important role in Kafka and generally provides the following functions:

1.5.1 Broker Registration

Broker is distributed and independent of each other, but requires a registration system that can manage the Broker in the entire cluster, such as ZooKeeper.

1.5.2 Topic registration

Messages of the same Topic in Kafka are divided into multiple Partition and distributed on multiple Broker. These Partition information and the corresponding relationship with Broker are also maintained by Zookeeper and recorded by special nodes.

1.5.3 producer load balancing

The same Topic message is partitioned and distributed across multiple Broker, so the producer needs to send the message to these distributed Broker reasonably.

The old-fashioned layer 4 load balancer determines an associated Broker for the producer based on its IP address and port. Generally, a producer will only correspond to a single Broker, but the amount of messages generated by each producer and the message storage of each Broker in the actual system are different.

Using Zookeeper for load balancing, because each Broker starts, the Broker registration process will be completed, and the producer will dynamically perceive the change of the Broker server list through the changes of the node, so that the dynamic load balancing mechanism can be implemented.

1.5.4 Consumer load balancing

Consumers in Kafka also need to carry out load balancing to realize that multiple consumers reasonably receive messages from corresponding Broker servers. Each consumer packet contains several consumers, and each message is sent to only one consumer in the packet. Different consumer groups consume messages under their own specific Topic without interference.

1.5.5 relationship between divisions and consumers

Kafka assigns each Consumer Group a globally unique Consumer share within Group ID,Group. This ID,Kafka stipulates that each partition information can only be consumed by one Consumer in the same group. Record the relationship between partition and Consumer in Zk. Once each consumer has determined the consumption right for a message partition, he or she needs to write its Consumer ID to the temporary node of the corresponding message partition of Zookeeper.

1.5.6 Offset record of message consumption progress

When Consumer consumes the specified message partition, it needs to record the consumption progress Offset of the partition message to Zookeeper periodically, so that after the Consumer is restarted or other Consumer takes over the message consumption of the message partition, the message consumption can continue from the previous progress.

1.57 Consumer Registration

The process of allocating Consumer and message partitions to allow messages from different partitions under the same Topic to be consumed by multiple Consumer as evenly as possible.

After Consumer starts, create a node under ZK, and each Consumer will register and listen to the changes of Consumer in Consumer Group, in order to ensure Consumer load balancing.

Consumer will listen on the Broker list, and Consumer load balancing will be carried out when changes occur.

2 Kafka generation process

2.1 write mode

Producer uses push mode to publish messages to broker, and each message is append to patition, belonging to sequential write disk, which is at least 3 orders of magnitude faster than random writing!

2.2 Partition Partition

2.2.1 introduction to Partition

Messages are sent to a topic, which is essentially a directory, while topic consists of some partition log Partition Logs, which is organized as shown in the following figure:

Partition generation

You can see that the messages in each Partition are ordered, and the production messages are constantly appended to the Partition log, each of which is assigned a unique offset value.

Consumer

Partitioning can be easily extended in the cluster and concurrency can be improved.

Image understanding:

The design of Kafka comes from life, just like for road transportation, different starting points and destinations need to build different highways (themes), highways can provide multiple lanes (zoning), highways with large flow (theme) build more lanes (zoning) to ensure smooth flow, and roads with low flow have fewer lanes to avoid waste. Toll booths are like consumers. When there are more cars, they charge more together to avoid traffic jams. When there are few cars, it is better to drive a few cars along the road.

2.2.2 principles of zoning

We need to encapsulate the data sent by producer into a ProducerRecord object.

Data encapsulation

In the case of a specified partition, the specified value is directly taken as the partiton value.

If the partition value is not specified but there is a key, the hash value of the key is offset with the partition number of the topic to get the partition value.

In the case of neither partition value nor key value, an integer is randomly generated on the first call (incremented on this integer for each subsequent call), and this value is offset with the total number of partition available for topic to get the partition value, which is often called round-robin algorithm.

2.3 Kafka file storage mechanism

Kafka storage structure

Messages in Kafka are classified by topic, producers and consumers are oriented to topic, topic is only a logical concept, while Partition is a physical concept. Each Partition corresponds to a log file, each partition uses .index` to store data index, and `.log stores data. The metadata in the index file points to the physical offset address of the Message in the corresponding log file (see kaldi, Neo4j).

In order to prevent the inefficiency of data location caused by the large log file, Kafka adopts slicing and indexing mechanism, which divides each partition into multiple segment. Each segment corresponds to .index` and `.log. These files are located in a folder with the naming convention: topic name + partition serial number. For example, if the topic first has three partitions, its corresponding folders are first-0, first-1, and first-2.

100000000000000000000.index 200000000000000000000.log 300000000000000170410.index 400000000000000170410.log 500000000000000239430.index 600000000000000239430.log

Note: the index and log files are named after the offset of the first message in the current segment.

Data search process

2.4 how to ensure the sequential execution of messages

2.4.1 out of order

Kafka a topic, a partition, a Consumer, but Consumer internal multithreaded consumption, so the data will also be out of order.

Multithreaded consumption

The sequential data is written into different partition, and different consumers consume it, but the execution time of each Consumer is not fixed, so there is no guarantee that the Consumer that reads the message first will complete the operation first. In this case, the message will not be executed in order, resulting in an error in the order of data.

Multiple consumers

2.4.2 solution

Ensure that the same message is sent to the same partition, a topic, a partition, a consumer, and internal single-threaded consumption.

The information written to the same Partition must be in order.

If you specify the same key,key for the information, it must be written to the same partition.

Reading information from the same Partition must be orderly.

Single-thread consumption

On the basis of 1, ID is mapped to different queues according to information on a Consumer to speed up consumption.

Memory queue

4 data reliability

4.1 message passing semantics

The message passing semantic message delivery semantic, Kafka is to ensure that messages are transmitted between producer and consumer. There are three types of transmission guarantees (delivery guarantee):

At most once: at most once, the message may be lost, but it will never be retransmitted.

At least once: at least once, the message is never lost, but it may be transmitted repeatedly.

Exactly once: pass it exactly once. The message is processed and only processed once. No loss, no repetition, just once.

Ideally, you would want the system's message delivery to be strict exactly once, but it's hard to do that. Next, we will talk about it in accordance with the flow of the message.

4.2 Information from producer to Broker

4.2.1 producer information sent to Broker

The general steps are as follows:

Producer finds the Leader metadata of the target Partition from ZK.

Producer sends a message to Leader.

Leader accepts message persistence and then chooses how to synchronize the Follower based on the acks configuration.

Followder replies ack to Leader after synchronizing the data as mentioned earlier.

After Leader synchronizes with Follower, Leader replies ack to producer.

For Leader replying to ack,Kafka, users are provided with three levels of reliability, which are weighed against requirements for reliability and latency.

Request.required.acks = 0

Producer does not wait for the ack of broker, providing a minimum delay. Broker has returned before it has been written to disk, and data may be lost in the event of broker failure, corresponding to At Most Once mode.

As long as there is no successful information will be lost, the general production is not used.

Request.required.acks = 1

This is the default value. Producer waits for the leader of broker's ack,partition to return ack after the disk is successfully dropped. If the leader fails before the follower synchronization succeeds, the data will be lost. It is considered that the leader returned information will be successful.

Request.required.acks =-1 / all

Producer waits for the leader and follower (in ISR) of broker's ack,partition to be successfully unloaded before returning ack.

However, if the leader fails when the leader receives the message and ok,follower receives the message but sends the ack, the producer will re-send a message to the follower.

Corresponds to the At Least Once mode.

4.2.2 how to ensure idempotency

If the business needs data Exactly Once, it can only be de-duplicated downstream in earlier versions of Kafka. Now it introduces an idempotency, meaning that no matter how many duplicate messages the producer sends, the server will persist only one piece of data.

At Least Once + idempotency = Exactly Once

Enable idempotency. Enable.idompotence= true in the producer parameters. Producers that enable idempotency will be assigned a PID at initialization, and messages with the same Partition will be cached with Sequence Number,Broker to determine uniqueness. However, if the PID is restarted, it will change, and different partition also have different primary keys, idempotency cannot guarantee the Exactly Once of cross-partition sessions.

4.3 Kafka Broker information on disk

Data downloading process

After Kafka Broker receives the message, how to set the order is set by producer.type, usually two values.

Sync, the default mode, data must be finally closed before it can be counted as OK.

Async, asynchronous mode, the data is refreshed to the Page Cache of OS and returned. If there is a sudden problem with the machine, the information is lost.

4.4 Consumer consumption data from Kafka Broker

Consumption data

Consumer works as a Consumer Group consumer group, where one or more consumers form a group that consumes a topic. Each partition can only be read by one consumer in the group at a time, but multiple group can consume the partition at the same time. If one consumer fails, other group members will automatically load balance the partition read by the previously failed consumer. Consumer Group pulls messages from Broker to consume messages in two stages:

Get the data and submit the Offset.

Start processing the data.

If you submit the offset first and then process the data, there may be an exception in the processing of the data, which may lead to data loss. If you process the data first and then submit the offset, if you fail to submit the offset, it may lead to repeated consumption of information.

PS:

The drawback of the pull schema is that if kafka does not have data, consumers may fall into a loop and return empty data all the time. In view of this, consumers of Kafka will pass a duration parameter timeout when consuming data. If there is no data available for consumption, consumer will wait for a period of time before returning. This period of time is called timeout.

5 Kafka partition allocation policy

For consumers in the same group.id, there is a certain partition allocation strategy for message consumption in multiple partition in a topic.

There are two partition allocation policies in Kafka, which are set by partition.assignment.strategy.

RangeAssignor range partitioning policy, which is also the default mode.

RoundRobinAssignor allocation policy, polling partition mode.

5.1 RangeAssignor range Partition Policy

The Range range partitioning policy is for each topic. First, the partitions in the same topic are sorted by sequence number, and the consumers are sorted alphabetically. If there are now 10 partitions and 3 consumers, the sorted partition will be p0~p9. After the ranking of consumers, it will be C1-0, C2-0, C3-0. Determine how many partitions each consumer should consume by the number of Partitions / Consumer. If the division is not complete, then the first few consumers will spend an extra partition.

Consumer consumption zone C1-0 consumption p0, 1, 2, 3, C2-0 consumption 4, 5, 6, C3-0 consumption 7, 8, 9

Disadvantages of Range range partitioning:

As the above is only for 1 topic, C1-0 consumers spend more than 1 zone impact is not very great. If there are more than N topic, then for each topic, consumers C1-0 will consume one more partition. The more topic, the more C1-0 consumption will significantly consume N more partitions than other consumers. This is an obvious drawback of Range range partitioning.

5.2 RoundRobinAssignor polling Partition Policy

The RoundRobin polling partitioning strategy is to list all partition and all consumer, then sort them by hascode, and finally assign partition to each consumer through the polling algorithm. Polling partitions are divided into the following two situations:

Same Consumer subscription information within the same Consumer Group

Consumer subscription information is different within the same Consumer Group

5.2.1 Consumer subscription information is the same within Consumer Group

If all consumers subscribe to the same message within the same consumer group, the partition distribution of the RoundRobin policy will be uniform.

For example, three consumers in the same consumer group, C0, C1, and C2, all subscribe to two topics t0 and T1, and each topic has three partitions (p0, p1, p2), so the subscribed partitions can be identified as t0p0, t0p1, t0p2, t1p0, t1p1, t1p2. The final partition allocation result is as follows:

Consumer consumption Division C0 consumption t0p0, t1p0 Division C1 consumption t0p1, t1p1 Division C2 consumption t0p2, t1p2 Division

5.2.1 Consumer subscription information is different within Consumer Group

Within the same consumer group, the subscribed messages are different, so partition allocation is not a complete polling allocation, which may lead to uneven partition distribution. If a consumer does not subscribe to a topic within a consumer group, the consumer will not be assigned any partitions to that topic when assigning partitions.

For example, there are three consumers C0, C1, C2 in the same consumer group, and they subscribe to three topics t0, T1, and T2. These three topics have 1, 2 and 3 partitions respectively (that is, T1 has 1 partition (p0), T1 has 2 partitions (p0, p1), and T2 has 3 partitions (p0, p1, p2). That is, all the partitions subscribed by the whole consumer can be identified as t0p0, t1p0, t1p1, t2p0, t2p1, t2p2. Then consumer C0 subscribes to topic t0, consumer C1 subscribes to topics t0 and T1, and consumer C2 subscribes to topics t0, T1 and T2. The final partition allocation result is as follows:

Consumer consumption Division C0 consumption t0p0 Division C1 consumption t1p0 Division C2 consumption t1p1, t2p0, t2p1, t2p2 Division

6 Kafka efficient read and write

Kafka can support millions of TPS due to the following features.

6.1 Sequential read and write data

The information is stored in the hard disk, which is composed of many discs. The microscope will see that the surface of the disk is uneven, the protruding place is magnetized to represent the number 1, and the concave place is not magnetized to represent the number 0. Therefore, the hard disk can store text, pictures and other information in binary system.

Disk plan

The image above is the actual picture of the hard disk, and the internal structure may not be understood. Let's take a look at an image:

Disk internal diagram

The system reads data from the disk surface through the magnetic head, and the flying height of the magnetic head on the disk surface is only 1/1000 of the diameter of human hair.

The disk in the hard disk is similar to the CD disc in appearance. A disk has two sides, each of which can store data.

Each disk is divided into super many concentric tracks, and the radius of the concentric circle is different.

The same track on all disks forms a cylinder, and the same sector of the same track is called a cluster. The data is read and written according to the cylinder from top to bottom, and after one cylinder is full, it is moved to the next sector to start writing data.

A track is an arc (sector) divided into segments, each of which is used to store 512 bytes and other information. Because the Radian of the sector of the concentric circle is the same but the radius is different, the linear speed of the outer ring is larger than that of the inner circle.

It is too inefficient for the system to read one sector at a time, so the operating system reads data according to block. A block (block) is generally composed of multiple sectors. The size of each piece is 4~64KB.

Page page, the default 4KB, the operating system often communicates with two storage devices, memory and hard disk, similar to the concept of blocks, which requires a virtual basic unit. So with memory operation, it is the concept of a virtual page to act as the minimum unit. Dealing with hard drives is based on blocks as the smallest unit.

Sector: the minimum read and write unit of a hard disk

Block / cluster: the smallest unit of the operating system for reading and writing to the hard disk

Page: the smallest unit of operation between memory and the operating system.

The read / write request completion process of a disk access consists of three actions:

Track finding: the average time it takes for the head to move to the track where the data is located is about 10ms.

Rotation delay: the time required for disk rotation to move the sector of the requested data below the read-write head, depending on the speed of the disk. If the disk is 5400 revolutions per minute, the average is about 5 ms.

Data transfer: the head bit goes from the first position of the target sector to the time it takes to access all the data. If the track at 5400 rpm has 400 sectors, it will take me time to access only one 0.0278ms.

It can be found that the reading time is mainly in the first two, and if I read it sequentially, the seek and rotation delay can be used only once. On the other hand, if it is read randomly, it may experience multiple seek and rotation delay, and the difference between them is almost three orders of magnitude.

Read and write randomly and sequentially in disk and memory

6.2 Memory Mapped Files memory Mapping Fil

Virtual memory systems divide virtual memory into blocks of fixed size called virtual pages (Virtual Page,VP), which in general defaults to 4KB. Similarly, physical memory is also split into physical pages (Physical Page,PP), which is also 4KB.

The server can directly use the Page of the operating system to achieve the mapping of physical memory to files. The user will read and write data directly to Page, and the operating system will automatically synchronize the operation of physical memory to the hard disk according to the mapping. Realize the function of reading and writing memory sequentially.

The disadvantage is also mentioned when the Broker information is off the disk, it is not true that the disk may lead to data loss.

Memory mapping

6.3 Zero Copy

6.3.1 Direct memory access DMA

CPU issues instructions to operate IO for read and write operations. In most cases, it just reads the data into memory and then transfers it from memory to IO, so the data can actually go through CPU.

The emergence of Direct Memory Access is to speed up the input / output of bulk data. DMA refers to the interface technology in which external devices exchange data with system memory directly without CPU. In this way, the speed of data transmission depends on the working speed of the memory and peripherals.

If only DMA is used to transfer data without replicating data through CPU, we call it zero-copy Zero Copy. Using Zero Copy technology, the time-consuming performance is at least halved.

6.3.2 comparison of Kafka reading and writing

Zero copy

As shown above, the black process is a useless Zero Copy technical process:

DMA transmission, the disk reads data to the operating system memory Page Cache area.

CPU handling, data is copied from the Page Cache area to the user memory area.

CPU transfers data from the user memory area to the Socket Cache area.

DMA transmission, data transfer from the Socket Cache area to the NIC network card cache area.

The red process is based on Zero Copy technology process:

DMA transmission, the disk reads data to the operating system memory Page Cache area.

DMA transmission, data transfer from the system memory Page Cache area to the NIC network card cache area.

6.4 Batch Deal

When consumers pull data, Kafka does not send data one by one, but sends it in batches to deal with it, which can save network transmission and increase the TPS of the system. However, there is also a disadvantage that our data is not really processed in real time, but the real real-time still depends on Flink.

Thank you for your reading, the above is the content of "Why Kafka is so powerful", after the study of this article, I believe you have a deeper understanding of why Kafka is so powerful, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.