What are the features of Kafka 07/19 Update SLTechnology News&Howtos

What are the features of Kafka

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article is about what Kafka features are about. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

The birth of Kafka: open source by linked-in

Kafka- is a framework to solve this kind of problem, which realizes the seamless connection between producers and consumers.

Kafka- 's highly productive distributed messaging system (A high-throughput distributed messaging system)

Kafka feature: it describes its design as unique, so let's take a look at how it excels:

Fast: a single kafka service can process hundreds of MB data sent by thousands of clients per second.

Scalability: a single cluster can act as a big data processing hub, focusing on various types of business

Persistence: messages are persisted to disk (can handle TB data level data but still maintain extremely high data processing efficiency) and have a backup fault tolerance mechanism

Distributed: focus on big data domain, support distributed, cluster can handle million-level messages per second

Real-time: the messages produced can be consumed by consumers immediately.

We can see that the messages in each Partition are ordered, and the production messages are constantly appended to the Partition log, each of which is assigned a unique offset value.

The Kafka cluster saves all messages, regardless of whether the message is consumed or not; we can set the expiration time of the message, and only the expired data will be automatically cleared to free disk space. For example, if we set the message expiration time to 2 days, all messages within these 2 days will be saved to the cluster, and the data will only be cleared after more than two days.

There is only one metadata that the Kafka needs to maintain-- consume the offset value of the message in the Partition, and for each message consumed by the Consumer, the offset will be incremented by 1. In fact, the status of the message is completely controlled by Consumer, and Consumer can track and reset this offset value so that Consumer can read messages from anywhere.

There are many considerations for storing the message log in the form of Partition. First, it is convenient to expand in the cluster. Each Partition can be adjusted to adapt to its machine, and a topic can be composed of multiple Partition, so the whole cluster can adapt to data of any size. Second, concurrency can be improved because it can be read and written in units of Partition.

Distributed:

These Partitions are distributed on each server in the cluster, and each Partition can have multiple backups in the cluster, and this number of backups is configurable.

Each Partition has a leader server, while the other backed-up server is called followers, and only the leader server handles all read and write requests on this Partition, while other followers passively replicates data on the leader. If a leader dies, a server in followers is automatically upgraded to leader. So, in fact, each server in the cluster acts as a leader server for Partition and a follower server for other Partition.

Producers:

Producer can publish messages to a topic according to its own choice, and Producer can also decide which Partition of this topic to publish messages to. Of course, we can choose the simple partition selection algorithm provided by API, or we can implement a partition selection algorithm by ourselves.

Consumers:

There are usually two modes of message delivery, queuing (queue) and publish-subscribe (publish-subscribe)

Queuing: each Consumer takes a message from the message queue

Pub-scrib: messages are broadcast to each Consumer

Kafka implements both patterns-ConsumerGroup-by providing an abstraction of Consumer. Consumer instances need to give themselves a ConsumerGroup name, and if all instances use the same ConsumerGroup name, then these Consumer will work in queuing mode; if all instances use different ConsumerGroup names, then they will work in public-subscribe mode.

As shown in the following figure: the cluster with two server has a total of four p0~p3 Partition and two Consumer Group. The Partition is consumed in queuing mode inside the Group, and the pub-scrib mode between the Group.

Message ordering:

How does Kafka ensure the sequence of message consumption? As mentioned earlier, Partition, the order of messages in a Partition is ordered, but Kafka only ensures that messages are ordered in one Partition. If you want to order messages in the entire topic, then a topic can only set one Partition.

Thank you for reading! This is the end of this article on "what are the features of Kafka?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.