In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Today, I will talk to you about how the information in Kafka is consumed, which may not be well understood by many people. In order to make you understand better, the editor has summarized the following contents for you. I hope you can gain something according to this article.
It is easy to use Kafka, but it is troublesome to build, maintain, and tune Kafka clusters. The Kafka cluster needs to be maintained by someone, so don't think you can easily do the job. "some inaccurate but meaningful analogies will be used for some of the terms of Kafka below.
One of the topics we are going to discuss today is how Kafka can achieve continuous consumption, breakpoint continuation and parallel consumption for multiple processes of a single program, and independent of each other for multiple programs.
A Kafka can have multiple different queues. Let's call this queue Topic, assuming that one of the queues is shown in the following figure:
The information goes in on the right and comes out on the left. If this is a list of Redis, after it pops up a message, the queue looks like this:
The leftmost message 1 is missing. So even if the program closes and reopens immediately after consuming message 1, the program will continue to consume from message 2 and will not repeat it twice.
But what if I have two programs? Program 1 reads each piece of data and then transfers it to the database. Program 2 reads each piece of data and then checks to see if it is related to keys. In this case, information 1 should be consumed by program 1 as well as by program 2. But it is obvious that the above scheme will not work. When program 1 consumes information 1, program 2 can no longer get it.
So, in Kafka, the information stays in the queue, but for each program, there is a separate token to record which piece of data is currently being consumed, as shown in the following figure.
When program 1 wants to read the next piece of data in Kafka, Kafka first moves the mark of the current position one bit to the right and returns the new value. The combination of tag move and return can be regarded as an atomic operation, and there is no problem of repeated reading.
Program 1 and program 2 use different tags, so which value their respective tags point to does not affect each other.
When adding a program 3, you only need to add one more tag. The new tag is also not affected by the first two tags.
This makes it possible for multiple different programs to read Kafka without affecting each other.
Now if you think program 1 is too slow to consume and run program 1 three times at the same time, then because marking and shifting are atomic operations, even if you seem to read the Kafka at the same time, the internal Kafka will "queue" them so that the results they return are not repeated or omitted.
If you read the Kafka tutorials online, you will find that they mention something called Offset, which is actually the markup that points to the current data in each program mentioned in this article.
You will also see a keyword called Group, which actually corresponds to programs 1, 2, and 3 of this article.
For the same queue, if multiple programs use different Group consumption, the data they read will not interfere with each other.
For the same queue, when multiple processes with the same Group consume data, it looks like they are lpop the Redis.
Finally, in the online article about Kafka, you will definitely see a word called Paritition or Chinese slicing. And you will find that you do not understand this thing.
It's all right. Forget it. You just need to know how many Partition there are in a Topic, and the maximum number of processes you can start to read the same Group.
If a Topic has three Partition, then you can only open up to three processes to read the same Group at the same time. Topic if there are 5 Partition, then you can only open up to 5 processes to read the same Group.
After reading the above, do you have any further understanding of how the information in Kafka is consumed? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.