Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the message system Kafka

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

How to analyze the message system Kafka, I believe that many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Kafka is an open source messaging system developed by Linkedin in December 2010, which is mainly used to deal with active streaming data. Active streaming data is very common in web applications, including the pv of the site, what users visit, what content they search, and so on. These data are usually recorded in the form of logs and then processed statistically at regular intervals.

The traditional log analysis system provides a scalable scheme for offline processing of log information, but there is usually a big delay if it is to be processed in real time. The existing queuing system can well handle real-time or near real-time applications, but the unprocessed data is usually not written to disk, which may be problematic for offline applications such as Hadoop (which only processes part of the data in an hour or a day). Kafka is designed to solve the above problems, it can be well applied offline and online.

2. Design goal

(1) the cost of accessing data on disk is O (1). General data is stored on disk using BTree, and the access cost is O (lgn).

(2) high throughput. Even on the average node, hundreds of message can be processed every second.

(3) explicit distribution, that is, all producer, broker and consumer will have multiple, all of which are distributed.

(4) data can be loaded into Hadoop in parallel.

3. KafKa deployment structure

Kafka is an explicit distributed architecture, and producer, broker (Kafka), and consumer can all have multiple. The role of Kafka is similar to caching, that is, caching between active data and offline processing systems. Several basic concepts:

(1) message (message) is the basic unit of communication, and each producer can publish some messages to a topic (topic). If consumer subscribes to this topic, newly released messages will be broadcast to these consumer.

(2) Kafka is explicitly distributed, and multiple producer, consumer and broker can run on a large cluster and provide services as a logical whole. For consumer, multiple consumer can form a group, and this message can only be transmitted to one consumer in a group.

4. Key technical points of KafKa

(1) zero-copy

On Kafka, there are two reasons that can lead to inefficiency: 1) too many network requests and 2) too many byte copies. In order to improve efficiency, Kafka divides message into groups, and each request sends a set of message to the corresponding consumer. In addition, in order to reduce byte copies, sendfile system calls are used. To understand the principle of sendfile, let's first talk about the traditional use of socket to send files to copy:

Sendfile system call:

(2) Exactly once message transfer

How do I record the status of the information processed by each consumer? Only the offset that each consumer has processed data is saved in the Kafka. This has two advantages: 1) the amount of data saved is less. 2) when consumer goes wrong, when you restart consumer to process data, you only need to start processing data from the nearest offset.

(3) Push/pull

Producer pushes data to Kafka (push), and consumer pulls (pull) data from kafka.

(4) load balancing and fault tolerance

There is no load balancing mechanism between Producer and broker.

Zookeeper is used for load balancing between broker and consumer. All broker and consumer are registered with zookeeper, and zookeeper saves some of their metadata information. If a broker and consumer changes, all other broker and consumer are notified.

After reading the above, have you mastered the method of how to analyze the message system Kafka? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report