Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Apache top-level project introduction 2-Kafka

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Apache top-level project introduction series-1, let's start with Kafka. Why? Popular + name cool.

Kafka official website is a relatively simple, straightforward website, "kafka is a high throughput distributed messaging system." Kafka started with LinkedIn as the basis for pipline used by linkedin to manage activity flows (PV, user behavior analysis, search situations) and operational data processing.

Because of its distributed and high throughput, it is widely used, such as Cloudera, Hadoop, Storm, Spark etc.

First of all, as a message system, kafka provides basic functions, such as decoupling, sequencing, asynchrony and so on. At the same time, high-quality design concept supports high throughput, provides O (1) time responsibility persistence, data level is above TB/PB, supports offline and real-time processing, that is, docking with hadoop,storm, supports horizontal scale out.

Architecture diagram:

As you can see, kafka is a distributed architecture design (of course, in the DT era, horizontal scale out cannot survive). The former producer concurrently (supports batch) push messages to the kafka specific topic cluster server broker, and each topic contains multiple partition for horizontal expansion. The consumer consumer obtains messages from the broker server pull through consumer group. Kafka manages the cluster configuration through zk, electing leader, and rebalance. The message mode is push/pull.

Let's build a kafka cluster service:

Send via zk, consume messages:

Use java to produce / consume messages:

More straightforward, note here that messages can be sent in batches, not all message middleware can be sent in batches. Batch sending is one of the reasons for high throughput.

Here, stream streams are used to consume payload, and message flow iterators are used without stopping, just like listening for messages.

The reason why kafka is efficient or innovative:

Message deletion management usually message middleware consumes a message and deletes a message, which makes the use of messages very expensive. On the other hand, kafka uses stateless management, introduces message offset, and applies a time-based SLA retention policy, which is deleted when the message exceeds a certain period of time. According to the official website, consuming Kafka messages is very lightweight: come and go. Sounds like takeout, take and go. Even, due to the introduction of offsets, consumers can get messages from any location at will, including re-obtaining messages that have already been consumed.

2. Kafka uses linux sendfile to copy files from linux kernel

3.kafka introduces zk to manage distributed coordination, HA, and fault tolerance. Zk is used to manage kafaka proxy broker. When kafka is added or an agent fails, the zk service will notify producers and consumers.

4. Producer performance, message structure optimization size, and batch delivery

5. Consume this performance: message structure optimization and stateless introduction of cheap amounts, without the need for b + tree indexing.

Generally speaking, kafka performs very well, so it can be regarded as a substitute for common message middleware, especially if managing hadoop,stream. In addition, if you deal with site logs, users use behavior analysis, or offline log is the best choice.

Well, this is it first. It's really unreliable to get up early in the morning to write. Time is tight and the task is heavy. I hope you will forgive me, some pictures are borrowed from the Internet.

Official account: technical geek TechBooster

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report