Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Big data, a good programmer, trains and shares several important questions about kafka

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Good Programmer Big Data Training shares several important questions about kafka:

1. Concept of segment

topic will have one or more partitions, each partition will have multiple segments, the size of the segment can be set in the kafka configuration file, the size of the segment is equal, each segment has multiple index files and corresponding data files composed of

2. Data storage mechanism? (Reasons for fast data writing)

First, the broker receives the data and puts it into the operating system's (linux) cache.

pagecache uses free memory as much as possible, uses sendfile technology to reduce duplication between operating systems and applications as much as possible, and writes data sequentially, with a write speed of up to 600m/s.

3. How does consumer solve the Load Balancer problem?

When the number of consumers in the same group changes, kafka's Load Balancer will be triggered. First, obtain the starting partition number consumed by the consumer, then calculate the number of partitions to be consumed by the consumer, and finally use the hashcode value of the starting partition number to modulate the number of partitions

1. Data distribution strategy

kafka defaults to calling its own partition (DefaultPartition) for partitioning. You can also customize the partition. Custom partitions need to implement the Partition trait and implement the partition method.

How does Kafka ensure that data is not lost?

After kafka receives the data, it will store it according to the number of replicas specified by the topic created. The replica data is synchronized by kafka itself. The multi-replica mechanism ensures the security of the data.

3. Can kafka guarantee the global order of data in topic

Kafka can be ordered within the partition and disordered between partitions.

How do we achieve global order? The easiest way to create a partition topic is to specify the number of partitions as 1

If you want to consume data that has already been consumed

1. Use different groups.

2. With some configuration, it is possible to synchronize data generated online into mirrors, and then process large quantities of data by specific cluster areas.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report