How to achieve message persistence in Kafka 04/07 Update SLTechnology News&Howtos

How to achieve message persistence in Kafka

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to achieve Kafka message persistence". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Overview of Kafka message persistence

Kakfa relies on the file system to store and cache messages. The conventional wisdom about hard drives is that hard drives are always slow. Can a file system-based architecture provide excellent performance? In fact, the speed of the hard drive depends entirely on how it is used. Meanwhile, Kafka based on JVM memory has the following disadvantages:

The memory overhead of an object is very high, usually twice or more than the data to be stored

With the increase of data in the heap, the speed of GC becomes slower and slower.

In fact, the performance of disk linear write is much higher than that of arbitrary location write. Linear read and write is greatly optimized by the operating system (read-ahead, write-behind, etc.), even faster than random memory read and write. So unlike the common design in which data is cached in memory and then flushed to the hard disk, Kafka writes the data directly to the log of the file system:

Write operation: append the data order to the file

Read operations: reading from a file

The benefits of this:

Read operations do not block write operations and other operations, and data size does not affect performance

The hard disk space is smaller than the memory space limit.

Linear access to the disk, fast, can be saved for a longer time, more stable

2. Analysis of the persistence principle of Kafka.

A Topic is divided into multiple Partition, and each Partition is an append-only log file at the storage level. Messages belonging to a Partition are directly appended to the end of the log file, and the location of each message in the file is called offset (offset).

As shown in the following figure, we created mytopic1 earlier, with three partitions. We can check it in the corresponding log directory.

Kafka logs are divided into index and log (shown in the figure above), which appear in pairs: index files store metadata and log stores messages. The index file metadata points to the migration address of the message in the corresponding log file; for example, 2128 refers to the second piece of data of the log file, with an offset address of 128; and the physical address (specified in the index file) + offset address can be located to the message.

We can use the tools that come with Kafka to view the data information in the log log file:

This is the end of the content of "how to achieve message persistence in Kafka". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.