Linux read and write Mechanism and how to optimize it 04/21 Update SLTechnology News&Howtos

Linux read and write Mechanism and how to optimize it

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the Linux reading and writing mechanism and how to optimize the relevant knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that after reading this Linux reading and writing mechanism and how to optimize the article will have a harvest, let's take a look.

Caching

Cache is a component used to reduce the average time required for high-speed devices to access low-speed devices. File read and write involves computer memory and disk, and the speed of memory operation is much faster than that of disk. If you directly operate the disk every time you call read,write, on the one hand, the speed will be limited, on the other hand, it will reduce the life of the disk, so the operating system will cache the data whether it is a read or write operation to the disk.

Page Cache

Page cache (Page Cache) is a buffer between memory and files. It is actually a memory area. All files IO (including network files) interact directly with page cache. The operating system maps a file to the page level through a series of data structures, such as inode, address_space, and struct page. We will not discuss these specific data structures and the relationship between them. Just know that page cache exists and it plays an important role in file IO. To a large extent, the optimization of file read and write is to optimize the use of page cache.

Dirty Page

The page cache corresponds to an area in the file. If the page cache and the corresponding file area are inconsistent, the page cache is called a Dirty Page. If you modify the page cache or create a new page cache, dirty pages will be generated as long as the disk is not scrubbed.

View page cache size

There are two ways to view the page cache size on linux. One is the free command.

$free total used free shared buffers cachedMem: 20470840 1973416 18497424 270208 1202864 Muhammad + buffers/cache: 500344 19970496Swap: 00

The cached column is the page cache size in Byte

The other is to view / proc/meminfo directly. Here we only focus on two fields.

Cached: 1202872 kBDirty: 52 kB

Cached is the page cache size, Dirty is the dirty page size

Dirty page writeback parameter

Linux has some parameters that can change the operating system's write-back behavior to dirty pages.

$sysctl-a 2 > / dev/null | grep dirtyvm.dirty_background_ratio = 10vm.dirty_background_bytes = 0vm.dirty_ratio = 20vm.dirty_bytes = 0vm.dirty_writeback_centisecs = 500vm.dirty_expire_centisecs = 3000

Vm.dirty_background_ratio is the percentage of dirty pages that can be filled in memory. When the total size of dirty pages reaches this ratio, the system background process will start brushing dirty pages to disk (vm.dirty_background_bytes is similar, but set by the number of bytes); vm.dirty_ratio is an absolute dirty data limit, and the percentage of dirty data in memory cannot exceed this value. If the dirty data exceeds this amount, new IO requests will be blocked until the dirty data is written to disk; vm.dirty_writeback_centisecs specifies how often the dirty data is written back, in 1% seconds Vm.dirty_expire_centisecs specifies the time for dirty data to survive (in 1% seconds). For example, here it is set to 30 seconds. When the operating system writes back, if the dirty data is in memory for more than 30 seconds, it will be written back to disk.

These parameters can be modified by commands such as sudo sysctl-w vm.dirty_background_ratio=5, requiring root permission, or by executing echo 5 > / proc/sys/vm/dirty_background_ratio under the root user.

File read and write process

With the concept of page caching and dirty pages, let's look at the reading and writing process of the file.

Read document 1. The user initiates the read operation 2. Operating system lookup page cache a. If it misses, a page fault exception is generated, then a page cache is created and the corresponding page fill page cache b. 0 is read from disk. If hit, the content to be read is returned directly from the page cache. User read call finishes writing file 1. The user initiates the write operation 2. Operating system lookup page cache a. If it misses, a page fault exception is generated, and then a page cache is created, and the content passed in by the user is written to the page cache b. If it is hit, the content passed in by the user is directly written to the page cache 3. The user write call completes 4. The page is modified into a dirty page, and the operating system has two mechanisms to write the dirty page back to disk 5. 5. The user manually calls fsync () 6. Dirty pages are regularly written back to disk by the pdflush process

There is a corresponding relationship between the page cache and the disk file, which is maintained by the operating system. The read and write operations to the page cache are completed in kernel mode, which is transparent to the user.

Optimization of document Reading and Writing

Different optimization schemes are suitable for different usage scenarios, such as file size, frequency of reading and writing, etc. Here, we do not consider the scheme of modifying system parameters. There are always gains and losses in modifying system parameters, and we need to choose a balance point. This is too relevant to the business, such as whether strong consistency of data is required, whether data loss is tolerated, and so on. The idea of optimization has the following two points:

1. Maximize the use of page cache

two。 Reduce the number of system api calls

The first point is easy to understand. Try to make every IO operation hit the page cache, which is much faster than operating the disk. The system api mentioned in the second point is mainly read and write. Because system calls go from user mode to kernel state, and some are accompanied by copies of in-memory data, reducing system calls in some scenarios will also improve performance.

Readahead

Readahead is a non-blocking system call that triggers the operating system to pre-read the contents of the file into the page cache and returns immediately. The function prototype is as follows

Ssize_t readahead (int fd, off64_t offset, size_t count)

Under normal circumstances, calling read immediately after calling readahead does not improve the reading speed. We usually call readahead in batch reads or a period of time before reading. Assuming the following scenario, we need to read 1000 1m files in a row. There are two scenarios below. The pseudo code is as follows.

Directly call the read function char* buf = (char*) malloc (10: 1024: 1024); for (int I = 0; I read (fd, buf, size); / / do something with buf close (fd);} call readahead first and then call readint* fds = (int*) malloc (sizeof (int) * 1000); int* fd_size = (int*) malloc (sizeof (int) * 1000); for (int I = 0; I for (int I = 0) I read (fds [I], buf, fd_ Sze [I]); / / do something with buf close (FDS [I]);}

If you are interested, you can write the code and actually test it. It is important to note that you must write back the dirty page and empty the page cache before testing. Execute the following command

Sync & & sudo sysctl-w vm.drop_caches=3

Check the Cached and Dirty entries in / proc/meminfo to confirm whether it is valid.

Through the test, it is found that the reading speed of the second method is about 10% more than that of the first method. In this scenario, readahead is executed immediately after batch execution of read, and the optimization space is limited. If there is a scenario that can call readahead some time before read, it will greatly improve the reading speed of read itself.

This scheme actually makes use of the page cache of the operating system, that is, triggering the operating system to read files into the page cache in advance, and the operating system has a set of perfect mechanisms for page fault handling, cache hit and cache elimination. Although users can also do cache management for their own data, it is not much different from using page cache directly, and it will increase the cost of maintenance.

Mmap

Mmap is a method of memory mapping file, that is, a file or other object is mapped to the address space of a process to realize the one-to-one mapping relationship between the file disk address and a virtual address in the process virtual address space. The function prototype is as follows

Void * mmap (void * addr, size_t length, int prot, int flags, int fd, off_t offset)

After the implementation of such a mapping relationship, the process can use a pointer to read and write this section of memory, and the system will automatically write back dirty pages to the corresponding file disk, that is, to complete the operation of the file without having to call system call functions such as read,write. As shown in the following figure

Mmap can not only reduce system calls such as read,write, but also reduce the number of copies of memory. For example, when read is called, a complete process is for the operating system to read disk files into the page cache, and then copy the data from the page cache to the buffer passed by the read. But after using mmap, the operating system only needs to read the disk to the page cache, and then the user can directly manipulate the mmap mapped memory through the pointer. Reduced data copy from kernel state to user state

Mmap is suitable for frequently reading and writing to the same area. For example, a 64m file stores some index information, and we need to modify it frequently and persist it to disk, so that the file can be mapped to the user's virtual memory through mmap, and then the memory area can be modified by pointer. The modified part can be automatically brushed back to disk by the operating system, or you can call msync to manually scan the disk.

This is the end of the article on "Linux reading and writing mechanism and how to optimize it". Thank you for reading! I believe you all have a certain understanding of the "Linux reading and writing mechanism and how to optimize" knowledge. If you want to learn more knowledge, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.