Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is a MySQL cache pool

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article focuses on "what is the MySQL cache pool", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the MySQL cache pool"?

Preface

Interviewer: classmate, can you tell me about the Mysql cache pool?

Crazy chat: Ah, is it so difficult? let me organize the language. (inner OS: isn't this TM simple? I can pull for you for half an hour!)

Interviewer: yes, I'll give you a minute to think about it.

.... A minute later.

Crazy chat: I'm ready. Listen up. I'm going to start the show.

Why is there a cache pool?

Mysql's innodb storage engine is based on disk storage and is managed as a page.

In the database system, the gap between CPU speed and disk speed is very large. In order to make up the gap as much as possible, the concept of cache pool is put forward.

So the cache pool is simply a "memory area", which makes up for the slow speed of the disk through the speed of memory, resulting in an impact on the performance of the database.

The basic principle of cache pool

Read operation:

To read the page in the database, first store the page read from the disk in the cache pool, and the next time you read the same page, first determine whether the page is in the cache pool.

If the page is said to be hit in the cache pool, the page is read directly, otherwise, the page on the disk is read.

Write operation:

For the modification of pages in the database, the pages in the cache pool are first modified, and then refreshed to disk at a certain frequency. Instead of flushing back to disk every time the page changes, the page is refreshed back to disk through the mechanism of checkpoint.

As you can see, both read and write operations operate on the cache pool rather than directly manipulating the disk.

Cache pool structure

Buffer Pool is a continuous memory space, which is managed by the innodb storage engine in the form of pages.

The structure of the cache pool is shown below:

You can see that the cache pool includes data pages, index pages, insert cache, adaptive hash index, lock information, and data fields.

Data pages and index pages consume most of the memory.

"but how does innodb manage so many pages in the cache pool?"

To better manage these cached pages, innodb creates some so-called control information for each cache page, including what the page belongs to:

Tablespace number (sapce id)

Page number (page numeber)

The address of the page in buffer Pool

Some lock information and LSN information log serial number

Other control information

The amount of memory occupied by the control information corresponding to each cache page is the same, and a piece of memory occupied by the control information corresponding to each page is called a "control block".

The "control block" and the cache page correspond one to one, and they are both stored in the Buffer Pool, where the control block is stored in front of the Buffer Pool and the cache page is stored behind the Buffer Pool.

Schematic diagram of memory space corresponding to Buffer Pool:

Cache pool parameter settin

Innodb_buffer_pool_size: the size of the cache pool should be set to up to 80% of physical memory

Innodb_buffer_pool_instance: set the number of cache pools. It is generally recommended to set the number of cache pools to the number of CPU. Multiple cache pools can reduce the competition for resources within the database and increase the ability to access the database concurrently.

Innodb_old_blocks_pct: the proportion of the chain length of the whole LRU in the older generation. The default is 3:7.

Innodb_old_blocks_time: the residence time window of the old generation (in milliseconds). The default is 1000, that is, both "visited" and "stay in the old generation for more than 1 second" can be inserted into the head of the new generation.

Cache pool management

Manage the linked list structure on which the cache pool depends

Free linked list

When starting the Mysql server, you need to complete the initialization process of the Buffer Pool, that is, allocate the memory space of the Buffer Pool and divide it into several pairs of control blocks and cache pages, but at this time no real disk pages are cached in the Buffer Pool, and then as the program runs, the pages on the disk will continue to be cached in the Buffer Pool.

In order to record which cache pages are available, we wrap all the free pages into a node to form a linked list, which can be called the Free linked list (free linked list). Because all the cache pages in the newly initialized Buffer Pool are free, each cache page is added to the Free list.

In order to facilitate the management of the Free linked list, some "control information" is specially defined for the linked list, which contains the address of the head node and tail node of the linked list, as well as the number of nodes in the current linked list.

In addition, the address of a "cache page control block" is recorded in the node of each Free linked list, and each "cache page control block" records the corresponding "cache page address", so each Free linked list node corresponds to an idle cache page.

Draw a structure diagram for everyone:

How about this picture? you can read it now.

2. Lru linked list

The Lru linked list is used to manage the pages that have been read. When the database is started, the Lru linked list is empty, and the pages are also placed in the Free list. When you need to read data, you will apply for a page from the Free linked list and put the data read from the disk into the application page. The collection of this page is called the Lru linked list.

3. Flush linked list

The Flush linked list is used to manage the modified pages, and the modified pages in Buffer Pool are also called "dirty pages". Dirty pages exist not only in the Lru linked list, but also in the Flush linked list. The Flush linked list stores a pointer to the specific data in the Lru linked list.

Therefore, only when the page in the Lru linked list is modified for the first time, the corresponding pointer will be stored in the Flush. If the page is modified later, the corresponding data of the page in the Lru linked list will be updated directly.

The relationship between the three is as follows:

Read operation

One of the most important functions of Buffer Pool is to "accelerate reading". Accelerated reading is when you need to access a data page, if the page is already in the cache pool, then you no longer need to access the disk, and the contents of the page can be obtained directly from the buffer pool. When we need to access data in a page, we load the page into Buffer Pool and use it directly if the page is already in Buffer Pool.

Question: so how do you quickly find pages in Buffer Pool?

In order to avoid scanning Lru when querying data pages, a page is actually located according to the table space number + page number, which is equivalent to the table space number + page number is a key, and the cache page is the corresponding value. Create a hash table with tablespace number + page number as key and cache page as value. When you need to access the data of a page, first check whether there is a corresponding cache page from the hash table according to the table space number + page number.

If so, just use the cache page directly.

If not, select a free cache page from the Free list and load the corresponding page on disk to the location of the cache page. Whenever a page needs to be loaded into the Buffer Pool from the disk, a free cache page is taken from the Free linked list, and the information of the control block corresponding to the cache page is filled in, then the Free linked list node corresponding to the cached page is removed from the linked list, indicating that the cache page has been used, and the page is written to the Lru linked list.

At the time of initialization, all the pages in Buffer pool are free pages, and when you need to read data, you will apply for pages from the Free linked list, but the physical memory cannot increase infinitely, but the data in the database is constantly increasing, so the pages of the Free linked list will be used up.

Therefore, you need to consider removing part of the cached pages from the Buffer pool, and then you need to consider how to delete and delete which cached pages. Assuming that a total of n pages have been accessed, the number of pages accessed in the cache divided by n is the cache hit ratio, and the higher the cache hit ratio, the less IO interaction with the disk.

In order to improve the cache hit rate, InnoDB optimizes the traditional Lru algorithm and solves two problems: 1. Pre-read failure 2. Cache pool pollution.

Write operation

Another major function of Buffer pool is to "speed up writing", that is, when you need to modify a page, first change the page in the buffer pool, write down the relevant redo log, even if the page modification has been completed.

The modified page is actually refreshed to disk, which is done by the background refresh thread. The previous page update is done first in the cache pool, so it is inconsistent with the pages on disk, which are called dirty pages (dirty page).

Question: when will these modified pages be refreshed to disk? In what order do you flush to disk?

The simplest way is to synchronize to the corresponding page on the disk every time a change occurs, but writing data to the disk frequently will seriously affect the performance of the program. Therefore, after each modification of the cache page, the changes can not be synchronized to the disk immediately, but at some point in the future, and the background refresh thread flushes to the disk in turn to realize the modification landing to the disk.

But if you don't synchronize to disk immediately, how can you tell which pages in Buffer Pool are dirty and which have never been modified when you synchronize later?

InnoDB does not synchronize all the cache pages to disk at one time, InnoDB creates a linked list that stores dirty pages, and all pages that have been modified in the Lru linked list need to be added to this linked list, because the pages in this linked list need to be refreshed to disk, so this linked list is also called Flush linked list, and the structure of the linked list is consistent with the Free linked list.

The dirty page modification here means that the page is modified for the first time after it is loaded into Buffer Pool, and the Flush linked list is added only when it is modified for the first time. For pages that already exist in the Flush linked list, if this page is modified again, it will not be put on the Flush linked list.

It should be noted that the dirty page data is actually still in the Lru linked list, while the dirty page records in the Flush linked list only point to the dirty pages in the Lru linked list through a pointer. And the dirty pages in the Flush linked list are sorted and refreshed to disk according to oldest_lsn (this value represents the lsn number when the page was changed for the first time, the corresponding value oldest_modification, each page header record). A smaller value means that it should be refreshed first to avoid data inconsistency.

At this point, I believe you have a deeper understanding of "what is the MySQL cache pool". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report