In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article shows you that I understand MySQL double write very well. The content is concise and easy to understand. It will definitely make your eyes shine. I hope you can gain something through the detailed introduction of this article.
MySQL double write is the three shiny features of InnoDB, the other two are insert buffer and adaptive hash, in fact, there are several such as asynchronous IO, Flush neighbor Page(refresh adjacent page), this and the system level of relevance is high, so the three highlights are more targeted.
Of course, when it comes to double write in MySQL, it is mainly to deal with a very natural problem, that is, partial write.
Classic partial write problem
This problem is more classic, many database designs need to consider such a critical point problem, MySQL page is 16k, data verification is carried out according to this unit, and the operating system level data unit certainly does not reach 16k, such as 4k, then once a power failure occurs, only part of the write is retained, if it is Oracle DBA will generally be very calm about this, say redo to restore it, but maybe we are shielded from some details, MySQL in the recovery process of a benchmark is to check the checksum of the page, that is, the last transaction number of the page, when this partial page write problem occurs, because the page has been damaged, so can not locate the transaction number in the page, so this time redo can not be directly restored.
From this point of view, the partial write problem will definitely exist in Oracle, but Oracle only does this process smoothly for us. There are differences in design, and there are differences in recovery technology. However, the problem would not be solved anyway.
Therefore, if this kind of problem is discussed, it can be discussed for a long time, and all aspects of the architecture can be analyzed and compared.
Simple analysis of the double write problem
I have drawn a relatively simple picture of this, and I welcome suggestions for improvement.
In general, double write buffer is a buffer technology, the main design is to prevent data loss in power failure, abnormal circumstances. There are a few points to note is that the data in the buffer pool after modification into dirty pages, this process will produce binglog records and redo records, of course, data writing data file is an asynchronous work, if you look closely, in the shared table space (system tablespace) There will be a space of 2M, divided into 2 units, a total of 128 pages, of which 120 are used for batch dirty data, and the other 8 are used for Single Page Flush. According to Ali's analysis, it is mainly because the batch brushing is done by the background thread, which does not affect the foreground thread. Single page flush is initiated by the user thread and needs to be cleaned as soon as possible and replaced with a free page. So it's not exactly a 64+64 split.
The process of refreshing data is to use memcopy to copy dirty data to the double write buffer in memory, write it twice, write 1MB to the shared table space each time, and then call fsync to synchronize it to disk. One thing to note here is that this process of refreshing to shared table space, although it is twice, is written sequentially, so the overhead will not be very large, and it will not be like the double write performance that everyone thinks may be very poor. According to Percona's test, it is about 5% difference. Whether data is important or performance is more important is a basic proposition. Of course, it will be written to the corresponding table space file later. This process is random writing, and the performance overhead will be larger. So in the early days, when SSD was used, many people would also have such concerns, whether to write sequentially or randomly.
Of course, double write was designed to be used for recovery, otherwise it would not be worth it with such fanfare. This graph is from http://blog.csdn.net/renfengjun/article/details/41541809
I thought it was clear enough, so I quoted it directly.
You can see that one of the central words in it is checksum. If there is a partil write, such as a power failure, then in the process of writing twice, it is likely that the page is inconsistent, so checksum verification is likely to have problems. When problems occur, because there is page information written to the shared table space earlier, the information of the reconstructed page can be rewritten.
Another function of double write
Double write actually has another feature, that is, when writing data from double write buffer to real segment, the system will automatically merge the connection space refresh method, so that multiple pages can be refreshed at a time to improve efficiency.
For example, in the following environment, we can get a basic merged page situation based on the results of show status.
> show status like '%dbl%';
| Variable_name | Value |
| Innodb_dblwr_pages_written | 23196544 |
| Innodb_dblwr_writes | 4639373 |
You can get it through InnoDB_dblwr_pages_written/InnoDB_dblwr_writes, and you can basically see it through indicators.
Double write improvements in Percona
Of course, for double write, there is a continuous improvement in Percona. In Percona version 5.7, an improvement was made. You can see a new parameter, innodb_parallel_doublewrite_path.
| innodb_parallel_doublewrite_path | xb_doublewrite |At the system level, there will also be a file corresponding to 30.
-rw-r---- 1 mysql mysql 31457280 Mar 28 17:54 xb_double write is parallel double write, for a detailed description and testing of this feature, please refer to. https://www.percona.com/blog/2016/05/09/percona-server-5-7-parallel-doublewrite/? utm_source=tuicool&utm_medium=referral
It provides a lot of comparison and analysis of detailed tests. Of course MariaDB,Facebook,Aurora also have some of their own implementation methods and considerations, this is limited to energy, has not been carefully tested analysis.
I understand double write in MySQL very well. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserves, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.