In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article is to share with you about how to achieve ACID in MySQL, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
Suppose you execute a sql statement:
Update user set age = 18 where user_id = 345981
Will MySQL go directly to disk to modify data?
Obviously not, the disk IO is too slow, if every request MySQL has to write to the disk, the disk will not be able to handle it.
Then write memory? Load the data from disk to memory, and then modify the data in memory.
No, if there is a power outage, the memory will be gone and the data will never be found again.
In fact, this is a problem that many middleware will encounter. No matter how distributed and reliable a middleware does, it will encounter this problem:
Here comes the data, write to disk or write to memory?
Is it too slow to write to disk? It's not safe to write memory?
MySQL's solution is to write both disk and memory.
Write data to memory, and then write redo log to disk.
What is redo log?
When executing the above sql statement, MySQL will determine whether there is data with user_id = 345981 in memory, if not, go to the disk to find the "page" where the data is located, load the whole page of data into memory, then find the row data with user_id = 345981, and set the age of this row of data in memory to 18.
At this time, the data in memory is new and correct, while the disk data is old and outdated, so we call the corresponding page data of the disk "dirty page".
Here is an additional point of knowledge: MySQL reads data by page, and there are many rows of records in a page, from memory to disk, as well as pages.
At this time, if the power is off, the data will be gone, so MySQL will record what you have modified to the page, and save it to disk, that is, redo log.
After writing redo log,MySQL, I think the transaction has been committed successfully, the data has been persisted (D of ACID), and then when I am free, the data in memory will be brushed to disk.
What if MySQL is powered off before the in-memory data is brushed to disk?
At this point, you only need to load the "dirty page" to memory after reboot, and then apply redo log, and the dirty page will become a "clean page".
You would say, what if I succeed in writing memory, but fail to write redo log to disk? We will discuss this later when we discuss the "two-phase submission".
You will also say, redo log still has to write disk, isn't that still very slow?
No, writing redo log to disk is faster than writing to disk for a number of reasons:
Generally speaking, when we write to disk, it is "random", while redo log, it is "sequential".
MySQL makes optimizations on writing redo log, such as "group submission"
After this, we will start one after another.
How to store redo log
The official MySQL document has a few words:
The redo log encodes requests to change table data that result from SQL statements or low-level API calls.
The redo log is a disk-based data structure used during crash recovery to correct data written by incomplete transactions.
By default, the redo log is physically represented on disk by two files named ib_logfile0 and ib_logfile1.
MySQL writes to the redo log files in a circular fashion.
From these words, we can roughly get to:
Redo log records the changes made by sql statements and other api to the table data, that is to say, redo log records the physical changes of the data, which is the biggest difference between binlog and binlog. Binlog records the logical changes of the data, which is one of the reasons why redo log can be used for crash recovery, while binlog cannot.
Redo log is stored on disk and is used to correct data after crash recovery, which is what we often call fault recovery, such as power outage, downtime and so on.
Redo log has two files by default
Redo log is written in loop (circular)
There are two other parameters:
Innodb_log_file_size: set the size of each redo log file. The default is 50331648 byte, or 48 MB.
Innodb_log_files_in_group: sets the number of redo log files. The default is 2, and the maximum is 100.
We often say that a transaction has four characteristics of ACID, among which D (durability), data persistence, means that once a transaction is committed, its state must remain committed and cannot be rolled back. Even if your system crashes or crashes, you have to find a way to commit the transaction and save the data to me:
Durability guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure (e.g.power outage or crash).
This is really strict, but it also seems to be easy to implement, which is to write the data to disk every time, and then tell the client that the transaction committed successfully.
But what if I want to pursue high performance? Do I have to write the data to memory?
So we say that while innodb implements high-performance data writing, it uses redo log to achieve transaction persistence.
Binlog
After talking about redo log, let's talk about binlog again.
It's the same update statement:
Update user set age = 18 where user_id = 345981
When this update statement is executed, binlog is generated as well as redo log.
There are many differences between binlog and redo log, and it is important to know that redo log is only a function of the innodb storage engine, while binlog is a function of the MySQL server layer, that is, redo log is only available on MySQL that uses innodb as the storage engine, while binlog, as long as you are MySQL, will have it.
Mysql architecture
What is recorded in binlog? As mentioned above, unlike redo log recording the physical changes of data, binlog records the logical changes of data, such as the above update statement, you can simply think that such a sql statement is recorded in binlog, of course, it will also record which database it is currently in, and so on.
Binlog has three formats, statement/row/mixed, if you are interested, you can learn more about it: Binary Logging Formats
What is the purpose of binlog?
The reason why MySQL put binlog on the server layer shows that binlog provides some general capabilities, such as data restore.
DBA always said that he can restore MySQL data to any time, how to restore it?
Suppose you accidentally emptied all the data in a table at 8 o'clock on Wednesday night.
At this point, DBA will find the most recent "full backup" and replay the binlog from the last full backup to 8pm on Wednesday, and your data will be restored.
Binlog has another function: master-slave copy, the master library sends binlog to the slave library, the slave library saves the binlog, and then executes it, thus achieving master-slave synchronization.
Of course, we can also allow our business applications to listen to the binlog of the main database and do specific things, such as real-time statistics, when the data in the database changes.
Two-phase submission
Finally, when I execute a update statement, when are redo log and binlog written? This leads to what we often call "two-phase submission":
Write: redo log (prepare)
Write: binlog
Write: redo log (commit)
Why is redo log divided into two phases: prepare and commit? can't redo log be written at once?
Let's discuss it in two situations:
Write redo log first, then binlog
Write binlog first, then redo log
1. Write redo log first, then binlog
In this way, redo log will be written to disk, but binlog has not been written to disk, so when crash recovery occurs, after recovery, the master library will apply redo log to recover the data, but because there is no binlog, the slave database will not synchronize the data, and the master database is "newer" than the slave library, resulting in master inconsistency.
2. Write binlog first, then redo log
Similar to the previous situation, it is easy to know that, in turn, the slave database is "new" than the master database, and it also causes the master to be inconsistent.
On the other hand, the two-phase submission will solve this problem, when crash recovery:
If redo log has already commit, do not hesitate to commit the transaction
If the redo log is in prepare, then determine whether the corresponding binlog of the transaction is complete
Yes, commit the transaction
No, the transaction is rolled back
The purpose of two-phase commit is to ensure the logical consistency between redo log and binlog.
The above is how to achieve ACID in MySQL. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.