Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does MySQL return to any second state?

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "how to restore MySQL to any second". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Do you feel that the database no longer has to worry about the server crash when you see this topic?

Then we need to learn why we can do this. And how to do it?

That is, why can it be restored to any point in time? How to recover to any point in time?

Why do you need redo log when you have binlog?

How is the transaction committed? Do you write binlog or redo log first for transaction commit? How to ensure that the logs of these two parts are consistent in order?

In the last lesson, we learned the full execution of a select statement, so today we will start with a update statement.

Mysql > update T set c=c+1 where ID=2

In fact, the execution process is the same as the query process, except that the final executor is to find this data and update it.

In addition, the update process also involves an important log module, namely redo log (redo log) and binlog (archive log).

Personally, I've only heard of binlog.

1 redo log

Like most relational databases, InnoDB records physical changes to the data file and ensures that the log is always in advance.

This is called WAL (Write-Ahead Logging), which ensures that the previous redo log has been written to disk before persisting the data file.

Every update of MySQL is not written to disk. The InnoDB engine first writes the record to redo log, updates it to memory, and then updates the record to disk when appropriate.

Referring to two important logs, I think it is necessary to post the storage structure diagram of InnoDB to have an overall understanding of it:

If you look at the various spaces below, you are advised to come back and take a look at this picture.

1.1What is redo log?

When the database makes changes to the data, you need to read the data pages from disk to buffer pool, and then modify them in buffer pool.

At this time, the data page in buffer pool is inconsistent with the data page on disk. We call the data page in buffer pool as dirty page dirty data.

It feels like copying a copy of the data first, modifying the copied data, and then overwriting the original data after the modification.

You can also see here that all update operations are now taking place in dirty page.

If an abnormal DB service restart occurs at this time, the data is not in memory and has not been synchronized to the disk file (note that synchronizing to the disk file is a random IO), that is, data loss will occur.

If at this time, you can log the corresponding changes to a file when there is a file after the dirty page changes in buffer pool have been completed (note that the logging is sequential IO).

Then when the crash occurs in the DB service and the DB is restored, it can also be reapplied to the disk file according to the recorded contents of this file, and the data is consistent.

This file is called redo log, which is used to record the modified data and record it sequentially.

As far as I understand, redo log is the physical space where dirty page is stored.

1.2 when does it come into being & release?

The disk that produces the redo log,redo log after the transaction starts is not written with the commit of the transaction, but begins to be written to the redo log file during the execution of the transaction.

When the dirty pages of the corresponding transaction are written to disk, the mission of redo log is completed, and the space occupied by the redo log can be reused (overwritten).

1.3 how to write it?

The redo log file is named after the ib_ log file and is written to the file sequentially. When it is full, it goes back to the first file to overwrite.

As shown in the figure:

Write pos is the location of the current record, move back while writing, and then go back to the beginning of file 0 after writing to the end of the last file.

Checkpoint is the current location to be erased, and it is also pushed back and looped. Before erasing the record, update the record to the data file.

The space between write pos and checkpoint can be used to record new operations.

If write pos catches up with checkpoint, it means that it is full, and you can't perform any new updates at this time. You have to stop and erase some records and push checkpoint.

Redo log files are written in a loop, and always make sure that the corresponding dirty pages have been brushed to disk before overwriting.

Under very heavy loads, redo log can be generated very fast, resulting in frequent brushing operations, which in turn leads to performance degradation.

If you can expect such a scenario, we recommend that you resize the redo log file. You can do a clean shutdown, then modify the redo log configuration and restart the instance.

Reference:

Http://mysql.taobao.org/monthly/2015/05/01/

1.4 related configuration

By default, the corresponding physical files are located in ib_logfile1, ib_logfile2 under the data directory of the database.

Innodb_log_group_home_dir specifies the path where the log filegroup is located. / by default, it is in the data directory of the database.

Innodb_log_files_in_group specifies the number of files in the redo log filegroup. The default is 2.

# about the size and number of files, it is configured by the following two parameters

Size of the innodb_log_file_size redo log file.

Innodb_mirrored_log_groups specifies the number of log image filegroups. The default is 1.

1.5 other

Redo log has a cache Innodb_log_buffer. The default size is 8m _ focus Innodb. The storage engine first writes the redo log to innodb_log_buffer.

The logs of the innodb log buffer are then flushed to disk in the following three ways:

1. Master Thread refreshes the Innodb_log_buffer to the redo log file once a second

2. When each transaction commits, the redo log is flushed to the redo log file.

3. When the free space of the redo log cache is less than half, the redo log cache is flushed to the redo log file

With redo log,InnoDB, you can ensure that even if the database is abnormally restarted, the previously submitted records will not be lost, which is called crash-safe.

CrashSafe can guarantee that after the MySQL server is down and restarted:

Data for all committed transactions still exist

The data of all uncommitted transactions are automatically rolled back

2 binlog

As mentioned earlier, MySQL as a whole can be divided into Server layer and engine layer.

In fact, redo log is a log unique to InnoDB that belongs to the engine layer, while the Server layer also has its own log, that is, binlog (Archive Log).

2.1What is binlog?

Logically formatted logs can simply be thought of as sql statements in executed transactions.

However, it is not as simple as the sql statement, but includes the reverse information of the executed sql statement (addition, deletion and modification).

This means that delete corresponds to delete itself and its reverse insert;update corresponds to the information of the version before and after the execution of update; insert corresponds to the information of delete and insert itself.

2.2 when will it be generated and released?

When the transaction is committed, the sql statements in the transaction are recorded into binlog in a certain format at one time. As a result, the commit of larger transactions may become slower.

By default, the retention time of binlog is configured by the parameter expire_logs_days, which means that inactive log files will be deleted automatically after the generation time exceeds the configured number of days.

2.3Differences from redo log

1. Redo log is unique to the InnoDB engine. Binlog is the Server layer implementation of MySQL, and all engines can use it.

2. The content is different: redo log is a physical log, which records what changes have been made on the data page. It is the dml and ddl statements that are being executed.

On the other hand, binlog is a logical log, which records the original logic of the statement, dml and ddl sql statements after they have been submitted, such as "add 1 to the c field of this line of ID=2".

3. The way of writing is different: redo log is written in a loop and the space is fixed; binlog can be appended all the time. After a file is written to a certain size, it will continue to write the next one, and the previously written file will not be overwritten.

4. Different functions: redo log is mainly used to ensure transaction security, as data recovery after abnormal down machine or media failure, and binlog is mainly used for master slave replication and instant point recovery.

5. In addition, the time when the log is generated, the time that can be released, and the cleaning mechanism when the log can be released are completely different.

Reference:

Http://www.importnew.com/28039.html

3 data update transaction flow

With a conceptual understanding of these two logs, let's look at the internal flow of the executor and the InnoDB engine when executing this simple update statement.

1. The actuator first goes to the engine to get the ID=2 line. ID is the primary key, and the engine uses a tree search to find this line. If the data page of the ID=2 line is already in memory, it is returned directly to the executor; otherwise, you need to read the memory from disk and then return it.

Corresponding to the above, is to load the data into the dirty data.

2. The executor gets the row data given by the engine, adds 1 to this value, for example, it was N, and now it is Number1, gets a new row of data, and then calls the engine interface to write the new data.

3. The engine updates this new line of data to memory and records the update operation to redo log, when redo log is in prepare state. The executor is then informed that the execution is complete and the transaction can be committed at any time.

4. The executor generates the binlog of the operation and writes the binlog to disk

5. The executor calls the commit transaction interface of the engine, and the engine changes the redo log just written to the commit (commit) state, and the update is completed.

Two-phase submission

Don't you look a little confused when dealing with redo log and binlog?

In fact, this is the so-called two-phase commit, that is, COMMIT is automatically divided into two phases: prepare and commit.

MySQL generates xid in the prepare phase and then writes to the binlog in the commit phase. It is up to Binlog to decide whether the transaction should be committed or rolled back during recovery.

As you can see from the two-phase commit process above, the two-phase commit ensures that in any case, the transaction exists in both redo log and binlog, or neither.

This ensures that the binlog and redo log order of transactions are consistent. Once the persistence binlog in Phase 2 is completed, the transaction is committed.

In addition, it should be noted that each phase requires a fsync operation to ensure the consistency of the upper and lower layers of data.

PS: it is important to record the Binlog after the InnoDB engine Prepare (that is, Redo Log writes to disk).

Another thing to note is that the Redo log generated by the SQL statement is flushed all the way to disk (master thread fsync redo log per second), while Binlog is flushed to disk when it is a transactional commit, and commit is slow if the binlog is too large.

Reference:

Http://www.ywnds.com/?p=7892

For instance

Bin log is used to record complete logical records, and all logical records can be found in bin log, so when backing up and restoring, it is based on bin log and backs up a complete data with the original database through the complete logical operation of its records.

For example, if redo log executes update t set status = 1, the data status of the original database has been updated to 1, but bin log write failed, and this operation is not recorded. When subsequent backups are restored, status = 0, resulting in data inconsistency).

Its core is that redo log records, even if an abnormal restart, will be flushed to disk, while bin log records, the main database logic operations, mainly used for backup and recovery.

A complete transaction process:

The ledger sells a bottle of cola (redo log is prepare), then receives the money and puts it in the cash box (bin log record), and then turns around and ticks the ledger (redo log is set to commit) to show that a transaction is over.

If the transaction is interrupted while collecting the money, go back to sort out the transaction and find that there is only bookkeeping and no collection, then the transaction fails and the record in the ledger is deleted (rollback).

If the money is terminated after receiving the money, and then look back and find that the account book has a record (prepare) and the cash box has this income (bin log), then continue to improve the account book (commit), this transaction is valid.

4 how to recover the data at any time?

When you need to recover to a specified second, such as a database misoperation on November 23, 2018.23, 14.23.45, and you need to retrieve data, you can do this:

1. First of all, find the most recent full backup, which, if you are lucky, may be a backup from last night's 11.22, which was restored to the temporary library.

2. Then, starting from the time point of the backup, take out the backup binlog in turn and replay it to the time before the table was mistakenly deleted at noon.

In this way, your temporary library is the same as the online library before it was deleted by mistake, and then you can take the table data out of the temporary library and restore it to the online library as needed.

When crash is encountered, the recovery process is also very simple:

1. During the recovery process, the last binlog file is scanned to extract the xid.

2. Redo the redo log after the checkpoint, collect the transaction linked list in the prepare phase, and compare the transaction xid with the xid in binlog.

If it exists, the transaction is recorded to binlog successfully, but in the end, if commit is not successful, it will be committed, otherwise it will be rolled back.

Here, we have to look at the two paragraphs above in order to understand them more thoroughly.

To sum up, basically, the following situations can occur at most:

When the transaction is in the prepare phase crash, the database recovery, the transaction is not written to binlog and the redo log is not committed, the transaction is rollback.

When a transaction is crash during the binlog phase, and the log has not been successfully written to disk, the transaction will be rollback at startup.

When the transaction crash after the binlog log has been fsync to disk, but the InnoDB does not have time to commit, the MySQL database recovery will read out the xid in binlog, and then tell InnoDB to commit these xid transactions, and InnoDB will roll back other transactions after committing these transactions, so that redo log and binlog are always consistent.

Let me talk about my understanding again.

1. Prepare stage; 2. Writing binlog stage; 3. Commit stage

When it crashes before 2,

Restart recovery: no commit is found and roll back.

Backup restore: no binlog. Consistent.

When it crashes before 3

Restart recovery: although there is no commit, it satisfies the integrity of prepare and binlog, so it will automatically commit after restart.

Backup: binlog, consistent.

To sum up, if a transaction succeeds in the prepare phase and the binlog in the MySQL Server layer is written successfully, then the transaction must be commit successful.

Summary

This paper introduces the two most important logs in MySQL, namely physical log redo log and logical log binlog.

It is best to be able to understand the role of these two kinds of logs, and be able to sort out the submission process of things.

After-class topic & the quintessence of comment area

The comment area is a little more exciting this time!

Ask

After-class topic

As I said earlier, the cycle of regular full backups "depends on the importance of the system, some once a day and some once a week." So in what scenarios does one-day preparation have an advantage over one-week preparation? In other words, which indicator of the database system does it affect?

Answer 1

There are two aspects to the length of the backup time period.

First of all, it is the time to recover the data loss, since it needs to be recovered, it must be the data loss. If you have a backup every day, just find the full backup of the day and add the binlog of a certain time of the day to restore. If the backup is made on Monday, suppose it is Monday, and the data you want to recover is at a certain time on Sunday, then you need full backup + all the binlog at a certain time from Monday to Sunday to recover, which takes a lot more time than the former. It depends on the degree to which the business can stand it.

Secondly, the database is lost. If you back up the database on Monday, you need to make sure that the binlog for the whole week is intact, otherwise you will not be able to recover; while one day is prepared, as long as you ensure that the binlog on that day is intact; of course, this can be achieved through techniques such as verification or redundancy, compared with the above.

Answer 2

The cycle of backing up the database directly affects the speed of recovery. if you prepare one day, you only need to re-perform the database modification operation on the most recent day. On the other hand, there is a lot to be done to prepare for one week. Therefore, in systems that are very sensitive to the speed of system recovery, it is best to use one preparation a day, or even one preparation an hour, and so on.

Answer 3

I understand that backup is life-saving medicine plus regret medicine, backup can save life when a disaster occurs, backup can regret when something goes wrong. Everything has two sides, no one is better than the other, only who is more suitable, it all depends on the business situation and needs. One-a-day recovery time is shorter, binlog is less, life-saving time is faster, but regret time is shorter, and weekly preparation is just the opposite. My own backup strategy is to set up a 16-hour delayed replication slave library, which acts as a medicine for regret and has a faster recovery time. Another two days a full library and binlog, as a life-saving medicine, used at the last minute. This is more balanced.

Answer 4

1 if there is no master-slave and no Binlog Server, it is recommended to back up at least once a day. The library is very small and the concurrency is small, so the backup cycle can be shortened, for example, one backup per hour.

2 if there is a master-slave and Binlog Server, it is recommended to back up at least once a week. The concurrency is not high when the library is small, and the backup cycle can be shortened, for example, once a day.

There is no distinction between master and slave without Binlog Server, because important systems recommend building at least master-slave replication and building Binlog Server as much as possible (financial environment is particularly important).

Ask

1. First, the client sends a sql statement to the SQL interface of the server layer through tcp/ip

After receiving the request, 2.SQL interface parses the statement to verify whether the permissions match

3. After the verification is passed, the parser will parse the statement to see if there are any grammatical errors.

4. Next, the optimizer generates the corresponding execution plan and selects the optimal execution plan.

5. Then the executor executes the statement according to the execution plan. You will go to open table at this step, and if there is a MDL on that table, wait.

If not, add a brief MDL (S) to the table

If the opend_table is too large, the open_table_ cache is too small. Need to keep opening the frm file)

6. When you enter the engine layer, you will first go to data dictionary (metadata information) in innodb_buffer_pool to get the table information.

7. Through the metadata information, go to lock info to find out whether there is relevant lock information, and write the lock information required by this update statement to lock info (locks need to be added here)

8. Then the old data involved is stored in undo page in innodb_buffer_pool by snapshot, and the modified redo of undo log is recorded.

(if there is one in data page, load it directly into undo page. If not, you need to go to disk to retrieve the data of the corresponding page and load it into undo page.)

9. Do the update operation in the data page of innodb_buffer_pool. And record the physical data page modification of the operation in redo log buffer

Since the update transaction will involve multiple page changes, multiple page changes will be recorded in redo log buffer.

Because of group commit, the redo log buffer generated by this transaction may be flush along with other transactions and sync to disk.

10. At the same time, the modified information will be recorded in binlog_cache according to the format of event. Note here that binlog_cache_size is a transaction-level parameter, not a session-level parameter

Once the commit is completed, the dump thread will actively send the event from the binlog_cache to the slave I / O thread)

11. After that, write the sql to change buffer page for the changes that need to be made on the secondary index, and wait until the next time another sql needs to read the secondary index, then do merge with the secondary index.

(the random I-SSD O becomes the sequential I-sign O, but since all the disks are now available, there is little difference between the random I-big O and the sequence I-big O for addressing.)

twelve。 At this point, the update statement is complete, and either commit or rollback is required. The case of commit is discussed here, and double one

13.commit operation, because the internal XA is used between the storage engine layer and the server layer (ensures the consistency of the two transactions, here mainly ensures the atomicity of redo log and binlog)

So the submission is divided into prepare phase and commit phase.

In the 14.prepare phase, the xid of the transaction is written, and the flush and sync operations in the binlog_cache are performed (this step is very time-consuming for large transactions)

In the 15.commit phase, the redo log generated by the previous transaction has been sync to disk. So this step is just marking commit in redo log.

16. When both binlog and redo log are on the disk, if the operation of refreshing the dirty page is triggered, copy the dirty page to doublewrite buffer, refresh the doublewrite buffer to the shared tablespace, and then write the dirty page to disk through the page cleaner thread

Teacher, do you think there is anything wrong with my steps? I think there is a problem in step 6, because step 5 has already gone to open table, is it necessary to look up metadata in buffer in step 6? Does this metadata represent the metadata of the system, or of all tables? Thank you for your correction.

Answer

In fact, in the implementation of 5 is to call the process of 6, so it is one thing. Both the MySQL server layer and the InnoDB layer hold the table structure, so they will be taken apart when described in the book.

This description is very detailed, and at the same time, there is a bit of insights that we will talk about later.

Ask

Hello, I don't quite understand the word 'data page' mentioned. Is it a storage method?

Answer

MySQL records are accessed in "pages", with a default size of 16K. In other words, if you want to access a record on disk, you will not just read the record, but will read its 16K data into memory together.

Ask

Do I still write binlog when I use redolog to recover? And vice versa?

Answer

Crash recovery process does not write binlog, when using binlog to restore an instance (or build a standby database), it will write redolog

This is the end of the content of "how to restore MySQL to any second". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report