Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the principle of InnoDB second snapshot?

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the principle of InnoDB second snapshot". In daily operation, I believe many people have doubts about what the principle of InnoDB second snapshot is. The editor consulted all kinds of data and sorted out a simple and easy-to-use method of operation. I hope it will be helpful to answer the doubt of "what is the principle of InnoDB second snapshot?" Next, please follow the editor to study!

Two ways to start a transaction

It is important to note that the begin/start transaction command is not the starting point of a transaction, and the transaction does not really start until the first statement that operates on the InnoDB table after them is executed. If you want to start a transaction right away, you can use the transaction with consistent snapshot command!

In the first startup mode, the consistency view is created when the line holds the first snapshot read statement.

In the second startup mode, the consistency view is created when the start transaction with consistent snapshot is executed

Transaction C does not explicitly use begin/commit, indicating that the update statement itself is a transaction and is automatically committed when the statement is completed. Transaction B is queried after updating the row; transaction An is queried in a read-only transaction and in chronological order after the query of transaction B. But the result is that the value of k found by transaction B is 3, while the value of k found by transaction An is 1

According to our imagination, the situation should be as follows:

The final result is that the result of transaction A query is 1, and that of transaction B query is 2, but why is it 3 after transaction B query?

It is important to note that there are two concepts of views in MySQL:

1. One is view, which is a virtual table defined by a query statement, which executes the query statement and generates the result when called. The syntax for creating a view is create view... And its query method is the same as a table

2. The other is the consistent read view that InnoDB uses when implementing MVCC (Multi-Version Concurrency Control, multi-version concurrency control), namely consistent read view, which is used to support the implementation of RC (Read Committed, read commit) and RR (Repeatable Read, repeatable read) isolation levels.

The principle of InnoDB creating second-level Snapshot

The first picture of this article is the implementation principle of MVCC. Under the repeatable readable isolation level, the transaction "takes a snapshot" when it starts. Note that this snapshot is based on the entire library. If a library has 100 gigabytes, then I start a transaction and MySQL has to copy 100g of data, which is a slow process. But the usual transactions are executed very quickly. In fact, we don't need to copy out the 100G data. Let's take a look at how this snapshot is implemented:

Each transaction in InnoDB has a unique transaction ID, called transaction id. It is applied to the transaction system of InnoDB at the beginning of the transaction and is strictly incremented in the order of application. There are multiple versions of each row of data. Each time a transaction updates data, a new data version is generated and transaction id is assigned to the transaction ID of that data version, marked as row trx_id. At the same time, the old data version should be retained, and in the new data version, information can be obtained directly. That is, a row of records in a data table may actually have multiple versions (that is, each data in row), each with its own row trx_id.

As shown in, this is the state of a record that has been continuously updated by multiple transactions.

The three dotted arrows in the figure are undo log (rollback log), while V1, V2, and V3 are not physically real, but are calculated based on the current version and undo log each time they are needed. For example, when you need V2, you use V4 to calculate U3 and U2 in turn.

According to the definition of repeatable readability, when a transaction starts, you can see the results of all committed transactions. But then, during the execution of this transaction, updates from other transactions are not visible to it.

Therefore, a transaction only needs to declare at startup time, "at the time of my startup, if a data version is generated before I start, I will recognize it; if it is generated after I start, I will not recognize it. I have to find the last version of it. If it is the data updated by the transaction itself, it still has to recognize it.

In implementation, InnoDB constructs an array for each transaction to hold the transaction startup moment, which is currently in the ID of all transactions that have been started but have not been committed. The minimum value of transaction ID in the array is marked as low water level, and the maximum value of transaction ID that has been created in the current system plus 1 is recorded as high water level. This array of views and the high water level form a consistent view (read-view) of the current transaction, while the visibility rules of the data version are based on the comparison between the row trx_id of the data and this consistent view. This view array divides all row trx_id into several different situations:

In this way, there are several possibilities for a data version of row trx_id at the start of the current transaction:

1. If it falls in the green section, it indicates that this version is a committed transaction or the current transaction generated by itself, and this data is visible.

2. If it falls on the red part, it means that this version is generated by a transaction started in the future and is definitely invisible.

3. If it falls in the yellow part, there are two situations.

If row trx_id is in the array, it means that this version is generated by a transaction that has not been committed and is not visible.

If the row trx_id is not in the array, this version is generated by a committed transaction, visible.

The key point is to understand the two situations that fall in the yellow section. If a piece of data is updated by multiple things (transactions have not been committed), then there must be row trx_id in the array, that is, this version is generated by transactions that have not yet committed.

So how does InnoDB create snapshots in seconds? My summary is as follows:

1. When a transaction is opened, InnoDB assigns an ID to the transaction, which is called transaction ID directly for convenience.

2. There are multiple versions of each row of data in the table. When A transaction updates the data, a new data version is generated, and the transaction ID of A transaction is assigned to the transaction ID of this data version.

3. Calculate the data version corresponding to the transaction ID by rolling back the log. For example, if the value of the data of the A transaction updated id=1 is 1, then there is a new version of the data of the id=1. (id=1,value=1) the transaction ID corresponding to this data is the transaction ID of the A transaction.

4. If a data version of the transaction ID falls in the yellow part and is still in the transaction array, it is determined that the data version is a new version of the data generated by other uncommitted transactions to modify the data; if a data version of the transaction ID falls in the yellow part and is not in its own transaction array, then the transaction has been committed and is legitimate data

So the principle of InnoDB's second snapshot creation ability is the same as Linux's AUFS file system, that is, as long as you change a certain piece of data, I will copy that one for you to change, and will not affect other people's data:

It's all caused by "current Reading".

After we understand how InnoDB implements the principle of creating second snapshots, the opening questions are actually easy to answer:

We might as well assume that before transaction A starts, there is only one active transaction in the system whose ID is 99; the version numbers of transaction A, B and C are 100,101,102 respectively, and there are only these four transactions in the current system; before the start of the three transactions, the row trx_id of this row of data is 90.

Thus, the view array of transaction An is [99100], the view array of transaction B is [99100101], and the view array of transaction C is [99100101102].

This is actually very easy to understand. Transaction C found that there were four transactions opened but not committed at this time (99100101102). Because other transactions had not modified the data of id=1, there was only one version at this time (id=1, Know1, transaction ID=90), and transaction ID 90 was at a low water level, indicating that the transaction with transaction ID 90 had been committed and the data was valid, and then transaction C set k to 2. And the transaction is committed, so a new version is generated at this time (id=1,k=2, transaction ID is 102)

Finally, transaction A went to query the value of k and found that there were already three versions. When transaction A saw the data of the latest version (id=1,k=3, transaction ID 101), it found that 101was at high water level and could not be read. Then, when the version was (id=1,k=2, transaction ID was 102), the same transaction 102was also high water level, could not be read, continued to read, read the data of (id=1,k=1, transaction ID is 90). A transaction with a transaction ID of 90 is at a low water level and is visible when the data is available, so the resulting data is still the data version of (id=1, kryp1, transaction ID 90).

What's the problem? How does transaction B see the modification result of transaction C? For transaction B, isn't transaction C a high water level? According to reason, the modification of high water level transactions should not be seen!

The view array of transaction B is made by Mr., and then transaction C commits. Shouldn't it be invisible (1Magne2)? how can it be calculated (1Magne3)?

Yes, if transaction B queries the data once before the update, the value of k returned by this query is indeed 1. However, when it is going to update the data, it can no longer be updated in the historical version, otherwise the update of transaction C will be lost. Therefore, the set k=k+1 of transaction B at this time is based on (1mem2). So, here's a rule:

Update data is read before writing, and this read can only read the current value, called "current read" (current read).

Therefore, at the time of the update, the data currently read is (1mem2). After the update, a new version of the data is generated (1Jing 3), and this new version of row trx_id is 101. Therefore, when executing the transaction B query statement, I see that my version number is 101, and the latest data version number is also 101, which is my own update and can be used directly, so the value of k obtained by the query is 3.

However, the result of select is 1, so the example at the beginning of our team is slightly modified so that the value of the update is no longer k=k+1, which avoids the current read, so the following example is the result of the avoided current read:

Because in MYSQL: data that cannot be queried in the same table as updated data in the same table, I use temporary tables here

In fact, in addition to the update statement, the select statement, if locked, is also currently read.

Now suppose transaction C is not committed immediately, but becomes the following transaction C, what happens?

The difference of transaction C'is that it is not committed immediately after the update, and the update statement of transaction B is initiated before it commits. As mentioned earlier, although transaction C 'has not yet been committed, this version has been generated and is currently the latest version. So, what about the update statement of transaction B?

Transaction C'is not committed, which means that the write lock on this version has not been released. Transaction B is the current read, must read the latest version, and must be locked, so it is locked, and must wait until transaction C 'releases the lock before continuing its current read.

At this point, we string consistent reads, current reads, and row locks together.

Repeatable implementation

The core of repeatable readability is consistent read; when a transaction updates data, it can only use the current read. If the row lock of the current record is occupied by another transaction, you need to enter the lock and wait.

The logic of read commit is similar to that of repeatable read. the main difference is that under the repeatable read isolation level, you only need to create a consistency view at the beginning of the transaction, and then other queries in the transaction share this consistency view; under the read commit isolation level, a new view is recalculated before each statement is executed.

There are multiple versions of row data in InnoDB, each with its own row trx_id, and each transaction or statement has its own consistent view. Normal query statements are consistent reads, which determine the visibility of the data version based on the row trx_id and the consistency view. For repeatable reads, the query only recognizes data that has been committed before the transaction starts; for read commits, the query only recognizes data that has been committed before the statement starts; and the current read, always reads the latest version that has been committed.

At this point, the study of "what is the principle of InnoDB second snapshot" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report