Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are MySQL transactions, isolation levels, and MVCC

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the MySQL transaction, isolation level and what MVCC is, has a certain reference value, friends in need can refer to. I hope you will learn a lot after reading this article. Next, let the editor take you to learn about it.

The mysql tutorial column introduces MySQL-related transactions, isolation levels, and MVCC.

The fourth article in the MySQL series focuses on transactions, including transaction ACID features, isolation levels, dirty reading, non-repeatable reading, phantom reading understanding, and multi-version concurrency control (MVCC).

A transaction (Transaction) ensures that an indivisible set of atomic operations is either executed or not executed. Among the commonly used storage engines of MySQL, InnoDB supports transactions, while the native MyISAM engine does not.

In this article, unless otherwise specified, the data tables and data used are as follows:

CREATE TABLE `user` (`id` int (11) DEFAULT NULL, `name` varchar (12) DEFAULT NULL) ENGINE = InnoDB;insert into user values (1, 'hedgehog'); copy code 1. ACID

First of all, we need to understand the four characteristics of transaction ACID, namely, Atomicity, Consistency, Isolation and Durability, which are also the four basic elements of transaction.

In order to explain the features of ACID in detail, imagine a scenario here: I transfer 100 yuan to you.

Suppose this operation can be divided into the following steps (suppose your account balance and I both have a balance of 100 yuan):

Check my account balance my account debit 100 yuan start to transfer query your account balance your account to 100 yuan 1.1 atomicity (Atomicity)

The atomicity of a transaction means that a transaction must be the smallest unit of work that can no longer be divided, and all the operations in a transaction either succeed or fail, and it is impossible to perform only part of the operation in a transaction.

In the above transfer scenario, atomicity requires that these five steps are either performed or not performed, and it is impossible for my account to deduct 100 yuan while your account fails to receive 100 yuan.

1.2 consistency (Consistency)

The consistency of transactions means that the database always changes from one consistency state to another, the consistency focuses on the visibility of the data, and the intermediate state of the data is invisible.

At the same time, the consistency requirements of transactions meet the constraints defined by developers, such as amount greater than 0, height greater than 0, and so on.

In the above transfer scenario, consistency ensures that after the final execution of the entire transfer operation, the deduction amount of my account is the same as that of your account, and if the balance of your account and I do not meet the amount constraint (for example, less than 0), the whole transaction will be rolled back.

1.3 isolation (Isolation)

The isolation of transactions means that one state transition will not be affected by other state transitions.

Suppose you and I both have 100 yuan, and I initiate two transfers, both of which are 50 yuan. The following steps are expressed in pseudo code:

Check my account balance read my my account debit 50 yuan my=my-5050 start to transfer query your account balance read yours your account to 50 yuan yours=yours+50

If isolation is not guaranteed, the following may occur:

The first transfer and the second transfer my account balance your account balance 1read my

My=100yours=1002

Read my-50 my=100yours=1003my=my-50=100-50

My=50yours=1004read yours (100) my=my-50=100-50=50my=50yours=1005yours=yours+50=100+50=150

My=50yours=1506

Read yours (150) my=50yours=1507

Yours=yours+50=150+50=200my=50yours=2007endendmy=50yours=200

After two transfers, the final result is that my account balance is 50 yuan and your account balance is 200 yuan, which is obviously wrong.

If the transaction isolation is guaranteed, the above situation will not happen, and only a certain degree of consistency will be lost.

1.4 persistence (Durability)

The persistence of a transaction means that after a transaction is committed, its changes are permanently saved to the database.

In the above transfer scenario, persistence ensures that after a successful transfer, my account balance is 0 and your account balance is 200.

two。 Auto commit and implicit submission 2.1 Auto commit

In MySQL, we can open the transaction through begin or start transaction and close the transaction through commit. If these two commands are not in the SQL statement, by default, each SQL is a separate transaction, which is automatically committed after execution is completed.

For example:

Update user set name=' reshape 'where id=1; copy code

Suppose I only execute this update statement, and after I close the MySQL client and reopen a new client, I can see that all the name field values in the user table have become "reshaped", which confirms that this update statement has been automatically submitted after execution.

Autocommit is a default property of MySQL, which can be seen through the SHOW VARIABLES LIKE 'autocommit' statement. When its value is ON, autocommit of the transaction is turned on.

Mysql > SHOW VARIABLES LIKE 'autocommit';+-+-+ | Variable_name | Value | +-+-+ | autocommit | ON | +-+-+ 1 row in set (0.00 sec) copy code

We can turn off the automatic commit of the transaction by SET autocommit = OFF.

2.2 implicit submission

However, even if we have changed the value of the autocommit variable to OFF to turn off transaction auto-commit, MySQL will automatically commit the transaction when some SQL statements are executed, which is called implicit commit.

The SQL statements that trigger implicit commit are:

DDL (Data definition language, data definition language), such as create, drop, alter, truncate, statements that modify MySQL's own table data, such as create/drop user, grant, set password. If a new transaction is started in a transaction, the last transaction will be implicitly committed, such as: moment transaction A transaction B1begin

2update user set name=' reshaping 'where id=1

three

Select name from user where id=1; (N1) 4begin

five

Select name from user where id=1; (N2)

It can be proved that there are two query statements N1 and N2 in transaction B, the result of which is N1 = hedgehog and N2 = reshaping.

Other management statements are not one by one examples, can be their own Baidu. 3. Isolation level

The isolation level of a transaction specifies the visibility of changes made in a transaction within and between transactions. Lower levels of isolation usually perform higher concurrency and lower system overhead.

Four kinds of transaction isolation levels are defined in the SQL standard, namely, read uncommitted (Read Uncommitted), read committed (Read Committed), repeatable read (Repeatable Read), serializable (Serializable).

To explain in detail these four isolation levels and their respective phenomena, assume that two transactions are about to be executed, as shown in the table below:

Moment transaction A transaction B1begin

two

Begin;3

Update user set name=' reshapes' where id=1;4select name from user where id=1; (N1)

five

Commit;6select name from user where id=1; (N2)

7commit

8select name from user where id=1; (N3)

During the execution of transaction An and transaction B, there are three queries for N1Magic N2MaginN3, and their values are different at each isolation level, which are discussed below.

3.1 read unsubmitted (Read Uncommitted)

At the read uncommitted isolation level, changes in a transaction are visible to other transactions, even if they are not committed.

In the above scenario, if the isolation level of the database is read uncommitted, because transaction A can read the modified data of uncommitted transaction B, that is, the modification of transaction B in time 3 is visible to transaction A, so N1 = reshape, N2 = reshape, N3 = reshape.

3.2 read submitted (Read Committed)

Under the read committed isolation level, changes in a transaction are not visible to other transactions until they have been committed.

In the above scenario, if the isolation level of the database is read committed, transaction A can only read the data after transaction B commits, that is, the modification of transaction B in time 3 is not visible to transaction A, and the query at N2 is visible to transaction An after transaction B commits. So N1 = hedgehog, N2 = remodeling, N3 = remodeling.

3.3Readable (Repeatable Read)

Repeatable readability is the default transaction isolation level for MySQL. At a repeatable isolation level, the same records are queried multiple times in a transaction, and the results are always consistent.

In the above scenario, if the isolation level of the database is repeatable, the values of query N1 and N2 are "hedgehogs" because they are in the same transaction, while N3 is a query made after transaction An is committed, and changes to transaction B are visible, so N3 = reshape.

3.4Serializable (Serializable)

Under the serializable isolation level, transactions are executed serially, read locks are added, write locks are added, and transactions are not executed concurrently, so there is no exception.

In the above scenario, if the isolation level of the database is serializable, transaction An is opened first, transaction B is blocked when transaction B is opened, and transaction B is not opened until transaction A commits, so N1 = hedgehog, N2 = hedgehog. The query at N3 is executed after transaction B commits (transaction B is blocked first and executes before the N3 query statement), so N3 = reshape.

4. Problems caused by isolation level

In different transaction isolation levels, if you encounter transaction execution concurrently, there will be many problems, such as Dirty Read, Non-Repeatable Read, Phantom Read and so on. Here are different examples to illustrate these problems in detail.

4.1 dirty reading (Dirty Read)

Dirty Read means that one transaction can read data modified by another uncommitted transaction.

Take a look at the following example, assuming that the isolation level is read unsubmitted:

Moment transaction A transaction B1begin

two

Begin;3

Update user set name=' reshapes' where id=1;4select name from user where id=1; (N1)

five

Rollback;6select name from user where id=1; (N2)

7commit

Under the read uncommitted isolation level, the value of N1 is "reshaping" and the value of N2 is "hedgehog" due to the rollback of transaction B. Here there is a dirty read at N1, and it is obvious that the query result at N1 is a dirty data, which will have an impact on normal business.

Dirty reading occurs when reading an uncommitted isolation level.

4.2 non-repeatable (Non-Repeatable Read)

Unrepeatable Non-Repeatable Read means that executing the same query twice may yield different results.

Continue to use the AB transaction case when introducing the isolation level, assuming that the isolation level is read committed:

Moment transaction A transaction B1begin

two

Begin;3

Update user set name=' reshapes' where id=1;4select name from user where id=1; (N1)

five

Commit;6select name from user where id=1; (N2)

7commit

8select name from user where id=1; (N3)

Under the read committed isolation level, transactions can read data committed by other transactions. In the above case, the result is N1 = hedgehog, N2 = reshape, N3 = reshape, and in transaction A, there are two identical queries N1 and N2, but the results of these two queries are not the same, which leads to non-repeatable reading.

Non-repeatable reads occur at the isolation level of read uncommitted and read committed.

4.3 Phantom reading (Phantom Read)

Phantom Read means that when a transaction reads a record in a range, another transaction inserts a new record in that range, and when the previous transaction reads the record in that range again, the new record will be read.

Take a look at the following example, assuming that the isolation level is repeatable at this time:

Moment transaction A transaction B1begin

2select name from user; (N1)

three

Begin;4

Insert into user values (2, 'five people'); 5

Commit;6select name from user; (N2)

7select name from user for update; (N3)

8commit

Transaction A has three queries, between N1 and N2, transaction B executes an insert statement and commits, and the query at N3 uses for update.

The result at N1 is obviously only a "hedgehog", and the result at N2 is also a "hedgehog" because transaction A starts before transaction B, while the result at N3 should theoretically be only a "hedgehog" in a repeatable isolation level. but in fact, the results of N2 are "hedgehogs" and "five people", which leads to illusion.

This is strange. Doesn't the repeatable isolation level ensure that the same records are queried multiple times in a transaction, and the results are always the same? This result does not satisfy the definition of repeatability.

In fact, at a repeatable isolation level, phantom reading may occur if the current read is used.

The current read and snapshot read will be discussed in the following section when introducing the principles of transaction implementation and MVCC. Here is a conclusion.

Phantom reading occurs at the isolation level of read uncommitted, read committed, and repeatable.

What needs to be noted here is that both phantom reading and unrepeatable reading mean that the results of the same query statement in a transaction are different, but phantom reading focuses more on querying newly inserted data from other transactions (insert) or data deleted by other transactions (delete), while the scope of unrepeatable reading is wider, as long as the results are different, it can be regarded as unrepeatable reading. However, in general, we think that unrepeatable reads focus more on data updates (update) from other transactions.

4.4 Summary

From the above description, we already know the concepts of the four isolation levels and the problems they will encounter. The higher the isolation level of the transaction, the stronger the isolation, and the fewer problems will be encountered. But at the same time, the higher the isolation level, the weaker the concurrency.

The following table summarizes the concept of isolation levels and the problems that can occur at different isolation levels:

Isolation level dirty read can not be repeated unreal read concept read changes in committed √√√ transactions, even if not committed, read uncommitted is visible to other transactions

Changes in √√ transactions are visible and repeatable to other transactions only after they have been committed

√ queries the same records multiple times in a transaction, and the results are always consistent serializable.

Transactions are executed serially, with read locks added to read and write locks added to 5. MVCC

MVCC (Multi-Version Concurrency Control) is multi-version concurrency control, which is implemented by MySQL to improve database concurrency performance. It can ensure the concurrent execution of read-write operations of different transactions when reading and writing databases concurrently. At the same time, it can also solve the transaction isolation problems such as dirty read, unrepeatable read, phantom read and so on.

The concept of current reading was mentioned in the previous discussion of phantom reading, and it is precisely because of the current reading that phantom reading also occurs at a repeatable isolation level.

Before explaining the cause of phantom reading at the repeatable isolation level, the implementation principle of MVCC is introduced.

The implementation principle of 5.1 MVCC

First of all, we need to know that there are hidden fields for each row of data in InnoDB's data page:

DB_ROW_ID: implicit primary key. If no primary key is defined in the table structure, InnoDB will automatically generate this field as the primary key of the table DB_TRX_ID: transaction ID, representing the last transaction IDDB_ROLL_PTR: rollback pointer that modified this row record, pointing to the previous version of this row record (the record corresponding to the previous transaction ID)

Each modification statement records a rollback statement (undo log) accordingly. If each rollback statement is regarded as a record in a data table, then the change record to the same row can be regarded as a linked list through the transaction ID and the rollback pointer, and each node on the linked list is a snapshot version, which is the meaning of multiple versions in MVCC.

For example, suppose you make multiple changes to the only row "hedgehog" in the user table.

Update user set name=' reshapes the copy code of 'where id=1;update user set name=' Trojan' where id=1;update user set name=' Dada 'where id=1;

So the version chain of this record is:

In this version chain, the header node is the latest version of the current record. The DB_TRX_ID transaction ID field is a very important attribute, so Mark it first.

In addition, in the isolation level of read committed (RC,Read Committed) and repeatable read (RR,Repeatable Read), a Read View is created when the transaction is started, which is used to record the active transaction information of the current system, and to judge the visibility between the transactions by reading the view.

There are two important properties in the read view:

Current transaction ID: represents the transaction of the transaction generating the read view ID transaction ID list: indicates the transaction ID list active in the current system when the read view is generated, the minimum transaction ID: indicates the smallest transaction active in the current system when the read view is generated, the next transaction ID: indicates that the system should be assigned to the transaction ID of the next transaction when the read view is generated

Note that the value of the next transaction I is not the maximum value + 1 in the transaction ID list, but the maximum value + 1 of the transactions that have existed in the current system. For example, there are active transactions in the current database (1minute 2), when transaction 2 commits and a new transaction is opened. In the generated read view, the value of the next transaction ID is 3.

We judge the visibility between concurrent transactions by combining the version chain and the read view, and the judgment rules are as follows (suppose we now want to determine whether transaction A can access the modification record of transaction B):

If the current transaction ID of transaction B is less than the minimum transaction ID of transaction A, it means that transaction B has been committed before transaction A generates a read view, so transaction B is visible to transaction A. If the current transaction ID of transaction B is greater than or equal to the value of the next transaction ID of transaction A, it means that transaction B is opened after transaction A generates a read view, so transaction B is not visible to transaction A. If the current transaction ID of transaction B is between the minimum transaction ID of transaction An and the next transaction ID (left closed and right open, [minimum transaction ID, next transaction ID), it needs to be discussed in two cases: if the current transaction ID of transaction B is in the transaction ID list of transaction A, transaction B is still active and uncommitted when transaction An is created, so transaction B is not visible to transaction A. If the current transaction ID of transaction B is not in the transaction ID list of transaction A, transaction B is committed when transaction An is created, so transaction B is visible to transaction A.

If transaction B is invisible to transaction A, you need to modify the version chain of the record, traversing forward from the rollback pointer until you find the first transaction ID that is visible to transaction A, or the traversal version chain is not found (indicating that the record is not visible to transaction A).

This is how MVCC is implemented.

5.2 when to read the creation of the view

What we need to pay attention to here is the opportunity to create a Read View. In the above discussion, we already know that a transaction will create a read view when it starts, and there are two ways to start a transaction, one is begin/start transaction, and the other is start transaction with consistent snapshot. The timing of creating a read view is also different by opening a transaction in these two ways:

If the transaction is opened in begin/start transaction mode, the read view will create a run process to read view 5.3 MVCC if the transaction is opened in start transaction with consistent snapshot mode when the first snapshot read statement is executed

To illustrate in detail the running process of MVCC, the following example assumes that there are currently two transactions (transaction isolation level is MySQL default repeatable readable):

What we need to pay attention to here is the timing of transaction startup. In the above discussion, we already know that a transaction will create a Read View when it starts, and there are two ways to start a transaction, one is begin/start transaction, and the other is start transaction with consistent snapshot. The timing of creating a read view is also different when a transaction is started in these two ways:

If the transaction is opened in begin/start transaction mode, the read view will create if the transaction is opened in start transaction with consistent snapshot mode when the first snapshot read statement is executed, and the transaction A transaction B1start transaction with consistent snapshot will be created at read view time

two

Start transaction with consistent snapshot;3

Update user set name=' reshapes' where id=1;4select name from user where id=1; (N1)

five

Commit;6select name from user where id=1; (N2)

7commit

Then the running process of MVCC is analyzed according to the version chain described above and the read view when the two transactions are opened.

The above figure shows the read view with two transactions open, and after the update statement of transaction B is executed, the version chain of the id=1 line is shown below.

First, let's take a look at the query statement at N1. The current transaction ID=2 of transaction B is equal to the next transaction ID of transaction A, so according to the visibility judgment discussed above, transaction B is invisible to transaction An and needs to be searched online along the version chain of the current row.

So follow the version chain to the historical version of DB_TRX_ID=1 transaction ID=1, which happens to be equal to the transaction ID value of transaction A, that is, the version of the row when transaction An is opened, this version is of course visible to transaction A, so it reads the name=' hedgehog of line id= 1, that is, the final N1 = hedgehog.

Looking at the query statement at N2, transaction B has been committed and the version chain is still shown above. Because the current version of transaction ID is equal to the next transaction ID in transaction A's read view, the current version of the record is not visible to transaction A, so it is also N2 = hedgehog.

It should be noted that if the time 4 statement of transaction An in the example is changed to an update statement for the row, transaction A will wait for transaction B to commit before executing the update statement. This is because transaction B has not committed, that is, transaction B has not released the write lock on the id=1 row, and transaction A has to update the row, which must update the current latest version (current read), so transaction An is blocked. You must wait for transaction B to release the write lock on the row before continuing to execute the update statement.

5.4 comparison of the timing of generating Reading View between RC and RR

The MVCC processes discussed above are all for RR (Repeatable Read) isolation levels, but what about read committed (RC, Read Committed) levels?

Since we have already discussed the situation of reading non-repeatable readings in the submitted isolation level, we will not give an example here, but just give a conclusion.

The time to generate a Read View under the RR, Repeatable Read isolation level is when the transaction is started, and the time to read the committed (RC, Read Committed) isolation level is to generate the Read View before each statement is executed.

For the example described above during MVCC execution, if the isolation level is read committed (RC, Read Committed):

For the query statement at N1, because transaction B has not been committed, the visible version of transaction An is still the version of transaction ID=1, so N1 = the query statement at hedgehog N2, transaction B has been committed, and the read view is also generated when the query statement at N2 is executed, and its current transaction ID=3, while in the version chain of the record, the current version of transaction ID DB_TRX_ID=2 is visible before the N2 query statement transaction ID. So N2 = reshape 5.5 current read and snapshot read current read: read the latest version of the record snapshot read: when reading the record will read the transaction visible version of the record 5.6 repeatable reasons for phantom reading

After understanding MVCC, let's look at the reasons for phantom reading at the repeatable isolation level. As mentioned above, it is because of the current reading that phantom reading occurs at a repeatable isolation level. first, let's review the example.

Moment transaction A transaction B1begin

2select name from user; (N1)

three

Begin;4

Insert into user values (2, 'five people'); 5

Commit;6select name from user; (N2)

7select name from user for update; (N3)

8commit

The query at N2 must have been very clear that they are "hedgehogs". The query statement used in N3 is for update, which adds a "row-level lock" to the target record. The meaning of the row-level lock will be discussed later. Now you only need to know that for update can lock the target record.

Locking is naturally to prevent others from modifying it, so of course, what is locked is the latest version of the record. Therefore, when querying with for update, the current read will be used to read the latest version of the target record, so the query statement at N3 will query out the records in transaction B that are not visible to transaction A, and a phantom reading will occur.

The statements currently read are:

Select... For updateselect... Lock in share mode (shared read Lock) update... insert... delete... Thank you for reading this article carefully. I hope it will be helpful for everyone to share MySQL affairs, isolation level and MVCC. At the same time, I also hope that you will support us, pay attention to the industry information channel, and find out if you encounter problems. Detailed solutions are waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report