What is MySQL Lock, transaction, MVCC 04/16 Update SLTechnology News&Howtos

What is MySQL Lock, transaction, MVCC

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail what MySQL locks, transactions and MVCC are for you. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

When a single SQL statement is executed, is it committed as a transaction?

The following is an excerpt from High performance MySQL (3rd Edition)

MySQL defaults to autocommit (AUTOCOMMIT) mode. That is, if you do not start a transaction explicitly, each query is committed as a transaction. In the current connection, you can enable or disable autocommit mode by setting the AUTOCOMMIT variable

"how does MySQL implement the ACID of transactions?

Transaction has four characteristics of ACID, so how does MySQL implement these four attributes of transaction?

Atomicity is either all successful or all failed. MySQL is atomicity achieved by recording undo_log. Undo_log rolls back the log, writing the undo_log to disk before the real SQL is executed, and then manipulating the data in the database. If an exception or rollback occurs, you can reverse the operation according to undo_log to restore the data as it was before the transaction was executed.

Once a persistent transaction is normally committed, its impact on the database should be permanent. Even if the system crashes, the modified data will not be lost. As the storage engine of MySQ L, InnoDB stores data on disk, but if disk IO is needed for every read and write of data, the efficiency will be very low. For this reason, InnoDB provides Buffer Pool as a buffer to access the database: when data is read from the database, it is first read from Buffer Pool, and if not in Buffer Pool, it is read from disk and put into Buffer Pool; when data is written to the database, the modified data in Buffer Pool,Buffer Pool is written first and flushed to disk periodically.

This design also brings corresponding problems: what if the data is submitted and the data is still in the buffer pool (not yet flushed), and MySQL goes down and power goes out? Will the data be lost?

The answer is no, MySQL ensures persistence through the mechanism of redo_log. Redo_log is the redo log. To put it simply, when the data is modified, the operation will be recorded in redo_log in addition to modifying the data in Buffer Pool. When the transaction is committed, the fsync API will be called to flush the redo_log. If the MySQL is down, you can read the data in the redo_log and restore the database when you restart.

Isolation

Isolation is the most complex one in ACID, which involves the concept of isolation level, there are a total of four

Read uncommittedRead committedRepeatable readSerializable

To put it simply, the isolation level defines the changes to the data in a transaction, which transactions are visible and which are not visible. Isolation is to manage the access order of multiple concurrent read and write requests.

MySQL will talk about the specific implementation of isolation later.

Consistency

Consistency is achieved through rollback, recovery, and isolation in a concurrent environment.

Problems that may be caused by transaction concurrency

From the previous question, I know that a single DDL execution will also be treated as a transaction auto-commit, so whether it is multiple SQL concurrency or multiple transactions with multiple SQL manually organized by yourself, it will lead to transaction concurrency problems.

Specifically, there are:

Dirty write (data committed by one transaction overwrites data uncommitted by another transaction) dirty read (data read from one transaction to uncommitted data from another transaction) cannot be repeated (the key point is that the data read multiple times in update and delete is different from that in delete transaction) Phantom read (the key point is that the number of records read multiple times in a transaction in insert is different)

We mentioned the isolation level of transactions above, and all isolation levels of MySQL guarantee no dirty writing, so there are only dirty, unrepeatable and phantom reads left.

Let's take a look at how each isolation level solves or does not solve the above problems:

Read uncommitted

Read is not submitted. At this level, no lock is added during the read process, only when the write request is locked, so the write operation modifies the data during the read process, which will result in dirty reading. It will naturally lead to unrepeatable reading and illusory reading.

Read committed

Committed reads, like uncommitted reads, are read unlocked and write locked. The difference is that the MVCC mechanism is used to avoid the problem of dirty reading, and there will also be the problem of unrepeatable reading and phantom reading. We'll talk more about MVCC later.

Repeatable read

MySQL's default isolation level, at which MySQL solves problems in two ways

Read-write lock adds a read lock when reading in parallel, and reading is a shared lock. Add a write lock whenever there is a write request, so that the read and write are serial. The data is locked when it is read and cannot be modified by other transactions. So there is no unrepeatable reading. When modifying and deleting data, it is also locked, and other transactions cannot read the data, so there is no dirty read. The first way is what we often call "pessimistic lock". The data is locked in the whole transaction process, which is more conservative and expensive in performance. MVCC (later)

In addition, Next-Key lock is used to solve the problem of phantom reading to some extent. We'll talk about that later.

Serializable

At this isolation level, transactions are executed sequentially. If autocommit is disabled, InnoDB implicitly converts all ordinary SELECT statements to SELECT. LOCK IN SHARE MODE . That is, a read sharing lock is implicitly added to the read operation, so as to avoid dirty reading, unrepeatable reading and phantom reading.

MVCC "

Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory

Multi-version concurrency control (MCC or MVCC) is a concurrency control method, which is usually used by database management systems to provide concurrent access to the database and to implement transaction storage in programming languages.

To put it simply, it is a method used by the database to control concurrency. Each database may have a different implementation of MVCC.

In the case of our commonly used MySQL, MySQL's InnoDB engine implements MVCC.

What problem can MVCC solve?

From the above definition, we can see that MVCC mainly solves the problem of data consistency when transactions are concurrent.

How does InnoDB implement MVCC

The following figure is from High performance MySQL (3rd Edition)

This book is well written and translated well, and my initial systematic understanding of MySQL is also due to reading this book. However, I think there are some problems in the account of how MVCC is implemented.

Let's see what's wrong.

First of all, take a look at the official documentation of MySQL. I compared the documents [1] of versions 5.1,5.6,5.7, and the description of this part of MVCC is almost the same.

According to the document, it is obvious to add three hidden columns to each piece of data:

A 6-byte DB_TRX_ID field that represents the transaction ID that most recently inserted or updated the record. A 7-byte DB_ROLL_PTR field that points to the undo log record of the record's rollback segment. The 6-byte DB_ROW_ID is incremented automatically when new data is inserted. When there is no user primary key on the table, InnoDB automatically generates a clustered index containing the DB_ROW_ID field.

Here I would like to add a diagram of the internal structure of MySQL including rollback segment

Edition chain

We talked about the concept of undo_log before. Each undo log has a roll_pointer attribute, so all versions are linked into a linked list by the roll_pointer attribute. We call this linked list as the version chain, and the header node of the version chain is the latest value of the current record.

ReadView

By hiding columns and version chains, MySQL can restore data to a specified version; but which version to restore to needs to be determined based on ReadView. The so-called ReadView means that the transaction (remember transaction A) takes a snapshot of the whole transaction system (trx_sys) at a certain time, and then carries on the read operation, it will compare the transaction id in the read data with the trx_sys snapshot, so as to judge whether the data is visible to the ReadView, that is, whether the data is visible to transaction A.

So far, we find that MVCC is based on hidden fields, undo_log chains, and ReadView.

MVCC in Read committed

We talked earlier about the use of MVCC to solve dirty reading problems in the Read committed isolation level. Here I refer to two articles:

Https://cloud.tencent.com/developer/article/1150633https://cloud.tencent.com/developer/article/1150630

InnoDB only looks for rows whose version is earlier than the current transaction version (that is, the row's version number is less than or equal to the transaction's system version number), which ensures that the row read is either existing before the transaction starts, or the transaction itself has been inserted or modified. So there is no dirty reading.

The occurrence of unrepeatable reads under the Read committed isolation level is due to the generation mechanism of read view. At the Read committed level, data that has been committed before the execution of the current statement is visible. During each statement execution, the read view is closed and the current read view is recreated. This allows you to create a transaction interval for read view based on the current global transaction linked list. To put it simply, under the Read committed isolation level, MVCC generates a snapshot version each time select, so each select will read different version data, so it will be non-repeatable.

MVCC in Repeatable read

The Repeatable read isolation level solves the problem of non-repeatable readings, and multiple reads in a transaction do not produce different results, ensuring repeatability. In the previous article, we said that there are two ways to implement Repeatable read, one is pessimistic locking, and the opposite MVCC is optimistic locking.

The root cause that the Repeatable read isolation level can solve the problem of non-repeatable reading is that the generation mechanism of read view is different from that of Read committed.

Read committed: data that has been submitted before the execution of the current statement is visible. Repeatable read: data that has been committed before the execution of the current transaction is visible.

Unlike Read committed, under the isolation level of Repeatable read, when a transaction is created, the current global read view is generated until the end of the transaction. In this way, repeatability can be achieved.

Phantom Reading and Next-Key Lock current Reading and Snapshot Reading

Through the MVCC mechanism, although the data becomes readable, but the data we read may be historical data, not timely data, not the current data of the database! For this way of reading historical data, we call it snapshot read, while the way of reading data from the current version of the database is called current read reference [3].

Snapshot read: it is selectselect * from table … Current read: special read operation, insert / update / delete operation, belongs to the current read, deals with the current data and needs to be locked. Select * from table where? Lock in share mode;select * from table where? For update;insert;update; delete; solve phantom reading

In order to solve the problem of phantom reading in current reading, the MySQL transaction uses next-key lock.

Repeatable read avoids phantom reading through the next-key lock mechanism.

The InnoDB storage engine has three algorithms for row locking, which are:

Record Lock: lock on a single record Gap Lock: gap lock, locking a range, but excluding Next-Key Lock: Gap Lock + Record Lock on the log book

Next-key lock is a kind of row lock that implements the equivalent of record lock (record lock) + gap lock (gap lock); it is characterized by locking not only the record itself (the function of record lock), but also a range (the function of gap lock).

When InnoDB scans an index record, it first adds a row lock (Record Lock) to the index record, and then a gap lock (Gap Lock) to the gap between the two sides of the index record. After the gap lock is added, other transactions cannot modify or insert records in this gap.

When the index of the query contains unique attributes, Next-Key Lock optimizes it and downgrades it to Record Lock, that is, only the index itself is locked, not the range.

About MySQL locks, transactions, MVCC what is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.