Example Analysis of MySQL Database transaction 07/15 Update SLTechnology News&Howtos

Example Analysis of MySQL Database transaction

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Editor to share with you a sample analysis of MySQL database transactions, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

Transaction characteristics: ACID

From a business point of view, a set of operations on the database requires that four characteristics be maintained:

Atomicity (atomicity): a transaction must be regarded as an indivisible minimum unit of work. All operations in the whole transaction either commit successfully or fail to roll back. For a transaction, it is not possible to perform only part of the operations.

Consistency (consistency): a database always transitions from one consistent state to another. The following bank column will say.

Isolation (isolation): in general, changes made by one transaction are not visible to other transactions until they are finally committed. Note the "generally speaking" here, which will be mentioned later at the transaction isolation level.

Durability (persistence): once a transaction commits, its changes are permanently saved to the database. Even if the system crashes, the modified data will not be lost. There is also a relationship between persistent security and refresh log levels, with different levels corresponding to different levels of data security. )

To better understand ACID, take bank account transfers as an example:

START TRANSACTION;SELECT balance FROM checking WHERE customer_id = 10233276 / update checking SET balance = balance-200.00 WHERE customer_id = 10233276 / update savings SET balance = balance + 200.00 WHERE customer_id = 10233276 / commit

Atomicity: either commit completely (10233276 checking balance minus 200 million) or roll back completely (the balance of both tables does not change)

Consistency: the consistency of this example is that 200 yuan will not go missing because the database system crashes after line 3 and before line 4, because the transaction has not yet been committed.

Isolation: allows operation statements in one transaction to be isolated from statements of other transactions, such as transaction A running after line 3 and before line 4, when transaction B goes to query the checking balance, it can still see the 200 yuan subtracted from transaction A (account money unchanged), because transaction An and B are isolated from each other. Transaction B does not observe a change in the data until transaction A commits.

Persistence: this is easy to understand.

Transaction isolation is achieved through locks, MVCC, etc. (MySQL lock summary)

The atomicity, consistency, and persistence of transactions are achieved through transaction logs (see below)

Isolation level of the transaction

Problems caused by concurrent transactions

Lost update (Lost Update): when two or more transactions select the same row and then update the row based on the initially selected value, the problem of missing updates occurs because each transaction is unaware of the existence of other transactions-the last update overwrites updates made by other transactions. For example, two editors made an electronic copy of the same document. Each editor changes its copy independently, and then saves the changed copy, thus overwriting the original document. The editor who finally saved a copy of his changes overrides the changes made by another editor. This problem can be avoided if another editor cannot access the same file before one editor completes and commits the transaction.

Dirty Reads: a transaction is modifying a record, and the data of that record is in an inconsistent state before the transaction is completed and committed; at this time, another transaction also reads the same record. If uncontrolled, the second transaction reads the "dirty" data and makes further processing accordingly, resulting in uncommitted data dependencies. This phenomenon is vividly called "dirty reading".

Non-Repeatable Reads: at some point after reading some data, a transaction reads the previously read data again, only to find that the data it reads has changed or that some records have been deleted! This phenomenon is called "unrepeatable reading".

Phantom Reads: a transaction re-reads previously retrieved data according to the same query conditions, only to find that other transactions insert new data that meets its query criteria. This phenomenon is called "phantom reading".

The difference between phantom reading and unrepeatable reading:

The focus of unrepeatable reading is modification: in the same transaction, the data read for the first time is different from the data read for the second time under the same conditions. (because other transactions have committed changes in the middle)

The focus of phantom reading is on adding or deleting: in the same transaction, the number of records read for the first time and the second time is not the same under the same condition. (because there are other transactions committed to insert / delete)

Solutions to the problems caused by concurrent transactions:

"lost updates" is usually something that should be completely avoided. However, preventing the loss of updates can not be solved by the database transaction controller alone, and the application needs to add the necessary locks to the data to be updated. Therefore, it should be the responsibility of the application to prevent the loss of updates.

"dirty reading", "non-repeatable reading" and "phantom reading" are all problems of database read consistency, which must be solved by the database to provide a certain transaction isolation mechanism.

One is locking: locking the data before it is read to prevent other transactions from modifying the data.

The other is data multi-version concurrency control (MultiVersion Concurrency Control, referred to as MVCC or MCC), also known as multi-version database: without any lock, a consistent data snapshot (Snapshot) of a data request point in time is generated through a certain mechanism, and this snapshot is used to provide a certain level of consistent reading (statement level or transaction level). From the user's point of view, it seems that the database can provide multiple versions of the same data.

The SQL standard defines four types of isolation levels, each of which specifies the changes made in a transaction, which are visible and invisible within and between transactions. Low-level isolation levels generally support higher concurrent processing and have lower system overhead.

Level 1: Read Uncommitted (read uncommitted)

All transactions can see the execution results of other uncommitted transactions

This isolation level is rarely used in practical applications because its performance is not much better than other levels.

The problem caused by this level is dirty reading (Dirty Read): uncommitted data was read.

Level 2: Read Committed (read submission)

This is the default isolation level for most database systems (but not the default for MySQL)

It satisfies the simple definition of isolation: a transaction can only see the changes made by the committed transaction.

The problem with this isolation level is unrepeatable reading (Nonrepeatable Read): unrepeatable reading means that we may see different results when we execute identical select statements in the same transaction. The reasons for this may be:

There is a cross transaction with a new commit, resulting in a change in the data

When a database is operated by multiple instances, other instances of the same transaction may have a new commit during the processing of the instance

Level 3: Repeatable Read (rereadable)

This is the default transaction isolation level for MySQL

It ensures that when multiple instances of the same transaction read data concurrently, they will see the same data rows.

A problem that may arise at this level-Phantom Read: when the user reads a range of data rows, another transaction inserts a new row in that range, and when the user reads the range of data rows, they will find a new "phantom" row

InnoDB and Falcon storage engines solve phantom reading problems through multi-version concurrency control (MVCC,Multiversion Concurrency Control) mechanisms; InnoDB also solves phantom reading problems through gap locks

Multi-version concurrency control:

Most transactional storage engine implementations of Mysql are not simple row-level locks. Based on the consideration of promoting concurrency, multi-version concurrency control (MVCC) is generally implemented at the same time, including Oracle and PostgreSQL. However, the implementation is different.

MVCC is implemented by saving snapshots of data at a certain point in time. That is to say, no matter how long it takes to implement, everything sees the same data.

It is divided into optimistic concurrency control and pressimistic concurrency control.

How MVCC works:

InnoDB's MVCC is achieved by saving two hidden columns after each row of records. One of these two columns saves the creation time of the row, and the other saves the expiration time of the row (delete time). Of course, what is stored is not the real time, but the system version number (system version number). Each time you start a new transaction, the system version number is automatically added. The system version number at the beginning of the transaction is used as the version number of the transaction, which is used to query the version number of each row for comparison.

How does MVCC work at the REPEATABLE READ (rereadable) isolation level:

SELECT

InnoDB checks each row of records against the following criteria:

InnoDB only looks for data rows whose version is earlier than the current transaction version, which ensures that the rows read by the transaction either exist before the transaction starts or are inserted or modified by the transaction itself.

The delete version number of the row is either undefined or greater than the current transaction version number, which ensures that the row read by the transaction is not deleted before the transaction starts

Only those who meet the above two conditions will be queried.

INSERT

InnoDB saves the current system version number as the line version number for each newly inserted row

DELETE

InnoDB saves the current system version number as the line deletion ID for each line deleted.

UPDATE

InnoDB saves the current system version number as the row version number for the inserted row of new records, and saves the current system version number to the original line as the deletion ID

Save these two version numbers so that most operations are unlocked. It makes the data operation simple, the performance is good, and can guarantee that only the rows required by the compound will be read. The drawback is that each row of records requires additional storage space, more line checking and some additional maintenance work.

MVCC only works under two isolation levels: COMMITTED READ (read commit) and REPEATABLE READ (repeatable readable).

MVCC can be thought of as a variant of row-level locks, but it avoids locking operations in many cases and costs less. Although the implementation mechanisms of different databases are different, most of them implement non-blocking read operations (reads do not need to be locked, and can avoid unrepeatable reads and phantom reads), and write operations lock only the necessary rows (writes must be locked. otherwise, concurrent writing in different transactions will lead to data inconsistency.

Level 4: Serializable (serializable)

This is the highest level of isolation.

It solves the problem of phantom reading by forcing the ordering of transactions so that they can not conflict with each other. In short, it adds a shared lock to each read row of data. MySQL lock summary

At this level, it can lead to a lot of timeouts and lock competition.

Isolation level comparison

Each specific database does not necessarily fully implement the above four isolation levels, for example:

Oracle only provides two standard isolation levels, Read committed and Serializable, as well as self-defined Read only isolation levels.

In addition to the four isolation levels defined by ISO/ANSI SQL92 above, SQL Server supports an isolation level called "snapshot", but strictly speaking it is an Serializable isolation level implemented in MVCC.

MySQL supports all four isolation levels, but when implemented, there are some features, such as using MVCC consistent read under some isolation levels, but not in some cases.

Mysql can set the isolation level by executing the set transaction isolation level command, and the new isolation level takes effect at the beginning of the next transaction. For example: set session transaction isolation level read committed

Transaction log

Transaction logs can help improve transaction efficiency:

Using the transaction log, the storage engine only needs to modify the memory copy of the table when it modifies the data, and then records the modification in the transaction log on the hard disk, instead of persisting the modified data to disk every time.

The transaction log is appended, so the operation of writing the log is the sequential I _ hand O in a small area of the disk, unlike the random I _ big O, which needs to move the head in multiple places on the disk, so using the transaction log method is relatively much faster.

After the transaction log is persisted, the modified data in memory can be slowly brushed back to disk in the background.

If the modification of the data has been recorded in the transaction log and persisted, but the data itself is not written back to disk, the system crashes and the storage engine can automatically recover this part of the modified data when it is restarted.

At present, most storage engines are implemented in this way, which we usually call pre-written logs (Write-Ahead Logging), and you need to write to disk twice to modify data.

The principle of transaction implementation in Mysql

The implementation of transaction is based on the storage engine of database. Different storage engines have different levels of transaction support. The storage engines that support transactions in mysql are innoDB and NDB.

InnoDB is the default storage engine for mysql, and the default isolation level is RR (Repeatable Read). Under the isolation level of RR, the unrepeatable read problem is solved through multi-version concurrency control (MVCC,Multiversion Concurrency Control), and the phantom read problem is solved by adding gap locks (that is, concurrency control). Therefore, the RR isolation level of innoDB actually achieves the effect of serialization level and retains good concurrency performance.

The isolation of transactions is achieved through locks, while the atomicity, consistency, and persistence of transactions are achieved through transaction logs. When it comes to transaction logs, we have to say redo and undo.

1.redo log

In innoDB's storage engine, transaction logs are implemented through redo logs and innoDB storage engine log buffering (InnoDB Log Buffer). When a transaction is started, the operations in the transaction are first written to the log buffer of the storage engine. Before the transaction is committed, these buffered logs need to be flushed to disk in advance for persistence. This is what DBA often call "log first" (Write-Ahead Logging). When the transaction commits, the data files mapped in Buffer Pool are slowly flushed to disk. At this point, if the database crashes or goes down, when the system is rebooted for recovery, the database can be restored to the state it was before the crash according to the logs recorded in redo log. Unfinished transactions can continue to commit, or you can choose to roll back, depending on the recovery strategy.

When the system starts, a contiguous piece of storage space has been allocated to redo log, Redo Log is recorded in a sequential append way, and performance is improved by sequential IO. All transactions share the storage space of the redo log, and their Redo Log is recorded alternately in the order in which the statements are executed. The following is a simple example:

Record 1:

Record 2:

Record 3:

Record 4:

Record 5:

2.undo log

Undo log mainly serves for the rollback of transactions. During the execution of a transaction, in addition to recording redo log, a certain amount of undo log is recorded. Undo log records the state of the data before each operation, and if a rollback is needed during the execution of the transaction, it can be rolled back according to undo log. The rollback of a single transaction only rolls back the operations done by the current transaction and does not affect the operations done by other transactions.

The following is a simplified process for undo+redo transactions

Suppose there are two values, An and B, and the value is 1pc2.

1. Start transaction

two。 Record Agg1 to undo log

3. Update A = 3

4. Record Agg3 to redo log

5. Record bread2 to undo log

6. Update B = 4

7. Record B = 4 to redo log

8. Flush redo log to disk

9. Commit

At any step of 1-8, the system goes down and the transaction is not committed, and the transaction will not have any impact on the data on disk. If there is a downtime between 8 and 9, you can choose to roll back after recovery, or you can choose to continue to complete the transaction commit, because the redo log has been persisted at this time. If the system goes down after 9 and the data changed in the memory map is not ready to be flushed back to disk, then after the system is restored, the data can be flushed back to disk according to redo log.

Therefore, redo log actually guarantees the persistence and consistency of transactions, while undo log guarantees the atomicity of transactions.

Transaction usage in Mysql

The service layer of MySQL does not manage transactions, but is implemented by a lower-level storage engine. Like InnoDB.

MySQL supports statements for local transactions:

START TRANSACTION | BEGIN [WORK] COMMIT [WORK] [AND [NO] CHAIN] [[NO] RELEASE] ROLLBACK [WORK] [AND [NO] CHAIN] [[NO] RELEASE] SET AUTOCOMMIT = {0 | 1}

START TRANSACTION or BEGIN statement: start a new transaction.

COMMIT and ROLLBACK: used to commit or roll back transactions.

CHAIN and RELEASE clauses: used to define the operation after the transaction commits or rollback, respectively, CHAIN starts a new thing immediately and has the same isolation level as the previous transaction, and RELEASE disconnects from the client.

SET AUTOCOMMIT can modify the commit mode of the current connection. If SET AUTOCOMMIT=0 is set, all transactions after setting up need to be committed or rolled back through explicit commands.

Note the use of transactions:

If you start a new transaction with the start transaction command while the table is locked, it will result in an implicit unlock

Tables is executed.

In the same transaction, it is best not to use tables of different storage engines, otherwise ROLLBACK needs to be right and wrong.

Transaction-type tables are specially processed because COMMIT and ROLLBACK can only commit and roll back transaction-type tables.

Like Oracle's transaction management, all DDL statements cannot be rolled back, and some DDL statements cause implicit commits.

In a transaction, you can specify a portion of a rollback transaction by defining a SAVEPOINT (for example: mysql > savepoint test; defines a savepoint named test), but you cannot specify a portion of a committed transaction. For complex applications, multiple different SAVEPOINT can be defined and rolled back when different conditions are met.

Different SAVEPOINT. It is important to note that if a SAVEPOINT with the same name is defined, the SAVEPOINT defined later will overwrite the previous definition. For SAVEPOINT that no longer needs to be used, you can delete the SAVEPOINT through the RELEASE SAVEPOINT command, and the deleted SAVEPOINT can no longer execute the ROLLBACK TO SAVEPOINT command.

Auto submit (autocommit):

Mysql defaults to autocommit mode, which can be enabled or disabled by setting the autocommit variable.

Implicit locking

InnoDB uses a two-phase locking protocol during transaction execution:

Locking can be performed at any time, and InnoDB will automatically add locks when needed according to the isolation level

Locks are released only when commit or rollback is executed, and all locks are released at the same time.

Explicit locking

InnoDB also supports display locking through specific statements (storage engine layer):

Select... Lock in share mode / / shared lock select... For update / / exclusive lock

Display locking for MySQL Server layers:

Lock table and unlock table

(read more: summary of MySQL locks)

MySQL support for distributed transactions

There are many ways to implement distributed transactions, not only using the native transaction support provided by innoDB, but also using message queue to achieve the final consistency of distributed transactions. Here we focus on innoDB's support for distributed transactions.

MySQL has supported distributed transactions since 5.0.3. Currently, distributed transactions only support the InnoDB storage engine. A distributed transaction involves multiple actions, which themselves are transactional. All actions must be completed successfully together, or rolled back together.

As shown in the figure, mysql's distributed transaction model. The model is divided into three parts: application (AP), resource manager (RM), and transaction manager (TM):

Application: defines the boundaries of transactions and specifies which transactions need to be done

Resource manager: provides a way to access transactions, usually a database is a resource manager

Transaction manager: coordinates and participates in each transaction in a global transaction.

Distributed transactions adopt a two-stage commit (two-phase commit) approach:

In the first phase, all transaction nodes begin to prepare and tell the transaction manager ready.

The second phase of the transaction manager tells each node whether it is commit or rollback. If one node fails, all the rollback of the global node is required to ensure the atomicity of the transaction.

The SQL syntax for distributed transactions (XA transactions) mainly includes:

XA {START | BEGIN} xid [JOIN | RESUME]

Although MySQL supports distributed transactions, some problems were found during testing:

If the database restarts abnormally when the branch transaction reaches the prepare state, after the server restarts, you can continue to commit or rollback the branch transaction, but the committed transaction does not write binlog, which may lead to the loss of some data using binlog recovery. If there is a replicated database, it is possible to cause data inconsistency between the master and slave databases.

If the branch transaction executes to the prepare state, the database is abnormal, and it can no longer start normally, and backup and binlog are needed to recover the data, then the branch transactions in the prepare state cannot be recovered through binlog because they are not recorded in binlog, and this part of the data will be lost after the database is restored.

If the client connection of the branch transaction is aborted abnormally, the database will automatically roll back the incomplete branch transaction. If the branch transaction has been executed to the prepare state, then other branches of the distributed transaction may have been successfully committed. If the branch rolls back, it may lead to the incompleteness of the distributed transaction and lose some of the contents of the branch transaction.

In short, the distributed transaction of MySQL still has serious defects, in the case of database or application exception.

May result in incomplete distributed transactions. If the application does not have very high requirements for data integrity, you can consider making

Use it. If the application has high requirements for transaction integrity, then distributed is not recommended for the current version.

Business.

These are all the contents of the article "sample Analysis of MySQL Database transactions". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.