In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
In this article, the editor introduces in detail what "mysql transaction refers to". The content is detailed, the steps are clear, and the details are handled properly. I hope this article "what mysql transaction refers to" can help you solve your doubts.
In mysql, a transaction is a mechanism, an operation sequence, and a program execution unit that accesses and updates the database. The transaction contains one or more database operation commands, and all the commands are submitted or revoked to the system as a whole, that is, this set of database commands are either executed or not executed.
The operating environment of this tutorial: windows7 system, mysql5.6 version, Dell G3 computer.
Database transaction (Transaction) is a mechanism, an operation sequence, a program execution unit that accesses and updates the database, and contains a set of database operation commands.
The transaction submits or cancels the operation request to the system as a whole, that is, this set of database commands are either executed or not executed, so the transaction is an indivisible logical unit of work.
When performing concurrent operations on a database system, transactions are used as the smallest control unit, which is especially suitable for database systems where multiple users operate at the same time.
As a relational database, MySQL supports transactions, which is based on MySQL5.6.
Let's first review the basics of MySQL transactions.
1. Logical architecture and storage engine
As shown in the figure above, the logical architecture of the MySQL server can be divided into three layers from top to bottom:
(1) the first layer: deal with client connection, authorization authentication and so on.
(2) the second layer: server layer, which is responsible for the parsing, optimization and caching of query statements, as well as the implementation of built-in functions, stored procedures and so on.
(3) the third layer: storage engine, which is responsible for the storage and extraction of data in MySQL. In MySQL, the server layer does not manage transactions, which are implemented by the storage engine. MySQL storage engines that support transactions include InnoDB, NDB Cluster, etc., among which InnoDB is the most widely used; other storage engines do not support transactions, such as MyIsam, Memory and so on.
Unless otherwise noted, the content described later is based on InnoDB.
two。 Commit and rollback
A typical MySQL transaction operates as follows:
Start transaction;... # one or more sql statements commit
Where start transaction identifies the start of the transaction, and commit commits the transaction and writes the execution result to the database. If there is a problem with the execution of the sql statement, rollback is called and all sql statements that have been successfully executed are rolled back. Of course, you can also use the rollback statement directly in the transaction for rollback.
Automatic submission
Autocommit (autocommit) mode is used by default in MySQL, as shown below:
In autocommit mode, if there is no start transaction to explicitly start a transaction, each sql statement is executed as a transaction.
You can turn off autocommit; in the following way. Note that the autocommit parameter is specific to the connection, and if you modify the parameter in one connection, it will not affect other connections.
If autocommit is turned off, all sql statements are in one transaction until commit or rollback is executed, and the transaction ends and another transaction starts.
Special operation
In MySQL, there are special commands that, if executed in a transaction, force the commit commit transaction to be executed immediately, such as DDL statements (create table/drop table/alter/table), lock tables statements, and so on.
However, the commonly used select, insert, update, and delete commands do not force the transaction to be committed.
3. ACID characteristics
ACID is the four characteristics that measure transactions:
Atomicity (Atomicity, or indivisibility)
Consistency (Consistency)
Isolation (Isolation)
Persistence (Durability)
According to strict standards, it is a transaction only if it satisfies the ACID feature at the same time; but in the implementation of the major database vendors, there are very few transactions that really meet the ACID. For example, NDB Cluster transactions of MySQL do not satisfy persistence and isolation; InnoDB default transaction isolation level is repeatable and does not meet isolation; Oracle default transaction isolation level is READ COMMITTED, which does not meet isolation. Therefore, ACID is not so much a condition that a transaction must meet, but rather four dimensions that measure a transaction.
The ACID features and how they are implemented are described in detail below; for ease of understanding, the order of introduction is not strictly A-C-I-D.
Atomicity
1. Define
Atomicity means that a transaction is an indivisible unit of work, in which the operations are either done or not done; if a sql statement in the transaction fails, the executed statement must also be rolled back and the database returns to the pre-transaction state.
two。 Implementation principle: undo log
Before explaining the principle of atomicity, let's first introduce MySQL's transaction log. There are many kinds of MySQL logs, such as binary log, error log, query log, slow query log and so on. In addition, InnoDB storage engine also provides two kinds of transaction log: redo log (redo log) and undo log (rollback log). Redo log is used to ensure transaction persistence; undo log is the basis for the implementation of transaction atomicity and isolation.
Let's go back to undo log. The key to atomicity is the ability to undo all sql statements that have been successfully executed when the transaction is rolled back. InnoDB rollback is achieved by undo log: when a transaction modifies the database, InnoDB generates the corresponding undo log;. If the transaction fails or calls rollback, causing the transaction to need to be rolled back, you can use the information in undo log to roll back the data to the way it was before the modification.
Undo log is a logical log, which records information related to sql execution. When a rollback occurs, InnoDB will do the opposite work according to the contents of undo log: for each insert, delete; will be executed on rollback for each delete, insert; will be executed on rollback for each update, and an opposite update will be performed on rollback to change the data back.
Take the update operation as an example: when a transaction executes update, the generated undo log contains information such as the primary key of the modified row (in order to know which rows have been modified), which columns have been modified, and the value of these columns before and after modification. When rollback, you can use this information to restore the data to the state before update.
Persistence
1. Define
Persistence means that once a transaction is committed, its changes to the database should be permanent. Other operations or failures that follow should not affect it in any way.
two。 Implementation principle: redo log
Both redo log and undo log belong to the transaction log of InnoDB. Let's talk about the background of the existence of redo log.
As the storage engine of MySQL, InnoDB stores data on disk, but if disk IO is needed for every read and write of data, the efficiency will be very low. To this end, InnoDB provides Buffer Pool, and Buffer Pool contains the mapping of some data pages on disk as a buffer to access the database: when reading data from the database, it will first read from Buffer Pool, and if not in Buffer Pool, it will be put into Buffer Pool; after reading from disk. When writing data to the database, the modified data in Buffer Pool,Buffer Pool will be written first and flushed to disk periodically (this process is called brushing).
The use of Buffer Pool greatly improves the efficiency of reading and writing data, but it also brings new problems: if MySQL is down, and the modified data in Buffer Pool has not been refreshed to disk, it will lead to the loss of data, and the durability of the transaction can not be guaranteed.
Therefore, redo log is introduced to solve this problem: when the data is modified, in addition to modifying the data in Buffer Pool, the operation is recorded in redo log; when the transaction is committed, the fsync API is called to flush the redo log. If the MySQL is down, you can read the data in the redo log and restore the database when you restart. Redo log uses WAL (Write-ahead logging, pre-written log). All modifications are first written to the log, and then updated to Buffer Pool, ensuring that data will not be lost due to MySQL downtime, thus meeting the persistence requirements.
Since redo log also needs to write logs to disk when a transaction commits, why is it faster than writing data modified in Buffer Pool directly to disk (that is, brushing)? There are two main reasons:
(1) brushing dirty is a random IO, because the position of each modified data is random, but writing redo log is an additional operation and belongs to sequential IO.
(2) brushing dirty is in terms of data pages (Page). The default page size of MySQL is 16KB, and a small modification on a Page is written to the whole page; while redo log contains only the parts that really need to be written, and the invalid IO is greatly reduced.
3. Redo log and binlog
We know that there is also a binlog (binary log) in MySQL that can also record writes and be used for data recovery, but the two are fundamentally different:
(1) different functions: redo log is used for crash recovery to ensure that MySQL outage will not affect persistence; binlog is used for point-in-time recovery to ensure that the server can recover data based on point-in-time. In addition, binlog is also used for master-slave replication.
(2) different levels: redo log is implemented by InnoDB storage engine, while binlog is implemented by the server layer of MySQL (see the introduction to MySQL logic architecture above), and supports InnoDB and other storage engines at the same time.
(3) the content is different: redo log is a physical log, and the content of disk-based Page;binlog is binary. Depending on the binlog_format parameters, it may be based on sql statement, data itself, or a mixture of the two.
(4) the timing of writing is different: binlog writes when the transaction is committed, while the timing of redo log is relatively diverse:
As mentioned earlier: when a transaction commits, fsync is called to flush the redo log; this is the default policy, which can be changed by changing the innodb_flush_log_at_trx_commit parameter, but the persistence of the transaction will not be guaranteed.
In addition to transaction commit, there are other opportunities to refresh the disk: for example, master thread flushes the disk once per second redo log and so on. The advantage is that you don't have to wait for the commit to flush the disk, and the speed of commit is greatly accelerated.
IV. Isolation
1. Define
Unlike atomicity and persistence, which focus on the transaction itself, isolation studies the interaction between different transactions. Isolation means that the operations within the transaction are isolated from other transactions, and the transactions executed concurrently cannot interfere with each other. Strict isolation corresponds to Serializable (serializability) in the transaction isolation level, but serializability is rarely used in practical applications for performance reasons.
Isolation pursues non-interference between transactions in the case of concurrency. For simplicity, we mainly consider the simplest read and write operations (special read operations such as locked reads will be specified), so the discussion of isolation can be divided into two aspects:
The effect of a write operation on a write operation of (another transaction): the locking mechanism ensures isolation
Effect of write operations on read operations of (one transaction): MVCC guarantees isolation
two。 Locking mechanism
First, let's look at the interaction between the write operations of the two transactions. Isolation requires that only one transaction can write to the data at a time, and InnoDB ensures this through the locking mechanism.
The basic principle of the locking mechanism can be summarized as follows: before the transaction modifies the data, the transaction needs to obtain the corresponding lock; after obtaining the lock, the transaction can modify the data; during the transaction operation, this part of the data is locked. If other transactions need to modify the data, they need to wait for the current transaction to commit or roll back the lock.
Row lock and table lock
According to granularity, locks can be divided into table locks, row locks, and other locks that lie between the two. The table lock locks the whole table when operating the data, and the performance is poor, while the row lock only locks the data that needs to be operated, and the concurrent performance is good. However, because locking itself consumes resources (obtaining locks, checking locks, releasing locks, etc.), using table locks in the case of more locking data can save a lot of resources. Different storage engines in MySQL support different locks. For example, MyIsam only supports table locks, while InnoDB supports both table locks and row locks, and row locks are used in most cases for performance reasons.
How to view lock information
There are several ways to view locks in InnoDB, such as:
Overview of select * from information_schema.innodb_locks; # locks show engine innodb status; # InnoDB overall status, including lock status
Let's look at an example:
# execute in transaction A: start transaction;update account SET balance = 1000 where id = 1 * * # execute in transaction B: start transaction;update account SET balance = 2000 where id = 1
Check the lock at this point:
Show engine innodb status looks at the lock-related parts:
You can see how transactions 24052 and 24053 occupy locks through the above command, where lock_type is RECORD, representing a row lock (record lock), and lock_mode is X, representing an exclusive lock (write lock).
In addition to exclusive locks (write locks), there is a concept of shared locks (read locks) in MySQL. Since this article focuses on the implementation principle of MySQL transactions, the introduction of locks ends here. Later, we will write a special article to analyze the differences and usage scenarios of different locks in MySQL.
After introducing the interaction between write operations, let's discuss the impact of write operations on read operations.
3. Dirty reading, unrepeatable reading and phantom reading
First, let's look at three types of problems that may exist in read operations in the case of concurrency:
(1) dirty reading: the uncommitted data (dirty data) of other transactions (B) can be read in the current transaction (A). This phenomenon is dirty reading. Examples are as follows (take the account balance statement as an example):
(2) unrepeatable reading: the same data is read twice in transaction A, and the result of the two reads is different. This phenomenon is called non-repeatable reading. The difference between dirty reading and unrepeatable reading is that the former reads data that is not committed by other transactions, while the latter reads data that has been committed by other transactions. Examples are as follows:
(3) Phantom reading: in transaction A, the database is queried twice according to a certain condition, and the number of query results is different. This phenomenon is called phantom reading. The difference between unrepeatable reading and phantom reading can be understood as: the former is that the data has changed, while the latter is that the number of rows of the data has changed. Examples are as follows:
4. Transaction isolation level
The SQL standard defines four isolation levels and specifies whether the above problems exist under each isolation level. In general, the lower the isolation level, the lower the system overhead, the higher the concurrency that can be supported, but the worse the isolation. The relationship between isolation levels and reading problems is as follows:
In practical applications, read uncommitted will cause a lot of problems in concurrency, but the performance improvement compared with other isolation levels is very limited, so it is used less. Serializability forces transactions to be serial, and the concurrency efficiency is very low. It is only used when data consistency is extremely high and no concurrency is acceptable, so it is less used. Therefore, in most database systems, the default isolation level is read committed (such as Oracle) or repeatable read (hereinafter referred to as RR).
You can view the global isolation level and the isolation level for this session with the following two commands:
The default isolation level for InnoDB is RR, which will focus on RR later. It should be noted that in the SQL standard, RR cannot avoid phantom reading, but RR implemented by InnoDB avoids phantom reading.
5. MVCC
RR solves the problems of dirty reading, unrepeatable reading, phantom reading and so on, using the full name of MVCC:MVCC Multi-Version Concurrency Control, that is, multi-version concurrency control protocol. The following example shows the characteristics of MVCC: different transactions may read different data at the same time (that is, multiple versions)-transaction An and transaction C can read different versions of the data at T5.
The biggest advantage of MVCC is that there is no lock on read, so there is no conflict between read and write, and the performance of concurrency is good. InnoDB implements MVCC, and multiple versions of data can coexist, mainly based on the following technologies and data structures:
1) Hidden column: every row of data in InnoDB has a hidden column. The hidden column contains the transaction id of the row data, the pointer to undo log, and so on.
2) undo log-based version chain: as mentioned earlier, the hidden column of each row of data contains a pointer to undo log, and each undo log also points to an earlier version of undo log, forming a version chain.
3) ReadView: MySQL can restore data to a specified version by hiding columns and version chains, but which version to restore to needs to be determined according to ReadView. The so-called ReadView means that the transaction (remember transaction A) takes a snapshot of the whole transaction system (trx_sys) at a certain time, and then carries on the read operation, it will compare the transaction id in the read data with the trx_sys snapshot, so as to judge whether the data is visible to the ReadView, that is, whether the data is visible to transaction A.
The main contents of trx_sys and the methods to determine visibility are as follows:
Low_limit_id: represents the id in the system that should be assigned to the next transaction when the ReadView is generated. If the transaction id of the data is greater than or equal to low_limit_id, it is not visible to that ReadView.
Up_limit_id: represents the smallest transaction id among the active read and write transactions in the current system when the ReadView is generated. If the transaction id of the data is less than up_limit_id, it is visible to that ReadView.
Rw_trx_ids: represents the id list of active read and write transactions in the current system when the ReadView was generated. If the transaction id of the data is between low_limit_id and up_limit_id, you need to determine whether the transaction id is in the rw_trx_ids: if so, the transaction is still active when the ReadView is generated, so the data is not visible to the ReadView; if not, the transaction is committed when the ReadView is generated, so the data is visible to the ReadView.
The following takes the RR isolation level as an example, combined with several problems mentioned earlier.
(1) dirty reading
A ReadView is generated before transaction A reads the balance of zhangsan at T3, and since transaction B is still active without committing, its transaction id must be in the rw_trx_ids of ReadView, so according to the rules described earlier, the modification of transaction B is not visible to ReadView. Next, transaction A queries the data of the previous version according to the undo log pointed to by the pointer, and the balance of zhangsan is 100. In this way transaction An avoids dirty reading.
(2) non-repeatable
A ReadView is generated before transaction A reads the balance of zhangsan at T2. Transaction B is discussed in two cases: one is that, as shown in the figure, the transaction has started but not committed, and the transaction id is in the rw_trx_ids of ReadView; the other is that transaction B has not yet started, and its transaction id is greater than or equal to the low_limit_id of ReadView. In either case, according to the rules described earlier, the modification of transaction B is not visible to ReadView.
When transaction A reads the balance of zhangsan again at T5, it will judge the visibility of the data according to the ReadView generated at T2, thus judging that the modification of transaction B is not visible; therefore, transaction A queries the data of the previous version according to the undo log pointed to by the pointer, and the balance of zhangsan is 100, thus avoiding non-repeatable reading.
(3) Phantom reading
The mechanism by which MVCC avoids phantom reading is very similar to avoiding unrepeatable reading.
When transaction A reads 0 at T2 time
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.