Examples of MYSQL_ multiversion concurrency control, storage engine, and indexing 07/04 Update SLTechnology News&Howtos

Examples of MYSQL_ multiversion concurrency control, storage engine, and indexing

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the MYSQL_ multi-version concurrency control, storage engine, indexing examples, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian take you to understand.

Multi-version concurrency control

Most of mysql's transactional storage engines do not implement simple row-level locks. Based on the consideration of improving concurrency performance, they generally implement multi-version concurrency control at the same time.

MVCC can be thought of as a variant of row-level locking, but it avoids locking operations in many cases because of its lower overhead.

InnoDB's MVCC is achieved by saving two hidden columns at the end of each row of records, one of which saves the creation time of the row, and the other saves the expiration time (or deletion time) of the row, of course, the storage is not the actual time value, but the system version is good. Each time you start a new transaction, the system version number is automatically incremented. The system version number at the beginning of the transaction is used as the version number of the transaction, which is used to compare the version number of each row queried.

Under REPEATABLE READ isolation level, the implementation of MVCC:

SELECT

The lookup version of the InnoDB is earlier than the data row of the current transaction version number, which ensures that the row read by the transaction either exists before the transaction starts, or is inserted or modified by the transaction itself.

The deleted version of the row is either undefined or greater than the current transaction version number, which ensures that the row read by the transaction is not deleted before the transaction begins.

INSERT

InnoDB saves the current system version number as the line version number for each newly inserted row.

DELETE

InnoDB saves the current system version number as the row deletion identity for each line deleted.

UPDATE

In order to insert a new record, InnoDB saves the current system version number as the row version number and saves the current system version number to the original line as the line deletion version number.

MVCC only works at the isolation levels of REPEATABLE READ and READ COMMITED. The other two isolation levels are incompatible with MVCC. Because READ UNCOMMITED always reads the latest rows of data, not those that match the current transaction version. SERIALIZABLE, on the other hand, locks all rows of read data.

Storage engine InnoDB storage engine

InnoDB is the default transactional engine for MYSQL and the most important and widely used storage engine. Unless there is a very specific reason to use another storage engine, the InnoDB engine should be given priority.

InnoDB uses MVCC to support high concurrency and implements four standard isolation levels. The default level is REPEATABLE READ (repeatable), and the implementation of phantom reading is prevented by the gap lock + MVCC strategy, which makes InnoDB lock not only the rows designed by the query, but also the gaps in the index to prevent phantom rows from being inserted.

Gap lock: when we use range conditions rather than equal conditions to retrieve data and request sharing or exclusive locks, InnoDB will lock the index entries of existing data records that meet the conditions; for records whose key values are within the range of conditions but do not exist, it is called "GAP", and InnoDB will also lock this "gap", this locking mechanism is the so-called gap lock (Next-Key lock).

Reference: gap lock (Next-Key lock)

The primary index is a clustered index, and the data is saved in the index, so as to avoid reading the disk directly, so the query performance is greatly improved.

Many optimizations have been made within InnoDB, including predictability pre-reading when reading data from disk, adaptive hash indexes that automatically create hash indexes in memory for acceleration operations, and insert buffers that speed up insert operations.

MyISAM storage engine

In mysql5.1 and previous versions, MyISAM is the default storage engine. MyISAM provides a number of features, including full-text indexing, compression, spatial functions, etc., but does not support transactions and row-level locks, and there is no doubt that it cannot be safely recovered after a crash.

The MyISAM engine can still be used for read-only data, or for tables that are small enough to tolerate repair operations.

When creating a MyISAM table, if the DELAY_KEY_WRITE option is specified, the modified index data will not be written to disk immediately when the modification is completed, but to the key buffer in memory, and the corresponding index block will be written to disk only when the key buffer is cleaned or the table is closed. This approach can greatly improve write performance, but it can cause index corruption in the event of a database or host crash and need to be repaired.

Compare

Transactions: InnoDB supports transactions, but MyISAM does not support transactions.

Lock granularity: InnoDB supports table-level locks and row-level locks, while MyISAM only supports table-level locks.

Foreign keys: InnoDB supports foreign keys.

Backup: InnoDB supports hot backup, but requires tools.

Crash recovery: the probability of damage after a MyISAM crash is much higher than that of InnoDB, and the recovery rate is slow.

Other features: MyISAM supports full-text indexing, compression, spatial functions and other features.

Type of backup

Cold backup (cold backup): mysql service needs to be turned off, and read and write requests are not allowed.

Warm backup: the service is online, but only read requests are supported, but write requests are not allowed

Hot backup (hot backup): while backing up, the business will not be affected.

Indexes

An index (also known as a "key") is a data structure used by the storage engine to quickly find records.

B-Tree index

This index is supported by most mysql engines.

Although the term "B-Tree" is used, different storage engines may use different storage structures, and the NDB cluster storage engine that actually uses TmurTree focus InnoDB uses B+Tree.

The B-Tree index can speed up the access to data, because the storage engine does not need to perform a full table scan to get the required data, but instead starts searching from the root node of the index, so the search speed is much faster.

B-Tree stores index columns sequentially, which is suitable for finding scope data. Because the index tree is ordered, it can also be used for sorting and grouping in addition to user lookups.

You can specify multiple columns as index columns, and multiple index columns together form an index key. B-Tree indexes are suitable for full key values, range of key values, or key prefix lookups, where key prefix lookups only apply to leftmost prefix lookups. The search must start with the leftmost column of the index.

The data structure B-Tree of B-Tree index

In order to describe B-Tree, we first define a data record as a binary [key,data], with key as the key value of the record. For different data records, key is different, and data records data except key.

All nodes have the same depth, that is, the B-Tree is balanced.

The key in a node is arranged non-decreasing from left to right.

If the left and right key of a pointer is keyi and keyi+1, respectively, and not null, then all key of the pointer to the node is greater than or equal to keyi and less than or equal to keyi+1.

Search algorithm: first, binary search is carried out in the root node, and if found, the data of the corresponding node is returned, otherwise the node pointed to by the pointer in the corresponding interval is searched recursively.

Because inserting and deleting new data records will destroy the nature of B-Tree, it is necessary to split, merge, rotate and so on the tree to maintain the B-Tree property.

B+Tree

Compared with B-Tree, B+Tree has the following characteristics:

The upper limit of the pointer per node is 2d instead of 2d+1 (d is the degree of B-Tree).

Internal nodes do not store data, only external key; nodes do not store pointers.

B+Tree with sequential access pointers

The B+Tree structures commonly used in database systems or file systems are optimized on the basis of classical B+Tree, adding sequential access pointers.

The purpose of this optimization is to provide interval access performance, for example, if you want to query all records with key 18 to 49 in the figure.

advantage

Balanced trees such as red-black trees can also be used to implement indexes, but file systems and database systems generally use B-Tree as the index structure for the following two reasons:

Better retrieval times: the time complexity of the balanced tree to retrieve data is equal to the tree height h, while the tree height is approximately O (h) = O (logN), where d is the output of each node. The output of red and black trees is 2, while the output of B-Tree is generally very large, and the height of red and black trees is obviously much higher than that of B-Tree, so the number of searches is more. B+Tree is more suitable for out-of-memory indexes than B-Tree, because the data domain is removed from the nodes in B+Tree, so it can have a greater output, and the retrieval efficiency will be higher.

Make use of the computer pre-reading feature: in order to reduce the disk I / O, the disk is often not strictly read on demand, but pre-read every time. The theory of this is based on the famous locality principle in computer science: when a data is used, the data near it is usually used immediately. In the process of pre-reading, the disk is read sequentially, which does not need to seek the disk, and only needs a very short rotation time, so the speed will be very fast. The operating system generally divides memory and disk into solid-sized blocks, each of which is called a page, and the memory and disk exchange data on a page-by-page basis. The database system sets the size of one node of the index to the size of the page, so that a single Istroke O can be fully loaded into a node, and the pre-read feature can be used, and the adjacent nodes can also be preloaded.

Reference: data structure and algorithm principle behind MySQL index

Hash indexing

The InnoDB engine has a special feature called "adaptive hash index". When an index value is used very frequently, a hash index is created on top of the B+Tree index, which makes the B+Tree index have some of the advantages of a hash index, such as fast hash lookup.

The hash index can be searched at O (1) time, but it loses its order, and it has the following limitations:

The hash index contains only hash values and row pointers, but does not store field values, so you cannot use the values in the index to avoid all rows.

Cannot be used for sorting and grouping.

Only precise search is supported, and can not be used for partial search and range search.

When a hash conflict occurs, the storage engine must traverse all row pointers in the linked list.

Spatial data Index (R-Tree)

The MyISAM table supports spatial indexes and can be used as a geographic data store. Spatial indexes index data from all dimensions, and queries can be combined according to any dimension.

You must use Mysql's GIS-related functions such as MBRONTAINS () to maintain the data.

Full-text index

A full-text index is a special type of index that looks for keywords in the text rather than directly comparing the values in the index. The lookup criteria use MATCH AGAINST instead of normal WHERE.

Full-text indexing is generally implemented using an inverted sort index, which records the mapping of the document in which the keyword expires.

The MyISAM storage engine supports full-text indexing, and the InnoDB storage engine also supports full-text indexing in Mysql version 5.6.4.

Advantages of indexing

Greatly reduces the number of rows of data that the server needs to scan.

Helps the server avoid sorting and creating temporary tables (B+Tree indexes are ordered and can be used for Order by and group by operations).

Change the random B+Tree O to the sequential I hand O (the index is ordered, that is, all the adjacent data are stored together).

Thank you for reading this article carefully. I hope the article "examples of MYSQL_ multi-version concurrency control, storage engine and index" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.