What are the knowledge points of index, lock and transaction in MySql 07/03 Update SLTechnology News&Howtos

What are the knowledge points of index, lock and transaction in MySql

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "what are the index, lock and transaction knowledge points in MySql". The content is simple and easy to understand and clearly organized. I hope it can help you solve your doubts. Let Xiaobian lead you to study and learn "what are the index, lock and transaction knowledge points in MySql".

The details are as follows:

1. index

An index, similar to a book catalog, allows you to find records immediately based on a page number in the catalog.

Advantages of indexing:

Natural ordering.

Quick search.

Disadvantages of the index:

Take up space.

Slow down the rate at which tables are updated.

Note: Small tables use full table scans faster, medium and large tables use indexes. Superlarge table indexes are mostly invalid.

Indexing can be divided into two types: clustered index and secondary index (also called secondary index or non-clustered index).

Functionally speaking, it can be divided into 6 types: ordinary index, unique index, primary key index, composite index, foreign key index, full-text index.

Six indexes in detail:

Normal index: The most basic index, without any constraints.

Unique index: Similar to a regular index, but with a uniqueness constraint.

Primary key index: Special unique index, no null value allowed.

Composite index: An index is created by grouping multiple columns together and covering multiple columns.

Foreign key index: Only tables of InnoDB type can use foreign key index to ensure data consistency, integrity and cascade operation.

Full-text index: MySQL full-text index can only be used for InnoDB, MyISAM, and can only be used for full-text search in English, generally using full-text index engine (ES, Solr).

Note: The primary key is the unique index, but the unique index does not have to be the primary key. The unique index can be empty, but there can only be one null value. The primary key cannot be empty.

In addition, InnoDB clusters data by primary key. If there is no primary key defined and no clustered index defined, MySql will select a unique non-empty index instead. If there is no such index, it will implicitly define a 6-byte primary key as a clustered index, which users cannot view or access.

To put it simply:

When a primary key is set, a unique index is automatically generated, and if there is no clustered index before, the primary key is a clustered index.

When the primary key is not set, a unique index that is not empty is selected as the clustered index, and if not already, an implicit 6-byte index is generated.

MySql stores data in pages. The default page is 16kb. When you query, you will not only load a certain piece of data, but load the page where the data is located into pageCache. This is actually similar to the nearby access principle of OS.

MySql indexes use the B+ tree structure. Before talking about B+ trees, let's talk about B trees. B trees are multi-way balanced search trees. Compared with ordinary binary trees, they will not be extremely unbalanced and are multi-way.

A characteristic of B-trees is that they also store data in non-page child nodes.

See also:

This feature causes non-page child nodes to fail to store a large number of indexes.

B+ Tree is optimized for this. As shown below:

We see that B+ Tree stores all data in leaf nodes, and non-child nodes only store indexes and pointers.

We assume that a non-page child node is 16kb, each index, i.e., primary key, is bigint, i.e., 8b, and pointer is 8b. Then each page can store about 1000 indexes (16kb/ 8b + 8b).

How many indexes can a three-level B+ tree store? As shown below:

It can store about 1 billion indexes. Usually the height of the B+ tree is 2-4 layers, because MySql is running, the root node is resident in memory, so each lookup only needs about 2 - 3 IO times. It can be said that the design of B+ trees is based on the characteristics of mechanical disks.

Knowing the design of the index, we can learn something else:

MySql's primary key cannot be too large, and if UUID is used, non-leaf nodes of the B+ tree will be wasted.

MySql's primary key is best self-increasing, if you use UUID, each insertion will adjust the B+ tree, resulting in page splitting, seriously affecting performance.

So, if the project uses sub-library sub-table, we usually need a primary key for sharding, then what should we do? In the implementation, we can keep the self-increasing primary key, and the logical primary key can be used as a unique index.

2. lock mechanism

Concepts spring up about Mysql locks, but locks actually have several dimensions, so let's explain.

1. type dimension

Shared lock (read lock/ S lock)

Exclusive lock (write lock/ X lock)

Type breakdown:

Intentional shared lock

intentional exclusive lock

Pessimism lock (use lock, i.e. for update)

Optimistic locking (using the version number field, similar to CAS mechanism, i.e. user control. Disadvantages: When concurrency is high, there are many useless retries)

2. granularity of lock (granularity dimension)

table locks

Page lock (Mysql Berkeley DB engine)

Line lock (InnoDB)

3. Lock algorithm (algorithm dimension)

Record Lock

Gap Lock (locks a range but does not contain lock records)

Next-Key Lock (Record Lock + Gap Lock, lock a range, and lock the record itself, MySql prevents phantom reading, is to use this lock implementation)

4. Default read operation, locked?

The default is the MVCC mechanism ("Consistent Unlocked Read"), which guarantees isolation correctness at RR level and is unlocked.

You can choose manual locking: select xxxx for update; select xxxx lock in share mode, called "consistent lock read."

After using the lock, you can avoid phantom reading at RR level. Of course, the default MVCC reading can also avoid phantom reading.

Since RR can prevent phantom reading, what is the use of SERIALIZABLE?

Prevent missing updates. For example:

At this point, we must use SERIALIZABLE level for serial reads.

Finally, row locks are implemented by locking clustered indexes. If you do not hit the index correctly when querying, the MySql optimizer will discard row locks and use table locks.

3. Affairs

Transaction is the eternal topic of database, ACID: atomicity, consistency, isolation, persistence.

Four characteristics, the most important of which is consistency. Consistency is guaranteed by atomicity, isolation, and persistence.

Atomicity is guaranteed by Undo log. Undo Log saves records prior to each change, allowing for rollback in case of errors.

Isolation is guaranteed by MVCC and Lock. This is the last part.

Persistence is guaranteed by Redo Log. Every time before the data is actually modified, the record will be written to Redo Log. Only when Redo Log is successfully written, it will be truly written to B+ tree. If power is cut before submission, the record can be restored through Redo Log.

Then we talk about isolation.

Isolation Level:

Read Not Submitted (RU)

Submitted Read (RC)

repeatable reading (RR)

serializable (serializable)

Each level solves a different problem, usually three: dirty reading, unrepeatable reading, and phantom reading. A classic picture:

There is a note here about phantom reads, in the database specification, RR level will lead to phantom reads, but due to Mysql optimization, RR level of MySql will not lead to phantom reads: when using the default select, MySql uses MVCC mechanism to ensure that there will be no phantom reads; you can also use locks, when using locks, such as for update (X lock), lock in share mode (S lock), MySql will use Next-Key Lock to ensure that phantom reads do not occur. The former is called a snapshot read and the latter is called a current read.

Principle analysis:

RU dirty read occurs: RU principle is to lock the row record of each update statement, rather than locking the entire transaction, so dirty read will occur. RC and RR lock the entire transaction.

RC cannot read repeatedly: RC generates a new Read View every time it executes a SQL statement, and each time it reads is different. RR transactions use the same Read View from start to finish.

RR does not occur illusion reading reasons: as mentioned above.

What is the difference between RR and Serialize? A: Lost updates. The lock section of this article has already been mentioned.

MVCC: Full name for multiversion concurrency control.

innoDB Each clustered index has 4 hidden fields, which are the primary key (RowID), the most recently changed transaction ID (MVCC core), the pointer to Undo Log (isolation core), and the index deletion flag (when deleted, it will not be deleted immediately, but marked and then deleted asynchronously);

Essentially, MVCC is implemented using Undo Log lists.

MVCC implementation: The transaction modifies the original data in an exclusive lock manner, stores the data before modification in the Undo Log, and associates it with the data through a rollback pointer. If the modification succeeds, nothing is done. If the modification fails, the data in the Undo Log is restored.

By the way, we usually think of MVCC as an optimistic lock-like approach, i.e., using version numbers, when in fact, innoDB is not implemented that way. Of course, this does not affect our use of MySql.

The above is "MySql index, lock, transaction knowledge points what" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.