The method of MySQL Optimization and Index 04/12 Update SLTechnology News&Howtos

The method of MySQL Optimization and Index

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

Most people do not understand the knowledge points of this article "MySQL optimization and indexing methods", so the editor summarizes the following contents, detailed contents, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "MySQL optimization and indexing methods" article.

A brief introduction to Index

Nature of the index:

The essence of MySQL indexes or indexes of other relational databases is to trade space for time in one sentence.

The role of the index:

The (disk-stored) data structure of an index relational database to speed up the retrieval of row data in a table.

Classification of indexes

The classification above in the data structure:

HASH index

High efficiency of equivalent matching

Range lookup is not supported

Tree index

Binary tree, recursive binary search method, left small and right big

Balanced binary tree, binary tree to balanced binary tree, the main reason is left-handed and right-handed.

Drawbacks: 1 too many times of Magi IO.

Disadvantages 2. The utilization rate of IO IO is not high, and the saturation degree is not high.

Multipath balanced search tree (B-Tree)

The characteristics greatly reduce the height of the tree.

B + tree

Characteristics, using the left closed comparison method

The root node branch node has no data area, and only the leaf node contains the data area (to put it bluntly, even if the root node and child node have been located, they will not stay because there is no data area, and the leaf node will be found all the time. )

When we search for 13, we can locate both the root node and the child node, but we will always find the leaf node.

Binary tree balance binary tree, B tree comparison:

As shown in the figure, if it is a self-increasing primary key:

Binary trees are obviously not suitable for relational database indexes (no different from full table scans).

Although the balanced binary tree solves this problem, it will also cause the tree to be thin and tall, which will also lead to too many IO queries and low IO utilization mentioned above.

The B-tree has obviously solved these two problems, so the following will explain why MySQL also uses the B + tree and makes those enhancements in this case.

Comparison between B-tree and B + tree:

Optimization of B + tree on B tree:

IO is more efficient (each node of the B-tree retains the data area, while the B + tree does not. Suppose we have to traverse three layers to query a piece of data, then obviously the IO consumption in the B + tree query is lower)

Range search is more efficient (as shown in the figure, the B + tree has formed a natural linked list form, which only needs to be found according to the chain structure at the end)

Index-based data scanning is more efficient.

Classification of index types

Index types can be divided into two categories:

Primary key index

Auxiliary index (secondary index)

Uniqueness index

Composite index

General index

Overlay index

The performance of the primary key index is relatively the best, but for SQL optimization, most of the time we do some improvements and additions to the auxiliary index.

The B + tree landed at the storage engine level.

We create two tables as test_innodb (using InnoDB as the storage engine) test_myisam (using MyISAM as the storage engine) the following figure shows the related files of the landing of two table disks, the two storage engines are completely different in the landing of the Btree disk.

The B + tree landed in MyISAM:

* .frm file is a table skeleton file, for example, what type of id field name field in this table is stored here

* .MYD (D=data) stores data

* .MYI (I=index) stores the index

For example, if you execute the following sql statement now, in MyISAM, he will first find 103 in test_myisam.MYI and then get the address of 0x194281, and then go to test_myisam.MYD to find the data to return.

SELECT id,name from test_myisam where id = 103

If id is the primary key index and name is also an index in the test_myisam table, then there will be two horizontal B + trees in the test_myisam.MYI, which leads to no primary and secondary primary and secondary indexes in the MyISAM engine. Because this mechanism makes it possible to use multiple indexes in the MyISAM engine, this is not the case in InnoDB.

The B + tree landed in InnoDB:

Unlike MyISAM, which stores data in a separate MYD file, InnoDB stores its data directly in the data area corresponding to the leaf node keyword, where it holds detailed records of all rows of this id column.

InnoDB primary key index and secondary index relationship

We now execute the following SQL statement, which will first find the secondary index, then find the primary key under the secondary index, and then go back to the table (secondary scan) and return it according to the primary key index query 103.

SELECT id,name from test_myisam where name = 'zhangsan'

Here is a question, why not record the disk address directly under the secondary index like MyISAM, but go to the trouble of going back to the table scan primary key index. This question is answered in the relevant interview questions below, and remember that this question comes from here.

Relevant interview questions

Why MySQL chooses B + tree as its index structure

Let's not talk about this. I should have made it clear above.

The difference between the landing of B + tree in MyISAM and InnoDB.

To sum up, there are three types of files for MyISAM landing data storage, .frm files are table skeleton files, .MYD (D=data) stores data, .MYD (I=index) stores indexes, and primary key index and secondary index horizontal relationship in MyISAM engine. In MyISAM engine, it is possible to use multiple indexes, while InnoDB is the opposite. There is a strict distinction between primary key index and secondary index. In InnoDB, only one index can be used in a statement or not.

How to determine whether an index is used in a sql statement.

You can judge by executing the plan that you can explain/ desc before the sql statement.

Set global optimizer_trace='enabled=on' turns on the execution plan switch, and he will record each query sql execution plan in the OPTIMIZER_TRACE table in the information_schema library.

Why is it best to choose self-incrementing columns for primary key indexes?

Since the increment column, the entire index tree only increases on the right side when the data is inserted, and the change of the index tree is relatively small.

Why is an index not recommended for columns that change frequently?

As with the reason for the previous problem, when an index changes frequently, it means that the shrinking tree changes frequently. four

Why is it that indexing is not recommended for columns with high repetition?

This is because of discreteness, for example, a table with 1 million data, in which one field represents gender, 0 represents male and 1 represents female, and this field is indexed, so there will be a lot of duplicate data on the index tree. Our common index building is generally driven. The aim is to delete the query scope of the data as much as possible, which is obviously not a match.

What is a federated index

A federated index is an index that contains multiple functions. It is only one index, not multiple.

Secondly, a single-column index is a special federated index.

The creation of a federated index should follow the principle of leftmost preposition (most commonly used columns > discreteness > small footprint)

What is an overlay index?

The index column that needs to be queried can be returned directly through the index item information, which is called the overlay index. To put it bluntly, there is no need to do the table operation, and the required data can be directly obtained from the secondary index.

What is the ICP mechanism?

Index push-down, to put it simply, in the sql implementation process, in the face of where multi-condition filtering, through an index to complete data search and filter conditions, the characteristics can reduce io operations.

What is the reason why there can be no primary key pair or wrong in the InnoDB table?

First of all, this sentence is true, but there are three situations:

This field is used as the clustered index when you manually explicitly specify this field as the primary key.

There are two situations when a primary key is not explicitly specified:

He will look for the first UK (unique key) as the primary key index to organize indexing.

If neither the primary key nor the UK is specified, the rowId (each record in the InnoDB table will have a 6byte rowId) is used as the clustered index.

What is a table return operation?

In InnoDB, the content of the query based on the secondary index cannot be directly obtained from the secondary index, and the operation that requires a second scan based on the primary key index is called the table return operation.

Why the secondary index leaf node data area in InnoDB records the value of the primary key index instead of recording the disk address as in MyISAM.

The reason is actually very simple, because the data structure of the primary key index will change frequently. If the disk address is recorded in the secondary index data area, then suppose we have 10 secondary indexes. When our primary key index structure changes, we have to notify the secondary index one by one, and the primary key index structure is often changed. Additions and deletions may affect him.

Data structure.

The above is the content of this article on "MySQL optimization and indexing methods". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.