Detailed explanation of the Index of MySQL 07/06 Update SLTechnology News&Howtos

Detailed explanation of the Index of MySQL

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

one。 Introduction to Index Foundation 1.1

In MySQL, index, also known as "key", is a data structure used by the storage engine to quickly find records.

Indexes are critical to good performance, especially as the amount of data in the table becomes larger and larger, the impact of indexes on performance becomes more and more important.

Index optimization should be the most effective means to optimize query performance, and creating a truly optimal index often requires rewriting SQL query statements.

1.2 how the index works

The easiest way to understand how indexing works in MySQL is to take a look at the index section of a book: for example, if you want to find a topic in a book, you will first look at the book's index catalogue, find the corresponding chapter and page number, and then you can quickly find what you want to read.

In MySQL, the storage engine uses the index in a similar way, first looks up the corresponding value in the index, then finds the corresponding data row according to the matching index record, and finally returns the data result set to the client.

1.3 Type of index

In MySQL, we usually refer to the following types of indexes:

Regular index

A regular index, also known as a regular index (index or key), can routinely improve query efficiency. There can be multiple regular indexes in a data table. The conventional index is the most commonly used index type. If the type of index is not clearly specified, the index we are talking about refers to the conventional index.

Primary key index

Primary key index (Primary Key), also referred to as primary key. It can improve query efficiency and provide uniqueness constraints. There can be only one primary key in a table. The field marked as auto-growing must be the primary key, but the primary key is not necessarily auto-growing. Generally, the primary key is defined on a meaningless field (such as: number), and the data type of the primary key is preferably numeric.

Unique index

Unique index (Unique Key), which can improve query efficiency and provide uniqueness constraints. There can be multiple unique indexes in a table.

Full-text index

Full-text index (Full Text), which can improve the query efficiency of full-text search, generally uses Sphinx instead. However, Sphinx does not support Chinese retrieval. Coreseek is a full-text search engine that supports Chinese, also known as Sphinx with the function of Chinese word segmentation. In the actual project, we use Coreseek.

Foreign key index

Foreign key index (Foreign Key), referred to as foreign key, can improve query efficiency, and foreign key will be automatically associated with the corresponding primary key of other tables. The main function of foreign keys is to ensure the consistency and integrity of records.

Note: only tables of the InnoDB storage engine support foreign keys. If the foreign key field does not specify an index name, it is automatically generated. If you want to delete a record in a parent table, such as a classification table, you must first delete the corresponding record in a child table (a table with a foreign key, such as an article table), or an error will occur. When creating a table, you can set foreign keys to the fields, such as foreign key (cate_id) references cms_cate (id). Because the efficiency of foreign keys is not very good, it is not recommended to use foreign keys, but we should use the idea of foreign keys to ensure the consistency and integrity of the data.

1.4 methods of indexing

In MySQL, indexing is implemented at the storage engine layer, not at the server layer. The indexing methods supported by MySQL can also be described as the type of index (in a broad sense). There are mainly the following:

B-Tree index

If you don't specify a type, you're probably talking about B-Tree indexes. Different storage engines use B-Tree indexes in different ways and performance varies. For example, MyISAM uses prefix compression to make the index smaller, but InnoDB stores the index in the original data format. Another example is that MyISAM references the indexed row through the physical location of the data, while InnoDB references the indexed row based on the primary key.

B-Tree stores index columns sequentially, so it is suitable for finding scope data. It can speed up access to data because the storage engine no longer needs to perform a full table scan to get the data it needs.

If an index contains the values of multiple fields (columns), it is a composite index. The composite index sorts multiple field values based on the order of the columns when the index was created. As follows:

Create table people (id int unsigned not null auto_increment primary key comment 'primary key id', last_name varchar (20) not null default' 'comment' last name', first_name varchar (20) not null default 'comment' first name', birthday date not null default '1970-01-01' comment 'birth date', gender tinyint unsigned not null default 3 comment 'gender: 1 male, 2 female, 3 unknown', key (last_name, first_name, birthday)) engine=innodb default charset=utf8

The following data has also been inserted into the people table:

Idlast_namefirst_namebirthdaygender1ClintonBill1970-01-0132AllenCuba1960-01-0133BushGeorge1970-01-0134SmithKim1970-01-0135AllenCally1989-06-083...

We created a composite index key (last_name, first_name, birthday) that contains the values of the last name, first name, and date of birth for each row of data in the table. The index is also sorted and stored in this order, and if two people have the same last name and first name, the index is sorted and stored according to their date of birth.

B-Tree indexes are suitable for full key values, range of key values, or key prefix lookups, where key prefix lookups only apply to lookups based on the leftmost prefix.

The composite index is valid for the following types of queries:

Full value matching

Full-value matching refers to matching with all columns in the index. For example: find someone whose last name is Allen, first name is Cuba, and date of birth is 1960-01-01.

The SQL statement is:

Select id,last_name,first_name,birthday from people where last_name='Allen' and first_name='Cuba' and birthday='1960-01-01'

Match leftmost prefix

For example, use only the first column of the index to find all people with the last name Allen. The SQL statement is:

Select id,last_name,first_name,birthday from people where last_name='Allen'

Match column prefix

For example, match only the beginning of the value of the first column of the index, and find all people whose last names start with A. The SQL statement is:

Select id,last_name,first_name,birthday from people where last_name like'A%'

Match range valu

For example, the range matches the last name between Allen and Clinton. The SQL statement is:

Select id,last_name,first_name,birthday from people where last_name BETWEEN 'Allen' And' Clinton'

Only the first column of the index is used here.

Exactly match the first column and the range matches the following columns

For example, find someone whose last name is Allen and whose first name starts with the letter C. That is, the first column of the full match composite index, and the range matches the second column. The SQL statement is:

Select id,last_name,first_name,birthday from people where last_name = 'Allen' and first_name like'C%'

Query that accesses only the index

B-Tree can usually support "index-only queries", that is, queries only need to access the index, not data rows. This is related to the optimization of the "overlay index", which will be discussed later.

Here are some situations in which composite indexes fail:

(1) an index cannot be used if the search does not start with the leftmost column of the composite index. For example, in the above example, the index cannot be used to find a person named Cuba or a particular date of birth, because neither of these columns is the leftmost data column of the composite index key (last_name, first_name, birthday). Similarly, it is not possible to find the person whose last name ends with a letter, that is, the fuzzy match% of the like range query, which invalidates the index if placed in the first place.

(2) if the columns in the index are skipped during the search, only the previous index columns will be used, and the subsequent index columns will be invalidated. For example, find someone whose surname is Allen and whose birth date is on a particular date. When looking here, since no lookup name (first_name) is specified, MySQL can only use the first column of the composite index (that is, last_name).

(3) if there is a range query for a column in the query, all columns to the right of that column cannot be optimized by index. For example, if the query condition is where last_name='Allen' and first_name like'C% 'and birthday='1992-10-25 columns, this query can only use the first two columns of the index, because the like here is a range condition. If the number of values of the columns in the range query is limited, you can optimize the column on the right by using multiple equal conditions instead of the range condition to make the index available.

Now we know how important the order of the columns in the composite index is, and these restrictions are related to the order of the index columns. When optimizing performance, you may need to use the same column but different order of indexes to meet different types of query requirements, for example, in a table, you may need two composite indexes key (last_name, first_name, birthday) and key (first_name, last_name, birthday).

The B-Tree index is the most commonly used index type, and later, unless otherwise specified, refers to the B-Tree index.

1. Hash indexing

Hash indexing (hash index) is implemented based on a hash table, and only queries that exactly match all columns of the index are valid. In MySQL, only the Memory engine shows support for hash indexes.

2. Spatial data Index (R-Tree)

The MyISAM engine supports spatial indexing and can be used as a geographic data store. Unlike the B-Tree index, this index does not require a prefix query.

3. Full-text index

A full-text index is a special type of index that looks for keywords in the text rather than directly comparing the values in the index. Full-text indexing is completely different from several other indexes in that it is more similar to what search engines do than simple where conditional matching. You can create a full-text index and an B-Tree index on the same column at the same time, and the full-text index is suitable for Match Against operations rather than normal where conditional operations.

An index can contain a column (that is, a field) or the values of multiple columns. If an index contains multiple columns, it is generally called a composite index, and the order of the columns is important because MySQL can only efficiently use the leftmost prefix column of the index. Creating an index with two columns is very different from creating two indexes with only one column.

1.5 advantages of indexing

Indexes allow MySQL to quickly find the data we need, but this is not the only function of indexes.

The most common B-Tree index stores data sequentially, so MySQL can be used for Order By and Group By operations. Because the data is stored sequentially, B-Tree stores all the relevant column values together. Finally, because the actual column values are also stored in the index, some queries can get all the data using only the index without having to go back to the table query. According to these characteristics, it is concluded that the index has the following three advantages:

The index greatly reduces the amount of data that the MySQL server needs to scan. Indexes can help the server avoid sorting and temporary tables. An index can change a random Ithumb O into a sequential Ithumb O.

In addition, "three-star system" is used to evaluate whether an index is suitable for a query statement. Samsung system mainly refers to: if the index can put the relevant records together, get one star; if the order of the data in the index is the same as the order in the search, get two stars; if the columns in the index contain all the columns needed by the query, get Samsung.

Indexing is not always the best tool, nor does it mean that the more indexes the better. In general, indexes are useful as long as the benefits of helping the storage engine find records quickly outweigh the extra work.

For very small tables, a simple full table scan is more efficient in most cases, and there is no need to build an index. For medium to large tables, the benefits of indexing are obvious.

two。 High performance indexing strategy

The correct creation and use of indexes is the basis for implementing high-performance queries. Now that you've introduced various types of indexes and their advantages and disadvantages, let's take a look at how to really take advantage of these indexes. The following sections will help you understand how to use indexes efficiently.

2.1 separate columns

We often see queries that use indexes improperly or make it impossible for MySQL to use existing indexes. If the columns in the SQL query statement are not independent, MySQL will not use the index. "Independent column" means that the index column cannot be part of an expression or an argument to a function.

For example, the following SQL query statement cannot use the primary key index id:

Select id,last_name,first_name,birthday from people where id+1=3

It's easy to see that the above where expression can actually be abbreviated to where id=2, but MySQL cannot automatically parse the expression. We should get into the habit of simplifying where conditions and always put index columns on one side of the comparison operator. So to use the primary key index, write it correctly as follows:

Select id,last_name,first_name,birthday from people where id=2

Here is another common miswriting:

Select... From... Where to_days (current_date ())-to_days (date_col) 3 explain select id,last_name from people where id > 3 order by id desc

2. File sorting

File sorting (filesort) refers to sorting the queried result set through additional operations and then returning it to the client. This kind of sorting method, does not use the index sort, the efficiency is low. Although files are sorted, which MySQL calls filesort, disk files are not necessarily used.

If the value of the Extra parameter from explain contains a "Using filesort" string, it is a file sort. At this point, you must optimize the index or SQL query. Such as:

Explain select id,last_name,first_name from people where id > 3 order by last_name

MySQL can use the same index to satisfy both lookup and query. If possible, when designing an index, you should try to satisfy both operations as much as possible.

Index sorting is possible only if the columns of the index contain fields in the where condition and fields in the order by, and the order of the columns in the index is the same as the order of all fields contained in where + order by (note: order by is after where).

Now, let's optimize the above SQL statement to make use of index sorting.

First, add a multi-column index.

Alter table people add key (id,last_name)

You will find that it is still impossible to use index sorting only by adding key (id,last_name), because the where + order by statement also has to meet the leftmost prefix requirement of the index, and where id > 3 is a range condition, which will cause subsequent order by last_name to fail to use index key (id,last_name).

Second, change the order by last_name in the SQL statement to order by id,last_name.

Note: if the SQL query statement is an associated query associated with multiple tables, index sorting can be used only if all the fields sorted by order by come from the first table.

Here are several situations in which index sorting cannot be used:

1. If order by sorts by multiple fields, but the sorting direction of multiple fields is different, that is, some fields are asc (ascending order, the default is ascending order), and some fields are desc (descending order). Such as:

Explain select * from people where last_name='Allen' order by first_name asc, birthday desc

2. If order by contains a field that is not in the index column. Such as:

Explain select * from people where last_name='Allen' order by first_name, gender

3. If the first column of the index column is a range lookup condition. Such as:

Explain select * from people where last_name like'A% 'order by first_name

4. In this case, you can optimize the SQL statement to:

Explain select * from people where last_name like'A% 'order by last_name,first_name;2.8 redundant and duplicate indexes

MySQL allows multiple indexes to be created on the same column (only with different names). Because MySQL needs to maintain duplicate indexes separately, and the optimizer needs to analyze them one by one when optimizing the query, duplicate indexes can affect performance.

A duplicate index is an index of the same type created in the same column order on the same column. Duplicate indexes should be avoided and deleted as soon as they are found.

Redundant indexes are different from duplicate indexes. If you create the index key (A, B) and then create the index key (A), it is a redundant index. Because index (A) is only the prefix index of the previous index. Index (A, B) can also be used as index (A). However, if you create another index (BMaga), it is not a redundant index.

Redundant indexes usually occur when a new index is added to a table. For example, someone might add a new index (A, B) instead of extending the existing index (A). In another case, a secondary index (A) is extended to (A, ID), where ID is the primary key. For InnoDB, the primary key column is already included by default in the secondary index, so this is also redundant.

In most cases, redundant indexes are not required. You should try to expand existing indexes instead of creating new ones. Sometimes, however, redundant indexes are needed for performance reasons, because extending an existing index makes it larger, which affects the performance of other query statements that use the index.

You need to be very careful when extending the index. Because the leaf node of the secondary index contains the primary key value, the index on column (A) is equivalent to the index on (A, ID). Index (A) is very useful if someone uses a query such as where Agg5 order by ID. However, if you change index (A) to index (A, B), it actually becomes index (A, B, ID), then the order by statement of the above query cannot use index sorting, but can only use file sorting.

It is recommended that you use the pt-upgrade tool in the Percona toolkit to carefully check the planned index changes.

Therefore, extend the original index only if you are clear about all the queries related to an index. Otherwise, creating a new index (making the old index redundant for the new index) is the safest way.

2.9 unused indexes

There may be indexes in the MySQL server that will never be used, and such indexes are completely cumbersome, and it is recommended to consider deleting them. It should be noted, however, that the uniqueness constraint function of a unique index, which may not be used by the query, can be used to avoid duplicating data.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.