What are the indexing skills of mysql 07/03 Update SLTechnology News&Howtos

What are the indexing skills of mysql

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the index skills of mysql". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "what are the index skills of mysql"!

I. MySQL three-tier logical architecture

MySQL's storage engine architecture separates query processing from storage/retrieval of data. Here is the logical architecture diagram of MySQL:

The first layer is responsible for connection management, authorization authentication, security, etc.

Each client connection corresponds to a thread on the server. A thread pool is maintained on the server to avoid creating and destroying a thread for each connection. When a client connects to a MySQL server, the server authenticates it. Authentication can be done by username and password or by SSL Certificates. After login authentication, the server also verifies that the client has permission to execute a query.

2. The second layer is responsible for parsing queries

Compile SQL and optimize it (such as adjusting the order of reading tables, selecting appropriate indexes, etc.). For SELECT statements, the server checks the query cache before parsing the query. If the corresponding query result can be found in the cache, the query result is returned directly without query parsing and optimization. Stored procedures, triggers, views, etc. are implemented at this level.

The third layer is the storage engine.

The storage engine is responsible for storing data in MySQL, fetching data, starting a transaction, and so on. Storage engines communicate with the upper layers through APIs that mask differences between different storage engines, making them transparent to the query process at the upper layers. The storage engine does not parse SQL.

II. Compare InnoDB and MyISAM1. Storage structure

MyISAM: Each MyISAM is stored as three files on disk. They are: table definition file, data file, index file. The name of the first file starts with the name of the table and the extension indicates the file type. Frm files store table definitions. Data files have the extension.MYD (MYData). The index file extension is.MYI (MYIndex).

InnoDB: All tables are stored in the same data file (or multiple files, or separate table space files), and the size of the InnoDB table is limited only to the size of the operating system file, which is generally 2 GB.

2. Storage space

MyISAM: MyISAM supports three different storage formats: static tables (default, but note that there can be no spaces at the end of the data, which will be removed), dynamic tables, and compressed tables. When the table is created and imported, it will not be modified. You can use compressed tables to greatly reduce disk space consumption.

InnoDB: Requires more memory and storage, it builds its own dedicated buffer pool in main memory for caching data and indexes.

Portability, backup and recovery

MyISAM: Data is stored as files, so it is convenient for data transfer across platforms. A table can be acted upon individually during backup and restore.

InnoDB: Free solutions can be copying data files, backing up binlogs, or using mysqldump, which is relatively painful when the data volume reaches tens of gigabytes.

4. Business support

MyISAM: The emphasis is on performance, atomicity per query, which executes several times faster than the InnoDB type, but does not provide transaction support.

InnoDB: Provides advanced database functions such as transaction support transactions, external keys, etc. A transaction-safe (ACID compliant) type table with commit, rollback, and crash recovery capabilities.

5、 AUTO_INCREMENT

MyISAM: You can create a federated index with other fields. The auto-growth column of the engine must be an index. If it is a composite index, auto-growth may not be the first column. It can be sorted according to the previous columns and incremented.

InnoDB: InnoDB must contain only the index for this field. The auto-growth column of the engine must be an index or, if it is a composite index, the first column of the composite index.

6. Table lock difference

MyISAM: Only table-level locks are supported. When users operate myisam tables, select, update, delete, and insert statements will automatically lock the tables. If the locked tables satisfy insert concurrency, new data can be inserted at the end of the tables.

InnoDB: supports transaction and row-level locking, which is the biggest feature of innodb. Row locking greatly improves the new ability of multi-user concurrent operation. However, InnoDB row locks are valid only if the primary key of WHERE is valid, and all non-primary keys of WHERE will lock the whole table.

7. Full-text index

MyISAM: Full-text indexing with support for FULLTEXT types

InnoDB: Full-text indexing of type FULLTEXT is not supported, but innodb can support full-text indexing using sphinx plug-ins and works better.

8. Table PK

MyISAM: Allow tables to exist without any index and primary key, the index is the address where the row is saved.

InnoDB: If no primary key or non-empty unique index is set, a 6-byte primary key (invisible to the user) will be automatically generated. The data is part of the primary index, and the additional index stores the value of the primary index.

9. Number of rows in the table

MyISAM: stores the total number of tables, if select count() from table; will directly extract the value.

InnoDB: does not save the total number of tables, if you use select count(*) from table; will traverse the entire table, the consumption is quite large, but after adding the wehre condition, myisam and innodb are handled in the same way.

10. CRUD operation

MyISAM: MyISAM is the better choice if you perform a lot of SELECT.

InnoDB: If your data does a lot of INSERT or UPDATE, you should use InnoDB tables for performance reasons.

11. Foreign key

MyISAM: Not supported

InnoDB: Support

3. Introduction to SQL Optimization 1. Under what circumstances is SQL optimized

Low performance, execution time too long, wait time too long, join queries, index failures.

2. SQL statement execution process

(1) Preparation process

select distinct ... from ... join ... on ... where ... group by ... having ... order by ... limit ...

(2) Analytical process

from ... on ... join ... where ... group by ... having ... select distinct ... order by ... limit ... SQL optimization is to optimize the index

An index is equivalent to a table of contents for a book.

The data structure of the index is a B+ tree.

IV. Index 1. Advantages of index

(1) Improve query efficiency (reduce IO usage)

(2) Reduce CPU usage

For example, query order by age desc, because the B+ index tree itself is in good order, so if you query again, if you trigger the index, you don't have to query again.

2. Disadvantages of index

(1) The index itself is large and can be stored in memory or on a hard disk, usually on a hard disk.

(2) Index is not used in all cases, such as ① a small amount of data ② frequently changing fields ③ rarely used fields

(3) Index will reduce the efficiency of addition and deletion

3. Classification of index

(1) Single-valued index

(2) Unique index

(3) Joint index

(4) Primary key index

Note: Unique index and primary key index Unique difference: primary key index cannot be null

4. Create index alter table user add INDEX `user_index_username_password` (`username`,`password`)

MySQL index principle-> B+ tree

The underlying data structure of MySQL index is B+ tree

B+Tree is an optimization based on B-Tree, making it more suitable for implementing external storage index structure. InnoDB storage engine uses B+Tree to implement its index structure.

Each node in the B-Tree structure diagram contains not only the key value of the data, but also the data value. However, the storage space of each page is limited. If the data is large, the number of keys that can be stored in each node (i.e., a page) will be small. If the amount of data stored is large, the depth of the B-Tree will also be large, increasing the number of disk I/O times during query, and thus affecting query efficiency. In B+Tree, all data record nodes are stored on leaf nodes of the same layer according to the order of key value, while non-leaf nodes only store key value information, which can greatly increase the number of key values stored in each node and reduce the height of B+Tree.

B+Tree differs from B-Tree in several ways:

Non-leaf nodes store only key information.

There is a chain pointer between all leaf nodes.

Data records are stored in leaf nodes.

Optimize the B-Tree in the previous section. Since the non-leaf nodes of the B+Tree only store key information, assuming that each disk block can store 4 key values and pointer information, the structure of the B+Tree is as follows:

Usually there are two head pointers on a B+Tree, one pointing to the root node and the other to the leaf node with the smallest key, and all leaf nodes (i.e., data nodes) are in a chained ring structure. Therefore, two lookup operations can be performed on B+Tree: one is range lookup and paging lookup for primary key, and the other is random lookup starting from root node.

Perhaps there are only 22 data records in the above example, and the advantages of B+Tree cannot be seen. Here is an estimation:

The size of a page in the InnoDB storage engine is 16KB, the primary key type of a table is INT (4 bytes) or BIGINT (8 bytes), and the pointer type is generally 4 or 8 bytes. That is to say, a page (a node in the B+Tree) stores about 16KB/(8B+8B)= 1K key values (because it is an estimation, for convenience of calculation, K here is taken as <$10 <$^3). That is, a B+Tree index with depth 3 can maintain 10^3 * 10^3 * 10^3 = 1 billion records.

In practice, each node may not be filled, so in the database, the height of the B+Tree is generally 2~4 layers. MySQL's InnoDB storage engine is designed to have the root node resident in memory, which means that it only needs 1~3 disk I/O operations to find the row record of a certain key value.

B+Tree indexes in databases can be classified into clustered indexes and secondary indexes. The above B+Tree example graph is implemented in the database as a clustered index, and the leaf nodes in the B+Tree of the clustered index store the row record data of the whole table. The difference between an auxiliary index and a clustered index is that the leaf node of the auxiliary index does not contain all the data of the row record, but stores the clustered index key of the corresponding row data, i.e., the primary key. When querying data through a secondary index, the InnoDB storage engine traverses the secondary index to find the primary key, and then finds the complete row record data in the clustered index through the primary key.

5. How to trigger the union index 1. Establish a union index on the user table username, password

2. Trigger the union index

(1) Use all index keys of the union index to trigger the union index

(2) Use all index keys of joint index, but those connected with or cannot trigger joint index.

(3) When the first field on the left of the joint index is used alone, the joint index can be triggered.

(4) When other fields of the joint index are used alone, the joint index cannot be triggered.

6. Analyze SQL execution plan---explain

explain can simulate sql optimization to execute sql statements.

1. Introduction to explan

(1) User table

(2) Department table

(3) Index not triggered

(4) Trigger index

(5) Analysis of results

The table that appears first in the explain is the driver table.

When the join condition is specified, the table satisfying the query condition with a small number of records is [Driver Table].

When no join condition is specified, the table with fewer rows is the [driver table].

Sorting a driven table directly triggers an index, sorting a non-driven table does not trigger an index.

2, explain query results brief introduction

(1) id: SELECT identifier. This is the query sequence number for SELECT.

Select_type: Select type:

SIMPLE: Simple SELECT(no UNION or subquery)

PRIMARY: outermost SELECT

UNION: the second or subsequent SELECT statement in UNION

DEPENDENT UNION: The second or subsequent SELECT statement in UNION, depending on the external query

UNION RESULT: The result of UNION

SUBQUERY: First SELECT in a subquery

DEPENDENT SUBQUERY: The first SELECT in a subquery, depending on the outer query

DERIVED: SELECT of exported tables (subquery of FROM clause)

(3) table: table name

(4) type: connection type

system: The table has only one row (= system table). This is a special case of the const join type.

const: The table has at most one matching row, which will be read at the beginning of the query. Because there is only one row, the column values in that row can be considered constant by the rest of the optimizer. const is used when comparing all parts of a PRIMARY KEY or UNIQUE index with constant values.

eq_ref: For each row combination from the previous table, read one row from that table. This is probably the best join type, except for const type. It is used in conjunction with all parts of an index and the index is UNIQUE or PRIMARY KEY. eq_ref can be used for indexed columns compared using the = operator. The comparison value can be a constant or an expression that uses the column of the table read before the table.

ref: For each row combination from the previous table, all rows with matching index values are read from that table. ref is used if the join uses only the leftmost prefix of the key, or if the key is not UNIQUE or PRIMARY KEY(in other words, if the join cannot select a single row based on the keyword). If the keys used match only a small number of rows, the join type is good. ref can be used for indexed columns using the = or operator.

ref_or_null: This join type is the same as ref, but MySQL is added to specifically search rows containing NULL values. Optimizations of this join type are often used in solving subqueries.

index_merge: This join type indicates that the index merge optimization method is used. In this case, the key column contains the list of used indexes, and key_len contains the longest key element of the used index.

unique_subquery: This type replaces the ref of an IN subquery of the following form: value IN (SELECT primary_key FROMsingle_table WHERE some_expr);unique_subquery is an index lookup function that completely replaces subqueries, making it more efficient.

index_subquery: This join type is similar to unique_subquery. You can replace IN subqueries, but only for non-unique indexes in subqueries of the form: value IN (SELECT key_column FROM single_table WHERE some_expr)

range: Retrieves only a given range of rows, using an index to select rows. The key column shows which index is used. key_len contains the longest key element of the index used. The ref column in this type is NULL. When using =,>,>=,

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.