Mysql Learning 11: chapter 6: indexing 07/04 Update SLTechnology News&Howtos

Mysql Learning 11: chapter 6: indexing

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1. Index 1.1. binary index

A B+tree is derived from a binary tree, a balanced binary tree, and a B-tree.

Each node of a binary tree has at most two child nodes, and the left child tree key is always less than the right child tree and less than the root key.

1.2. balanced binary tree structure

Balanced binary tree structure on the basis of improvement, must meet the absolute value of the height difference between the left and right subtrees does not exceed 1, and the left subtree and the right subtree are a balanced binary tree, at any time to ensure that the inserted tree is balanced, through Guo left or right to make unbalanced trees balanced.

1.3. B-tree structure

B-tree is also called Btree, each node has at most 4 child nodes, except root node and leaf node, other nodes have at least 2 child nodes. All leaf nodes are at the same level, and leaf nodes do not include any keyword information.

1.4. B+tree

A B+tree is a variant of the Btree, a multi-way search tree in which all keywords and data are stored in leaf nodes and contain pointers to keyword records.

Summary: B+tree index is a doubly linked list structure, retrieval is faster than B-tree, the order of accessing keywords is continuous, there is no need to visit the previous node, and leaf nodes contain all data information.

1.4.1. Clustered and normal indexes

B+ trees are divided into two categories, one is called clustered index, and the other is called non-clustered index (ordinary index).

InnoDB storage engine is an index organization table, clustered index is a form of index organization table, the logical order of index key values determines the physical storage order of table data rows.

The leaf nodes of the aggregated index store information of all rows of data records in the table, that is, data is index and index is data. A primary key (clustered index) is created when creating a table. If a primary key is not created, InnoDB will select the first unique index that does not contain Null values as the primary key. If there is no unique index, a 6-byte rowid is generated for the table by default.

Ordinary index in the leaf node does not contain all rows of data records, only in the leaf node has its own key value and primary key value. Retrieves the data, and obtains the row data record you want to find through the primary key on the leaf node of the common index.

Common cable creation syntax:

alter table tab_name add index index_name(col1);

Or:

create index inde_name on tab_name(col1);

See which indexes are in the table;

show index from tab_name;

Index Creation Experiment

l Create a test library

mysql> create database test;

l Create test tables

DROP TABLE IF EXISTS `t`;

CREATE TABLE `t` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`name` varchar(20) NOT NULL,

`address` varchar(20) NOT NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

l View table structure

[test]>desc t;

+---------+-------------+------+-----+---------+----------------+

+---------+-------------+------+-----+---------+----------------+

| id | int(11) | NO | PRI | NULL | auto_increment |

+---------+-------------+------+-----+---------+----------------+

Creating a stored procedure

DELIMITER $$

DROP PROCEDURE IF EXISTS `proc_auto_insertdata`$$

CREATE PROCEDURE `proc_auto_insertdata`()

BEGIN

DECLARE init_data INTEGER DEFAULT 1;

WHILE init_data explain select * from t where name='name11';

l Creating an index

create index idx_tname on t(name);

l Review the implementation plan again

optimization method

l Execution plan view method:

1. Look at the query type type. If all appears, it means full table scan;

2. Look at the key column to see if the l index is used. null indicates that no index is used;

3. Look at the rows column, the number of rows scanned during SQL execution;

4. Look at the extra column to see if there is a Using filesort or Using temporary that affects performance.

5. Look at the filtered column,(5.7 added, 5.6 with explain extended added to this column), representing the percentage of rows that return results to the rows that need to be read.

l SQL optimization ideas:

1. Check whether the data type of the table is reasonably designed and whether it complies with the principle that the simpler the data type of the selection, the smaller.

2. Whether the table is defragmented.

3. Whether statistics for the table are collected.

4. View execution plan If index is not used, create it.

5. Before creating an index, check the selectivity of the index to determine whether the fields are suitable for index creation. Selectivity refers to the ratio of the index value (cardinality) that does not duplicate to the total number of records, the higher the ratio, the better.

6. After creating the index, look at the execution plan and compare it before and after.

L Reasonable index creation:

1. Frequently queried columns.

2. Columns often used for table joins.

3. Sort frequently grouped classes.

1.4.2. ICP, MRR and BKA

ICP(Index Condition Pushdown) is an optimized way for mysql to retrieve row data from tables using indexes. 5.6 Start supporting. Before the storage engine fetches all the data to the server, it uses index filtering. After ICP is used, if index can be used, the storage engine filters the data and then sends it to the server layer. ICP can reduce the number of times the engine layer accesses the base table and the number of times the server layer accesses the storage engine.

It is controlled by index_condition_pushdow in the optimizer_switch parameter, which is enabled by default.

[mysql]>show variables like '%pushdown%';

Close:

set optimizer_switch="index_condition_pushdown=on|off";

When ICP optimization is used, the execution plan extra column displays the Using index condition.

5.7 Default value of optimizer_switch parameter in:

|index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on |

MBR(Multi-Range Read Optimization), added after 5.6. Controlled by two options in the optimizer_switch parameter, which is enabled by default.

mrr_cost_basd: Determine to turn on mrr feature by cost-based algorithm, on automatic, off forced on.

MBR function: store the set of primary key values found on the leaf nodes on the ordinary index into read_rnd_buffer, and then sort the primary key values in the buffer, and then use the primary key value set of the sorting number to access the data in the table, and program the random IO sequence IO to reduce the IO overhead of the query process.

When MBR optimization is used, the execution plan extra column displays Using MBR.

BKA (Batched Key Access), an algorithm to improve table join performance. Its function is to use sequential IO when reading records in the joined table.

BKA principle: multi-table join statement, when using index to access the second join table, use a join buffer to collect the relevant column values generated by the first operation object. After BKA constructs the key, it is passed to the engine layer in batches for index lookup. The key is submitted to the engine through the MBR interface.

Controlled by batched_key_access option of optimizer_switch parameter, closed by default.

To enable this parameter, MBR must be enforced.

SET global optimizer_switch='mrr=on,mrr_cost_based=off';

SET global optimizer_switch='batched_key_access=on';

When BKA is used, the execution plan extra column displays Using join buffer(Batched Key Access).

1.4.3. Primary and unique indexes

A primary key index is a clustered index, and there can only be one index per table. Three conditions must be met:

Primary key values must be unique.

l cannot contain null values.

Make sure that the value is self-increasing. It can ensure that the order of writing data is also self-increasing, improving access efficiency.

Create Primary Key Syntax:

alter table tab_name add primary key(col);

Unique index. Duplicate values are not allowed, but null values are allowed. Multiple unique indexes are allowed.

Grammar:

alter table tab_name add unique(col);

1.4.4. coverage indices

The data is in the index, and you don't have to go back to the table to query the data when you find the index. The Using index appears in the execution plan extra column.

If you use an overlay index, be sure to let select list the columns you need, and never write select * directly.

1.4.5. prefix index

For BLOB, TEXT, or very long varchar columns, the index established for their first few characters is the prefix index. Prefix indexes can no longer be used in ORDER BY or GROUP BY, nor can they be used as overlay indexes.

alter table tab_name add key(col_name(prefix_length));

Note: the most critical parameter prefix_length, this value needs to be based on the actual table content to get the appropriate index selectivity.

1.4.6. CODIS

A federated index, also known as a composite index, is an index created by two or more columns in a table.

create index idx_c1_c2 on t(c1,c2);

The columns with high selectivity are placed first.

1.5. hash index

Hash index uses hash algorithm to convert key value into new hash value. Hash index can only be used for equivalent query, and cannot be used for sorting, fuzzy search, range query, etc. When retrieving, it does not need to search step by step from root node to leaf node like B+tree, but only needs a hash algorithm to locate the corresponding position immediately.

1.6. Index Summary

index advantage

l Improve data retrieval efficiency

l Improve the efficiency of the combination function

l Improve sorting efficiency

l Use an overlay index to avoid backtracking

Index Create Four Don't

l Do not index fields with low selectivity

Do not index columns that are rarely queried.

l Large data type fields Do not create indexes

l Avoid using NULL as much as possible, and specify NOT NULL as a column.

Use no index

If the number of rows scanned through the index exceeds 30% of the full table, the optimizer will not go through the index, but the full table scan.

In a federated index, the first query condition is not the leftmost column.

l In the union index, the first index column uses range query, only part of the index can be used, and ICP appears.

In a union index, the first query condition is not the leftmost prefix column.

The leftmost column of fuzzy query criteria starts with wildcard %.

Two single-column indexes, one for retrieval and one for user sorting. Only one index can be used, because query statements can only use one index at most, consider establishing a joint index.

l There is an index on the query field, but a function operation is used.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.