In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the relevant knowledge of "what is the syntax of MySQL index". The editor shows you the operation process through an actual case. The method of operation is simple and fast, and it is practical. I hope this article "what is the syntax of MySQL index" can help you solve the problem.
Index definition
MySQL's official definition of index is: index (index) is a data structure (ordered) that helps MySQL to obtain data efficiently. The index is added to the fields of the database table, which is a mechanism to improve the efficiency of the query. In addition to data, the database system also maintains data structures that meet specific lookup algorithms, which refer to (point to) data in some way, so that advanced lookup algorithms can be implemented on these data structures. this data structure is the index. As shown in the following diagram:
In fact, to put it simply, an index is an ordered data structure.
On the left is the data table, which has two columns and seven records, and the leftmost one is the physical address of the data record (note that logically adjacent records are not necessarily physically adjacent on disk). In order to speed up the Col2 lookup, you can maintain a binary lookup tree shown on the right, each node contains an index key value and a pointer to the physical address of the corresponding data record, so that you can use binary lookup to quickly obtain the corresponding data.
Index advantage
Accelerate the speed of searching and sorting, reduce the IO cost of database and the consumption of CPU
By creating a uniqueness index, you can ensure the uniqueness of each row of data in the database table.
Index disadvantage
An index is actually a table that holds the primary key and index fields and points to the record of the entity class, which takes up space.
Although the query efficiency has been increased, for addition, deletion and modification, each time you change the table, you also need to update the index: naturally, you need to add new nodes in the index tree to delete: the records pointed to in the index tree may be invalidated, which means that many nodes in the index tree are invalid changes: the direction of the nodes in the index tree may need to be changed.
But in fact, we do not use binary search tree to store in MySQL. Why?
You know, binary search tree, where a node can only store one piece of data, and a node corresponds to a disk block in the MySQL, so each time we read a disk block, we can only get one piece of data, which is very inefficient, so we will think of using the B-tree structure to store.
Index structure
Indexing is implemented in the storage engine layer of MySQL, not in the server layer. So the indexes of each storage engine are not necessarily the same, and not all engines support all index types.
BTREE index: the most common index type, most indexes support B-tree index.
HASH indexing: only supported by Memory engine, it is easy to use.
R-tree index (spatial index): spatial index is a special index type of MyISAM engine, which is mainly used for geospatial data types. It is usually used less and is not specifically introduced.
Full-text (full-text index): full-text index is also a special index type of MyISAM, which is mainly used for full-text indexing. InnoDB supports full-text indexing from the Mysql5.6 version.
Support for various index types by three storage engines: MyISAM, InnoDB and Memory
Indexes
INNODB engine
MYISAM engine
MEMORY engine
BTREE index
Support
Support
Support
HASH index
Not supported
Not supported
Support
R-tree index
Not supported
Support
Not supported
Full-text
Supported after version 5.6
Support
Not supported
What we usually call an index, if not specifically specified, refers to an index organized by a B+ tree (multi-search tree, not necessarily binary). Clustered index, composite index, prefix index and unique index all use B+tree index by default, which is collectively referred to as index.
BTREE
Multi-path balanced search tree, an m-order (m-fork) BTREE satisfies:
Maximum number of m children per node: ceil (m hand 2) to m keywords: ceil (m hand 2)-1 to m Mel 1
Ceil means rounding up, ceil (2.3) = 3
Insert keyword case
Guarantee not to destroy the properties of m-order B-tree
Because of the third order, there can only be two nodes at most, so at first 26 and 30 are together, and then 85 is going to split, 30 as the middle upper position, 26 to hold, 85 to the right.
That is, the upper position in the middle, then the left in the old node and the right to the new node
For example, when 70 is re-inserted in the figure, 70 happens to be the upper position in the middle, and then 62 is held, and 85 is used to assign a new node.
It needs to be split again after taking the upper position.
Just continue to split upward, by the same token.
Comparative advantage
Compared with the binary search tree, the height / depth is lower and the natural query efficiency is higher.
B+TREE
The B + tree has two types of nodes: internal nodes (also known as index nodes) and leaf nodes. The internal node is the non-leaf node, the internal node does not store data, only stores the index, and the data is stored in the leaf node.
The key in the inner node is arranged in the order from small to large. For a key in the inner node, all the key in the left tree is less than it, and the key in the right subtree is greater than or equal to it. The records in the leaf nodes are also arranged according to the size of key.
Each leaf node has a pointer to the adjacent leaf node, and the leaf node itself is connected in order from small to large according to the size of the keywords.
The parent node holds the index of the first element of the right child.
Comparative advantage
The query efficiency of B+Tree is more stable. Because only the leaf node of B+Tree holds key information, querying any key has to go from root to leaf, so it is more stable.
You only need to traverse the leaf node to traverse the whole tree.
B+Tree in MySQL
The MySql index data structure optimizes the classical B+Tree. On the basis of the original B+Tree, adding a linked list pointer to the adjacent leaf nodes (the whole structure is similar to a two-way linked list), a B+Tree with sequential pointers is formed to improve the performance of interval access.
Careful students can see, what is the biggest difference between this picture and our binary search tree diagram?
A significant change in the transition from binary search tree to B-tree is that a node can store multiple data, which is equivalent to storing multiple data in a disk block, which greatly reduces the number of IO!
Schematic diagram of the B+Tree index structure in MySQL:
Binary search tree diagram:
Principle of indexing BTree Index: introduction to initialization
The light blue one is called a disk block, and you can see that each disk block contains several data items (dark blue) and pointers (yellow).
For example, disk block 1 contains data items 17 and 35, including pointers P1, P2, P3
P1 represents blocks less than 17, P2 represents blocks between 17 and 35, and P3 represents blocks greater than 35.
The real data exists in leaf nodes 3, 5, 9, 10, 13, 15, 28, 29, 36, 60, 75, 79, 90, 99. `
Non-leaf nodes do not store real data, but only store data items that guide the search direction, such as 17 and 35 do not really exist in the data table. `
Search process
If you are looking for data item 29, disk block 1 is first loaded from disk into memory, and an IO occurs. Use binary search to determine 29 between 17 and 35 in memory, lock the P2 pointer of disk block 1, memory time is negligible because it is very short (compared to disk IO), load disk block 3 from disk to memory through the disk address of P2 pointer of disk block 1, the second IO,29 occurs between 26 and 30, lock the P2 pointer of disk block 3, load disk block 8 into memory through pointer, and the third IO occurs. At the same time in memory through the binary search search to 29, the end of the query, a total of three IO.
The truth is, a 3-tier B+ tree can represent millions of data. If millions of data lookups only need three IO, the performance improvement will be huge. If there is no index, each data item will have to have one IO, then a total of millions of IO will be required, obviously the cost is very high.
Index classification
In InnoDB, tables are stored in the form of indexes according to the order of primary keys, which is called index organization tables. And because we mentioned earlier, InnoDB uses the B + tree index model, so the data is stored in the B + tree.
Each index corresponds to a B + tree in InnoDB.
Suppose we have a table with a primary key listed as ID, a field k in the table, and an index on k.
The construction statement of this table is:
Mysql > create table T (id int primary key, k int not null, name varchar (16), index (k)) engine=InnoDB; copy code
The (ID,k) values of R1~R5 in the table are (100prime1), (200jing2), (300jin3), (500jing5) and (600jin6), respectively. An example diagram of the two trees is shown below:
It is not difficult to see from the figure that according to the content of the leaf node, the index type is divided into primary key index and non-primary key index.
Primary key index
The primary key column of the data table uses the primary key index and will be created by default, which is why, before we learned the index, the teacher often told us that it would be faster to look up according to the primary key, and the primary key itself had built the index.
The leaf node of the primary key index stores the entire row of data. In InnoDB, a primary key index is also known as a clustered index (clustered index).
Auxiliary index
The leaf node content of the secondary index is the value of the primary key. In InnoDB, a secondary index is also called a secondary index (secondary index).
As shown below:
The primary key index stores the whole row of data.
The secondary index only stores itself, and the id primary key is used to query back the table.
Based on the above index structure, let's discuss a question: what is the difference between a query based on a primary key index and a secondary index?
If the statement is select * from T where ID=500, that is, the primary key query mode, you only need to search the ID B+ tree.
If the statement is select * from T where query 5, that is, the normal index query method, you need to search the k index tree first to get a value of 500 for ID, and then search the ID index tree. This process is called returning to the table.
In other words, queries based on secondary indexes need to scan one more index tree. Therefore, we should try our best to use primary key query in our application.
Unless we say that the data we are querying happens to exist on our index tree, we call it an overlay index-- that is, the index column contains all the data we want to query.
At the same time, secondary indexes are divided into the following categories (just skip it first, and we'll learn about it later):
Unique index (Unique Key): a unique index is also a constraint. The property column of a unique index cannot have duplicate data, but the data is allowed to be NULL, and a table allows multiple unique indexes to be created. Most of the time, the purpose of establishing a unique index is for the uniqueness of the data of the attribute column, rather than for query efficiency.
Index: the only purpose of a normal index is to query data quickly. A table allows multiple normal indexes to be created, and allows data duplication and NULL.
Prefix index (Prefix): the prefix index applies only to data of type string. A prefix index creates an index on the first few characters of the text, which is smaller than a normal index because only the first few characters are taken.
Full-text index (Full Text): full-text index is mainly to retrieve the information of keywords in large text data, and it is a technology used in search engine database at present. Before Mysql5.6, only the MYISAM engine supported full-text indexing. After 5.6, InnoDB also supported full-text indexing.
Extension-index push-down
The so-called push down, as the name implies, is actually delaying our table return operation. MySQL will not easily let us go back to the table, because it is very wasteful. What do you mean? Let's look at the following example.
We created a composite index (name,status,address), which is also stored by this field, similar to the one in the figure:
Composite index tree (only index columns and primary keys are stored for returning to the table)
Name
Status
Address
Id (primary key)
Millet 1
0
one
one
Millet 2
one
one
two
We execute a statement like this:
SELECT name FROM tb_seller WHERE name like 'Xiaomi%' and status ='1'; copy code
First of all, on the composite index tree, we found the first name that begins with Xiaomi-Xiaomi 1.
At this time, we are not in a hurry to return to the table (the process of going back to the primary key index tree search, we call it the back table). Instead, we first judge whether status = 1 in the composite index tree. At this time, status=0, we will not return to the table directly. We will continue to find the next name that starts with Xiaomi.
Find the second one-- Xiaomi 2, judge status=1, then go to the primary key index tree according to id=2 and get all the data.
This method first determines whether other where conditions are met on its own index tree, and if it does not meet it, it directly pass it, and does not return to the table, which is called index push-down.
Leftmost prefix principle
The so-called leftmost prefix can be thought of as a process of climbing stairs. Suppose we have a compound index: name,status,address, then the order of the staircase from low to high is: name,status,address, the leftmost prefix requires us not to jump the staircase, otherwise our index will fail:
If you press the staircase from low to high, there is no jump-- in accordance with the leftmost prefix principle, the index will not fail.
A jump occurs.
Directly, the first layer of name will not go, of course, it will all fail.
Go to the first layer, but then go straight to the third layer, only those before the jump will not fail (here only name succeeds)
At the same time, this order is not determined by the order in our where, for example: where name=' Xiaomi Technology 'and status='1' and address=' Beijing' where status='1' and name=' Xiaomi Technology 'and address=' Beijing'
Although the order of the fields in the where is not the same, and the second one seems to have crossed the level, the effect is actually the same.
In fact, because our MySQL has an Optimizer (query optimizer), the query optimizer will optimize the SQL and choose the best query plan to execute.
With regard to this query optimizer, we will also talk about the logical architecture and storage engine of MySQL in subsequent articles.
Index design principles for tables
Tables with high query frequency and large amount of data
For field
It is best to extract from the conditions of the where clause, and if there are more combinations in the where clause, you should choose the combination of the most commonly used and best filtered columns.
Other principles
It is best to use a unique index. The higher the degree of differentiation, the more efficient the index.
It is not the more the better. Maintenance also requires the cost of time and space. It is recommended that there are no more than 5 indexes in a single table.
Because when choosing how to optimize the query, the MySQL optimizer will evaluate each index that can be used according to the unified information in order to generate the best execution plan. If there are many indexes that can be used for the query at the same time, it will increase the time for the MySQL optimizer to generate the execution plan, and also reduce the query performance.
For example:
We created three single-column indexes, name,status,address
When we query according to the status and address fields in where, the database will only select the best index and will not use all single-column indexes.
Optimal index: specifically, it refers to the index column with the highest recognition (the least proportion) in the query table, for example, there is a highly recognized "Xi'an" data in the address.
Using a short index, the index is also stored on a hard disk after it is created, so improving the efficiency of index access can also improve the overall access efficiency. If the total length of the fields that make up the index is relatively short, then more index values can be stored in the storage block of a given size, which can effectively improve the efficiency of MySQL access index.
Using the leftmost prefix, such as N fields, we do not necessarily need to create N indexes, we can use composite indexes
In other words, we try to create composite indexes instead of single-column indexes
Creating a composite index: CREATE INDEX idx_name_email_status ON tb_seller (name,email,status); is equivalent to creating an index on name; creating an index on name,email; creating an index on name,email,status; copying code to give a chestnut
Suppose we have a table with id as the primary key without creating an index:
CREATE TABLE `tuser` (`id` int (11) NOT NULL, `name` varchar (32) DEFAULT NULL, `age` int (11) DEFAULT NULL, PRIMARY KEY (`id`),) ENGINE=InnoDB copy code
What principles should we follow if we want to build a composite index here?
By adjusting the order, you can maintain one less index.
For example, in our business requirements, there are two ways to query: according to name query, according to name and age query
If we build an index (age,name), because of the leftmost prefix principle, our index can be implemented according to age, according to age and name queries, not just based on name queries (because of the jump), in order to achieve our requirements, we have to build another name index.
If we change the order to (name,age), we can achieve our requirements without having to maintain a name index, that is, by adjusting the order, we can maintain one less index.
Consider spatial-> short index
For example, in our business requirements, there are two ways to query: according to name query, according to age query, according to name and age query
We have two options:
Create a federated index (name,age) and a single-column index: age index.
Create a federated index (age,name) and a single-column index: name index.
Both of these solutions can meet our needs, so we have to consider the space at this time. The name field is larger than the age field. Obviously, scenario 1 consumes less space, so we prefer scenario 1.
When to build an index
Query Fields in where
Fields associated with other tables in a query, such as foreign keys
Sorted field
Statistical or grouped fields
When does the index reach?
There is very little data in the table.
A table that changes frequently.
Fields that are updated frequently
Table fields whose data are duplicated and evenly distributed (for example, if they contain a lot of duplicate data, then the binary search of multi-tree is of little use, which can be understood as O (logn) degraded)
Index related syntax to create index
By default, an index is created for the primary key-primary
CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name [USING index_type] ON tbl_name (index_col_name,...) index_col_name: column_name [(length)] [ASC | DESC] copy code to find the index
Add\ G at the end, which can be turned into a vertical display.
Select index from tbl_name\ G; copy code delete index drop INDEX index_name on tbl_name; copy code change index 1). Alter table tb_name add primary key (column_list); this statement adds a primary key, which means that the index value must be unique and cannot be NULL 2). Alter table tb_name add unique index_name (column_list); the value that this statement creates the index must be unique (except for NULL, NULL may appear multiple times) 3. Alter table tb_name add index index_name (column_list); add a normal index, and the index value can appear multiple times. 4)。 Alter table tb_name add fulltext index_name (column_list); this statement specifies the index as FULLTEXT for full-text index replication code to view index usage show status like 'Handler_read%';-view current session index usage show global status like' Handler_read%';-view global index usage copy code
Handler_read_first: the number of times the first item in the index has been read. If high, the server is performing a large number of full index scans (the lower the better).
Handler_read_key: if the index is working, this value represents the number of times a row has been read by the index value, and the lower the value, the lower the performance improvement of the index, because the index is not used frequently (the higher the value, the better).
Handler_read_next: the number of requests to read the next line in key order. If you use a range constraint or if you perform an index scan to query the index column, this value is increased.
Handler_read_prev: the number of requests that read the previous line in key order. This reading method is mainly used to optimize ORDER BY. DESC .
Handler_read_rnd: the number of requests to read a line based on a fixed position. This value is higher if you are executing a large number of queries and need to sort the results. You may use a large number of queries that require MySQL to scan the entire table or your connection does not use keys correctly. This value is high, which means that it is inefficient and should be indexed to remedy it.
Handler_read_rnd_next: the number of requests to read the next line in the data file. If you are doing a lot of table scans, the value is higher. It usually indicates that your table index is incorrect or that the query you write does not make use of the index.
This is the end of the content about "what is the syntax of MySQL index". Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.