In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about how to parse the MySQL index, many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
0 preface
This article will not explain the basic knowledge of indexing, but mainly about the relevant principles of MySQL database B+ tree indexing. Some of the knowledge in it refers to the book on the inside of MySQL technology, which is also a summary of this knowledge. For the knowledge about B-tree and B + tree, you can refer to my blog: if the interviewer asks you about B-tree and B + tree, throw this article to him.
1 Management of index
There are many types of indexes: general index, unique index, primary key index, composite index, full-text index. Let's take a look at how to create and delete the following types of indexes.
1.1 how the index is created
The creation of an index can be done in many cases.
Create an index directly
CREATE [UNIQUE | FULLLTEXT] INDEX index_name ON table_name (column_name (length))
[UNIQUE | FULLLTEXT]: indicates the optional index type, unique index or full-text index, or ordinary index without adding words.
Table_name: the name of the table, indicating which table to add the index to.
Column_name (length): column_name is the column name of the table, and length means to add an index to the previous length row record of this column.
How to modify the table structure to add an index
ALTER TABLE table_name ADD [UNIQUE | FULLLTEXT] INDEX index_name (column (length))
Create an index when you create a table
CREATE TABLE `table` (`id` int (11) NOT NULL AUTO_INCREMENT, `title` char (255) CHARACTER NOT NULL, PRIMARY KEY (`id`), [UNIQUE | FULLLTEXT] INDEX index_name (title (length)
1.2 how primary key indexes and combined indexes are created
The above is all about the general index, unique index and full-text index creation, but the primary key index and composite index are created in a somewhat different way, so talk about it separately.
Combined index creation method
Create an index when you create a table
CREATE TABLE `table` (`id` int (11) NOT NULL AUTO_INCREMENT, `title` char (255) CHARACTER NOT NULL, PRIMARY KEY (`id`), INDEX index_name (id,title))
How to modify the table structure to add an index
ALTER TABLE table_name ADD INDEX name_city_age (name,city,age)
Primary key index creation method
A primary key index is a special unique index in which a table can have only one primary key and no null values are allowed. Generally, the primary key index is created at the same time when the table is created.
CREATE TABLE `table` (`id` int (11) NOT NULL AUTO_INCREMENT, `title` char (255) CHARACTER NOT NULL, PRIMARY KEY (`id`))
1.3 Delete the index
Deleting an index can use ALTER TABLE or DROP INDEX statements to delete the index. Similar to the CREATE INDEX statement, DROP INDEX can be treated as a statement within ALTER TABLE, with the following syntax.
(1) DROP INDEX index_name ON talbe_name
(2) ALTER TABLE table_name DROP INDEX index_name
(3) ALTER TABLE table_name DROP PRIMARY KEY
The third statement is used only when deleting an PRIMARY KEY index, because a table can only have one PRIMARY KEY index, so there is no need to specify an index name.
1.4 Index instance
The above talked about the basic knowledge, and then, let's experience it through a specific example.
Step1: creating tables
Create table table_index (id int (11) not null auto_increment, title char (255) not null, primary key (id))
Step2: adding index
First, we add a normal index by adding an index directly.
CREATE INDEX idx_a ON table_index (title)
Next, we add the index when we modify the table structure.
ALTER TABLE table_index ADD UNIQUE INDEX idx_b (title)
Finally, we add a composite index.
ALTER TABLE table_index ADD INDEX idx_id_title (id,title)
In this way, we have used all the previous indexing methods, and I am sure you are familiar with these operations.
Step3: use the SHOW INDEX command to view index information
If you want to see the index information in the table, you can use the command SHOW INDEX. In the following example, we look at the index information in the table table_index.
SHOW INDEX FROM table_index\ G
What does it mean to get the above information? Let's introduce them one by one!
Field interprets the table non _ unique non-unique index where the Table index is located. If it is 0, it means unique, that is, if the column index does not include duplicate values, it is 0, otherwise it is the name of the 1Key_name index, and if it is the primary key, it is the position of the column in the PRIMARYSeq_in_index index. Starting at 1, if it is a combined index, then arrange how the Collation columns are stored in the index in the order in which the fields were indexed. Can be An or NULL,B+ tree index is always A, sorted, whether the part of the Sub_part column is indexed, if it is only the first 100 rows index, it displays 100. if it is the whole column, it shows whether the NULLPacked keyword is compressed, if not, it is the type of NULLIndex_type index, for InnoDB only supports B+ tree index, so it shows BTREE
Step4: delete index
Directly delete index mode
DROP INDEX idx_a ON table_index
Delete the index when you modify the table structure
ALTER TABLE table_index DROP INDEX idx_b
1.5 Cardinality keyword parsing
The meaning of so many keywords is described above, but the keyword Cardinality is so critical that the optimizer determines whether to use the index based on this value. In the B+ tree index, only highly selective fields are meaningful, high selectivity is that this field has a wide range of values, such as the name field, there will be a lot of names, the selectivity is high.
In general, you can use the Cardinality keyword to determine whether you need to use an index. If it is very close to 1, it is necessary to use it, and if it is very small, consider whether to use an index.
One problem to pay attention to is that this keyword is not updated in a timely manner. If you need to update it, you need to use ANALYZE TABLE, for example.
Analyze table table_index
Because there is no data at the moment, you will find that this value has always been zero and has not changed.
Strategy of InoDB storage engine Cardinality
In the InnoDB storage engine, updates to this keyword occur in two operations: insert and update. However, it is not updated every time, which increases the load, so there is a strategy for updating this keyword:
The data of 1ax 16 in the table has changed.
Counters for the InnoDB storage engine stat_modified_conter > 2000000000
The default InnoDB storage engine samples 8 leaf nodes. The sampling process is as follows:
The number of leaf nodes in the B+ tree index, recorded as A
Eight leaf nodes in the B+ tree index are randomly obtained. Count the number of different records on each page, which is p1-p8
Based on the sampling information, the estimated value of Cardinality is obtained as follows: (p1+p2+p3+...+p8) * A hand 8
Because of random sampling, the Cardinality value is different each time, only one case will be the same, that is, the leaf node in the table is less than or equal to 8, at this time, how random sampling is these 8, so it is the same.
1.6 Fast Index Creation
Before MySQL 5.5, for the addition or deletion of indexes, it was necessary to create a temporary table each time, then import the data into the temporary table, and then delete the original table. If a large table does this, it will be very time-consuming, which is a big defect.
The InnoDB storage engine has added a Fast Index Creation (fast index creation) index creation method since version 1.0.x.
The strategy for this approach is to add an S lock (shared lock) to the indexed table each time, there is no need to rebuild the table at creation time, and to delete the secondary index only needs to update the internal view and mark the secondary index space as available, so this efficiency is greatly improved.
1.7 online data definition
MySQL5.6 began to support online data definition operations: allow auxiliary index creation, but also allow other insert, update, delete and other DM operations, which greatly improves the availability of the database.
So, we can use the new syntax to create the index:
ALTER TABLE table_name ADD [UNIQUE | FULLLTEXT] INDEX index_name (column (length)) [ALGORITHM = {DEFAULT | INPLACE | COPY}] [LOCK = {DEFAULT | NONE | SHARED | EXLUSIVE}]
ALGORITHM specifies the algorithm for creating or deleting indexes
COPY: how to create temporary tables
INPLACE: no need to create temporary tables
DEFAULT: judge according to the parameter old_alter_table. If it is OFF, use INPLACE method.
LOCK indicates the situation in which locks are added to the table
NONE: no locks are added
SHARE: add an S lock, and read can be carried out, write operation needs to wait
EXCLUSIVE: add an X lock, read and write cannot be carried out concurrently
DEFAULT: first determine whether NONE can be used, if not, determine whether SHARE can be used, and then determine whether EXCLUSIVE mode can be used.
2 the use of B + tree index
2.1 Joint Index
Federated index refers to indexing multiple columns on a table. In this part, we will explain the relevant knowledge points of federated index through several examples.
First, let's create a table and create a federated index for the table.
Create table t_index (a char (2) not null default'', b char (2) not null default'', c char (2) not null default'', d char (2) not null default'') engine myisam charset utf8
Create a federated index
Alter table t_index add index abcd (a _
Insert several pieces of test data
(insert into t_index values), ('a2), ()), (()), (()) ('a2), ()), (()) ('a2), ()), (()) ('a3), ()), (()), (()) ('a5), ()), (()) (()), () (()) (()
At this point, we have basically prepared the data we need, and we can explore the federated index a little further.
When do we need to create a federated index?
The main purpose of index establishment is to improve the efficiency of query, then the purpose of federated index is similar, the purpose of federated index is to improve the efficiency in the case of multiple query conditions, just like the table established above, there are multiple fields, when we need to use multiple fields for query, we need to use federated index.
When will the joint index work?
Sometimes, we use federated indexes, but we don't know how they work. We don't know when federated indexes will work and when they will fail.
With this in mind, let's take a look at the leftmost matching principle of federated indexes.
Leftmost matching principle: this principle means to create a composite index, whichever is the leftmost. As long as the query condition has the leftmost column, the query will use the index.
Next, let's look at this principle with a few examples.
EXPLAIN SELECT * FROM t_index WHERE a ='a'\ G
Let's look at the results of this statement. First of all, we see that the index is used, because the query condition has the leftmost column a, so how many indexes are used? We need to look at the key_len field, we know that a character encoded by utf8 is 3 bytes, and the data type we use is char (2), which takes up two bytes, and the index is 2, 3 equals 6 bytes, so only one index works.
EXPLAIN SELECT * FROM t_index WHERE b = 'b2'\ G
In this statement, we can see that the index is not used because possible_keys is empty, and the number of rows in the query rows can be seen to be 6 (a total of 6 of our test data), indicating that a full scan has been carried out, indicating that this situation is not in line with the leftmost matching principle, so the index query will not be used.
EXPLAIN SELECT * FROM t_index WHERE a = 'a2' AND b =' b2' ORDER BY d\ G
This situation is a little different. We use a sort, we can see that we use an index, and we can see that two indexes an and b are used when key_len is 12. In addition, we can see that Using filesort, that is, file sorting, is used in the Extra option. The reason for using file sorting here is this: the above query uses indexes an and b, but when we sort with the d field, (a) D) or (b) there is no sorting for these two indexes. One advantage of using a federated index is that the next field of the index will be sorted automatically. In this case, the c field is sorted, but d is not. If we use c to sort, we will get different results.
EXPLAIN SELECT * FROM t_index WHERE a = 'a2' AND b =' b2' ORDER BY c\ G
Can you see that when we sort with c, because we use indexes an and b, c sorts automatically, so we don't need filesort.
At this point, I believe that through the above examples, the relevant knowledge of the federated index has been very clear, and finally, let's talk about a few common questions.
Q1: why not create an index on each column in the table?
First, it takes time to create and maintain an index, which increases as the amount of data increases.
Second, the index needs to occupy the physical space, in addition to the data table occupies the data space, each index also occupies a certain amount of physical space, if you want to establish a clustered index, then the space needed will be more.
Third, when the data in the table is added, deleted and modified, the index should also be maintained dynamically, which reduces the speed of data maintenance.
Q2: why do you need to use federated indexes
Reduce expenses. Building a joint index (col1,col2,col3) is actually equivalent to building (col1), (col1,col2), (col1,col2,col3) three indexes. Each additional index increases the overhead of write operations and disk space. For tables with a large amount of data, using federated indexes will greatly reduce the overhead!
Overwrite the index. For the federated index (col1,col2,col3), if you have the following sql: select col1,col2,col3 from test where col1=1 and col2=2. Then MySQL can get the data directly by traversing the index without going back to the table, which reduces a lot of random io operations. Reducing io operations, especially random io, is actually the main optimization strategy of dba. Therefore, in real practical applications, overlay index is one of the main optimization means to improve performance.
High efficiency. The more index columns, the less data is filtered by the index. A table with 1000W pieces of data has the following sql:select from table where col1=1 and col2=2 and col3=3, assuming that 10% of the data can be filtered out for each condition, and if there is only a single-valued index, then the index can filter out 1000W10%=100w pieces of data, and then go back to the table to find col2=2 and col3=3-compliant data from 100w pieces of data, and then sort and paginate. If it is a federated index, you can imagine the improvement in efficiency by screening 1000W 10% * 10% 1w through the index.
Overlay index
An overlay index is a kind of record that can be queried from a secondary index without querying the records in a clustered index. one advantage of using an overlay index is that the secondary index does not contain all the information of the entire row of records, so the size is much smaller than the clustered index, so the IO operation can be greatly reduced. Another advantage of overriding indexes is that they are optimized for statistical problems. Let's take a look at the following example.
Explain select count (*) from t_index\ G
If it is a myisam engine, the Extra column outputs Select tables optimized away statements, and the myisam engine has saved the total number of records and returned the results directly, so there is no need to override the index optimization.
In the case of an InnoDB engine, the Extra column outputs a Using index statement, indicating that the InnoDB engine optimizer uses an override index operation.
2.2 Index hint
The MySQL database supports the index prompt function, which tells the optimizer which index to use. Generally, the index prompt function (INDEX HINT) can be used in the following two cases:
The optimizer of MySQL database chose an index incorrectly, which caused SQL to run very slowly.
There are so many indexes that can be selected by a SQL statement that the cost of the optimizer choosing to execute the plan time may be greater than the SQL statement itself.
Let's move on to the above example. First, let's add some indexes to the above t_index table.
Alter table t_index add index a (a); alter table t_index add index b (b); alter table t_index add index c (c)
Next, we execute the following statement
EXPLAIN SELECT * FROM t_index WHERE a ='a 'AND b =' b 'AND c =' c'\ G
You will find that this statement can use three indexes, at this time, we can display the use of index hints to use the index a, as follows:
EXPLAIN SELECT * FROM t_index USE INDEX (a) WHERE a ='a 'AND b =' b 'AND c =' c'\ G
This shows the use of index a, and if sometimes the optimizer still does not select the index you want, then we can FORCE INDEX in another way.
EXPLAIN SELECT * FROM t_index FORCE INDEX (a) WHERE a ='a 'AND b =' b 'AND c =' c'\ G
In this way, you are sure to choose the index you want.
2.3 Index optimization
Multi-Range Read optimization
MySQL5.6 begins to support this optimization, which is designed to reduce random access to disks and convert random access to more sequential data access, which is suitable for queries of range, ref, and eq_ref types.
Benefits of Multi-Range Read optimization:
Make data access more sequential.
Reduces the number of times pages are replaced in the buffer.
Batch processing of query operations on key values.
We can use the tag in the parameter optimizer_switch to control whether Multi-Range Read optimization is turned on. The following methods will be set to always on:
SET @ @ optimizer_switch='mrr=on,mrr_cost_based=off'
Index Condition Pushdown (ICP) optimization
This optimization method is also supported from MySQL5.6, but before this method is not supported, when making an index query, we first look up records according to the index, and then filter records according to where conditions. However, when ICP optimization is supported, the MySQL database will determine whether where conditional filtering can be carried out while fetching the index, that is, putting the where filtering part on the storage engine layer, which greatly reduces the request for records by the upper SQL.
ICP supports queries of range, ref, eq_ref and ref_or_null types, and currently supports MyISAM and InnoDB storage engines.
We can turn on ICP with the following statement:
Set @ @ optimizer_switch = "index_condition_pushdown=on"
Or close:
Set @ @ optimizer_switch = "index_condition_pushdown=off"
When ICP is turned on, you can see the Using index condition prompt in the execution plan Extra.
3 the characteristics, advantages, disadvantages and applicable scenarios of the index
Characteristics of the index
It can speed up the retrieval of the database.
Reduce the speed of database insertion, modification, deletion and other maintenance
Can only be created on a table, not on a view
It can be created directly or indirectly.
Advantages of indexing
Create a uniqueness index to ensure the uniqueness of each row of data in the database table
Greatly accelerate the speed of data retrieval
It is particularly meaningful to speed up the connection between database tables, especially in achieving the referential integrity of the data.
When using grouping and sorting words for data retrieval, the query time can also be significantly reduced.
By using indexes, you can use optimization hides in queries to improve system performance.
Shortcomings of the index
First, it takes time to create and maintain an index, which increases as the amount of data increases.
Second, the index needs to occupy the physical space, in addition to the data table occupies the data space, each index also occupies a certain amount of physical space, if you want to establish a clustered index, then the space needed will be more.
Third, when the data in the table is added, deleted and modified, the index should also be maintained dynamically, which reduces the speed of data maintenance.
Applicable scenarios for indexing
Matching full value
Specify a specific value for all columns in the index, that is, there is an equivalent matching condition for all columns in the index.
Range query of matching values
The value of the index can be looked up in a range.
Match leftmost prefix
Query using only the leftmost column of the index, for example, the federated index on the col1 + col2 + col3 field can be utilized by equivalent queries containing col1, (col1 + col2), (col1 + col2 + col3), but not by equivalent queries of col2, (col2, col3).
The leftmost matching principle can be regarded as the first principle for the use of B-Tree indexes in MySQL.
Query only the index
The query is more efficient when the columns of the query are in the fields of the index, so you should try to avoid using select * and check only the fields you need.
Match column prefix
Only the first column in the index is used, and only the first part of the first column of the index is searched.
It can achieve accurate index matching part and range matching of other parts.
If the column name is an index, using column_name is null will use the index, for example, the following will use the index:
Explain select * from t_index where an is null\ G
Fields that often appear after the keywords order by, group by, distinct
Result set fields for set operations such as union
Fields often used as table joins
Consider using index override, and the data is rarely updated. If users often query some of your fields with value, you can consider establishing an index on these fields, thus changing the scan of the table into the scan of the index.
Index failure
Like queries starting with% cannot make use of B-Tree indexes. The value of key in the execution plan is null, which means that no indexes are used.
Indexes are also not used when data types are implicitly converted, for example, where 'age'+10=30
Perform functional operations on index columns for the same reason as above
Regular expressions do not use indexes
Comparisons between strings and data do not use indexes
In the case of a composite index, if the query condition does not include the leftmost part of the index column, that is, it does not satisfy the leftmost principle leftmost, the compound index will not be used.
If MySQL estimates that using an index is slower than a full table scan, do not use an index
For a condition split with or, if the column in the condition before or has an index, but there is no index in the following column, then none of the indexes involved will be used.
Using negative queries (not, not in, not like,! =,! >,! <) will not use the index
After reading the above, do you have any further understanding of how to parse the MySQL index? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.