In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The main purpose of designing an index is to help us get query results quickly, while like queries that start with% cannot use B-Tree indexes.
Considering that all the tables in innodb are clustered tables (similar to the index organization tables in oracle), and the structure of the records in the secondary index leaf node is (index field-> primary key field), we can take a lightweight approach instead of full table scanning by rewriting sql (the mysql optimizer is stupid and needs to be prompted enough):
Use the index full scan to find the primary key, and then return to the table according to the primary key to get the data.
The speed advantage of this method is particularly obvious when the length of a single row of records is larger and there are more records in the table, because the IO overhead of full index scan is much less than that of full table scan.
The lessons learned on paper are not profound, so you must put them into practice:
Create a test table test with self-incrementing primary key primary (id) and secondary index idx_name1 (name1) with 5 million pieces of data in the table.
Mysql > desc test +-+ | Field | Type | Null | Key | Default | Extra | +- -+-+ | id | int (11) | NO | PRI | NULL | auto_increment | | name1 | varchar (20) | YES | MUL | NULL | | name2 | varchar (20) | YES | | NULL | | name3 | varchar (20) | YES | NULL | | name4 | varchar | (20) | YES | | NULL | | name5 | varchar (20) | YES | | NULL | | name6 | varchar (20) | YES | | NULL | | name7 | varchar (20) | YES | | NULL | | name8 | varchar (20) | YES | NULL | | name9 | varchar (20) | YES | | | NULL | | name10 | varchar (20) | YES | | NULL | | +-+-+ 11 rows in set (0.01 sec) mysql > show index from test\ gateway * | * 1. Row * * Table: test Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: id Collation: a Cardinality: 4829778 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: * * 2. Row * * Table: test Non_unique: 1 Key_name: idx_name1 Seq_in_index: 1 Column_name: name1 Collation: a Cardinality: 2414889 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: Index_comment: 2 rows in set (0.00 sec) mysql > select count (*) from test +-+ | count (*) | +-+ | 5000000 | +-+ 1 row in set (1.59 sec)
Like query based on name1 takes 11.13s. From the execution plan, sql performs a full table scan (type: ALL):
Mysql > select * from test where name1 like'% O4JljqZw%'\ row * * id: 1167352 name1: BO4JljqZws name2: BrfLU7J69j name3: XFikCVEilI name4: lr0yz3qMsO name5: vUUDghq8dx name6: RvQvSHHg4p name7: ESiDbQuK8f name8: GugFnLtYe8 name9: OuPwY8BsiYname10: O0oNGPX9IW1 row in set (11.13 sec) mysql > explain select * from Test where name1 like'% O4JljqZw%'\ Graph * 1. Row * * id: 1 select_type: SIMPLE table: test type: ALLpossible_keys: NULL key: NULL key_len: NULL ref: NULL Rows: 4829778 Extra: Using where1 row in set (0.00 sec)
Rewrite sql to 'select a. From test a, (select id from test where name1 like'% O4JljqZw%') b where a.idfolb.idternative.'
Prompt the optimizer to use the secondary index idx_name1 to get the id in the subquery:
Mysql > select a.* from test a (select id from test where name1 like'% O4JljqqZw%') b where a.id=b.id\ Graph * 1. Row * * id: 1167352 name1: BO4JljqZws name2: BrfLU7J69j name3: XFikCVEilI name4: lr0yz3qMsO name5: vUUDghq8dx name6: RvQvSHHg4p name7: ESiDbQuK8f name8: GugFnLtYe8 name9: OuPwY8BsiYname10: O0oNGPX9IW1 row in set (2. 46 sec) mysql > explain select a.* from test a (select id from test where name1 like'% O4JljqZw%') b where a.id=b.id\ Graph * 1. Row * * id: 1 select_type: PRIMARY table: type: ALLpossible_keys: NULL key: NULL key _ len: NULL ref: NULL rows: 4829778 Extra: NULL** 2. Row * * id: 1 select_type: PRIMARY table: a type: eq_refpossible_keys: PRIMARY key: PRIMARY Key_len: 4 ref: b.id rows: 1 Extra: NULL** 3. Row * * id: 2 select_type: DERIVED table: test type: indexpossible_keys: NULL Key: idx_name1 key_len: 63 ref: NULL rows: 4829778 Extra: Using where Using index3 rows in set (0.00 sec)
After rewriting, the execution time of sql has been reduced to 2.46s, and the efficiency has been improved nearly four times.
The analysis of the implementation plan is as follows:
Step 1:mysql first overwrites the secondary index idx_name1 to retrieve the qualified id (Using where; Using index).
Step 2: perform a full table scan on the result set table: derived from sub-step 1 to get id (only one id in this case meets the criteria)
Step 3: finally, according to the id in step 2, use the primary key to go back to the table to get the data (type: eq_ref,key: PRIMARY)
Summary:
When the length of each record in the table is large, the sql efficiency will be significantly improved after being rewritten in this way.
In this experiment, the length of each record is still very small (only more than 100 bytes). If the length of each record is further increased, the execution efficiency of sql will be improved by an order of magnitude after rewriting. You can verify it by yourself.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.