In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the case study of MySQL performance optimization, which is very detailed and has certain reference value. Friends who are interested must finish it!
Preface MySQL Index underlying data structure and algorithm MySQL performance Optimization principle-previous practice (1)-- MySQL performance Optimization
In the last article, "practice (1)-MySQL performance Optimization", we talked about some principles of database table design, the introduction of Explain tools, and the best practices for optimizing indexes with SQL statements. This article continues to talk about how MySQL chooses the right index.
MySQL Trace tool
Whether MySQL finally chooses to walk the index or whether a table involves multiple indexes, and how to select the index finally, you can use the trace tool to find out. Turning on the trace tool will affect the performance of MySQL, so you can only temporarily analyze SQL use and close it immediately after use.
Case analysis
Before we talk about trace tools, let's take a look at a case study:
# sample table CREATE TABLE`employees` (`id`TABLE`employees`) NOT NULL AUTO_INCREMENT, `name`varchar (24) NOT NULL DEFAULT''COMMENT' name', `age`int (11) NOT NULL DEFAULT'0' COMMENT 'age', `position`varchar (20) NOT NULL DEFAULT''COMMENT' position', `hire_ time`timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'entry time', PRIMARY KEY (`id`), KEY `idx_name_age_ position` (`name`, `age`, `position`) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT=' employee record table' INSERT INTO employees (name,age,position,hire_time) VALUES ('ZhangSan',23,'Manager',NOW ()); INSERT INTO employees (name,age,position,hire_time) VALUES (' HanMeimei', 23 (name,age,position,hire_time) VALUES ('Lucy',23,'dev',NOW (); copy code
How to choose the right index for MySQL
EXPLAIN select * from employees where name > 'asides; copy the code
If you use name index, you need to traverse the name field federated index tree, and then go to the primary key index tree according to the traversed primary key value to find the final data. The cost is higher than the full table scan, and can be optimized with overlay index, so you only need to traverse the federated index tree of the name field to get all the results, as shown below:
EXPLAIN select name,age,position from employees where name >'a'; copy code EXPLAIN select * from employees where name > 'zzz'; copy code
For the execution results of the above two kinds of name >'a 'and name >' zzz', whether mysql finally chooses to walk the index or whether a table involves multiple indexes, and how mysql finally selects the index, we can use the trace tool to find out that opening the trace tool will affect the performance of mysql, so we can only temporarily analyze the use of sql and close it immediately after use.
Trace tool on / off Trace# enables traceset session optimizer_trace= "enabled=on", end_markers_in_json=on;# closes traceset session optimizer_trace= "enabled=off"; copy code case 1
Execute these two sentences of sql
Select * from employees where name >'a 'order by position;sELECT * FROM information_schema.OPTIMIZER_TRACE; copy the code
Put forward the trace value, see the notes for details.
{"steps": [{"join_preparation": {--first stage: SQL preparation phase "select#": 1, "steps": [{"expanded_query": "/ * select#1 * / select `employees`.`id`AS `id`, `employees`.`name`AS `name`, `employees`.`age`AS `age`, `employees`.`position`AS `position` `employees`.`hire _ AS `employees `time`from `employees`where (`employees`.`name` >'a') order by `employees`.`position` "}] / * steps * /} / * join_preparation * /}, {" join_optimization ": {--second stage: SQL optimization phase" select# ": 1 "steps": [{"condition_processing": {- conditional processing "condition": "WHERE", "original_condition": "(`employees`.`name` >'a')", "steps": [{"transformation": "equality_propagation" "resulting_condition": "(`employees`.`name` >'a')"}, {"transformation": "constant_propagation", "resulting_condition": "(`employees`.`name` >'a')"} {"transformation": "trivial_condition_removal", "resulting_condition": "(`employees`.`name` >'a')"}] / * steps * /} / * condition_processing * /} {"substitute_generated_columns": {} / * substitute_generated_columns * /}, {"table_dependencies": [--Table dependency details {"table": "`employees`", "row_may_be_null": false "map_bit": 0, "depends_on_map_bits": [] / * depends_on_map_bits * /}] / * table_dependencies * /} {"ref_optimizer_key_uses": [] / * ref_optimizer_key_uses * /}, {"rows_estimation": [--estimate the access cost of the table {"table": "`employees`" "range_analysis": {"table_scan": {--full table scan "rows": 3-- number of rows scanned "cost": 3.7-- query cost} / * table_scan * / "potential_range_indexes": [--possible index for query {"index": "PRIMARY",-- primary key index "usable": false, "cause": "not_applicable"} {"index": "idx_name_age_position",-- auxiliary index "usable": true, "key_parts": ["name", "age", "position" "id"] / * key_parts * /}, {"index": "idx_age", "usable": false "cause": "not_applicable"}] / * potential_range_indexes * /, "setup_range_conditions": [] / * setup_range_conditions * /, "group_index_range": {"chosen": false "cause": "not_group_by_or_distinct"} / * group_index_range * / "analyzing_range_alternatives": {--analyze the cost of using each index "range_scan_alternatives": [{"index": "idx_name_age_position", "ranges": ["a"
< name" --索引使用范围 ] /* ranges */, "index_pes_for_eq_ranges": true, "rowid_ordered": false, --使用该索引获取的记录是否按照主键排序 "using_mrr": false, "index_only": false, --是否使用覆盖索引 "rows": 3, --索引扫描行数 "cost": 4.61, --索引使用成本 "chosen": false, --是否选择该索引 "cause": "cost" } ] /* range_scan_alternatives */, "analyzing_roworder_intersect": { "usable": false, "cause": "too_few_roworder_scans" } /* analyzing_roworder_intersect */ } /* analyzing_range_alternatives */ } /* range_analysis */ } ] /* rows_estimation */ }, { "considered_execution_plans": [ { "plan_prefix": [ ] /* plan_prefix */, "table": "`employees`", "best_access_path": { --最优访问路径 "considered_access_paths": [ --最终选择的访问路径 { "rows_to_scan": 3, "access_type": "scan", --访问类型:为sacn,全表扫描 "resulting_rows": 3, "cost": 1.6, "chosen": true, --确定选择 "use_tmp_table": true } ] /* considered_access_paths */ } /* best_access_path */, "condition_filtering_pct": 100, "rows_for_plan": 3, "cost_for_plan": 1.6, "sort_cost": 3, "new_cost_for_plan": 4.6, "chosen": true } ] /* considered_execution_plans */ }, { "attaching_conditions_to_tables": { "original_condition": "(`employees`.`name` >'a') "," attached_conditions_computation ": [] / * attached_conditions_computation * /," attached_conditions_summary ": [{" table ":" `employees` " "attached": "(`employees`.`name` >'a')"}] / * attached_conditions_summary * /} / * attaching_conditions_to_tables * /}, {"clause_processing": {"clause": "ORDER BY" "original_clause": "`employees`.`position`", "items": [{"item": "`employees`.`position`"}] / * items * /, "resulting_clause_is_simple": true "resulting_clause": "`employees`.`position`"} / * clause_processing * /}, {"reconsidering_access_paths_for_index_ordering": {"clause": "ORDER BY", "index_order_summary": {"table": "`employees`" "index_provides_order": false, "order_direction": "undefined", "index": "unknown", "plan_changed": false} / * index_order_summary * /} / * reconsidering_access_paths_for_index_ordering * /} {"refine_plan": [{"table": "`employees`"}] / * refine_plan * /}] / * steps * /} / * join_optimization * /} {"join_execution": {--third phase: SQL execution phase "select#": 1, "steps": [{"filesort_information": [{"direction": "asc" "table": `employees`, "field": "position"}] / * filesort_information * /, "filesort_priority_queue_optimization": {"usable": false, "cause": "not applicable (no LIMIT)"} / * filesort_priority_queue_optimization * / "filesort_execution": [] / * filesort_execution * /, "filesort_summary": {"rows": 3, "examined_rows": 3, "number_of_tmp_files": 0, "sort_buffer_size": 200704 "sort_mode": "} / * filesort_summary * /}] / * steps * /} / * join_execution * /}] / * steps * /} copy the code
Conclusion: the cost of full table scan is lower than that of index scan, so MySQL finally chooses full table scan.
Case 2select * from employees where name > 'zzz' order by position;SELECT * FROM information_schema.OPTIMIZER_TRACE; copy code
Conclusion: looking at the trace field, we can see that the cost of index scan is lower than that of full table scan, so MySQL finally chooses index scan.
Common SQL in-depth optimization Order by and Group by optimization cases 1EXPLAIN select * from employees where name = 'ZhangSan' and position =' dev' order by age copy code
Analysis:
Use the leftmost prefix rule: the middle field cannot be broken, so the query uses the name index. From key_len = 74, you can also see that the age index column is used in the sorting process because there is no using filesort in the Extra field.
Case 2EXPLAIN select * from employees where name = 'ZhangSan' order by position copy code
Analysis:
From the execution result of explain: key_len = 74, the query uses the name index, due to the use of position for sorting, skipping age, the emergence of Using filesort.
Case 3EXPLAIN select * from employees where name = 'ZhangSan' order by age,position copy code
Analysis:
The query uses only indexes name,age and position for sorting, no Using filesort.
Case 4EXPLAIN select * from employees where name = 'ZhangSan' order by position,age copy code
Analysis:
The execution result is the same as that of explain in case 3, but Using filesort appears because the index is created in the order of name,age,position, but age and position are reversed when sorting.
Case 5EXPLAIN select * from employees where name = 'ZhangSan' and age = 18 order by position,age copy code
Analysis:
Compared with case 4, * * Using filesort * * does not appear in Extra, because age is constant and optimized in sorting, so the index is not reversed and Using filesort does not appear.
Case 6EXPLAIN select * from employees where name = 'ZhangSan' order by age asc, position desc; copy code
Analysis:
Although the sorted field columns are in the same order as the index, and order by defaults to ascending order, here position desc becomes the descending order of the column, resulting in a different sort from the index, resulting in Using filesort. This query method is supported by descending indexes in MySQL8 and above.
Case 7EXPLAIN select * from employees where name in ('ZhangSan',' hjh') order by age, position; copy code
Analysis:
For sorting, multiple equality conditions are also range queries.
Case 8EXPLAIN select * from employees where name >'a 'order by name; copy code
Can be optimized with an overlay index
EXPLAIN select name,age,position from employees where name >'a 'order by name; replication code optimization summary MySQL supports two ways of sorting filesort and index. Using index means that MySQL scans the index itself to complete the sorting. The efficiency of index is high, while that of filesort is low. When order by satisfies two situations, it uses the Using index.order by statement to use the leftmost previous example of the index. The combination of the where clause and the order by clause conditional column satisfies the leftmost preceding example of the index. Try to sort on the index column and follow the leftmost prefix rule when the index is established (the order in which the index is created). If the condition of order by is not on the index column, Using filesort is generated. You can use an overlay index as much as possible. Group by is similar to order by in that it essentially sorts and then groups, following the leftmost prefix rule in the order in which the index is created. For the optimization of group by, if you do not need sorting, you can add order by null to prohibit sorting. Note: where is higher than having, so don't go to having for the qualification that can be written in where. Using filesort file sorting principle filesort file sorting method one-way sorting: take out all the fields that meet the criteria at once, and then sort them in sort buffer; you can see the sort_mode information with trace tool
< sort_key, additional_fields >Or
< sort_key, packed_additional_fields >. Two-way sorting (also called table sorting mode): first, take out the corresponding sorting field and the row ID that can directly locate the running data according to the corresponding conditions, then sort in sort buffer, and then retrieve other needed fields again after sorting; you can see the display in the sort_mode information with the trace tool.
< sort_key, rowid >MySQL determines which sort mode to use by comparing the size of the system variable max_length_for_sort_data (default 1024 bytes) with the total size of the fields that need to be queried.
Single sort mode is used if max_length_for_sort_data is larger than the total length of the query field, and two-way sort mode is used if max_length_for_sort_data is smaller than the total length of the query field. Verify various sorting methods EXPLAIN select * from employees where name = 'ZhangSan' order by position; copy code
Take a look at the corresponding trace result of this sql as follows (only the sorted part is shown):
Set session optimizer_trace= "enabled=on", end_markers_in_json=on; # Open traceselect * from employees where name = 'ZhangSan' order by position;select * from information_schema.OPTIMIZER_TRACE Copy the code "join_execution": {--SQL execution phase "select#": 1, "steps": [{"filesort_information": [{"direction": "asc", "table": "`employees`" "field": "position"}] / * filesort_information * /, "filesort_priority_queue_optimization": {"usable": false, "cause": "not applicable (no LIMIT)"} / * filesort_priority_queue_optimization * / "filesort_execution": [] / * filesort_execution * /, "filesort_summary": {--File sorting information "rows": 1,-- estimated number of scan lines "examined_rows": 1,-- lines sorted by parameters "number_of_tmp_files": 0,-- number of temporary files used This is only if 0 represents all used sort_buffer memory sort, otherwise the disk file sort "sort_buffer_size": 200704,-- sort cache size "sort_mode": "--sort method The one-way sort} / * filesort_summary * /}] / * steps * /} / * join_execution * / copy code used here
Modify the system variable max_length_for_sort_data (default 1024 bytes). The total length of all fields in the employees table must be greater than 10 bytes.
Set max_length_for_sort_data = 10; select * from employees where name = 'ZhangSan' order by position;select * from information_schema.OPTIMIZER_TRACE; copy code
Some results sorted by trace:
"join_execution": {"select#": 1, "steps": [{"filesort_information": [{"direction": "asc", "table": "`employees`", "field": "position"}] / * filesort_information * / "filesort_priority_queue_optimization": {"usable": false, "cause": "not applicable (no LIMIT)"} / * filesort_priority_queue_optimization * /, "filesort_execution": [] / * filesort_execution * /, "filesort_summary": {"rows": 1 Examined_rows: 1, number_of_tmp_files: 0, sort_buffer_size: 53248, sort_mode: "--sort by Here, use hungry two-way sorting} / * filesort_summary * /}] / * steps * /} / * join_execution * / to copy the code
The detailed process of one-way sorting:
Find the first primary key that satisfies the name='ZhangSan' condition from the index name id; takes out the whole row according to the primary key id, takes out the values of all fields, and stores them in sort_buffer; from the index name to find the next primary key that meets the name='ZhangSan' condition, id; repeats steps 2 and 3 until the data in sort_buffer is sorted by field position, and the result is returned to the client
Detailed process of two-way sorting:
Find the first primary key that satisfies name='ZhangSan' from the index name id; takes out the whole row according to the primary key id, and puts the sort field position and the primary key id into the sort_buffer; take a primary key id; from the index name that satisfies the name='ZhangSan' record and repeat steps 3 and 4 until the fields position and id in the sort_buffer are sorted by position Iterate through the sorted id and field position, and return the values of all the fields to the client according to the value of id back to the original table.
Comparing the two sorting modes, one-way sorting puts all the fields that need to be queried into sort_buffer, while two-way sorting only sorts the primary key and the fields that need to be sorted in sort_buffer, and then returns to the fields needed by the original table query through the primary key.
If the MySQL sort memory configuration is relatively small and there are no conditions to continue to increase, you can properly configure the max_length_for_sort_data to make the optimizer choose to use a two-way sorting algorithm, and you can sort more rows at once in sort_buffer, but you just need to go back to the original table to fetch data according to the primary key.
If the MySQL sorting memory conditions can be configured relatively large, you can appropriately increase the value of max_length_for_sort_data, let the optimizer first choose full-field sorting (one-way sorting), put the required fields into the sort_buffer, so that the sorted query results will be returned directly from memory.
Therefore, MySQL uses the parameter max_length_for_sort_data to control sorting and uses different sorting modes in different scenarios to improve sorting efficiency.
Note: if all use sort_buffer memory sorting will generally be more efficient than disk file sorting, but because of this can not randomly increase the sort_buffer (default 1m), MySQL many parameter settings have been optimized, do not easily adjust.
Paging query optimization
Here, let's insert some test data into employess.
Drop procedure if exists insert_emp; delimiter;; create procedure insert_emp () begin declare i int; set iTunes 1; while (I 9000 limit 5; copy code)
The query results are consistent, so let's compare the execution plan:
EXPLAIN select * from employees limit 9000 limit 5; copy code EXPLAIN select * from employees where id > 9000 limit 5; copy code
Obviously, the rewritten SQL takes the index, and the number of rows scanned is greatly reduced, and the execution efficiency is more efficient. However, this rewritten SQL is not practical in many scenarios, because some records in the table may be deleted and the primary key is vacant, resulting in inconsistent results, as shown in the following figure (delete a previous record first, and then test the original SQL and optimized SQL):
The results of the two SQL are not the same, so if the primary key is not contiguous, the optimization method described above cannot be used.
In addition, if the original SQL is a field with a non-primary key of order by, rewriting according to the above method will cause the results of the two SQL to be inconsistent. So this rewriting has to meet the following two conditions:
Case 2: paging queries sorted by non-primary key fields
Looking at a paging query sorted by non-primary key fields, the SQL is as follows:
Select * from employees ORDER BY name limit 9000.5; copy code EXPLAIN select * from employees ORDER BY name limit 900JO5; copy code
It is found that the index of the name field is not used (the value of the key field is null). The specific reason mentioned earlier is that it is more expensive to scan the entire index and find unindexed rows (which may have to traverse multiple index trees) than to scan the whole table, so the optimizer gives up using the index. Know the reason for not going to the index, so how to optimize it? In fact, the key is to make the fields returned during sorting as few as possible, so you can let the sorting and paging operations find the primary key first, and then find the corresponding record according to the primary key. SQL is rewritten as follows:
Select * from employees e inner join (select id from employees order by name limit 90000Power5) ed on e.id = ed.id; copy code
The required result is the same as the original SQL, and the execution time has been reduced by more than half. Let's compare the execution plan of sql before and after optimization:
EXPLAIN select * from employees e inner join (select id from employees order by name limit 90000Power5) ed on e.id = ed.id; copy code
The original SQL uses filesort sorting, while the optimized SQL uses index sorting.
Join associated query optimization # sample table CREATE TABLE `t1` (`id` INT (11) NOT NULL AUTO_INCREMENT, `a` INT (11) DEFAULT NULL, `b` INT (11) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_ a` (`a`) ENGINE = INNODB AUTO_INCREMENT = 10001 DEFAULT CHARSET = utf8;CREATE TABLE T2 LIKE T1; copy code
Insert 10,000 rows of records into T1 table and 100 rows of records into T2 table
# t 110,000 records drop procedure if exists insert_emp_t1; delimiter;; create procedure insert_emp_t1 () begin declare i int; set item1; while (I)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.