Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Note: a detailed explanation of the operation of improving the execution efficiency of SQL

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Scene

The database I use is MySQL5.6. Here is a brief introduction to the scenario.

Course schedule:

Create table Course (

C_id int PRIMARY KEY

Name varchar (10)

)

100 items of data

Student form:

Create table Student (

Id int PRIMARY KEY

Name varchar (10)

)

70000 items of data

Student report form SC

CREATE table SC (

Sc_id int PRIMARY KEY

S_id int

C_id int

Score int

)

70w items of data

The purpose of the inquiry is to find candidates who scored 100 in the Chinese language examination.

Query statement:

Select s. * from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100)

Execution time: 30248.271s

Dizzy, why are you so slow? let's take a look at the query plan:

EXPLAIN

Select s. * from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100)

Image

It is found that the index is not used, and the type is all ALL, so the first thing that comes to mind is to build an index, and the indexed field is of course the field under the where condition.

First index the c_id and score of the sc table

CREATE index sc_c_id_index on SC (c_id)

CREATE index sc_score_index on SC (score)

Execute the above query statement again for 1.054s

It is more than 3w faster, which greatly shortens the query time. It seems that the index can greatly improve the query efficiency, so it is necessary to build an index.

Most of the time, I forget to build an index, and I don't feel it at all when the amount of data is small, which makes it feel good.

But the time of 1 second is still too long, can you optimize it? take a closer look at the implementation plan:

View the optimized sql:

SELECT

`YSB`.`s`.`s _ id`AS `s_ id`

`YSB`.`s`.`name`AS `name`

FROM

`YSB`.`Student``s`

WHERE

< in_optimizer >

(

`YSB`.`s`.`s _ id`

< EXISTS >

(

SELECT

FROM

`YSB`.`SC``sc`

WHERE

(

(`YSB`.`sc`.`c _ id` = 0)

AND (`YSB`.`sc`.`score` = 100)

AND (

< CACHE >

(`YSB`.`s`.`s _ id`) = `YSB`.`sc`.`s _ id`

)

)

)

)

Add: here a friend asks how to view the optimized statement as follows:

Execute in the command window

There is type=all.

According to my previous idea, the order in which the sql should be executed is to execute the subquery first.

Select s_id from SC sc where sc.c_id = 0 and sc.score = 100

Time: 0.001s

The results are as follows:

And then execute.

Select s. * from Student s where s.s_id in (7, 29, 5, 000)

Time: 0.001s

This is quite fast. Instead of executing the query in the inner layer first, Mysql optimizes sql into an exists clause, and EPENDENT SUBQUERY,mysql executes the outer query first, then the query in the inner layer, so it will cycle 70007 times.

What about using join queries instead?

SELECT s. * from

Student s

INNER JOIN SC sc

On sc.s_id = s.s_id

Where sc.c_id=0 and sc.score=100

Here, in order to reanalyze the join query, delete the index sc_c_id_index,sc_score_index temporarily

Execution time: 0.057s

Efficiency has improved, take a look at the implementation plan:

There is a situation of joining tables here. I wonder if it is necessary to create an index on the s_id of the sc table.

CREATE index sc_s_id_index on SC (s_id)

Show index from SC

Performing a join query

Time: 1.076s, even the time has become longer, what is the reason? View the execution plan:

The optimized query statement is:

SELECT

`YSB`.`s`.`s _ id`AS `s_ id`

`YSB`.`s`.`name`AS `name`

FROM

`YSB`.`Student``s`

JOIN `YSB`.`SC``sc`

WHERE

(

(

`YSB`.`sc`.`s _ id` = `YSB`.`s`.`s _ id`

)

AND (`YSB`.`sc`.`score` = 100)

AND (`YSB`.`sc`.`c _ id` = 0)

)

It seems that the join query is done first, and then the where conditional filtering is carried out.

Go back to the previous execution plan:

Here, where conditional filtering is done first, and then the table is joined. The execution plan is not fixed, so let's first take a look at the standard sql execution order:

Normally, join is followed by where filtering, but in our case, if we join first, 70w pieces of data will be sent to join for exercise, so it is wise to perform where filtering first.

Now, in order to rule out the query optimization of mysql, I write an optimized sql myself.

SELECT

S.S. *

FROM

(

SELECT

*

FROM

SC sc

WHERE

Sc.c_id = 0

AND sc.score = 100

) t

INNER JOIN Student s ON t.s_id = s.s_id

That is, first perform the filtering of the sc table, and then join the table, the execution time is: 0.054s

It takes about the same time as before when the s_id index was not built. Check the execution plan:

It is much more efficient to extract sc first and then join tables. The problem now is that scanned tables appear when extracting sc, so it is clear that relevant indexes need to be established.

CREATE index sc_c_id_index on SC (c_id)

CREATE index sc_score_index on SC (score)

Then execute the query:

SELECT

S.S. *

FROM

(

SELECT

*

FROM

SC sc

WHERE

Sc.c_id = 0

AND sc.score = 100

) t

INNER JOIN Student s ON t.s_id = s.s_id

The execution time is 0.001s, which is 50 times faster.

Execute the plan:

We will see that indexes are used to extract sc and then connect tables.

So let's execute the sql again.

SELECT s. * from

Student s

INNER JOIN SC sc

On sc.s_id = s.s_id

Where sc.c_id=0 and sc.score=100

Execution time 0.001s

Execute the plan:

Here, mysql optimizes the query statement, first performs where filtering, and then performs join operations, and indexes are all used.

=

(I am the gorgeous dividing line)

Recently, some production data have been re-imported, and tests have found that the execution efficiency of sql optimized a few days ago has become lower again.

The data adjusted to the SC table increased to 300W, and the student scores were more discrete.

Let's review:

Show index from SC

Execute sql

SELECT s. * from

Student s

INNER JOIN SC sc

On sc.s_id = s.s_id

Where sc.c_id=81 and sc.score=84

Execution time: 0.061s, which is a little slow

Execute the plan:

The intersect union operation is used here, that is, the union of the results retrieved by the two indexes at the same time, and then look at the distinction between the fields score and c_id.

From a single field, the degree of differentiation is not very big. From the SC table, the result of c_id=81 search is 70001. The result of scorekeeper 84 is 39425.

The result of c_id=81 and score=84 is 897, that is, the degree of discrimination of the combination of the two fields is relatively high, so the query efficiency of establishing a federated index will be higher.

From another point of view, the data of the table is 300w, and there will be more in the future. as far as index storage is concerned, it is not a small number. with the increase of the amount of data, the index can not be loaded into memory, but should be read from disk. In this way, the more the number of indexes, the greater the cost of reading the disk.

Therefore, it is necessary to establish a multi-column federated index according to the specific business situation, so let's try it.

Alter table SC drop index sc_c_id_index

Alter table SC drop index sc_score_index

Create index sc_c_id_score_index on SC (clockwork score)

The time taken to execute the above query statement is 0.007s, which is acceptable.

Execute the plan:

The optimization of this statement is over for the time being.

Summary:

1. The efficiency of mysql nested subquery is indeed relatively low.

2. It can be optimized to join query.

3. When joining a table, you can first filter the table with the where condition, and then make a table join (although mysql will optimize the concatenated table statement)

4. Set up an appropriate index and, if necessary, establish a multi-column joint index

5. Learn to analyze the sql execution plan. Mysql will optimize the sql, so it is very important to analyze the implementation plan.

Index optimization

The optimization of subqueries and how to build indexes are mentioned above, and when multiple fields are indexed, a single index is established for each field.

Later, it is found that the efficiency of establishing a joint index will be higher, especially when the amount of data is large and the differentiation of a single column is not high.

Single column index

The query statement is as follows:

Select * from user_test_copy where sex = 2 and type = 2 and age = 10

Index:

CREATE index user_test_index_sex on user_test_copy (sex)

CREATE index user_test_index_type on user_test_copy (type)

CREATE index user_test_index_age on user_test_copy (age)

The sex,type,age field is indexed respectively, the amount of data is 300w, and the query time is 0.415s.

Execute the plan:

Discover type=index_merge

This is the optimization of multiple single-column indexes by mysql, and the intersect union operation is used for the result set.

Multi-column index

We can build multi-column indexes on these three columns and copy a table for testing.

Create index user_test_index_sex_type_age on user_test (sex,type,age)

Query statement:

Select * from user_test where sex = 2 and type = 2 and age = 10

The execution time is 0.032s, which is more than 10 times faster, and the higher the differentiation of the multi-column index, the more the speed is improved.

Execute the plan:

Leftmost prefix

Multi-column indexes also have the feature of leftmost prefix, execute the following statement:

Select * from user_test where sex = 2

Select * from user_test where sex = 2 and type = 2

Select * from user_test where sex = 2 and age = 10

The index is used, that is, the first field of the index, sex, appears in the where condition.

Index overlay

That is, all the columns of the query are indexed, so that when you get the result set, you don't have to go to disk to get the data of other columns, and you can directly return the index data, such as:

Select sex,type,age from user_test where sex = 2 and type = 2 and age = 10

Execution time: 0.003s, which is much faster than fetching all fields

Sort

Select * from user_test where sex = 2 and type = 2 ORDER BY user_name

Time: 0.139s

Establishing an index on the sort field will improve the efficiency of sorting.

Create index user_name_index on user_test (user_name)

Finally, a summary of sql tuning is attached for further study in the future:

1. Column types are defined as numeric types as far as possible, and the length is as short as possible, such as primary and foreign keys, type fields, and so on.

2. Establish a single-column index

3. Set up a multi-column joint index as needed.

When there is still a lot of data after a single column is filtered, the efficiency of the index will be relatively low, that is, the column will be less differentiated.

If you build an index on multiple columns, the distinction between multiple columns will be much greater, and there will be a significant improvement in efficiency.

1. Set up an overlay index according to the business scenario to query only the fields needed by the business. If these fields are covered by the index, the query efficiency will be greatly improved.

2. indexes need to be established on the fields of multi-table joins, which can greatly improve the efficiency of table joins.

3. An index needs to be established on the where condition field.

4. An index needs to be established on the sort field.

5. An index needs to be established on the grouping field.

6. Do not use operation functions on Where conditions to avoid index failure.

7. An index needs to be established on the sort field

8. An index needs to be established on the grouping field

Do not use operation functions on 9Where conditions to avoid index failure

Reference article

Http://www.cnblogs.com/linfangshuhellowored/p/4430293.html

Slow sql query

Http://tech.meituan.com/mysql-index.html

Cartesian product

Http://www.cnblogs.com/Toolo/p/3634563.html

Sql optimization

Http://www.cnblogs.com/mliang/p/3637937.html

Http://www.cnblogs.com/xwdreamer/archive/2012/07/19/2599494.html

Implementation Plan reference:

Http://www.cnblogs.com/ggjucheng/archive/2012/11/11/2765237.html

Original: https://www.cnblogs.com/tangyanbo/p/4462734.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report