What are the optimization skills of SQL 07/15 Update SLTechnology News&Howtos

What are the optimization skills of SQL

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the SQL optimization skills". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the SQL optimization skills".

I. Index optimization

The data structure of the index is B+Tree, and the query performance of B+Tree is relatively high, so the establishment of index can improve the query performance of SQL.

1. Set up a general index

Index the table fields that often appear after the where keyword.

2. Set up compound index

If there are several fields that often appear after the where keyword, you can create a corresponding composite index. One thing to note that you can optimize is to put the fields that appear most individually in front of you.

For example, now we have two fields an and b that often appear after the where keyword at the same time:

Select * from t where a = 1 and b = 2;\ * Q1 *\

There are also many SQL that use field an alone as the query condition:

Select * from t where a = 2;\ * Q2 *\

At this point, we can set up a composite index index (ameme b). Because not only Q1 can use the composite index, but Q2 can also use the composite index.

3. Leftmost prefix matching principle

If we are using a composite index, we should try to follow the leftmost prefix matching principle. MySQL matches to the right until it encounters a range query (>, 3 and d = 4

Then the composite index that we should build is: index (aformab pdjol c) instead of index (arecine breco c rec). Because field c is a range query, MySQL stops index matching when it encounters a range query.

You have also noticed that, in fact, the position of a _ mai _ b _ r _ d in SQL can be adjusted arbitrarily, and the optimizer will find the corresponding composite index.

Also note that the leftmost prefix matching principle is not only the leftmost N fields of the composite index, but also the leftmost M characters of a single-column (string type) index.

For example, we often say that the like keyword, try not to use fully fuzzy queries, because this does not require an index

So the suggestion is to use the right fuzzy query: select * from t where name like'Li%'(query the information of all students surnamed Li).

4. Index push-down

In many cases, we can also optimize SQL by pushing down the index of the composite index.

For example, at this point we have a composite index: index (name,age), and then a SQL like this:

Select * from user where name like 'Zhang%' and age = 10 and sex ='m'

According to the leftmost prefix matching principle of the composite index, when MySQL matches to the name of the composite index index (name,age), the matching stops; then the next process is to return to the table according to the primary key to determine whether the conditions of age and sex are met at the same time, and then return it to the client.

However, due to the optimization of index push-down, when matching to name, it will not immediately return to the table; instead, it will first determine whether the age in the composite index index (name,age) meets the conditions; and then return to the table to determine whether the sex is satisfied, otherwise it will be filtered out.

Then through the index push-down optimization introduced by MySQL 5.6, the number of times to return to the table can be reduced.

5. Overwrite index

In many cases, we can also override the index to optimize SQL.

Case 1: SQL only queries the primary key as the return value.

The leaf node of a primary key index (clustered index) is the entire row of data, while the leaf node of a normal index (secondary index) is the value of the primary key.

So when our SQL only queries the primary key value, we can directly get the contents of the corresponding leaf node and avoid returning to the table.

Case 2: the query field of SQL is in the index.

Composite index: if we have a composite index index (name,age) at this time, there is a SQL like this:

Select name,age from t where name like 'Zhang%'

Because the field name is a right fuzzy query, you can take the composite index, and then match to name, there is no need to return the table, because SQL only queries the fields name and age, so directly return the index value on ok.

6. General index

Try to use a normal index instead of a unique index.

First of all, the query performance of a normal index is not much different from that of a unique index; of course, the premise is that the records to be queried are all in the same data page, otherwise the performance of the normal index will be much slower.

However, the update performance of the ordinary index is better than that of the unique index; in fact, it is very simple, because the ordinary index can use change buffer to do the update operation, while the unique index needs to read the data from the disk into buffer pool every time because it wants to determine whether the updated value is unique.

7. Prefix index

We should learn to use the prefix index skillfully to avoid the index value being too large.

For example, one field is addr varchar, but if an entire index [index (addr)] is built, disk space is wasted, so the prefix index [index (addr (64))] is chosen.

To build a prefix index, be sure to pay attention to the differentiation of the field. For example, fields such as ID card numbers are poorly differentiated, and as long as the place of birth is the same, many of the first characters are the same; in this case, the entire table may be scanned at the worst.

The prefix index cannot avoid returning to the table, that is, the overlay index cannot be used as the optimization point, because the index value is only the first n characters of the field, and you need to go back to the table to determine whether the query value is consistent with the field value.

How to solve it?

Reverse storage: for example, ID card, the distinguishing degree of the following bits is very high. We can query it like this:

Select field_list from t where id_card = reverse ('input_id_card_string')

two。 Add the hash field and index the hash field.

8. Clean index column

Index columns cannot participate in the calculation, and index columns should be kept "clean".

Suppose we create a normal index on the field birthday of table student.

The following SQL statement does not take advantage of indexes to improve execution efficiency:

Select * from student where DATE_FORMAT (birthday,'%Y-%m-%d') = '2020-02-02'

We should change it to something like this:

Select * from student where birthday = STR_TO_DATE ('2020-02-02,'% Ymuri% mme% d')

9. Extended index

Instead of adding new indexes, we should expand the indexes as much as possible, and it is best to have no more than 5 indexes for a table; the more indexes a table has, the more performance-consuming update operations will be.

II. SQL optimization

1. Order By optimization

The fields after 1.order by are indexed as much as possible, so that you can avoid using sort_buffer for sorting.

If there is a SQL, check all the students' information according to their birthdays: select * from student order by birthday desc

So to improve the query performance of SQL, we can index the birthday field:

CREATE INDEX index_birthday ON student (birthday)

Do not bring unnecessary fields after 2.select, because if the length of a single row is too long, MySQL will use rowid sorting instead of full-field sorting, which will lead to more operations to return to the table.

If we just look up the students' names, ages and birthdays, don't write select *

Instead, query only the required fields: select name, age, birthday.

2. Join optimization

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

When using join, you should use the small table as the driver table. Small table: the table with the least amount of total data

Using the join statement, it is best to make sure that the index of the driven table can be used, otherwise you can only use the BNL (Block Nested-Loop Join) algorithm, rather than not using it.

Enable the BKA (Batched Key Access) algorithm, so that the NLJ algorithm can also make use of join_buffer, and the driven table can be queried to meet the criteria in batches, and then the sequential disk reading feature of MMR (Multi-Range Read) can be used to improve the efficiency of returning to the table.

If we have to use join and the driven table does not have an index to use, then we can use the temporary table (create temporary table xx (...) engine=innodb;) to turn the BNL algorithm into a BKA algorithm, thus improving query performance.

Join_buffer is an unordered array, so each judgment needs to traverse the entire join_buffer. We can implement hash join on the business side to improve the execution speed of SQL.

3. Group By optimization

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

If there is no sorting requirement for the results of the group by statement, add order by null after the statement.

Let the group by process use the index of the above table as much as possible, not only without temporary tables, but also without additional sorting.

If the amount of data that group by needs to count is small, try to use only memory temporary tables; you can also avoid using disk temporary tables by appropriately increasing the tmp_table_size parameter.

If the amount of data is too large, use the SQL_BIG_RESULT hint to tell the optimizer to directly use the sorting algorithm to get the group by results.

4. OR optimization

The combined index cannot be used by the or keyword under the Innodb engine.

Suppose there is now a SQL about the order table:

Select id,product_name from orders where mobile = '12345678900' or user_id = 6

Generally speaking, in order to improve the query efficiency of the above SQL, we will want to build a composite index index (mobile,user_id) for the fields mobile and user_id.

However, we can use explain to find that the composite index is not prompted in the execution plan, so the or keyword cannot hit the combined index of mobile + user_id.

So we can build a normal index for each of the two fields, and then use the union keyword, as follows:

(select id,product_name from orders where mobile = '12345678900') union (select id,product_name from orders where user_id = 6)

At this point, both mobile and user_id fields are indexed, so the query is the most efficient.

5. IN optimization

The in keyword is suitable for the main table and the subtable is small, and the exist keyword is suitable for the child of the main table. Due to the continuous upgrade of the query optimizer, the performance of the two is pretty much the same in many scenarios, so you can try to change to join queries.

Suppose we now have a SQL to query all the order data for the VIP user:

Select id from orders where user_id in (select id from user where level = 'VIP')

We can find that there will not be any optimization for the index, so we can use the join query, as follows:

Select o.id from orders o join user u on o.user_id = u.id and u.level = 'VIP'

At this point, the driven table should be user, so the primary key index of the user table can be utilized, that is, the BKA algorithm can be used to improve the performance of the join query.

6. Like optimization

Like is used for fuzzy queries, but if it is a fully fuzzy query, it will not hit the index of the corresponding field.

Suppose there is now a SQL about the student table:

SELECT name,age,birthday FROM student WHERE name like'% Zhang%'

Using explain, you can find that the execution plan prompt query missed the index.

Because the original requirement is to query the information of all students surnamed Zhang, it is not necessary to use full fuzzy query, just use right fuzzy query.

Replace it with the following:

SELECT name,age,birthday FROM student WHERE name like 'Zhang%'

But does the product manager have to vaguely match? Full-text index FULLTEXT can give it a try, but MySQL's full-text index does not support Chinese queries.

So Elasticsearch is the ultimate weapon!

III. Optimization of datasheet design

1. Data type: you should choose a type that is simpler or takes up less space.

Integer selection: you can choose tinyint, smallint, medium_int based on length instead of using int directly.

String selection: if you can determine the length of a string, try to use the char type instead of the variable-length varchar type.

Floating-point selection: decimal instead of double; can also be used to save decimal places, and decimal places can be multiplied by 100.

Date selection: try to use timestamp instead of datetime.

2. Avoid null values:

NULL values still take up space and make index updates more complex, which is prone to index splitting when updating NULL.

You can use meaningful values instead of NULL values, such as the "none" string, and so on.

3. Ultra-long string:

Generally speaking, very long strings are difficult to store in varchar, so we usually use the text type.

However, fields of type text should not be placed in the main table as far as possible, but are extracted from the child table and associated with the business primary key.

Thank you for your reading, the above is the content of "what are the SQL optimization skills?" after the study of this article, I believe you have a deeper understanding of what SQL optimization skills have, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.