Programming and Optimization of MySQL High performance SQL statement 04/29 Update SLTechnology News&Howtos

Programming and Optimization of MySQL High performance SQL statement

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following brings you about the preparation and optimization of MySQL high-performance SQL statements. I believe you must have read similar articles. What's the difference between what we bring to everyone? Let's take a look at the body. I believe you will gain something after reading the writing and optimization of MySQL high-performance SQL statements.

Environment and database

Because of my hobby, my system is Linux mint19; database version is MySQL 5.7.23-0ubuntu0.18.04.1 (Ubuntu). A brief introduction to how to view the MySQL version

# 1. If you connect to the server, you will directly output the Server version ckmike@ckmikePC:~$ mysql-uroot-pEnter password: Welcome to the MySQL monitor. Commands end with; or\ g.Your MySQL connection id is 7Server version: 5.7.23-0ubuntu0.18.04.1 (Ubuntu) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or'\ h' for help. Type'\ c'to clear the current input statement.mysql > # 2. After entering the CVM, use the command to view mysql > status-mysql Ver 14.14 Distrib 5.7.23, for Linux (x86 / 64) using EditLine wrapperConnection id: 7Current database: Current user: root@localhostSSL: Not in useCurrent pager: stdoutUsing outfile:''Using delimiter: Server version: 5.7.23-0ubuntu0.18.04.1 (Ubuntu) Protocol version: 10Connection: Localhost via UNIX socketServer characterset: latin1Db characterset: latin1Client characterset: utf8Conn. Characterset: utf8UNIX socket: / var/run/mysqld/mysqld.sockUptime: 2 days 13 hours 58 min 13 secThreads: 1 Questions: 53 Slow queries: 0 Opens: 112 Flush tables: 1 Open tables: 104 Queries per second avg: 0.000-mysql > # 3. Use system predefined functions to view mysql > select version ()-> +-+ | version () | +-+ | 5.7.23-0ubuntu0.18.04.1 | +-+ 1 row in set (0.00 sec) mysql > # 4. Use Linux to view ckmike@ckmikePC:~$ mysql-- versionmysql Ver 14.14 Distrib 5.7.23, and for Linux (x86 / 64) using EditLine wrapper#5.Linux to check the version rpm-qa with the help of the installation tool | grep mysql, of course, if it is installed using yum, you can also use yum to view yum list installed | grep mysql

Note: SQL is a standardized query language, regardless of the different implementation of each database vendor, the principle is the same, but according to the implementation of different vendors, there are some different ways to get the same result set. For example, MySQL uses limit to achieve paging, while oracle uses top before version 2005, followed by row_number (). Because the optimization of SQL must be aimed at the vendor implementation, I choose MySQL here, I use the most proficient, followed by oracle, and SQLServer except for college use, has not been used in work. I don't need to say more about the popularity of MySQL and why it is popular.

Consolidate the foundation of database

Before we talk about SQL performance optimization, we first need to master the basics of the database, because any SQL runs on the database engine, and just like explaining Linux commands and the Shell language, they are on a certain kernel or engine.

Database language classification

According to the difference of operation object and operation granularity, the database language is divided into four categories:

DQL (Data Query Language)

Database query language: I don't need to say any more about this.

DDL (Data Definition Language)

Database definition language: definition, modification and deletion of relational schemas.

DML (Data Manipulation Language)

Database operation language: insertion, modification and deletion of meta-ancestor data.

DCL (Data Control Language)

Database control language: authorization and recovery of permissions, rollback and commit of transactions, etc.

Whether it's MySQL,oracle, SQLServer and others who want to PSQL, Red Fox is a product made up of these. Similar to Android is a standard, but the various versions of Android are different, but the core is based on Android standards, but each distribution implements some different interactive features, but it is essentially the same standard. So it is very important for us to understand that the database standard is very important. No matter how the product changes, the internal core standard will not change. The only difference is that it is written differently because of different implementation, that's all.

Explanation: for the developer of SQL, the key lies in DQL, DDL, DML. And optimization happens to be these parts.

Basic concept

Relationship: it's simply a two-dimensional table.

Yuanzu: simply speaking, it is a row of a two-dimensional table, that is, a record.

Attribute: simply a column that represents an attribute, such as: sex

Note: this is a simple personal understanding, in fact, the definition of professional in my opinion is very mouthful, so I copied those theories, but please do not clearly grasp, understand thoroughly, my description may mislead you.

Set and algebraic operation

The math class is about to start again. OK, I'm not going to talk about those boring data theories here. Generally speaking, a set is composed of a class of elements with common attributes. For example, people are a concept of collection: they are all composed of various organs and tissues, such as eyes, nose, mouth, head, torso, feet, and so on. There are also, for example, gender, name, age, height, weight and other attributes. And a person is an element in the collection of people. In combining the basic concept is not a collection similar to a two-dimensional table (relationship), while the specific person is a meta-ancestor.

When talking about a set, it is necessary to mention the operation of the set: Union, difference, Cartesian product, selection, projection.

Continue the collection of people above to simply say and operate:

And operation: for example, Chinese = (Beijing people), (Tianjin people),. (Taiwanese), (Hong Kong people), (Macao people) and other provinces. Of course, some immigrant Chinese can be included here according to the definition, but I divide them here according to their nationality and household registration. Is to put all the elements of the set together, which is the union operation.

If you don't understand other operations, you can tutor them on your own. The length of the relationship will not be repeated.

Paradigm

Many people may not remember this concept, but as long as you are involved in a database, you will definitely use it. So the importance of this can be imagined. If you do not know, you can go to Baidu database paradigm to understand and tutoring.

DQL

DQL is the most common and frequently used language in our work, such as viewing reports.

Keyword: select

DDL

DDL is a language for defining, modifying and deleting relationships, including creating databases, tables, indexes, triggers, stored procedures, and functions.

Keyword: create

DML

DML is the language of operating Yuanzu, including deletion, insertion and modification of Yuanzu.

Keywords: insert, update, delete

We have already gone over many basic knowledge points here, but I will not talk about them here, such as grouping, functions, and sorting. Later, we will continue to explain them in detail.

Focus on SQL optimization

None of the above is the focus of today, but it is indeed a very important foundation. In fact, the optimization of SQL performance is aimed at the DQL statement. No matter you delete or update it, you have to locate the meta-ancestor first, so the SQL performance optimization we often talk about is actually aimed at the part of the query statement.

So what are the points that affect the performance of SQL queries? Let's make a list one by one next.

Search column plus index

The index is the most direct and effective way to improve the search speed, but do not abuse the index, because the index has a negative impact on updates and deletions, and the index also eats resources.

Do not use *

When I first came into contact with the database, my university teacher (known as Duan Longwang) told us: never use statements like select * from, especially in subqueries.

We can do an experiment. I now have a table user, which contains more than one million pieces of data.

# count the number of entries. They are all executed for the first time and have not been cached mysql > select count (id) from user;+-+ | count (id) | +-+ | 1050506 | +-+ 1 row in set (0.17 sec) mysql > select count (*) from user +-+ | count (*) | +-+ | 1050506 | +-+ 1 row in set (0.19 sec) mysql > the effect is not obvious, and sometimes count (*) is faster than count (id). I think this has something to do with the count () function, so let's directly query the entire table. Mysql > select * from user | | 1050504 | ckmike699997 | 18 | 17996 | 2019-05-11 | | 1050505 | ckmike699998 | 18 | 11907 | 2019-05-11 | | 1050506 | ckmike699999 | 18 | 17726 | 2019-05-11 | | 1050507 | ckmike700000 | 18 | 1563 | 2019-05-11 | +-| +-+ 1050506 rows in set (0.61sec) mysql > select name from user | | ckmike699998 | | ckmike699999 | | ckmike700000 | +-+ 1050506 rows in set (0.35 sec) |

The reason why it should not be used is that it consumes resources. Unnecessary fields in network transmission do not need to be found out, and packets can be as small as they can. This is a point. But try not to use it if you want it all automatically. This is especially true in subqueries, which can multiply the time consumed. Such a statement must not be used!

Join (left,inner,right) association

Usually the association is a competitive\ FK association, so the associated fields are indexed, and try not to use functions.

Where query field optimization

1. Query fields are indexed as much as possible

2.where query fields can only be used without functions, because it will invalidate the index and perform a full table scan.

3.where puts the query fields that can determine the smaller result set and scan the result set faster.

4. Avoid using like as much as possible

5. Functions, arithmetic, or other expression operations cannot be performed on the left side of the "=", and the index may not be used correctly.

6. Do not use! =, as much as possible, it will cause full table scan

7. If it is a field of a composite index, then the order should be based on the properties defined by the index, otherwise the index cannot be guaranteed to take effect.

8. If you use the variable @, it will cause the index to fail.

Never use'%%'in like

Like'% 'invalidates the index, resulting in a full table scan, which is very fatal. On the other hand, the index can not play at all, but it retains the pain point of the index. So is it impossible to use like? Of course not, we can use the indexing function like like 'abdc%', to take effect, and we can use like, which is a compromise, and using like is inefficient.

But demand is sometimes tough.

Subquery result set priority filtering

It is very effective to limit the size of the result set of a subquery, and try to place the filter conditions in the where statement in the subquery instead of external filtering.

Rational use of in and exists

In uses internal and external hash connections, while exists uses loop to loop through the word table.

Then there is a problem that set size affects performance. If two sets are about the same size, in fact, both sets are about the same, then the larger inner table uses exists, and the smaller inner table uses in:

Table A (small table), table B (large table) 1:select * from A where cc in (select cc from B) is inefficient, using the index of cc column on table A; select * from A where exists (select cc from B where cc=A.cc) is efficient, using the index of cc column on table B. Rational use of not in and not exists

The query statement uses not in to scan the whole table inside and outside, without using the index, while the subquery of not exists can still use the index on the table. So no matter which watch is big, it is faster to use not exists than not in.

Avoid using OR operators

If a field has an index and a field does not have an index, it will cause the engine to give up using the index and do a full table scan, at which point we can split it into two or more result sets for union.

Do you think it is what you want to write and optimize the MySQL high-performance SQL statements above? If you want to know more about it, you can continue to follow our industry information section.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.