What happens when we enter a SQL query statement? 12/23 Update SLTechnology News&Howtos

What happens when we enter a SQL query statement?

2025-12-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

We often say that when you look at a thing, don't get caught up in the details directly. You should get a bird's-eye view of it first, which can help you understand the problem from a high dimension. Similarly, the same is true for learning MySQL. Usually when we use the database, what we see is usually a whole. For example, if you have the simplest table with only one ID field, when executing the following query statement:

 copy code

Mysql > select * from T where ID= 10

All we see is enter a statement and return a result, but we don't know how the statement is executed within MySQL.

So today I would like to disassemble MySQL with you to see what "parts" are in it. I hope that through this disassembly process, you can have a better understanding of MySQL. In this way, when we encounter some exceptions or problems in MySQL, we can directly poke the essence, locate and solve the problem more quickly.

The following is a schematic diagram of the basic architecture of MySQL, from which you can clearly see the execution of SQL statements in various functional modules of MySQL.

Logical architecture diagram of MySQL

Generally speaking, MySQL can be divided into two parts: Server layer and storage engine layer.

The Server layer includes connectors, query caching, parsers, optimizers, executors, etc., covering most of MySQL's core service functions, as well as all built-in functions (such as date, time, math, and encryption functions, etc.). All cross-storage engine functions are implemented in this layer, such as stored procedures, triggers, views, and so on.

The storage engine layer is responsible for data storage and extraction. Its architecture mode is plug-in, supporting multiple storage engines such as InnoDB, MyISAM, Memory and so on. The most commonly used storage engine today is InnoDB, which has been the default storage engine since MySQL version 5.5.5.

In other words, if you do not specify the engine type when you perform create table table building, InnoDB is used by default. However, you can also select other engines by specifying the type of storage engine, such as using engine=memory in the create table statement, to specify that the memory engine is used to create tables. Different storage engines have different ways to access table data and support different functions. We will discuss the choice of engines in a later article.

It is not difficult to see from the figure that different storage engines share a Server layer, that is, the part from the connector to the actuator. You can first have an impression of the name of each component, and then I will walk you through the entire execution process with the SQL statement mentioned at the beginning, and look at the role of each component in turn.

Connector

In the first step, you will connect to this database first, and the connector will receive you at this time. Connectors are responsible for establishing connections with clients, obtaining permissions, maintaining and managing connections. The connection command usually goes like this:

 copy code

Mysql-h $ip-P $port-u $user-p

After typing the command, you need to enter the password in the interactive conversation. Although the password can also be written directly after-p on the command line, it may lead to your password being compromised. If you are connected to a production server, it is strongly recommended that you do not do so.

The mysql in the connection command is a client tool that is used to establish a connection with the server. After completing the classic TCP handshake, the connector will begin to authenticate you, using the user name and password you entered.

If the user name or password is wrong, you will receive a "Access denied for user" error and the client program finishes execution.

If the user name and password authentication is passed, the connector will find out the permissions you have in the permissions table. After that, the permission judgment logic in this connection will depend on the permissions read at this time.

This means that after a user successfully establishes a connection, even if you modify the permissions of the user with the administrator account, it will not affect the permissions of the existing connection. After the modification is complete, only the newly created connection will use the new permission settings.

After the connection is complete, if you have no subsequent action, the connection is idle, which you can see in the show processlist command. The figure in the text is the result of show processlist, where the Command column shows the line "Sleep", indicating that there is now a free connection in the system.

If the client does not move for too long, the connector will automatically disconnect it. This time is controlled by the parameter wait_timeout, and the default value is 8 hours.

If the client sends a request again after the connection is disconnected, it will receive an error warning: Lost connection to MySQL server during query. At this point, if you want to continue, you need to reconnect and then execute the request.

In the database, persistent connection means that after a successful connection, the client always uses the same connection if there are persistent requests from the client. A short connection means that the connection is disconnected after a few queries are executed, and a new one is established next time.

The process of establishing a connection is usually complicated, so I suggest you minimize the action of establishing a connection, that is, try to use a long connection.

But after using all persistent connections, you may find that sometimes the memory used by MySQL increases particularly fast, because the memory temporarily used by MySQL during execution is managed in the connection object. These resources are not released until the connection is disconnected. Therefore, if the long connection accumulates, it may lead to too much memory consumption and be forcibly killed by the system (OOM). From the phenomenon, it is that MySQL restarts abnormally.

How to solve this problem? You can consider the following two options.

Disconnect the long connection regularly. Use it for a period of time, or after it is judged in the program that a large memory-consuming query has been executed, disconnect, and then query and reconnect.

If you are using MySQL version 5.7 or later, you can reinitialize the connection resources by executing mysql_reset_connection after each large operation. This process does not require reconnection and redo permission verification, but restores the connection to the state it was when it was just created.

Query cache

After the connection is established, you can execute the select statement. The execution logic comes to the second step: query caching.

After MySQL gets a query request, it will first go to the query cache to see if this statement has been executed before. Previously executed statements and their results may be cached directly in memory in the form of key-value pairs. Key is the statement of the query and value is the result of the query. If your query can find key directly in this cache, then the value will be returned directly to the client.

If the statement is not in the query cache, the execution phase continues later. After the execution is completed, the execution results are stored in the query cache. As you can see, if the query hits the cache, MySQL does not need to perform the following complex operations to return the results directly, which is very efficient.

But in most cases I would advise you not to use query caching. Why? Because query caching often does more harm than good.

Query cache invalidation is very frequent, as long as there is an update to a table, all query caches on that table will be emptied. So it's possible that you took great pains to save the results, and before you could use them, you were completely emptied by an update. For databases with high update pressure, the hit rate of the query cache will be very low. Unless your business has a static table, it will take a long time to update. For example, a system configuration table, then the query on this table is suitable for using query caching.

Fortunately, MySQL also provides this "on-demand" approach. You can set the parameter query_cache_type to DEMAND so that query caching is not used for default SQL statements. For statements for which you are sure you want to use query caching, you can specify them explicitly with SQL_CACHE, like the following statement:

 copy code

Mysql > select SQL_CACHE * from T where ID= 10

It is important to note that MySQL version 8.0 removes the entire query cache function directly, which means that it has not been available since 8.0.

Analyzer

If you miss the query cache, you will start to actually execute the statement. First of all, MySQL needs to know what you want to do, so you need to parse the SQL statement.

The analyzer will first do a "lexical analysis". What you enter is a SQL statement made up of multiple strings and spaces, and MySQL needs to identify what the string is and what it represents.

MySQL is identified by the keyword "select" you typed, which is a query statement. It also recognizes the string "T" as "table name T" and the string "ID" as "column ID".

After these recognition is done, it is necessary to do "grammatical analysis". According to the results of lexical analysis, the parser will determine whether the SQL statement you enter satisfies the MySQL grammar according to the grammar rules.

If your sentence is incorrect, you will receive an error reminder of "You have an error in your SQL syntax". For example, the following sentence select misses the opening letter "s".

 copy code

Mysql > elect * from t where ID= 1

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'elect * from t where ID= 1' at line 1

General syntax errors will indicate the first place where the error occurs, so you should focus on the content immediately following the word "use near".

Optimizer

After going through the analyzer, MySQL knows what you're going to do. It has to be processed by the optimizer before starting execution.

The optimizer decides which index to use when there are multiple indexes in the table, or the join order of each table when a statement has multiple table associations (join). For example, you execute the following statement, which executes the join of two tables:

 copy code

Mysql > select * from T1 join T2 using (ID) where T1 .c = 10 and T2 .d = 20

It is possible to first take the ID value of the record of clocked 10 from table T1, and then associate it to table T2 according to the ID value, and then judge whether the value of d in T2 is equal to 20.

You can also first take the ID value of the record of daddy 20 from table T2, then associate it to T1 according to the ID value, and then judge whether the value of c in T1 is equal to 10.

The logical results of the two execution methods are the same, but the efficiency of execution will be different, and the role of the optimizer is to decide which scheme to use.

After the optimizer phase is complete, the execution plan for this statement is determined and then enters the executor phase. If you still have some questions, such as how the optimizer chooses the index, whether it is possible to choose the wrong index, etc., it doesn't matter, I will explain the optimizer separately in a later article.

Actuator

MySQL knows what you want to do through the parser and what to do through the optimizer, so it enters the executor phase and starts executing the statement.

When you start execution, you should first determine whether you have permission to execute a query on this table T. if not, an error without permission will be returned, as shown below.

 copy code

Mysql > select * from T where ID= 10

ERROR 1142 (42000): SELECT command denied to user 'b' @ 'localhost' for table 'T'

If you have permission, open the table and continue execution. When the table is opened, the executor uses the interface provided by the engine according to the engine definition of the table.

For example, in table T in our example, the ID field does not have an index, then the execution flow of the executor is as follows:

Call the InnoDB engine API to get the first row of the table to determine whether the ID value is 10. If not, skip it. If so, store the row in the result set.

Call the engine interface to take the "next row" and repeat the same judgment logic until the last row of the table is fetched.

The executor returns the recordset composed of all the rows that meet the criteria in the above traversal to the client as a result set.

At this point, the execution of the statement is complete.

For indexed tables, the same logic is performed. The interface "fetch the first line that meets the condition" is called for the first time, and then the interface "next line that meets the condition" is looped, which are already defined in the engine.

You will see a rows_examined field in the database's slow query log indicating how many lines were scanned during the execution of the statement. This value is accumulated each time the executor calls the engine to get the data row.

In some scenarios, the executor calls once and scans multiple lines inside the engine, so the number of rows scanned by the engine is not exactly the same as rows_examined.

Finally, share some Java framework materials, interview materials and learning resources to you:

Mainly includes (Java engineering, high-performance and distributed, concurrent programming, profound and simple. Videos and documents such as performance tuning, Spring, Mybatis, Netty, Redis, JVM, Zookeeper, ConcurrentHashMap1.8 source code interpretation, source code analysis, etc., as well as detailed Java advanced route brain maps)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.