In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article is about how MySQL optimizes query speed. The editor thought it was very practical, so I shared it with you as a reference. Let's follow the editor and have a look.
How to choose optimized data types and how to use indexes efficiently are essential for high-performance MySQL. But these are not enough, but also need a reasonable design query. If the query is badly written, no matter how reasonable the table structure is and how appropriate the index is, it is impossible to achieve high performance.
When it comes to MySQL performance optimization, query optimization, as the source of optimization, can best reflect whether a system is faster or not. This chapter and the following chapters will focus on query performance optimization, from which you will introduce some query optimization techniques to help you gain a deeper understanding of how MySQL really executes the query, where it is slow, how to make it faster, and understand the reasons for efficiency and inefficiency, which will help you to better optimize query SQL statements.
This chapter starts with "Why the query speed is so slow", so that you can clearly know where the query may be slow, which will help you to better optimize the query, so that you have a clear idea and are superior to others.
First, where is the slow?
The real measure of query speed is response time. If you think of a query as a task, it is made up of a series of subtasks, each of which consumes a certain amount of time. If you want to optimize the query and actually optimize its subtasks, eliminate some of the subtasks, reduce the number of times the subtasks are executed, or make the subtasks run faster.
What subtasks does MySQL have when executing a query and which subtasks take the most time? This requires the help of some tools, or some methods (such as execution plan) to analyze the query to find out where it is slow.
Generally speaking, the life cycle of the query can be seen roughly in order: from the client to the server, and then parse on the server, generate an execution plan, execute, and return the results to the client. Among them, "execution" can be regarded as the most important stage in the whole life cycle, which includes a large number of calls to retrieve data to the storage engine and data processing after the call, including sorting, grouping, and so on.
When completing these tasks, the query needs to spend time in different places at different stages, including network, CPU calculation, generating statistics and performing plans, lock waiting and other operations, especially the call operations to retrieve data from the underlying storage engine, which require in-memory operations, CPU operations, and may also generate a large number of context switches and system calls.
In all of these operations, a lot of time is consumed, and there will be some unnecessary additional operations, some of which may be repeated many times, some of which may be performed slowly, and so on. This is where queries can really be slow, and the purpose of optimizing queries is to reduce and eliminate the time spent on these operations.
Through the above analysis, we have an overall understanding of the query process, can clearly know where the query may have problems, and ultimately lead to the whole query is very slow, providing the direction for the actual query optimization.
In other words, query optimization can be done from the following two perspectives:
Reduce the number of subqueries and reduce additional, repetitive operations
A common reason for poor query performance is that too much data is accessed. When the amount of data is small, the query speed is not bad. Once the amount of data comes up, the query speed will change dramatically, which makes people crazy and experience very bad. For query optimization, you can troubleshoot from the following aspects:
Whether unwanted data is queried and whether additional records are scanned
Second, whether the unwanted data has been queried
In many cases, the actual data is queried in the actual query, and then the excess data is discarded by the application. This is an additional overhead for MySQL and consumes the CPU and memory resources of the application server.
Some typical cases are as follows:
1. Query unwanted records
This is a common mistake, and it is often mistakenly assumed that MySQL will only return the data you need, but in fact MySQL returns all the result sets before calculating.
Developers habitually use select statements to query a large number of results, and then the application query or front-end display layer to obtain the previous N rows of data, for example, query 100 records in a news website, but only display the first 10 on the page.
The most effective solution is to query as many records as you need, usually followed by a LIMIT, that is, a paged query.
two。 Returns all columns when multiple tables are associated
If you want to query all the actors who appear in the movie Academy Dinosaur, do not query it in the following way:
Select * fromt actor ainner join film_actor fa.actorId = a.actorIdinner join film f f.filmId = fa.filmIdwhere fa.title = 'Academy Dinosaur'
This will return all the data columns of the three tables, and the actual requirement is to query the actor information, which should be correctly written as follows:
Select a.* fromt actor ainner join film_actor fa.actorId = a.actorIdinner join film f f.filmId = fa.filmIdwhere fa.title = 'Academy Dinosaur'
3. Always query out all columns
Be sure to look at select * with a strange eye every time you see it. Do you really need to return all the data columns?
In most cases, it is not necessary. Select * results in a full table scan, which prevents the optimizer from completing optimizations such as index scanning, and too many columns can lead to additional Icano, memory, and CPU consumption for the server. Even if you really need to query all the columns, you should list all the columns one by one instead of *.
4. Repeatedly query the same data
If you don't pay much attention, it's easy to make the mistake of executing the same query over and over again and then returning exactly the same data each time.
For example, if you need to query the URL of a user's profile picture where a user comments, the data may be queried repeatedly when the user comments multiple times. A better way to handle this is to cache the data during the initial query and take it out of the cache directly for subsequent use.
Third, whether additional records are scanned
After you have determined that the query has queried only the data you need, you should then see if too much data has been scanned during the query. For MySQL, the three simplest metrics to measure query cost are as follows:
Number of rows scanned in response time number of rows returned
No metric can fully measure the cost of a query, but they can roughly reflect how much data needs to be accessed when the query is executed within MySQL, and can roughly calculate the actual operation of the query. All three metrics are recorded in MySQL's slow log, so checking slow logging is a way to find out if there are too many rows to scan.
Slow query: used to record statements in MySQL where the response time exceeds the threshold (long_query_time, the default is 10s), and slow queries are recorded in slow logs. Slow query can be enabled by variable slow_query_long, which is closed by default. Slow logs can be recorded in table slow_log or file for inspection and analysis.
1. Response time
Response time is the sum of two parts: service time and queue time. Service time refers to how long it really takes the database to process the query. The queuing time refers to the time that the server does not actually execute the query because it is waiting for some resources, which may be waiting for the Imax O operation, waiting for the row lock, and so on.
Under different types of application pressure, there is no consistent law or formula for response time. Many factors, such as storage engine locks (table locks, row locks), high concurrency resource competition, hardware response, and so on, affect response time, so response time may be either the result of a problem or the cause of a problem. Different cases are different.
When you see the response time of a query, you first need to ask yourself whether the response time is a reasonable value.
two。 Number of rows scanned and rows returned
When analyzing a query, it is helpful to see the number of rows scanned by the query, on which you can also analyze whether additional records have been scanned.
This metric may not be perfect for finding bad queries, because not all rows have the same access cost. Shorter rows are accessed quite quickly, and rows in memory are accessed much faster than rows on disk.
Ideally, the number of rows scanned should be the same as the number of rows returned. But in fact, this kind of beauty is not much. For example, when doing an associative query, the ratio of the number of rows scanned to the number of rows returned is usually very small, usually between 1:1 and 10:1, but sometimes the value can be very large.
3. Number of rows scanned and access type
When evaluating query overhead, you need to consider the cost of finding a row of data from a table. MySQL has several access ways to find and return a row of results. These access methods may require access to many rows to return a result, and some access methods may be able to return results without scanning.
The type column in the execution plan EXPLAIN statement reflects the access type. There are many types of access, from full table scan to index scan, range scan, unique index, constant index and so on. The speed listed here is from slow to fast, and the number of rows scanned is from more to less.
If the query cannot find the right access type, the best solution is usually to add an appropriate index, which is the problem we discussed earlier. You should now understand why indexes are so important for query optimization. The index allows MySQL to find the required records in the most efficient way with the least number of rows scanned.
If you find that a query scans a large amount of data but returns only a few rows, you can usually try the following techniques to optimize it:
Using an index override scan, put all the columns you need into the index so that the storage engine can return results without going back to the table to get the corresponding rows.
Optimize the table structure. For example, use a separate summary table to complete the query. Rewrite a complex query so that the MySQL optimizer can execute the query in a more optimized manner. Thank you for reading! On the MySQL query speed optimization method to share here, I hope that the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it and let more people see it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.