What are the specific suggestions for optimizing query statements? 07/16 Update SLTechnology News&Howtos

What are the specific suggestions for optimizing query statements?

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article shows you the specific suggestions to optimize the query sentence, the content is concise and easy to understand, it can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

I'm going to do two things: first, point out why this list is bad, and second, make my list. I hope mine is better. Keep reading, intrepid readers!

Why is that list so bad?

1. His strength is not in the right place.

One of the rules we should follow is that if you want to optimize your code, you should first find out where the bottleneck is. Yet Mr Silverton's strength is not in the right place. I think 60% of the optimizations are based on a clear understanding of SQL and database basics. You need to know the difference between join and subqueries, column indexes, and how to normalize data, and so on. The other 35% of optimizations need to be clear about the performance of database selection, such as COUNT (*) may be fast or slow, depending on which database engine you choose. There are other factors to consider, such as when the database is not cached, when it is stored on the hard disk but not in memory, when the database creates temporary tables, and so on. The remaining 5 per cent is rarely encountered, but Mr Silverton happens to spend a lot of time on it. I've never used SQL_SAMLL_RESULT before.

two。 Good problem, but bad solution.

Mr. Silverton asked some good questions. MySQL will use dynamic row format (dynamic row format) for columns of variable length such as TEXT or BLOB, which means that sorting will take place on the hard drive. Our approach is not to avoid these data types, but to separate them from the original table and put them into another table. The following schema illustrates this idea:

Double-click the code 1 2 3 4 5 6 7 8 9 10 11CREATE TABLE posts (id int UNSIGNED NOT NULL AUTO_INCREMENT, author_id int UNSIGNED NOT NULL, created timestamp NOT NULL, PRIMARY KEY (id)), CREATE TABLE posts_data (post_id int UNSIGNED NOT NULL. Body text, PRIMARY KEY (post_id))

3. It's a little weird...

Many of his suggestions are very surprising, such as "remove unnecessary parentheses". It doesn't matter whether you write SELECT * FROM posts WHERE (author_id = 5 AND published = 1) or SELECT * FROM posts WHERE author_id = 5 AND published = 1. Any good DBMS will be automatically identified and processed. This detail is like whether iTunes + is faster or + + I is faster in C language. Really, if you spend all your energy on this, you don't have to write code.

My list

Let's see if my list is better. Let me start with the most common.

1. Establish a benchmark, establish a benchmark!

If we need to make a decision, we need the data to talk. What kind of query is the worst? Where is the bottleneck? Under what circumstances will I write a bad query? Benchmarking allows you to simulate high-pressure situations, and then with the help of performance evaluation tools, you can find errors in the database configuration. Such workers have supersmack, ab, SysBench. These tools can directly test your database (such as supersmack) or simulate network traffic (such as ab).

two。 Performance testing, performance testing!

So, when you can create some high-pressure situations, you need to find errors in the configuration. This is what performance evaluation tools can do for you. It can help you find bottlenecks in your configuration, whether in memory, CPU, network, hard disk I mano, or above.

The first thing you need to do is to open the slow query log (slow query log) and install mtop. So you can get information about malicious intruders. Is there a query that needs to run for 10 seconds that is destroying your application? These guys will show you how his query is written.

After you find those slow queries, you need to use the tools that come with MySQL, such as EXPLAIN,SHOW STATUS,SHOW PROCESSLIST. They will tell you where resources are consumed and where the query statement is flawed, such as whether a query statement with three join subqueries is sorted in memory or on the hard drive. Of course, you should also use evaluation tools such as top,procinfo,vmstat to get more information about system performance.

3. Reduce your schema

Before you start writing queries, you need to design schema. Keep in mind that the space required to load a table into memory is about the number of rows * the size of a row. Unless you think everyone in the world will register on your site 2 trillion 800 billion times, you don't need to use BITINT as your user_id. Similarly, if a text column is of a fixed size (such as the US zip code, usually in the form of "XXXXX-XXXX"), using VARCHAR will add extra bytes to each line.

Some people don't care about database normalization, saying that it will lead to a rather complex schema. However, proper normalization will reduce redundant data. (proper normalization) means sacrificing a little performance for less footprint overall, which is common in computer science for memory. The best approach is IMO, which starts with normalization and then de-normalizes if performance requires it. Your database will be more logical and you don't have to optimize it prematurely. (translator's note, I don't quite understand this paragraph. It may have been translated incorrectly. Welcome to correct it.)

4. Split your watch

There are usually some tables that have only some columns that you often need to update. For example, for a blog, you need to display headlines in many different places (such as a list of recent articles) and only show a summary or full text on a particular page. Split horizontally and vertically is helpful:

Double-click the code to select all

CREATE TABLE posts (id int UNSIGNED NOT NULL AUTO_INCREMENT, author_id int UNSIGNED NOT NULL, title varchar, created timestamp NOT NULL, PRIMARY KEY (id)); CREATE TABLE posts_data (post_id int UNSIGNED NOT NULL, teaser text, body text, PRIMARY KEY (post_id))

The schema above optimizes the read data. Frequently accessed data is stored in one table, and infrequently accessed data is placed in another. After being split, infrequently accessed data takes up less memory. You can also optimize writing data by putting frequently updated data in one table and infrequently updated data in another table. This makes caching more efficient because MySQL does not need to move data that has not been updated out of the cache.

5. Don't overuse artificial primary key

Artificial primary key is great because they make fewer changes to schema. If we store geographic information in tables based on the US postcode, if the postcode system suddenly changes, then we will be in big trouble. On the other hand, adopting natural key is sometimes great, for example, when we need join many-to-many relational tables, we should not do this:

Double-click the code to select all

CREATE TABLE posts_tags (relation_id int UNSIGNED NOT NULL AUTO_INCREMENT,post_id int UNSIGNED NOT NULL,tag_id int UNSIGNED NOT NULL,PRIMARY KEY (relation_id), UNIQUE INDEX (post_id, tag_id))

Artificial key is completely redundant, and the number of post-tag relationships will be limited by the system maximum of shaping data.

Double-click the code to select all

CREATE TABLE posts_tags (post_id int UNSIGNED NOT NULL,tag_id int UNSIGNED NOT NULL,PRIMARY KEY (post_id, tag_id))

6. Learning index

The quality of the index you choose is very important, if not, it may destroy the database. For those who have not yet learned much about the database, an index can be thought of as hash sorting. For example, if we use the query statement SELECT * FROM users WHERE last_name = 'Goldstein', and last_name does not have an index, then DBMS will query each row to see if it equals "Goldstein". Indexes are usually B-tree (and other types), which can speed up the comparison.

You need to index the columns you want select,group,order,join. It is clear that the space required for each index is proportional to the number of rows in the table, so the more indexes will take up more memory. And when writing data, the index also has an impact, because the corresponding index is updated each time the data is written. You need to strike a balance, depending on the needs of each system and implementation code.

What exactly are the suggestions for optimizing query statements? have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.