In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "the optimization method of MySQL sentence". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn the optimization method of MySQL sentence.
1. Establish a benchmark, establish a benchmark!
If we need to make a decision, we need the data to talk. What kind of query is the worst? Where is the bottleneck? Under what circumstances will I write a bad query? Benchmarking allows you to simulate high-pressure situations, and then with the help of performance evaluation tools, you can find errors in the configuration. Such tools include supersmack, ab, and SysBench. These tools can directly test your database (such as supersmack) or simulate network traffic (such as ab).
two。 Performance testing, performance testing!
So, when you can create some high-pressure situations, you need to find errors in the configuration. This is what performance evaluation tools can do for you. It can help you find bottlenecks in your configuration, whether in memory, CPU, network, hard disk I mano, or above.
The first thing you need to do is to open the slow query log (slow query log) and install mtop. So you can get information about malicious intruders. Is there a query that needs to run for 10 seconds that is destroying your application? These guys will show you how his query is written.
After you find those slow queries, you need to use the tools that come with MySQL, such as EXPLAIN,SHOW STATUS,SHOW PROCESSLIST. They will tell you where the resources are consumed and where the flaws in the query statements are, such as whether a query with three join subqueries is sorted in memory or on the hard disk. Of course, you should also use evaluation tools such as top,procinfo,vmstat to get more information about system performance.
3. Reduce your schema
Before you start writing queries, you need to design schema. Keep in mind that the space required to load a table into memory is about the number of rows * the size of a row. Unless you think everyone in the world will register 2 trillion 800 billion times on your site, you don't need to use BITINT as your user_id. Similarly, if a text column is of a fixed size (such as the US zip code, usually in the form of "XXXXX-XXXX"), using VARCHAR will add extra bytes to each line.
Some people don't care about database normalization, saying that it will lead to a rather complex schema. However, proper normalization will reduce redundant data. (proper normalization) means sacrificing a little performance for less footprint overall, which is common in computer science for memory. The best approach is IMO, which starts with normalization and then de-normalizes if performance requires it. Your database will be more logical and you don't have to optimize it prematurely. (translator's note) I don't quite understand this paragraph. I may have translated it wrong. welcome to correct it. )
4. Split your watch
There are usually some tables that have only some columns that you often need to update. For example, for a blog, you need to display headlines in many different places (such as a list of recent articles) and only show a summary or full text on a particular page. Split horizontally and vertically is helpful:
CREATE TABLE posts (
Id int UNSIGNED NOT NULL AUTO_INCREMENT
Author_id int UNSIGNED NOT NULL
Title varchar (128)
Created timestamp NOT NULL
PRIMARY KEY (id)
);
CREATE TABLE posts_data (
Post_id int UNSIGNED NOT NULL
Teaser text
Body text
PRIMARY KEY (post_id)
);
The schema above optimizes the read data. Frequently accessed data is stored in one table, and infrequently accessed data is placed in another. After being split, infrequently accessed data takes up less memory. You can also optimize write data by putting frequently updated data in one table and infrequently updated data in another table. This makes caching more efficient because MySQL does not need to move data that has not been updated out of the cache.
5. Don't overuse artificial primary key
Artificial primary key is great because they make fewer changes to schema. If we store geographic information in tables based on the US postcode, if the postcode system suddenly changes, then we will be in big trouble. On the other hand, adopting natural key is sometimes great, for example, when we need join many-to-many relational tables, we should not do this:
CREATE TABLE posts_tags (
Relation_id int UNSIGNED NOT NULL AUTO_INCREMENT
Post_id int UNSIGNED NOT NULL
Tag_id int UNSIGNED NOT NULL
PRIMARY KEY (relation_id)
UNIQUE INDEX (post_id, tag_id)
);
Artificial key is completely redundant, and the number of post-tag relationships will be limited by the system maximum of shaping data.
CREATE TABLE posts_tags (
Post_id int UNSIGNED NOT NULL
Tag_id int UNSIGNED NOT NULL
PRIMARY KEY (post_id, tag_id)
);
6. Learning index
The quality of the index you choose is very important, if not, it may destroy the database. For those who have not yet learned much about the database, an index can be thought of as hash sorting. For example, if we use the query statement SELECT * FROM users WHERE last_name = 'Goldstein', and last_name does not have an index, then DBMS will query each row to see if it equals "Goldstein". Indexes are usually B-tree (and other types), which can speed up the comparison.
You need to index the columns you want select,group,order,join. It is clear that the space required for each index is proportional to the number of rows in the table, so the more indexes will take up more memory. And when writing data, the index also has an impact, because the corresponding index is updated each time the data is written. You need to strike a balance, depending on the needs of each system and implementation code.
7.SQL is not C
C is a classic process language, and for a programmer, C is also a trap that makes you mistakenly think that SQL is also a process language (of course, SQL is not a functional language and is not object-oriented). Instead of imagining manipulating data, imagine having a set of data and the relationship between them. The wrong usage occurs when subqueries are often used.
SELECT a.id
(SELECT MAX (created)
FROM posts
WHERE author_id = a.id)
AS latest_post
FROM authors a
Because this subquery is coupled, the subquery uses the information of the external query, and we should use join instead.
SELECT a.id, MAX (p.created) AS latest_post
FROM authors a
INNER JOIN posts p
ON (a.id = p.author_id)
GROUP BY a.id
8. Understand your engine.
MySQL has two storage engines: MyISAM and InnoDB. They have their own performance characteristics and considerations respectively. Generally speaking, MyISAM is suitable for reading a lot of data, InnoDB is suitable for writing a lot of data, but there are also many cases where the opposite is true. The biggest difference is how they handle the COUNT function.
The MyISAM cache has a table meta-data, such as the number of rows. This means that COUNT (*) does not consume much resources for a well-structured query. Then for InnoDB, there is no such cache. For example, let's page a query. Suppose you have a statement like SELECT * FROM users LIMIT 5 FROM users LIMIT 10, and when you run SELECT COUNT (*) FROM users LIMIT 5 line 10, it's quick to finish for MyISAM, but it takes the same amount of time for InnoDB as the first statement. MySQL has a SQL_CALC_FOUND_ROWS option that tells InnoDB to count rows when the query statement is run, and then get it from SELECT FOUND_ROWS (). This is unique to MySQL. But using InnoDB is sometimes necessary, and you can get some features (such as row locking, stord procedure, etc.).
9.MySQL specific keyboard shortcuts
MySQL provides many extensions that are easy to use. For example, INSERT... SELECT, INSERT... ON DUPLICATE KEY UPDATE, and REPLACE.
I don't hesitate to use them because they are convenient and work well in many cases. But MySQL also has some dangerous keywords that should be used less. INSERT DELAYED, for example, tells MySQL that you don't need to insert data immediately (for example, when writing a log). However, the problem is that in the case of a high amount of data, the insertion may be delayed indefinitely, causing the insertion queue to be full. You can also use MySQL's index hints to indicate which indexes need to be used. MySQL runs well most of the time, but MySQL can perform badly if schema is not well designed or if the statement is not well written.
At this point, I believe you have a deeper understanding of "the optimization method of MySQL statement". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.