In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how to achieve MySQL performance optimization. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.
Generally speaking, in order to ensure the efficiency of the database, we should do a good job in the following four aspects:
Database table design SQL statement optimization of database parameters configuration of appropriate hardware resources and operating system
In addition, the use of appropriate stored procedures can also improve performance. This order also shows the impact of four aspects on performance.
Database table design
A popular understanding of the three paradigms is of great benefit to database design. In database design, in order to better apply the three paradigms, we must understand the three paradigms popularly.
The first paradigm: 1NF-ensuring atomicity
It is the atomicity constraint to the attribute, which requires the attribute (column) to be atomic and can not be decomposed; (as long as the relational database satisfies 1NF)
Second normal form: 2NF-ensure that each column in the table is related to the primary key
It is a constraint on the uniqueness of the record, which requires a unique identification of the record, that is, the uniqueness of the entity.
Satisfy the 1NF first, and then each table should have a primary key, and make sure that each column is related to the primary key, not part of the primary key (mainly for the federated primary key). In other words, only one type of data is stored in a table, not multiple data.
Error demonstration: wrong design of commodity order information
Correct demonstration: correct design of commodity order information
Third paradigm: 3NF-ensure that each column is directly related to the primary key, not indirectly
3NF is a constraint on field redundancy, which requires that fields have no redundancy.
The third paradigm needs to ensure that each column of data in the data table is directly related to the primary key, not indirectly. Cannot pass dependencies, such as non-primary key column A depends on non-primary key column B, non-primary key column B depends on primary key.
Key field-> non-key field x-> non-key field y copy code
Case 1:
For example, when designing an order data table, the customer number can be used as a foreign key to establish a corresponding relationship with the order table. You cannot add other fields about the customer (such as name, company, etc.) to the order table. The design shown in the following two tables is a database table that meets the third normal form.
Case 2:
Assume that the student relationship table is Student (student number, name, age, college, college location, college telephone number), and the keyword is a single keyword "student number", because the following decision exists:
(student number)-> (name, age, college, college location, college phone number) copy code
That is, there is a transfer function dependence of the non-key fields "college location" and "college phone" on the key field "student number". It will also have data redundancy, update exceptions, insert exceptions, and delete exceptions. Correctly, the student relationship table should be divided into the following two tables:
Student: (student number, name, age, college) College: (college, location, telephone) advantages and disadvantages of the model
Advantages of stylization:
Less duplicated data, not redundant; faster maintenance updates; smaller modeled tables that can be run in memory.
Disadvantages of stylization:
Query often requires a lot of associations, increasing the cost of the query. It may also invalidate some indexing strategies because the normalized columns are placed in different tables that could have belonged to the same index in one table.
Advantages and disadvantages of anti-normalization
Advantages of anti-normalization:
Avoid associations, and almost all data can be displayed in one table. A valid index can be designed.
The disadvantages of anti-normalization:
There is more redundant data, less maintenance trouble, and it is easy to lose important information when deleting data.
Suggestions on the Design of data Table
No redundant database design can be done, but a non-redundant database is not necessarily the best database. Sometimes to improve the running efficiency of columns, it is necessary to lower the standard of paradigm and retain redundant data properly. The specific approach is to follow the third normal form in the conceptual data model design, and the work of reducing the normal form standard is considered in the physical data model design. To lower the paradigm is to add fields, allowing redundancy.
In addition, in the Alibaba Java Development Manual, the table design of the database allows appropriate redundancy to improve the performance of SQL queries and avoid associated queries of tables.
Moderate redundancy to reduce join association
Fields that are not updated frequently but are queried very frequently. Such as the product name in the order, the user nickname in the Weibo post.
Large fields are split vertically
For example, if the content in the blog list is separated, the blog content is not read when accessing the list, which is a deep logical relationship.
Horizontal split of large table
Example: in a forum system, administrators often post posts that require them to be at the top of each category list.
Design plan 1: add a column in the posting table to indicate whether the post is posted by an administrator, so that when each category list is displayed, you need to query the posting table twice, once on the top post, once on a regular post, and then merge the two results. If the content of the posting table is large, the performance overhead of the query top post will be larger. Design plan 2: store the top post in a separate top table. Because the number of top posts will be very small, but the access frequency is very high, so it will be separated from the posting table, and the performance overhead of access will be much less. Appropriate data type
If the amount of data is the same, but the data type is smaller, the storage of the same data will take up less space, so the IO consumption caused by retrieving the same data will be reduced, and the performance will be improved naturally. In addition, MySQL handles different types of data in different ways. For example, in the operation or sorting operation, the simpler the data type is, the higher the operation performance is, so try to choose a simple data type for the fields that need to be calculated or sorted frequently.
SQL statement optimization
General steps for SQL optimization
Know the execution frequency of all kinds of SQL through show status command; locate the less efficient SQL statement-(key select); through explain analysis of inefficient SQL; to determine the problem and take corresponding optimization measures. -- select sentence classification SelectDml data manipulation language (insert update delete) dtl data transaction language (commit rollback savepoint) Ddl data definition language (create alter drop..) Dcl (data Control language) grant revoke-- Show status Common commands-- query this session Show session status like 'com_%'; / / show session status like' Com_select'-- query global Show global status like 'com_%' -- authorize a user to grant all privileges on *. * to 'abc'@'%';-- Why authorize' abc' to indicate user name'@'to indicate host. Just look at the mysql- > user table-- reclaim permissions revoke all on. * from 'abc'@'%';-- refresh permissions [or not] flush privileges; copy code SQL statement optimization-show parameter
After the MySQL client connects successfully, you can provide server status information by using the show [session | global] status command. The session represents the statistical results of the current connection, and the global represents the statistical results since the last time the database was started. The default is session level.
Show status like 'Com_%'; copy code
Where Com_XXX represents the number of times the XXX statement was executed. Key note: through these parameters, Com_select,Com_insert,Com_update,Com_delete can know whether the current database application is mainly to insert updates or query operations, as well as the approximate implementation proportion of all kinds of SQL.
There are also several commonly used parameters that make it easy for users to understand the basic situation of the database. Connections: the number of attempts to connect to the MySQL server Uptime: the working time of the server (in seconds) Slow_queries: the number of slow queries (the default is 10s)
Show status like 'Connections';show status like' Uptime';show status like 'Slow_queries'; copy the code
Query the slow query time of MySQL
Show variables like 'long_query_time'; copy code
Modify MySQL slow query time
Set long_query_time=2; replication Code SQL statement Optimization-locating slow query
Above, we described how to query some of the running status of the MySQL database.
For example, the current MySQL run time: how many times select/update/delete.. has been executed by show status like 'Uptime';? /: show status like 'Com_%'; slow query for locating the number of current connections
How to quickly locate slow statements from a project (locate slow queries)
Show variables like'% query%'; copy code slow_query_log is off by default. When you use it, you need to change to on to open the slow_query_log_file record file that records slow logs. Long_query_time defaults to 10s. Every time the executed sql reaches this length, it will be recorded to view the slow query status.
Slow_queries records the number of slow queries. When a sql executes slowly, the vlue is 1 (records the number of slow sql entries in this session).
Show status like'% slow_queries%'; copy code
Note:
How to open a slow query: SET GLOBAL slow_query_log = ON; change the default time to 1s: SET GLOBAL long_query_time = 1
(after setting up, you need to reconnect to the database, PS: if you only change it here, when you restart the database service again, all settings will automatically return to the default values, and permanent changes need to be changed in my.ini)
SQL statement Optimization-Explain tool
Use the EXPLAIN keyword to simulate the optimizer's execution of SQL statements, analyze your query statements or structural performance bottlenecks add the explain keyword before the select statement, MySQL will set a flag on the query, and execution of the query will return information about the execution plan instead of executing the SQL.
Note: if a subquery is included in the from, the subquery will still be executed and the results will be placed in the temporary table
Examples of Explain analysis: DROP TABLE IF EXISTS `actor`; CREATE TABLE `actor` (`id` int (11) NOT NULL, `name` varchar (45) DEFAULT NULL, `update_ time` datetime DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;INSERT INTO `actor` (`id`, `name`, `update_ time`) VALUES CREATE TABLE`room` (`id` int (11) NOT NULL AUTO_INCREMENT, `name` varchar (10) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_ name` (`name`) ENGINE=InnoDB DEFAULT CHARSET=utf8;INSERT INTO `ang` (`id`, `name`) VALUES CREATE TABLE`film _ actor` (`id`int (11) NOT NULL, `film_ id` int (11) NOT NULL, `actor_ id` int (11) NOT NULL, `remark` varchar (25551) DEFAULT NULL,PRIMARY KEY (`id`), KEY `idx_film_actor_ id` (`film_ id`, `actor_ id`) ENGINE=InnoDB DEFAULT CHARSET=utf8;INSERT INTO`film _ actor` (`id`, `film_ id`, `actor_ id`) VALUES (1Q 1Q 1), (2JI 1J 2), (3JZ 2Z 1); copy code explain select * from actor;
Each table in the query outputs one row, and if two tables join the query through join, two rows are output. The specific instructions for each column are described later.
Two variants of Explain explain extended
Some additional information on query optimization will be provided on top of explain. Then you can get the optimized query statement through the show warnings command to see what the optimizer has optimized. In addition, there is a filtered column, which is a half-fraction value, and rows * filtered/100 can estimate the number of rows that will be joined to the previous table in the explain (the previous table refers to the table whose id value in explain is lower than the current table id value).
Explain extended select * from film where id = 1; copy code show warnings; copy code explain partitions
There is an extra partitions field compared to explain, and if the query is based on a partition table, it shows the partition that the query will access.
Columns in Explain
Next we will show the information for each column in the explain.
Id column
The number of the id column is the serial number of the select, there are several id for every select, and the order of the id is incremented in the order in which the select appears. The larger the id column, the higher the execution priority, the same id is executed from top to bottom, and id is the last execution of NULL.
Select_type column
Select_type indicates whether the corresponding row is a simple or complex query.
Simple: simple query. The query does not include subqueries and unionexplain select * from film where id = 2; copy code primary: outermost selectsubquery in complex queries: subqueries contained in select (not in the from clause) derived: subqueries contained in the from clause. MySQL stores the results in a temporary table, also known as a derived table (the English meaning of derived)
Use the following example to learn about primary, subquery, and derived types.
Explain select (select 1 from actor where id = 1) from (select * from film where id = 1) der; copy code
The merge optimization of derivative tables with the new MySQL5.7 feature is not turned off, as follows:
# turn off the new mysql5.7 feature to optimize the merge of derived tables set session optimizer_switch='derived_merge=off'; replication code # restore the default configuration set session optimizer_switch='derived_merge=on'; replication code union: the second and subsequent selectexplain select 1 union all select 1 in union; copy the code table column
This column indicates which table the row of explain is accessing. When there is a subquery in the from clause, the table column is in format, indicating that the current query depends on the query of id=N, so the query of id=N is executed first. When there is union, the value of the table column of UNION RESULT is, and 1 and 2 represent the select row id that participates in union.
Type column
This column represents the association type or access type, that is, MySQL decides how to look up the rows in the table and find the approximate range of data row records.
The order from the best to the worst is:
System > const > eq_ref > ref > range > index > ALL copy the code
In general, you need to ensure that the query reaches the range level, preferably ref.
NULL:MySQL can decompose query statements during the optimization phase, eliminating the need to access tables or indexes at execution time. For example, selecting the minimum value in the index column can be done by looking up the index separately, without the need to access the table at execution time.
Explain select min (id) from film; copy code
System, const: MySQL can optimize a part of a query and turn it into a constant (see show warnings). When all columns used for primary key or unique key are compared to constants, the table has at most one matching row, read once, and is faster. System is a special case of const. Only one tuple in the table is system.
Explain extended select * from (select * from film where id = 1) tmp; copy code show warnings; copy code
Eq_ref: all parts of the primary key or unique key index are concatenated, and at most one qualified record is returned. This is probably the best connection type outside of const, which is not available in simple select queries.
Explain select * from film_actor left join film on film_actor.film_id = film.id; copy code
Ref: compared to eq_ref, you don't use a unique index, but use a partial prefix of a normal index or a unique index. If the index is compared to a value, you may find multiple rows that match the criteria.
(1) simple select query. Name is a common index (non-unique index).
Explain select * from film where name = 'film1'; copy code
(2) related table query. Idx_film_actor_id is the joint index of film_id and actor_id. The left prefix film_id of film_actor is used here.
Explain select film_id from film left join film_actor on film.id = film_actor.film_id; copy code
Range: range scanning usually occurs in in (), betwwen, >, =, and so on. Use an index to retrieve a given range of rows.
Explain select * from actor where id > 1; copy the code
Index: scan the whole table index to pass faster than ALL.
Explain select * from film; copy code
ALL: a full table scan, which means that MySQL needs to find the desired row from beginning to end. Usually this needs to be optimized by adding an index.
Explain select * from actor copy code possible_keys column
This column shows which indexes the query might use to find it. When explain, it is possible that possible_keys has columns, while key displays NULL, which is because there is not much data in the table. Mysql thinks that the index is not very helpful to this query and chooses a full table query. If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the where clause to see if an appropriate index can be created, and then use explain to see the effect.
Key column
This column shows which index mysql actually uses to optimize access to the table. If no index is used, the column is NULL. If you want to force mysql to use or ignore the index in the possible_keys column, use force index, ignore index in the query.
Key_len column
This column shows the number of bytes used by mysql in the index, which allows you to figure out which columns in the index are used. For example, film_actor 's federated index idx_film_actor_id consists of two int columns, film_id and actor_id, and each int is 4 bytes. From the key_len=4 in the result, it can be inferred that the query uses the first column, the film_id column, to hold the row index lookup.
Explain select * from film_actor where film_id = 2; copy the code
The key_len calculation rules are as follows: string
Char (n): n-byte length varchar (n): 2 bytes store string length, if utf-8, length 3n + 2
Numerical type
Tinyint:1 byte smallint:2 byte int:4 byte bigint:8 byte
Time type
Date:3 byte timestamp:4 byte datetime:8 byte
If the field is allowed to be NULL, a 1-byte record is required to determine whether it is NULL
The maximum length of the index is 768 bytes. When the string is too long, mysql will do a process similar to the left prefix index, extracting the first half of the characters for indexing.
Ref column
This column shows the columns or constants used by the table to find values in the index of the key column record. Common ones are: const (constant), field name (example: film.id)
Rows column
This column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set.
Extra column
This column shows additional information. Common important values are as follows:
(1) Using index: use override index
Explain select film_id from film_actor where film_id = 1; copy the code
(2) Using where: use the where statement to process the result, and the query column is not overwritten by the index.
Explain select * from actor where name ='a copy; copy the code
(3) Using index condition: the column of the query is not completely covered by the index, and the range of a leading column in the where condition
Explain select * from film_actor where film_id > 1; copy the code
(4) Using temporary: MySQL needs to create a temporary table to process the query. This kind of situation generally needs to be optimized, the first thing is to think of using the index to optimize.
Actor.name does not have an index. At this time, a temporary table is created to distinctexplain select distinct name from actor; copy code film.name to establish an idx_name index. When querying, extra is using index, and no temporary table explain select distinct name from film; is used to copy the code.
(5) Using filesort: the external sort is used instead of the index, and the data is sorted from memory when the data is small, otherwise the sorting needs to be done on disk. In this case, it is generally necessary to consider using indexes to optimize.
Actor.name has no index, browses through the actor table, saves the sort keyword name and the corresponding id, and then sorts name and retrieves row records. Explain select * from actor order by name; copy code film.name establishes the idx_name index, and extra is the using indexexplain select * from film order by name; copy code when querying.
(6) Select tables optimized away: use some aggregate functions (such as max, min) to access a field that has an index
Explain select min (id) from film Copy code SQL statement optimization-index best practices # sample table CREATE TABLE`employees` (`id` int (11) NOT NULL AUTO_INCREMENT, `name` varchar (24) NOT NULL DEFAULT''COMMENT' name', `age`int (11) NOT NULL DEFAULT'0' COMMENT 'age', `position`varchar (20) NOT NULL DEFAULT''COMMENT' position', `hire_ time`timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'onboarding time', PRIMARY KEY (`id`), KEY `idx_name_age_ position` (`name`, `age`) `position`) USING BTREE) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT=' employee record table' INSERT INTO employees (name,age,position,hire_time) VALUES ('ZhangSan',23,'Manager',NOW ()); INSERT INTO employees (name,age,position,hire_time) VALUES (' HanMeimei', 23 magic now ()); INSERT INTO employees (name,age,position,hire_time) VALUES ('Lucy',23,'dev',NOW ()); copy code full value match EXPLAIN SELECT * FROM employees WHERE name=' ZhangSan'; copy code EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 22 Copy code EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 22 AND position =' manager'; copy code leftmost prefix rule
If you index multiple columns, follow the leftmost prefix rule. This means that the query starts at the leftmost front column of the index and does not skip the columns in the index.
EXPLAIN SELECT * FROM employees WHERE age = 22 AND position = 'manager'; replication code EXPLAIN SELECT * FROM employees WHERE position =' manager'; replication code EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan'; replication code does not do anything on the index column
Not doing anything on the index column (calculation, function, (automatic or manual) type conversion) will cause the index to fail and turn to a full table scan.
EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan';EXPLAIN SELECT * FROM employees WHERE left (name,3) =' ZhangSan'; copy code
Add a normal index to hire_time:
ALTER TABLE `employees`ADD INDEX `idx_hire_ time` (`employees time`) USING BTREE; copy code EXPLAIN select * from employees where date (hire_time) = '2020-09-30; copy code
If converted to a date range query, the index will be taken:
EXPLAIN select * from employees where hire_time > = '2020-09-30 0000 and hire_time 22 AND position =' manager'; replication code use override index as much as possible
Try to use overridden indexes (queries that only access the index (index columns contain query columns) and reduce select * statements.
EXPLAIN SELECT name,age FROM employees WHERE name= 'ZhangSan' AND age = 23 AND position =' manager'; copy code EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 23 AND position =' manager'; copy code mysql cannot use the index when using (! = or) will cause a full table scan EXPLAIN SELECT * FROM employees WHERE name! = 'ZhangSan' Copy code is null, is not null cannot use index EXPLAIN SELECT * FROM employees WHERE name is null to copy code like begins with a wildcard ('% abc...') mysql index failure becomes a full table scan operation EXPLAIN SELECT * FROM employees WHERE name like'% Zhang' copy code EXPLAIN SELECT * FROM employees WHERE name like 'Zhang%' copy code
Question: how to solve the problem that the like'% string% 'index is not used?
Using an overlay index, the query field must be an overlay index field EXPLAIN SELECT name,age,position FROM employees WHERE name like'% Zhang%'; replication code if the overlay index cannot be used, it may be necessary to use search engine strings without single quotation marks indexing invalidation EXPLAIN SELECT * FROM employees WHERE name = '1000; EXPLAIN SELECT * FROM employees WHERE name = 1000; replication code uses less or or in
Use less or or in. When querying with it, mysql does not necessarily use indexes. The internal optimizer of mysql evaluates whether or not to use indexes based on many factors such as retrieval ratio and table size. For more information, please see scope query Optimization.
EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan' or name =' HanMeimei'; replication code range query optimization
Add a single-valued index to age
ALTER TABLE`employees`ADD INDEX `idx_ age` (`age`) USING BTREE; copy code explain select * from employees where age > = 1 and age = 1 and age = 1001 and age
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.