In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article is to share with you about how to ensure the efficiency of the database in MySQL, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
The first paradigm: 1NF-ensuring atomicity
It is the atomicity constraint to the attribute, which requires the attribute (column) to be atomic and can not be decomposed; (as long as the relational database satisfies 1NF)
Second normal form: 2NF-ensure that each column in the table is related to the primary key
It is a constraint on the uniqueness of the record, which requires a unique identification of the record, that is, the uniqueness of the entity.
Satisfy the 1NF first, and then each table should have a primary key, and make sure that each column is related to the primary key, not part of the primary key (mainly for the federated primary key). In other words, only one type of data is stored in a table, not multiple data.
Error demonstration:
Wrong design of commodity order information
Correct demonstration:
Correct design of commodity order information
Third paradigm: 3NF-ensure that each column is directly related to the primary key, not indirectly
3NF is a constraint on field redundancy, which requires that fields have no redundancy.
The third paradigm needs to ensure that each column of data in the data table is directly related to the primary key, not indirectly. Cannot pass dependencies, such as non-primary key column A depends on non-primary key column B, non-primary key column B depends on primary key.
Key field-> non-key field x-> non-key field y
Case 1:
For example, when designing an order data table, the customer number can be used as a foreign key to establish a corresponding relationship with the order table. You cannot add other fields about the customer (such as name, company, etc.) to the order table. The design shown in the following two tables is a database table that meets the third normal form.
Case 2:
Assume that the student relationship table is Student (student number, name, age, college, college location, college telephone number), and the keyword is a single keyword "student number", because the following decision exists:
(student number)-> (name, age, college, college location, college telephone number)
Copy the code
That is, there is a transfer function dependence of the non-key fields "college location" and "college phone" on the key field "student number". It will also have data redundancy, update exceptions, insert exceptions, and delete exceptions. Correctly, the student relationship table should be divided into the following two tables:
Student: (student number, name, age, college)
College: (college, location, telephone)
Stylized advantages and disadvantages
Advantages of stylization:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Less duplicate data and no redundancy
Quick maintenance and update
The formatted table is smaller and can be run in memory.
Disadvantages of stylization:
Query often requires a lot of associations, increasing the cost of the query. It may also invalidate some indexing strategies because the normalized columns are placed in different tables that could have belonged to the same index in one table.
Advantages and disadvantages of anti-normalization
Advantages of anti-normalization:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Avoid associations, and almost all data can be displayed in one table.
A valid index can be designed.
The disadvantages of anti-normalization:
There is more redundant data, less maintenance trouble, and it is easy to lose important information when deleting data.
Suggestions on the Design of data Table
No redundant database design can be done, but a non-redundant database is not necessarily the best database. Sometimes in order to improve operational efficiency, it is necessary to lower the standard of paradigm and retain redundant data properly. The specific approach is to follow the third normal form in the conceptual data model design, and the work of reducing the normal form standard is considered in the physical data model design. To lower the paradigm is to add fields, allowing redundancy.
In addition, in the Alibaba Java Development Manual, the table design of the database allows appropriate redundancy to improve the performance of SQL queries and avoid associated queries of tables.
Moderate redundancy to reduce join association
Fields that are not updated frequently but are queried very frequently. Such as the product name in the order, the user nickname in the Weibo post.
Large fields are split vertically
For example, if the content in the blog list is separated, the blog content is not read when accessing the list, which is a deep logical relationship.
Horizontal split of large table
Example: in a forum system, administrators often post posts that require them to be at the top of each category list.
Design plan 1: add a column in the posting table to indicate whether the post is posted by an administrator, so that when each category list is displayed, you need to query the posting table twice, once on the top post, once on a regular post, and then merge the two results. If the content of the posting table is large, the performance overhead of the query top post will be larger.
Design plan 2: store the top post in a separate top table. Because the number of top posts will be very small, but the access frequency is very high, so it will be separated from the posting table, and the performance overhead of access will be much less.
Appropriate data type
If the amount of data is the same, but the data type is smaller, the storage of the same data will take up less space, so the IO consumption caused by retrieving the same data will be reduced, and the performance will be improved naturally. In addition, MySQL handles different types of data in different ways. For example, in the operation or sorting operation, the simpler the data type is, the higher the operation performance is, so try to choose a simple data type for the fields that need to be calculated or sorted frequently.
SQL statement optimization
SQL statement optimization
General steps for SQL optimization
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Know the execution frequency of various SQL through the show status command
Locate SQL statements with low execution efficiency-(key select)
Analysis of inefficient SQL by explain
Identify the problem and take corresponding optimization measures.
-- select statement classification
Select
Dml data manipulation language (insert update delete)
Dtl data transaction language (commit rollback savepoint)
Ddl data definition language (create alter drop..)
Dcl (data Control language) grant revoke
-- Show status common commands
-- query this session
Show session status like 'com_%'; / / show session status like' Com_select'
-- query the global situation
Show global status like 'com_%'
-- authorize a user
Grant all privileges on *. * to 'abc'@'%'
-- Why do you authorize 'abc'' to indicate a user name'@'to indicate host? just look at the mysql- > user table.
-- reclaim permissions
Revoke all on *. * from 'abc'@'%'
-- refresh permissions [or not to write]
Flush privileges
SQL statement optimization-show parameter
After the MySQL client connects successfully, you can provide server status information by using the show [session | global] status command. The session represents the statistical results of the current connection, and the global represents the statistical results since the last time the database was started. The default is session level.
Show status like 'Com_%'
Where Com_XXX represents the number of times the XXX statement was executed. Key note: through these parameters, Com_select,Com_insert,Com_update,Com_delete can know whether the current database application is mainly to insert updates or query operations, as well as the approximate implementation proportion of all kinds of SQL.
There are also several commonly used parameters that make it easy for users to understand the basic situation of the database. Connections: the number of attempts to connect to the MySQL server Uptime: the working time of the server (in seconds) Slow_queries: the number of slow queries (the default is 10s)
Show status like 'Connections'; show status like' Uptime'; show status like 'Slow_queries'
Query the slow query time of MySQL
Show variables like 'long_query_time'
Modify MySQL slow query time
Set long_query_time=2
SQL statement optimization-locating slow query
Above, we described how to query some of the running status of the MySQL database.
For example, the current running time of MySQL: show status like 'Uptime'
How many times has it been executed: select/update/delete.. /: show status like 'Com_%'
Current number of connections
Location slow query
How to quickly locate slow statements from a project (locate slow queries)
Show variables like'% query%'
Slow_query_log is turned off by off by default. When you use it, it needs to be turned on by on.
Slow_query_log_file records slow log files.
Long_query_time defaults to 10s. Every time the sql is executed for this length of time, it will be recorded.
View slow query status
Slow_queries records the number of slow queries. When a sql executes slowly, the vlue is 1 (records the number of slow sql entries in this session).
Show status like'% slow_queries%'
Note:
1. How to open a slow query: SET GLOBAL slow_query_log = ON
two。 Change the default time to 1s: SET GLOBAL long_query_time = 1
(after setting up, you need to reconnect to the database, PS: if you only change it here, when you restart the database service again, all settings will automatically return to the default values, and permanent changes need to be changed in my.ini)
SQL statement Optimization-Explain tool
Use the EXPLAIN keyword to simulate the optimizer's execution of SQL statements, analyze your query statements or structural performance bottlenecks add the explain keyword before the select statement, MySQL will set a flag on the query, and execution of the query will return information about the execution plan instead of executing the SQL.
Note: if a subquery is included in the from, the subquery will still be executed and the results will be placed in the temporary table
Explain Analysis exampl
DROP TABLE IF EXISTS `actor`
CREATE TABLE `actor` (
`id`int (11) NOT NULL
`name` varchar (45) DEFAULT NULL, `update_ time` datetime DEFAULT NULL, PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO `actor` (`id`, `name`, `update_ time`) VALUES
DROP TABLE IF Exists`Exist`
CREATE TABLE`room` (
`id` int (11) NOT NULL AUTO_INCREMENT, `name` varchar (10) DEFAULT NULL, PRIMARY KEY (`id`)
KEY `idx_ name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO `qing` (`id`, `name`) VALUES (3 recordings 0'), (1 recordings 1'), (2 recordings 2')
DROP TABLE IF EXISTS `film_ actor`
CREATE TABLE`film _ actor` (
`id`int (11) NOT NULL
`film_ id` int (11) NOT NULL
`actor_ id` int (11) NOT NULL
`remark` varchar (255) DEFAULT NULL
PRIMARY KEY (`id`)
KEY `idx_film_actor_ id` (`film_ id`, `actor_ id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT Into`film _ actor` (`id`, `actor_ id`, `actor_ id`) VALUES (1meme 1meme 1), (2meme 1meme 2), (3meme 2meme 1)
Explain select * from actor
Each table in the query outputs one row, and if two tables join the query through join, two rows are output. The specific instructions for each column are described later.
Two varieties of Explain
1.explain extended
Some additional information on query optimization will be provided on top of explain. Then you can get the optimized query statement through the show warnings command to see what the optimizer has optimized. In addition, there is a filtered column, which is a half-fraction value, and rows * filtered/100 can estimate the number of rows that will be joined to the previous table in the explain (the previous table refers to the table whose id value in explain is lower than the current table id value).
Explain extended select * from film where id = 1
Show warnings
2.explain partitions
There is an extra partitions field compared to explain, and if the query is based on a partition table, it shows the partition that the query will access.
Columns in Explain
Next we will show the information for each column in the explain.
Id column
The number of the id column is the serial number of the select, there are several id for every select, and the order of the id is incremented in the order in which the select appears. The larger the id column, the higher the execution priority, the same id is executed from top to bottom, and id is the last execution of NULL.
Select_type column
Select_type indicates whether the corresponding row is a simple or complex query.
Simple: simple query. Queries do not include subqueries and union
Explain select * from film where id = 2
Primary: outermost select in complex queries
Subquery: subqueries contained in select (not in the from clause)
Derived: a subquery contained in the from clause. MySQL stores the results in a temporary table, also known as a derived table (the English meaning of derived)
Use the following example to learn about primary, subquery, and derived types.
Explain select (select 1 from actor where id = 1) from (select * from film where id = 1) der
The merge optimization of derivative tables with the new MySQL5.7 feature is not turned off, as follows:
# turn off set session optimizer_switch='derived_merge=off' for merging and optimizing derivative tables with new mysql5.7 features
# restore default configuration set session optimizer_switch='derived_merge=on'
Union: the second and subsequent select in union
Explain select 1 union all select 1
Table column
This column indicates which table the row of explain is accessing. When there is a subquery in the from clause, the table column is in format, indicating that the current query depends on the query of id=N, so the query of id=N is executed first. When there is union, the value of the table column of UNION RESULT is, and 1 and 2 represent the select row id that participates in union.
Type column
This column represents the association type or access type, that is, MySQL decides how to look up the rows in the table and find the approximate range of data row records.
The order from the best to the worst is:
System > const > eq_ref > ref > range > index > ALL
In general, you need to ensure that the query reaches the range level, preferably ref.
NULL:MySQL can decompose query statements during the optimization phase, eliminating the need to access tables or indexes at execution time. For example, selecting the minimum value in the index column can be done by looking up the index separately, without the need to access the table at execution time.
Explain select min (id) from film
System, const: MySQL can optimize a part of a query and turn it into a constant (see show warnings). When all columns used for primary key or unique key are compared to constants, the table has at most one matching row, read once, and is faster. System is a special case of const. Only one tuple in the table is system.
Explain extended select * from (select * from film where id = 1) tmp
Show warnings
Eq_ref: all parts of the primary key or unique key index are concatenated, and at most one qualified record is returned. This is probably the best connection type outside of const, and this kind of type does not appear in simple select queries.
Explain select * from film_actor left join film on film_actor.film_id = film.id
Ref: compared to eq_ref, you don't use a unique index, but use a partial prefix of a normal index or a unique index. If the index is compared to a value, you may find multiple rows that match the criteria.
(1) simple select query. Name is a common index (non-unique index).
Explain select * from film where name = 'film1'
(2) related table query. Idx_film_actor_id is the joint index of film_id and actor_id. The left prefix film_id of film_actor is used here.
Explain select film_id from film left join film_actor on film.id = film_actor.film_id
Range: range scanning usually occurs in in (), betwwen, >, =, and so on. Use an index to retrieve a given range of rows.
Explain select * from actor where id > 1
Index: scan the whole table index to pass faster than ALL.
Explain select * from film
ALL: a full table scan, which means that MySQL needs to find the desired row from beginning to end. Usually this needs to be optimized by adding an index.
Explain select * from actor
Possible_keys column
This column shows which indexes the query might use to find it. When explain, it is possible that possible_keys has columns, while key displays NULL, which is because there is not much data in the table. Mysql thinks that the index is not very helpful to this query and chooses a full table query. If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the where clause to see if an appropriate index can be created, and then use explain to see the effect.
Key column
This column shows which index mysql actually uses to optimize access to the table. If no index is used, the column is NULL. If you want to force mysql to use or ignore the index in the possible_keys column, use force index, ignore index in the query.
Key_len column
This column shows the number of bytes used by mysql in the index, which allows you to figure out which columns in the index are used. For example, film_actor 's federated index idx_film_actor_id consists of two int columns, film_id and actor_id, and each int is 4 bytes. From the key_len=4 in the result, it can be inferred that the query uses the first column, the film_id column, to hold the row index lookup.
Explain select * from film_actor where film_id = 2
The key_len calculation rules are as follows:
String
Char (n): n byte length
Varchar (n): 2 bytes store string length, if utf-8, length 3n + 2
Numerical type
Tinyint: 1 byte
Smallint: 2 bytes
Int: 4 bytes
Bigint: 8 bytes
Time type
Date: 3 bytes
Timestamp: 4 bytes
Datetime: 8 bytes
If the field is allowed to be NULL, a 1-byte record is required to determine whether it is NULL
The maximum length of the index is 768 bytes. When the string is too long, mysql will do a process similar to the left prefix index, extracting the first half of the characters for indexing.
Ref column
This column shows the columns or constants used by the table to find values in the index of the key column record. Common ones are: const (constant), field name (example: film.id)
Rows column
This column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set.
Extra column
This column shows additional information. Common important values are as follows:
(1) Using index: use override index
Explain select film_id from film_actor where film_id = 1
(2) Using where: use the where statement to process the result, and the query column is not overwritten by the index.
Explain select * from actor where name ='a'
(3) Using index condition: the column of the query is not completely covered by the index, and the range of a leading column in the where condition
Explain select * from film_actor where film_id > 1
(4) Using temporary: MySQL needs to create a temporary table to process the query. This kind of situation generally needs to be optimized, the first thing is to think of using the index to optimize.
Actor.name has no index, so a temporary table is created to distinct.
Explain select distinct name from actor
Film.name sets up the idx_name index. When querying, extra is using index and no temporary table is used.
Explain select distinct name from film
(5) Using filesort: the external sort is used instead of the index, and the data is sorted from memory when the data is small, otherwise the sorting needs to be done on disk. In this case, it is generally necessary to consider using indexes to optimize.
Actor.name has no index, browses through the actor table, saves the sort keyword name and the corresponding id, and then sorts name and retrieves row records.
Explain select * from actor order by name
Film.name establishes the idx_name index, and extra is using index when querying.
Explain select * from film order by name
(6) Select tables optimized away: use some aggregate functions (such as max, min) to access a field that has an index
Explain select min (id) from film
# sample table
CREATE TABLE`employees` (
`id`int (11) NOT NULL AUTO_INCREMENT
`name`varchar (24) NOT NULL DEFAULT''COMMENT' name'
`age`int (11) NOT NULL DEFAULT'0' COMMENT 'age'
`position`varchar (20) NOT NULL DEFAULT''COMMENT' position'
`hire_ time`timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'entry time'
PRIMARY KEY (`id`)
KEY `idx_name_age_ position` (`name`, `age`, `position`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT=' employee record Table'
INSERT INTO employees (name,age,position,hire_time) VALUES ('ZhangSan',23,'Manager',NOW ())
INSERT INTO employees (name,age,position,hire_time) VALUES ('HanMeimei', 23, devolution, now ())
INSERT INTO employees (name,age,position,hire_time) VALUES ('Lucy',23,'dev',NOW ())
Full value matching
EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan'
EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 22
EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 22 AND position =' manager'
Leftmost prefix rule
If you index multiple columns, follow the leftmost prefix rule. This means that the query starts at the leftmost front column of the index and does not skip the columns in the index.
EXPLAIN SELECT * FROM employees WHERE age = 22 AND position = 'manager'
EXPLAIN SELECT * FROM employees WHERE position = 'manager'
EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan'
Do nothing on the index column
Not doing anything on the index column (calculation, function, (automatic or manual) type conversion) will cause the index to fail and turn to a full table scan.
EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan'; EXPLAIN SELECT * FROM employees WHERE left (name,3) =' ZhangSan'
Add a normal index to hire_time:
ALTER TABLE `employees`ADD INDEX `idx_hire_ time` (`employees time`) USING BTREE;EXPLAIN select * from employees where date (hire_time) = '2020-09-30'
If converted to a date range query, the index will be taken:
EXPLAIN select * from employees where hire_time > = '2020-09-30 00 and hire_time 00' and hire_time 22 AND position = 'manager'
EXPLAIN SELECT name,age FROM employees WHERE name= 'ZhangSan' AND age = 23 AND position =' manager'
Try to use overlay indexes
Try to use overridden indexes (queries that only access the index (index columns contain query columns) and reduce select * statements.
EXPLAIN SELECT name,age FROM employees WHERE name= 'ZhangSan' AND age = 23 AND position =' manager'
EXPLAIN SELECT * FROM employees WHERE name= 'ZhangSan' AND age = 23 AND position =' manager'
The inability of mysql to use an index when using (! = or) will result in a full table scan.
EXPLAIN SELECT * FROM employees WHERE name! = 'ZhangSan'
Is null and is not null also cannot use the index
EXPLAIN SELECT * FROM employees WHERE name is null
Like begins with a wildcard ('% abc...') mysql index failure becomes a full table scan operation
EXPLAIN SELECT * FROM employees WHERE name like'% Zhang'
EXPLAIN SELECT * FROM employees WHERE name like 'Zhang%'
Question: how to solve the problem that the like'% string% 'index is not used?
Using an overlay index, the query field must be an overlay index field
EXPLAIN SELECT name,age,position FROM employees WHERE name like'% Zhang%'
If you cannot use an overlay index, you may need to use a search engine.
Index invalidation of string without single quotation marks
EXPLAIN SELECT * FROM employees WHERE name = '1000mm; EXPLAIN SELECT * FROM employees WHERE name = 1000
Use less or or in
Use less or or in. When querying with it, mysql does not necessarily use indexes. The internal optimizer of mysql evaluates whether or not to use indexes based on many factors such as retrieval ratio and table size. For more information, please see scope query Optimization.
EXPLAIN SELECT * FROM employees WHERE name = 'ZhangSan' or name =' HanMeimei'
Range query optimization
Add a single-valued index to age
ALTER TABLE`employees`ADD INDEX `idx_ age` (`age`) USING BTREE;explain select * from employees where age > = 1 and age = 1 and age = 1001 and age
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.