Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the high performance optimization skills of Mysql

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces what are the high-performance optimization skills of Mysql. It is very detailed and has a certain reference value. Friends who are interested must finish it!

Database command specification

All database object names must be in lowercase letters and separated by underscores

The mysql reserved keyword is prohibited for all database object names (if the table name contains a keyword query, it needs to be enclosed in single quotation marks)

The naming of database objects should be sensible and should not exceed 32 characters.

Temporary database tables must be prefixed with tmp_ and suffixed with date, and backup tables must be prefixed with bak_ and suffixed with date (timestamp)

All column names and column types that store the same data must be the same (usually as associated columns. If the associated column types are inconsistent during the query, the data type will be implicitly converted automatically, which will invalidate the index on the column and reduce the query efficiency.)

Basic database design specification

1. All tables must use the Innodb storage engine

When there are no special requirements (that is, functions that cannot be met by Innodb, such as column storage, storage space data, etc.), all tables must use the Innodb storage engine (Myisam,5.6 is used by default before mysql5.5 and Innodb is the default after Innodb).

Innodb supports transactions, row-level locks, better resilience, and better performance under high concurrency.

two。 UTF8 is used in the character set of database and table.

The unified character set can avoid the garbled code caused by the character set conversion. The conversion of different character sets will cause index failure. If there is a need to store emoji expressions in the database, the character set needs to use the utf8mb4 character set.

3. All tables and fields need to be commented

Use comment clauses to add comments for tables and columns to maintain data dictionaries from the start

4. Try to control the amount of data in a single table, and it is recommended to control it within 5 million.

5 million is not the limitation of Mysql database, too much conference will cause great problems in table structure modification, backup and recovery.

The amount of data can be controlled by means of historical data archiving (applied to log data), sub-database and table (applied to business data), etc.

5. Use Mysql partitioned tables with caution

A partition table is physically represented as multiple files and logically as a table.

Select partition keys carefully, cross-partition query may be less efficient

It is suggested that the physical table should be used to manage big data.

6. Try to separate hot and cold data and reduce the width of the table.

Mysql limits the storage of a maximum of 4096 columns per table and the size of each row of data cannot exceed 65535 bytes.

Reduce disk IO to ensure memory cache hit ratio for hot data (the wider the table, the more memory it takes to load the table into the memory buffer pool and the more IO will be consumed)

Make more efficient use of cache to avoid reading useless cold data

Columns that are often used together are put into one table (avoid more associated operations).

7. Prohibit the establishment of reserved fields in the table

It is difficult to recognize the meaning by name in the naming of reserved fields. The reserved field cannot confirm the stored data type, so the appropriate type cannot be selected. Changes to the reserved field type lock the table.

8. It is prohibited to store large binary data such as pictures and files in the database.

Usually, the file is very large, which will cause the rapid growth of data in a short time. When the database reads the database, it usually carries out a large number of random IO operations. When the file is very large, the IO operation is very time-consuming.

It is usually stored in the file server, and the database stores only the file address information.

9. Do not do database stress testing online

10. It is prohibited to connect directly to the build environment database from the development environment and the test environment.

Database field design specification

1. Give priority to the smallest data type that meets the storage needs

Reason:

The larger the fields of the column, the more space is needed to build the index, so there are fewer and fewer index nodes that can be stored on a page, the more IO times are needed in traversal, and the worse the performance of the index.

Methods:

Convert strings to numeric type storage, such as converting IP addresses into shaping data

Mysql provides two ways to handle ip addresses

Inet_aton converts ip to unsigned integers (4-8 bits)

Inet_ntoa converts an integer ip to an address

Before inserting data, use inet_aton to convert the ip address into an integer, which can save space. When displaying data, use inet_ntoa to convert the integer ip address to address display.

For non-negative data (such as self-increasing ID, integer IP), unsigned integers should be preferred for storage.

Reason:

Unsigned can have twice as much storage space as signed.

SIGNED INT-21474836481147483647

UNSIGNED INT 04294967295

The N in VARCHAR (N) represents the number of characters, not bytes, and uses UTF8 to store 255Chinese characters Varchar (255bytes) = 765bytes. * too long will consume more memory.

two。 Avoid using TEXT and BLOB data types. The most common TEXT type can store 64k of data.

It is recommended that you separate the BLOB or TEXT columns into separate extension tables

Mysql temporary tables in memory do not support big data types such as TEXT and BLOB. If such data is included in the query, memory temporary tables cannot be used in sorting and other operations. Disk temporary tables must be used.

And for this kind of data, Mysql still has to do a second query, which will make the performance of sql very poor, but it does not mean that such data types must not be used.

If you must use it, it is recommended to separate the BLOB or TEXT columns into a separate extended table. Do not use select * when querying, but only need to take out the necessary columns. Do not query the TEXT column when you do not need the data.

TEXT or BLOB types can only use prefix indexes

Because MySQL has restrictions on the length of index fields, TEXT types can only use prefix indexes, and there can be no default values on TEXT columns

3. Avoid using ENUM types

Alter statement is required to modify ENUM value

ORDER BY of ENUM type is inefficient and requires additional operations

Prohibit the use of numeric values as enumerated values of ENUM

4. Define all columns as NOT NULL as possible

Reason:

Indexing NULL columns requires extra space to save, so it takes up more space

The null value should be specially handled when comparing and calculating.

5. Use TIMESTAMP (4 bytes) or DATETIME type (8 bytes) to store time

The time range of TIMESTAMP storage is 1970-01-01 00:00:01 ~ 2038-01-19-03:14:07

TIMESTAMP occupies 4 bytes the same as INT, but is more readable than INT

Use DATETIME type storage outside the range of TIMESTAMP values

People often use strings to store date data (incorrect practice)

Disadvantage 1: it is impossible to calculate and compare with date function

Disadvantage 2: using strings to store dates takes up more space

6. Finance-related amount data must use the decimal type

Imprecise floating point: float,double

Precision floating point: decimal

The Decimal type is a precise floating-point number and does not lose precision in calculation.

The occupied space is determined by the defined width, with 9 digits for every 4 bytes and one byte for the decimal point.

Can be used to store larger integer data than bigint

Index design specification

1. Limit the number of indexes on each table. It is recommended that there are no more than 5 indexes per table.

The more indexes, the better! Indexing can improve efficiency as well as reduce efficiency.

Indexes can increase query efficiency, but they can also reduce the efficiency of inserts and updates, and even in some cases.

Because when choosing how to optimize the query, the mysql optimizer will evaluate each index that can be used according to the unified information in order to generate the best execution plan. If there are many indexes can be used for the query at the same time, it will increase the time for the mysql optimizer to generate the execution plan, and will also reduce the query performance.

two。 It is forbidden to create a separate index for each column in the table.

Before version 5.6, a sql can only use one index in a table. After version 5.6, although there is a way to optimize the merged index, it is still far from the query method of using a federated index.

3. Each Innodb table must have a primary key

Innodb is an index organization table: the logical order in which data is stored is the same as the order in which it is indexed. Each table can have more than one index, but the table can only be stored in one order.

Innodb organizes tables in the order of primary key indexes

Do not use frequently updated columns as primary keys, multi-column primary keys are not applicable (equivalent to federated indexes)

Do not use UUID,MD5,HASH, string columns as primary keys (there is no guarantee of sequential growth of data)

It is recommended to use self-increment ID value for primary key

Common index column recommendations

Columns that appear in the WHERE clause of SELECT, UPDATE, DELETE statements

Fields contained in ORDER BY, GROUP BY, DISTINCT

Do not build an index for all the columns that match the fields in 1 and 2. It is usually better to jointly index the fields in 1 and 2.

Associated columns of multi-table join

How to select the order of index columns

The purpose of establishing an index is to search for data through the index, reduce random IO, and increase query performance. The less data the index can filter out, the less data will be read from the disk.

The one with the highest differentiation is placed on the far left of the federated index (differentiation = the number of different values in the column / the total number of rows in the column)

Try to place columns with small field length on the far left side of the federated index (because the smaller the field length, the larger the amount of data a page can store, and the better IO performance will be.)

Put the most frequently used columns to the left of the federated index (so that fewer indexes can be built)

Avoid building redundant and duplicate indexes (increasing the time it takes for the query optimizer to generate execution plans)

Duplicate index example: primary key (id), index (id), unique index (id)

Examples of redundant indexes: index (aformab), index (aforme b), index (a)

Overlay index is preferred for frequent queries.

Override index: an index that contains all query fields (fields contained in where,select,ordery by,group by)

Benefits of overriding an index:

Avoid the secondary query of Innodb table indexing

Innodb is stored in the order of clustered index. For Innodb, the secondary index stores the primary key information of the row in the leaf node. If the secondary index is used to query the data, after finding the corresponding key value, we have to do a second query through the primary key to get the data we really need.

In the overlay index, all the data can be obtained from the key value of the secondary index, which avoids the secondary query of the primary key, reduces the IO operation, and improves the query efficiency.

Random IO can be changed into sequential IO to speed up query efficiency.

Because the overlay index is stored in the order of key values, for IO-intensive range lookups, there is much less IO than randomly reading each row from the disk, so the overlay index can also be used to transform the randomly read IO of the disk into the order IO of the index lookup.

Index SET specification

Avoid using foreign key constraints as much as possible

Foreign key constraints (foreign key) are not recommended, but be sure to build indexes on the associated keys between tables

Foreign keys can be used to ensure the referential integrity of data, but it is recommended to implement them on the business side.

Foreign keys affect the write operations of parent and child tables, thus reducing performance

Database SQL development specification

1. It is recommended to use precompiled statements for database operation.

Precompiled statements can reuse these plans, reduce the time required for SQL compilation, and solve the problem of SQL injection caused by dynamic SQL.

Passing only parameters is more efficient than passing SQL statements.

The same statement can be parsed at once and used many times to improve processing efficiency.

two。 Avoid implicit conversion of data types

Implicit conversion can cause indexes to fail, such as:

Select name,phone from customer where id = '111'

3. Make full use of the indexes that already exist on the table

Avoid using query conditions with double% numbers. For example: a like'3% indexes, (if there is no front%, only post%, you can use the index on the column)

A SQL can use only one column in the composite index for range queries. For example, if there is a federated index with column a, and a range query with column an in the query condition, then the index on column bjournal c will not be used.

When defining a federated index, if column an is to use range lookup, put column a to the right of the federated index and use left join or not exists to optimize not in operations, because not in also usually uses index invalidation.

4. When designing a database, future extensions should be considered.

5. The program connects to different databases and uses different accounts to query across databases.

Leave room for database migration and subdatabase and table

Reduce business coupling

Avoid the security risk caused by excessive permissions

6. Prohibit the use of SELECT * must use SELECT query

Reason:

Consume more CPU and IO for network bandwidth resources

Cannot use override index

Can reduce the impact of table structure changes

7. Prohibit the use of INSERT statements without field lists

Such as:

Insert into values ('axiomagy, baccalaureate, cinema,')

You should use:

Insert into t (C1, c2, c3) values ('axiomagrical, baccalaureate, c')

8. Avoid using subqueries, which can be optimized to join operations

The general subquery is in the in clause, and when the subquery is simple SQL (does not include union, group by, order by, limit clauses), the subquery can be transformed into an associated query for optimization.

Reasons for poor performance of subqueries:

The result set of subquery can not use index, the result set of general subquery is stored in temporary table, and there is no index in memory temporary table or disk temporary table, so the query performance will be affected to a certain extent. Especially for subqueries with large result sets, the greater the impact on query performance.

Because the subquery will produce a large number of temporary tables and no indexes, it will consume too much CPU and IO resources, resulting in a large number of slow queries.

9. Avoid using JOIN to associate too many tables

For Mysql, there is an associated cache, and the size of the cache can be set by the join_buffer_size parameter.

In Mysql, one more association cache is allocated for a table with the same SQL multiple associations (join), and the more tables associated in a SQL, the more memory is consumed.

If the operation of multi-table association is widely used in the program, and the setting of join_buffer_size is unreasonable, it is easy to cause the server memory overflow, which will affect the stability of the server database performance.

At the same time, for association operations, temporary table operations will occur, which will affect query efficiency. Mysql allows a maximum of 61 tables to be associated, and it is recommended that no more than 5 tables be associated.

10. Reduce the number of interactions with the database

The database is more suitable for handling batch operations, and merging multiple identical operations together can improve the processing efficiency.

11. When making or judgment for the same column, use in instead of or

The value of in should not exceed 500. in operations can make more efficient use of indexes, and or rarely uses indexes in most cases.

twelve。 Do not use order by rand () for random sorting

Order by rand () loads all eligible data in the table into memory, then sorts all data in memory based on randomly generated values, and may generate a random value for each row, consuming a lot of CPU and IO and memory resources if the dataset that meets the criteria is very large.

It is recommended to get a random value in the program and then get the data from the database.

13. Function conversion and computation of columns are prohibited in the WHERE clause

When a column is functionally converted or evaluated, the index cannot be used.

Not recommended:

Where date (create_time) = '20190101'

Recommended:

Where create_time > = '20190101' and create_time < '20190102'

14. Use UNION ALL instead of UNION when it is clear that there are no duplicate values

UNION will put all the data of the two result sets into a temporary table before deduplicating.

UNION ALL will no longer deduplicate the result set.

15. Split a complex large SQL into multiple small SQL

Large SQL is logically complex and requires SQL that takes up a lot of CPU to calculate.

In MySQL, a SQL can only be calculated using one CPU.

After SQL splitting, parallel execution can be used to improve processing efficiency.

Database operation code of conduct

Batch write operations (UPDATE, DELETE, INSERT) of more than 1 million lines should be performed in batches for multiple times

1. Mass operations may cause serious master-slave delays.

In the master-slave environment, a large number of operations may cause serious master-slave delays, and large quantities of write operations usually take a long time to perform.

Only when the execution on the master library is completed, it will be executed on other slave libraries, so it will cause a long delay between the master library and the slave library.

2. When binlog logs are in row format, a large number of logs will be generated.

Mass write operations will generate a large number of logs, especially for binary data in row format. Since modifications to each row of data are recorded in row format, the more data we modify at a time, the more logs will be generated, and the longer it will take for log transmission and recovery, which is one of the reasons for the master-slave delay.

3. Avoid large transaction operations

Mass modification of data must be carried out in a transaction, which will cause a large number of data in the table to be locked, resulting in a large number of blocking, which will have a great impact on the performance of MySQL.

In particular, long-term blocking will fill the available connections to all databases, which will prevent other applications in the production environment from connecting to the database, so it is important to pay attention to batch writes.

Use pt-online-schema-change to modify the table structure for large tables

Avoid master-slave delay caused by large table modification

Avoid locking tables when modifying table fields

The modification of the data structure of large tables must be careful, which will cause serious table locking operations, especially in the production environment, which cannot be tolerated.

Pt-online-schema-change will first create a new table with the same structure as the original table, and modify the table structure on the new table, then copy the data from the original table to the new table, and add some triggers to the original table.

Copy the new data from the original table to the new table, and after all the row data is copied, name the new table as the original table, and delete the original table. Decompose the original DDL operation into several small batches.

It is forbidden to grant super permissions to accounts used by the program.

When the maximum number of connections is reached, a user connection with super privileges is also run

Super permissions can only be left to DBA accounts that deal with problems.

For the program to connect to the database account, follow the principle of minimum permissions

Programs can only use database accounts under one DB, not across libraries.

In principle, the account used by the program is not allowed to have drop permission.

These are all the contents of the article "what are the High performance Optimization skills of Mysql?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report