In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
Today, I will talk to you about what the operating guidelines of MySQL are, many people may not know much about it. In order to make you understand better, the editor summarizes the following contents for you. I hope you can get something according to this article.
1. Database command specification
All database object names must be in lowercase letters and separated by underscores
The mysql reserved keyword is prohibited for all database object names (if the table name contains a keyword query, it needs to be enclosed in single quotation marks)
The naming of database objects should be sensible and should not exceed 32 characters.
Temporary database tables must be prefixed with tmp and suffixed with date, and backup tables must be prefixed with bak and suffixed with date (timestamp)
All column names and column types that store the same data must be the same (generally as associated columns, if the associated column types are inconsistent during the query, the data type will be implicitly converted automatically, which will invalidate the index on the column and reduce the query efficiency).
II. Basic design specification of database
1. All tables must use the Innodb storage engine
If there are no special requirements (such as column storage, storage space data, etc.) that cannot be met by Innodb, all tables must use Innodb storage engine (Myisam,5.6 is used by default before mysql5.5 and default is Innodb after mysql5.5) Innodb to support transactions, support row-level locking, better recovery, and better performance under high concurrency.
2. UTF8 is used in the character set of database and table.
Better compatibility, unified character set can avoid garbled code caused by character set conversion, different character sets need to be converted before comparison will cause index failure.
3. All tables and fields need to be annotated
Use comment clauses to add comments for tables and columns to maintain the data dictionary from the start.
4. Try to control the amount of data in a single table. It is recommended to control it within 5 million.
5 million is not the limitation of MySQL database, too much conference will cause great problems in table structure modification, backup and recovery.
The amount of data can be controlled by means of historical data archiving (applied to log data), sub-database and table (applied to business data) and so on.
5. Use MySQL partitioned tables cautiously
A partition table is physically represented as multiple files and logically as a table. Select partition keys carefully, cross-partition query efficiency may be lower, it is recommended to use physical sub-table management big data
6. try to separate hot and cold data and reduce the width of the table.
MySQL limits the storage of a maximum of 4096 columns per table, and the size of each row of data cannot exceed 65535 bytes to reduce the disk IO, ensuring the memory cache hit ratio of hot data (the wider the table, the more memory it takes when loading the table into the memory buffer pool, and the more IO will be consumed) to make more efficient use of cache to avoid reading useless cold data and putting columns that are often used together into one table (avoid more associated operations)
7. Prohibit the establishment of reserved fields in the table
It is difficult to name the reserved field so that the reserved field cannot confirm the stored data type, so it is impossible to select the appropriate type to modify the reserved field type, and the table will be locked.
8. It is prohibited to store large binary data such as pictures and files in the database
Usually, the file is very large, which will cause the rapid growth of the amount of data in a short time. When the database reads the database, it usually carries out a large number of random IO operations. When the file is very large, the IO operation is very time-consuming and is usually stored in the file server, and the database only stores the file address information.
9. Do not do database stress testing online
10. It is forbidden to connect directly to the build environment database from the development environment and the test environment.
III. Database field design specification
1. Give priority to selecting the smallest data type that meets the storage needs
Reason
The larger the fields of the column, the more space is needed to build the index, so there are fewer and fewer index nodes that can be stored on a page, the more IO times are needed in traversal, and the worse the performance of the index.
Method
1) convert strings to numeric type storage, such as converting IP addresses into shaping data.
Mysql provides two ways to handle ip addresses:
Before inserting data, use inet_aton to convert the ip address to an integer, which can save space. When displaying data, use inet_ntoa to convert the integer ip address to address display.
2) for non-negative data (such as self-increasing ID and integer IP), unsigned integers should be preferred for storage.
Because: unsigned can have twice as much storage space as signed
The N in VARCHAR (N) represents the number of characters, not bytes
Use UTF8 to store 255 Chinese characters Varchar (255) = 765 bytes. Too much length consumes more memory.
2. Avoid using TEXT and BLOB data types. The most common TEXT type can store 64k data.
It is recommended that you separate the BLOB or TEXT columns into separate extension tables
Mysql temporary tables in memory do not support big data types such as TEXT and BLOB. If such data is included in the query, memory temporary tables cannot be used in sorting and other operations. Disk temporary tables must be used.
And for this kind of data, Mysql still has to do a second query, which will make the performance of sql very poor, but it does not mean that such data types must not be used.
If you must use it, it is recommended to separate the BLOB or TEXT columns into a separate extended table. Do not use select * when querying, but only need to take out the necessary columns. Do not query the TEXT column when you do not need the data.
TEXT or BLOB types can only use prefix indexes
Because MySQL has restrictions on the length of index fields, TEXT types can only use prefix indexes, and there can be no default values on TEXT columns.
3. Avoid using ENUM types
The ALTER statement is required to modify the ENUM value
ORDER BY of ENUM type is inefficient and requires additional operations
Prohibit the use of numeric values as enumerated values of ENUM
4. Define all columns as NOT NULL as possible
Reason:
Indexing NULL columns requires extra space to save, so it takes up more space
The NULL value should be specially handled when comparing and calculating.
5. Use TIMESTAMP (4 bytes) or DATETIME type (8 bytes) to store time
The time range of TIMESTAMP storage is 1970-01-01 00:00:01-2038-01-19-03:14:07.
TIMESTAMP occupies 4 bytes the same as INT, but is more readable than INT
DATETIME type storage is used for those out of the range of TIMESTAMP values.
People often use strings to store date-type data (incorrect practice):
Disadvantage 1: it is impossible to calculate and compare with date function
Disadvantage 2: using strings to store dates takes up more space
6. The amount data related to finance must use decimal type.
Imprecise floating point: float,double
Precision floating point: decimal
The Decimal type is an exact floating-point number and does not lose precision in calculation. The occupied space is determined by the defined width, with 9 digits for every 4 bytes and one byte for the decimal point. Can be used to store larger integer data than bigint.
IV. Specification for index design
1. Limit the number of indexes on each table. It is recommended that there are no more than 5 indexes per table.
The more indexes, the better! Indexing can improve efficiency as well as reduce efficiency.
Indexes can increase query efficiency, but they can also reduce the efficiency of inserts and updates, and even in some cases.
Because when choosing how to optimize the query, the mysql optimizer will evaluate each index that can be used according to the unified information in order to generate the best execution plan. If there are many indexes that can be used for the query at the same time, it will increase the time for the mysql optimizer to generate the execution plan, and also reduce the query performance.
2. It is forbidden to create a separate index for each column in the table
Before version 5.6, a sql can only use one index in a table. After version 5.6, although there is a way to optimize the merged index, it is still far from the query method of using a federated index.
3. Each Innodb table must have a primary key
Innodb is an index organization table: the logical order in which data is stored is the same as the order in which it is indexed.
Each table can have multiple indexes, but the storage order of the table can only be one Innodb that organizes the table in the order of the primary key index.
Do not use frequently updated columns as primary keys, do not apply multi-column primary keys (equivalent to federated indexes). Do not use UUID, MD5, HASH, and string columns as primary keys (there is no guarantee of sequential growth of data).
The primary key recommends the use of self-incrementing ID values.
Fifth, common index column recommendations
Columns that appear in the WHERE clause of SELECT, UPDATE, DELETE statements
Fields contained in ORDER BY, GROUP BY, DISTINCT
Do not build an index for all the columns that match the fields in 1 and 2. It is usually better to jointly index the fields in 1 and 2.
Associated columns of multi-table join
6. How to choose the order of index columns
The purpose of establishing an index is to search for data through the index, reduce random IO, and increase query performance. The less data the index can filter out, the less data will be read from the disk.
The one with the highest differentiation is placed on the far left of the federated index (differentiation = the number of different values in the column / the total number of rows in the column)
Try to place columns with small field length on the far left side of the federated index (because the smaller the field length, the larger the amount of data a page can store, and the better IO performance will be.)
The most frequently used columns are placed to the left of the federated index (so that fewer indexes can be built).
Avoid the establishment of redundant and duplicate indexes
Because this increases the time it takes the query optimizer to generate an execution plan.
Duplicate index example: primary key (id), index (id), unique index (id)
Examples of redundant indexes: index (aformab), index (aforme b), index (a)
Priority should be given to overriding the index.
Override indexes are preferred for frequent queries.
Override index: an index that contains all query fields (fields contained in where,select,ordery by,group by)
Benefits of overriding an index:
Avoid the secondary query of Innodb table indexing
Innodb is stored in the order of clustered indexes. For Innodb, the secondary index stores the primary key information of rows in the leaf node.
If we use the secondary index to query the data, after finding the corresponding key value, we have to do a second query through the primary key to get the data we really need. In the overlay index, all the data can be obtained from the key value of the secondary index, which avoids the secondary query of the primary key, reduces the IO operation, and improves the query efficiency.
Random IO can be changed into sequential IO to speed up query efficiency.
Because the overlay index is stored in the order of key values, for IO-intensive range lookups, there is much less IO than randomly reading each row from the disk, so the overlay index can also be used to transform the randomly read IO of the disk into the order IO of the index lookup.
IX. Index SET specification
Avoid using foreign key constraints as much as possible
Foreign key constraints (foreign key) are not recommended, but be sure to build indexes on the associated keys between tables
Foreign keys can be used to ensure the referential integrity of data, but it is recommended to implement them on the business side.
Foreign keys affect the write operations of parent and child tables, thus reducing performance.
Database SQL development specification
1. It is recommended to use precompiled statements for database operation.
Precompiled sentences can reuse these plans, reduce the time needed for SQL compilation, and solve the problem of SQL injection caused by dynamic SQL; only passing parameters is more efficient than passing SQL statements; the same statement can be parsed at once and used many times to improve processing efficiency.
2. Avoid implicit conversion of data types
Implicit conversion can cause the index to fail. For example, select name,phone from customer where id = '111'
3. Make full use of the existing indexes on the table
Avoid using query conditions with double% numbers.
Such as a like'3% indexes, (if there is no front%, only post%, you can use the index on the column)
A SQL can only use one column in the composite index to query the range.
For example, if there is a federated index with column an in the query condition, then the index on column a will not be used. When defining the federated index, column a should be placed on the right side of the federated index if column a uses range lookup.
Use left join or not exists to optimize not in operations
Because not in also usually uses indexes to fail.
4. When designing the database, we should consider the future expansion.
5. Programs connect to different databases and use different accounts to query across databases.
Leave room for database migration and subdatabase and table
Reduce business coupling
Avoid the security risk caused by excessive permissions
6. Prohibit the use of SELECT * you must use SELECT query
Reason:
Consume more CPU and IO for network bandwidth resources
Cannot use override index
Can reduce the impact of table structure changes
7. Prohibit the use of INSERT statements without field lists
For example: insert into values ('axiomagrine, thecontrol, etc.)
You should use insert into t (C1, c2, and c3) values ('axiomagrical, baccalaureate, and c')
8. Avoid using subqueries and optimize them to join operations.
The general subquery is in the in clause, and when the subquery is simple SQL (does not include union, group by, order by, limit clauses), the subquery can be transformed into an associated query for optimization.
Reasons for poor performance of subqueries:
The result set of the subquery cannot use the index, and the result set of the general subquery is stored in the temporary table, and there is no index in either the memory temporary table or the disk temporary table, so the query performance will be affected to a certain extent.
Especially for subqueries with large result sets, the greater the impact on query performance.
Because the subquery will produce a large number of temporary tables and no indexes, it will consume too much CPU and IO resources, resulting in a large number of slow queries.
9. Avoid using JOIN to associate too many tables
For MySQL, there is an associated cache, and the size of the cache can be set by the join_buffer_size parameter.
In MySQL, one more association cache is allocated for a table with the same SQL multiple associations (join), and the more tables associated in a SQL, the more memory is consumed.
If the operation of multi-table association is widely used in the program, and the setting of join_buffer_size is unreasonable, it is easy to cause the server memory overflow, which will affect the stability of the server database performance.
At the same time, for association operations, temporary table operations will occur, which will affect query efficiency. Mysql allows a maximum of 61 tables to be associated, and no more than 5 tables are recommended.
10. Reduce the number of interactions with the database
The database is more suitable for handling batch operations and merging multiple identical operations together, which can improve the processing efficiency.
11. When making or judgment for the same column, use in instead of or
The value of in should not exceed 500. in operations can make more efficient use of indexes, which are rarely used by or in most cases.
12. Random sorting using order by rand () is prohibited
All eligible data in the table is loaded into memory, and all data is sorted in memory according to randomly generated values, and a random value may be generated for each row, which consumes a lot of CPU and IO and memory resources if the dataset that meets the criteria is very large.
It is recommended to get a random value in the program and then get the data from the database.
13. Function conversion and calculation of columns are prohibited in WHERE clauses
When a column is functionally converted or evaluated, the index cannot be used.
Not recommended:
Recommended:
14. Use UNION ALL instead of UNION when it is clear that there are no duplicate values
UNION will put all the data of the two result sets into a temporary table before deduplicating.
UNION ALL will no longer deduplicate the result set.
15. Split a complex large SQL into multiple small SQL
Large SQL: a SQL that is logically complex and takes up a lot of CPU for calculation
MySQL: a SQL can only be calculated using one CPU
After SQL splitting, parallel execution can be used to improve processing efficiency.
Code of conduct for database operation
1. Batch write operations (UPDATE, DELETE, INSERT) of more than 1 million lines should be performed multiple times in batches
Large quantities of operations may cause serious master-slave delays
In the master-slave environment, a large number of operations may cause serious master-slave delay. Large-batch write operations usually take a long time to execute, but only when the execution on the master database is completed, it will be executed on other slave libraries. So it will cause a long delay between the master library and the slave database.
When binlog logs are in row format, a large number of logs will be generated.
Mass write operations will generate a large number of logs, especially for binary data in row format. Since modifications to each row of data are recorded in row format, the more data we modify at a time, the more logs will be generated, and the longer it takes for log transmission and recovery, which is one of the reasons for the master-slave delay.
Avoid large transaction operations
Mass modification of data must be carried out in a transaction, which will cause a large number of data in the table to be locked, resulting in a large number of blocking, which will have a great impact on the performance of MySQL.
In particular, long-term blocking will fill the available connections to all databases, which will prevent other applications in the production environment from connecting to the database, so it is important to pay attention to batch writes.
2. Use pt-online-schema-change to modify the table structure for large tables
Avoid master-slave delay caused by large table modification
Avoid locking tables when modifying table fields
The modification of the data structure of large tables must be careful, which will cause serious table locking operations, especially in the production environment, which cannot be tolerated.
Pt-online-schema-change will first create a new table with the same structure as the original table, and modify the table structure on the new table, then copy the data from the original table to the new table, and add some triggers to the original table.
Copy the new data from the original table to the new table, and after all the row data is copied, name the new table as the original table, and delete the original table.
Decompose the original DDL operation into several small batches.
3. It is prohibited to grant super permissions to accounts used by the program.
When the maximum number of connections is reached, one user with super permission to connect to super can only be used by the DBA account dealing with the problem.
4. For the program to connect to the database account, follow the principle of minimum permissions
Programs using database accounts can only be used under one DB, and accounts used by cross-library programs are not allowed to have drop permissions in principle.
After reading the above, do you have any further understanding of what the MySQL practice guidelines are? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.