In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What are the MySQL database specifications? in view of this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
1. Database command specification
All database object names must be in lowercase letters and separated by underscores
The MySQL reserved keyword is prohibited for all database object names (if the table name contains a keyword query, it needs to be enclosed in single quotation marks)
The naming of database objects should be sensible and should not exceed 32 characters.
Temporary database tables must be prefixed with tmp_ and suffixed with date, and backup tables must be prefixed with bak_ and suffixed with date (timestamp)
All column names and column types that store the same data must be the same (usually as associated columns. If the associated column types are inconsistent during the query, the data type will be implicitly converted automatically, which will invalidate the index on the column and reduce the query efficiency.)
II. Basic design specification of database
1. All tables must use the Innodb storage engine
When there are no special requirements (such as column storage, storage space data, etc.) that cannot be met by Innodb, all tables must use the Innodb storage engine (Myisam,5.6 is used by default before MySQL5.5 and then Innodb by default) Innodb to support transactions, support row-level locking, better recovery, and better performance under high concurrency.
two。 UTF8 is used in the character set of database and table.
The compatibility is better, the unified character set can avoid the garbled code caused by character set conversion, and the conversion of different character sets before comparison will cause index invalidation.
3. All tables and fields need to be commented
Use comment clauses to add comments for tables and columns to maintain data dictionaries from the start
4. Try to control the amount of data in a single table. It is recommended to control it within 5 million.
5 million is not the limitation of MySQL database, too much conference will cause great problems in table structure modification, backup and recovery.
The amount of data can be controlled by means of historical data archiving (applied to log data), sub-database and table (applied to business data), etc.
5. Use MySQL partitioned tables with caution
The partition table is physically represented as multiple files, logically it shows that a table carefully selects the partition key, and the efficiency of cross-partition query may be lower. It is suggested that big data should be managed by physical partition table.
6. Try to separate hot and cold data and reduce the width of the table.
MySQL limits the storage of a maximum of 4096 columns per table, and the size of each row of data cannot exceed 65535 bytes to reduce the disk IO, ensuring the memory cache hit ratio of hot data (the wider the table, the more memory it takes when loading the table into the memory buffer pool, and the more IO will be consumed) to make more efficient use of cache to avoid reading useless cold data and putting columns that are often used together into one table (avoid more associated operations)
7. Prohibit the establishment of reserved fields in the table
It is difficult to name the reserved field so that the reserved field cannot confirm the stored data type, so it is impossible to select the appropriate type to modify the reserved field type, and the table will be locked.
8. It is prohibited to store large binary data such as pictures and files in the database.
Usually, the file is very large, which will cause the rapid growth of the amount of data in a short period of time. When the database reads the database, it usually carries out a large number of random IO operations. When the file is very large, the IO operation is very time-consuming and is usually stored in the file server. The database only stores the file address information.
9. Do not do database stress testing online
10. It is prohibited to connect directly to the build environment database from the development environment and the test environment.
III. Database field design specification
1. Give priority to the smallest data type that meets the storage needs
Reason
The larger the fields of the column, the more space is needed to build the index, so there are fewer and fewer index nodes that can be stored on a page, and the more IO times are needed during traversal, the worse the index performance is.
Method
1) convert strings to numeric type storage, such as converting IP addresses into integer data.
MySQL provides two ways to handle ip addresses:
Before inserting data, use inet_aton to convert the ip address to an integer, which can save space. When displaying data, use inet_ntoa to convert the integer ip address to address display.
2) for non-negative data (such as self-increasing ID and integer IP), unsigned integers should be preferred for storage.
Because: unsigned can have twice as much storage space as signed
The N in VARCHAR (N) represents the number of characters, not bytes
Use UTF8 to store 255 Chinese characters Varchar (255) = 765 bytes. Too much length consumes more memory.
two。 Avoid using TEXT and BLOB data types. The most common TEXT type can store 64k of data.
It is recommended that you separate the BLOB or TEXT columns into separate extension tables
MySQL temporary tables in memory do not support big data types such as TEXT and BLOB. If such data is included in the query, memory temporary tables cannot be used in sorting and other operations. Disk temporary tables must be used.
And for this kind of data, MySQL still has to do a second query, which will make the performance of sql very poor, but it does not mean that such data types must not be used.
If you must use it, it is recommended to separate the BLOB or TEXT columns into a separate extended table. Do not use select * when querying, but only need to take out the necessary columns. Do not query the TEXT column when you do not need the data.
TEXT or BLOB types can only use prefix indexes
Because MySQL has restrictions on the length of index fields, TEXT types can only use prefix indexes, and there can be no default values on TEXT columns.
3. Avoid using ENUM types
Alter statement is required to modify ENUM value
ORDER BY of ENUM type is inefficient and requires additional operations
Prohibit the use of numeric values as enumerated values of ENUM
4. Define all columns as NOT NULL as possible
Reason:
Indexing NULL columns requires extra space to save, so it takes up more space
The null value should be specially handled when comparing and calculating.
5. Use TIMESTAMP (4 bytes) or DATETIME type (8 bytes) to store time
The time range of TIMESTAMP storage is 1970-01-01 00:00:01-2038-01-19-03:14:07.
TIMESTAMP occupies 4 bytes the same as INT, but is more readable than INT
DATETIME type storage is used for those out of the range of TIMESTAMP values.
People often use strings to store date-type data (incorrect practice):
Disadvantage 1: it is impossible to calculate and compare with date function
Disadvantage 2: using strings to store dates takes up more space
6. Finance-related amount data must use the decimal type
Imprecise floating point: float,double
Precision floating point: decimal
The Decimal type is an exact floating-point number and does not lose precision in calculation. The occupied space is determined by the defined width, with 9 digits for every 4 bytes and one byte for the decimal point. Can be used to store larger integer data than bigint.
IV. Specification for index design
1. Limit the number of indexes on each table. It is recommended that there are no more than 5 indexes per table.
The more indexes, the better! Indexing can improve efficiency as well as reduce efficiency.
Indexes can increase query efficiency, but they can also reduce the efficiency of inserts and updates, and even in some cases.
Because when choosing how to optimize the query, the optimizer will evaluate each index that can be used according to unified information in order to generate the best execution plan. If there are many indexes that can be used for the query at the same time, it will increase the time for the MySQL optimizer to generate the execution plan, and will also reduce the query performance.
two。 It is forbidden to create a separate index for each column in the table.
Before version 5.6, a sql can only use one index in a table, but after that, although there is a way to optimize the merge index, it is still far from the query method of using a federated index.
3. Each Innodb table must have a primary key
Innodb is an index organization table: the logical order in which data is stored is the same as the order in which it is indexed.
Each table can have multiple indexes, but the storage order of the table can only be one Innodb that organizes the table in the order of the primary key index.
Do not use frequently updated columns as primary keys, do not apply multi-column primary keys (equivalent to federated indexes). Do not use UUID, MD5, HASH, and string columns as primary keys (there is no guarantee of sequential growth of data).
The primary key recommends the use of self-incrementing ID values.
Fifth, common index column recommendations
Columns that appear in the WHERE clause of SELECT, UPDATE, DELETE statements
Fields contained in ORDER BY, GROUP BY, DISTINCT
Do not build an index for all the columns that match the fields in 1 and 2. It is usually better to jointly index the fields in 1 and 2.
Associated columns of multi-table join
6. How to choose the order of index columns
The purpose of establishing an index is to search for data through the index, reduce random IO, and increase query performance. The less data the index can filter out, the less data will be read from the disk.
The one with the highest differentiation is placed on the far left of the federated index (differentiation = the number of different values in the column / the total number of rows in the column)
Try to place columns with small field length on the far left side of the federated index (because the smaller the field length, the larger the amount of data a page can store, and the better IO performance will be.)
The most frequently used columns are placed to the left of the federated index (so that fewer indexes can be built).
Avoid the establishment of redundant and duplicate indexes
Because this increases the time it takes the query optimizer to generate an execution plan.
Duplicate index example: primary key (id), index (id), unique index (id)
Examples of redundant indexes: index (aformab), index (aforme b), index (a)
Priority should be given to overriding the index.
Override indexes are preferred for frequent queries.
Override index: an index that contains all query fields (fields contained in where,select,ordery by,group by)
Benefits of overriding an index:
Avoid the secondary query of Innodb table indexing
Innodb is stored in the order of clustered indexes. For Innodb, the secondary index stores the primary key information of rows in the leaf node.
If we use the secondary index to query the data, after finding the corresponding key value, we have to do a second query through the primary key to get the data we really need. In the overlay index, all the data can be obtained from the key value of the secondary index, which avoids the secondary query of the primary key, reduces the IO operation, and improves the query efficiency.
Random IO can be changed into sequential IO to speed up query efficiency.
Because the overlay index is stored in the order of key values, for IO-intensive range lookups, there is much less IO than randomly reading each row from the disk, so the overlay index can also be used to transform the randomly read IO of the disk into the order IO of the index lookup.
IX. Index SET specification
Avoid using foreign key constraints as much as possible
Foreign key constraints (foreign key) are not recommended, but be sure to build indexes on the associated keys between tables
Foreign keys can be used to ensure the referential integrity of data, but it is recommended to implement them on the business side.
Foreign keys affect the write operations of parent and child tables, thus reducing performance.
Database SQL development specification
1. It is recommended to use precompiled statements for database operation.
Precompiled sentences can reuse these plans, reduce the time needed for SQL compilation, and solve the problem of SQL injection caused by dynamic SQL. It is more efficient than passing SQL statements that the same sentence can be parsed at once and used many times, thus improving the processing efficiency.
two。 Avoid implicit conversion of data types
Implicit conversion can cause the index to fail. For example, select name,phone from customer where id = '111'
3. Make full use of the indexes that already exist on the table
Avoid using query conditions with double% numbers.
Such as a like'3% indexes, (if there is no front%, only post%, you can use the index on the column)
A SQL can only use one column in the composite index to query the range.
For example, if there is a federated index with column an in the query condition, then the index on column a will not be used. When defining the federated index, column a should be placed on the right side of the federated index if column a uses range lookup.
Use left join or not exists to optimize not in operations
Because not in also usually uses indexes to fail.
4. When designing a database, future extensions should be considered.
5. The program connects to different databases and uses different accounts to query across databases.
Leave room for database migration and subdatabase and table
Reduce business coupling
Avoid the security risk caused by excessive permissions
6. Prohibit the use of SELECT * must use SELECT query
Reason:
Consume more CPU and IO for network bandwidth resources
Cannot use override index
Can reduce the impact of table structure changes
7. Prohibit the use of INSERT statements without field lists
For example: insert into values ('axiomagrine, thecontrol, etc.)
You should use insert into t (C1, c2, and c3) values ('axiomagrical, baccalaureate, and c')
8. Avoid using subqueries, which can be optimized to join operations
The general subquery is in the in clause, and when the subquery is simple SQL (does not include union, group by, order by, limit clauses), the subquery can be transformed into an associated query for optimization.
Reasons for poor performance of subqueries:
The result set of the subquery cannot use the index, and the result set of the general subquery is stored in the temporary table, and there is no index in either the memory temporary table or the disk temporary table, so the query performance will be affected to a certain extent.
Especially for subqueries with large result sets, the greater the impact on query performance.
Because the subquery will produce a large number of temporary tables and no indexes, it will consume too much CPU and IO resources, resulting in a large number of slow queries.
9. Avoid using JOIN to associate too many tables
For MySQL, there is an associated cache, and the size of the cache can be set by the join_buffer_size parameter.
In MySQL, one more association cache is allocated for a table with the same SQL multiple associations (join), and the more tables associated in a SQL, the more memory is consumed.
If the operation of multi-table association is widely used in the program, and the setting of join_buffer_size is unreasonable, it is easy to cause the server memory overflow, which will affect the stability of the server database performance.
At the same time, for association operations, temporary table operations will occur, which will affect query efficiency. MySQL allows a maximum of 61 tables to be associated, and no more than 5 tables are recommended.
10. Reduce the number of interactions with the database
The database is more suitable for handling batch operations and merging multiple identical operations together, which can improve the processing efficiency.
11. When making or judgment for the same column, use in instead of or
The value of in should not exceed 500. in operations can make more efficient use of indexes, which are rarely used by or in most cases.
twelve。 Do not use order by rand () for random sorting
All eligible data in the table is loaded into memory, and all data is sorted in memory according to randomly generated values, and a random value may be generated for each row, which consumes a lot of CPU and IO and memory resources if the dataset that meets the criteria is very large.
It is recommended to get a random value in the program and then get the data from the database.
13. Function conversion and computation of columns are prohibited in the WHERE clause
When a column is functionally converted or evaluated, the index cannot be used.
Not recommended:
Recommended:
14. Use UNION ALL instead of UNION when it is clear that there are no duplicate values
UNION will put all the data of the two result sets into a temporary table before deduplicating.
UNION ALL will no longer deduplicate the result set.
15. Split a complex large SQL into multiple small SQL
Large SQL: a SQL that is logically complex and takes up a lot of CPU for calculation
MySQL: a SQL can only be calculated using one CPU
After SQL splitting, parallel execution can be used to improve processing efficiency.
Code of conduct for database operation
1. Batch write operations (UPDATE, DELETE, INSERT) of more than 1 million lines should be performed in batches for multiple times
Large quantities of operations may cause serious master-slave delays
In the master-slave environment, a large number of operations may cause serious master-slave delay. Large-scale write operations usually take a long time to execute, but only when the execution on the master database is completed, it will be executed on other slave libraries, so it will cause a long delay between the master library and the slave library.
When binlog logs are in row format, a large number of logs will be generated.
Mass write operations will generate a large number of logs, especially for binary data in row format. Since modifications to each row of data are recorded in row format, the more data we modify at a time, the more logs will be generated, and the longer it takes for log transmission and recovery, which is one of the reasons for the master-slave delay.
Avoid large transaction operations
Mass modification of data must be carried out in a transaction, which will cause a large number of data in the table to be locked, resulting in a large number of blocking, which will have a great impact on the performance of MySQL.
In particular, long-term blocking will fill the available connections to all databases, which will prevent other applications in the production environment from connecting to the database, so it is important to pay attention to batch writes.
two。 Use pt-online-schema-change to modify the table structure for large tables
Avoid master-slave delay caused by large table modification
Avoid locking tables when modifying table fields
The modification of the data structure of large tables must be careful, which will cause serious table locking operations, especially in the production environment, which cannot be tolerated.
Pt-online-schema-change will first create a new table with the same structure as the original table, and modify the table structure on the new table, then copy the data from the original table to the new table, and add some triggers to the original table.
Copy the new data from the original table to the new table, and after all the row data is copied, name the new table as the original table, and delete the original table.
Decompose the original DDL operation into several small batches.
3. It is forbidden to grant super permissions to accounts used by the program.
When the maximum number of connections is reached, one user with super permission to connect to super can only be used by the DBA account dealing with the problem.
4. For the program to connect to the database account, follow the principle of minimum permissions
The answers to the questions about the MySQL database specification are shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.