In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
MySQL development specification
File status:
[] draft
[√] officially released
[] modifying file identity:
Current version: V1.0
By: he Lei
Completion date: 2016-05-24
Change record
Serial number modification date modification content modifier auditor approval date
1 2016-05-24 MySQL Development Specification he Lei
MySQL development specification
1. Brief introduction
Continue to learn from, collect and organize some development specifications and techniques, hoping to make full use of the characteristics of MySQL and get better performance.
Norms are dead and people are alive.
The specification defined now is prepared for overturning in the future.
1.1 purpose
Provide reference for developers to facilitate writing into more efficient development.
1.2 range
Scope of the document: people who need to do application development based on MySQL.
1.3 Definitions, acronyms and acronyms
None for the time being
two。 Database design
There are three goals: functional implementation, scalability, and availability.
Key point: balance all aspects of business technology and make good choices.
80% of the performance optimization comes from the optimization of architectural design.
2.1 engine and version selection
* engine recommends using InnoDB
According to the current characteristics of our business, it is recommended to use MySQL5.6 Community Edition.
2.2 A brief discussion on the Architecture
The development of Daniel is good at, do not mention here, just mark it.
2.2.1 Separation of read and write
2.2.2 Sub-database and sub-table
3. Table design
Let's first look at an example:
Createtable erp_example (id int (10) unsigned NOT NULL AUTO_INCREMENT,/**) every table must have a primary key id, and int increments itself unsigned and has nothing to do with business. * * / server_type TINYINT (3) UNSIGNED NOT NULL DEFAULT 0grammes * if the storage type, for example, 1: Beijing, 2: Shanghai, 3: Guangzhou, if there is no negative number, use unsigned to increase the range. If you do not need letters, do not use char (1) to store * * / name varchar (5) NOT NULL DEFAULT'', / * * varchar,char assign values according to actual needs, the smaller the better, and can use NOT NULL DEFAULT'', NULL not only occupies space, but also reduces indexing efficiency, using''empty string instead. * * / PRIMARYKEY (id), KEY `idx_ name` (`name`) / * * refer to idx_ column name to mark, and only reference uniq_ column name to mark * * /) ENGINE=innodbAUTO_INCREMENT=1DEFAULTCHARSET=utf8
3.1 Design specifications
3.1.1 each table should have a primary key id int (10) UNSIGNED, which is independent of the business. NOT NULL AUTO_INCREMENT .
3.1.2 the recommended field is defined as NOT NULL, and the empty field can be defined as NOT NULL DEFAULT''or 0 (0 does not give practical meaning).
3.1.3 Library names, table names, and field names must be separated by lowercase letters and "_".
3.1.4 the library name, table name and field name shall not exceed 12 characters.
3.1.5 see the library name, table name and field name, and it is recommended to use nouns instead of verbs.
3.1.6 storing exact floating point numbers must use DECIMAL instead of FLOAT and DOUBLE.
3.1.7 it is recommended that you use UNSIGNED to store non-negative values.
3.1.8 it is recommended to use INT UNSIGNED to store IPV4.
INT (10) is recommended in the 3.1.9 × × definition instead of INT (1), INT (11) or others.
3.1.10 Storage status, gender, etc., use TINYINT, if you do not need negative numbers plus UNSIGNED. Use TINYINT UNSIGNED.
3.1.11 it is not recommended to use the ENUM type, use TINYINT instead.
3.1.12 avoid using TEXT and BLOB types as much as possible.
3.1.13 VARCHAR (N), N represents the number of characters rather than bytes, such as VARCHAR (255), which can store up to 255Chinese characters and needs to be selected according to the actual width. VARCHAR (N), N is as small as possible, because the maximum length of all VARCHAR fields in a MySQL table is 65535 bytes and 21845 Chinese characters. When sorting and creating temporary tables and other memory operations, the length of N will be used to apply for memory.
3.1.14 Select UTF8 for the character set of the table.
3.1.15 storage years use the YEAR type.
3.1.16 the storage date uses the DATE type.
3.1.17 Storage time (accurate to seconds) the TIMESTAMP type is recommended because TIMESTAMP uses 4 bytes and DATETIME uses 8 bytes. Values saved by the TIMESTAMP type cannot be earlier than 1970 or later than 2037.
3.1.18 split the overly large fields into other tables.
3.1.19 prohibit the use of VARBINARY and BLOB in the database to store pictures, files, etc.
3.1.20 changes in table structure need to be notified to DBA for review.
To sum up: if you can NOT NULL, use NOT NULL DEFAULT 'for NOT NULL,char and varchar, and use NOT NULL DEFAULT 0 for tinyint, smallint and int. The values of char and varchar should be stingy, given according to actual needs, such as no more than 5 names, varchar (5), not varchar (200). Int, tinyint, int (1) and int (13) are all the same. We use int (10), tinyint value range [- 128127], and unsigned value [0255]. If you don't need to store negative numbers, add unsigned for integer types.
3.2 Index
3.2.1 non-unique indexes must be named according to "idx_ field name _ field name [_ field name]".
3.2.2 unique indexes must be named according to "uniq_ field name _ field name [_ field name]".
3.2.3 Index names must be lowercase.
3.2.4 it is recommended that the number of fields in the index do not exceed 5.
3.2.5 the number of indexes in a single table is limited to 5.
3.2.6 the order of index fields needs to consider the number of field values after they are duplicated, and the more fields are put in front of them.
3.2.7 use EXPLAIN to determine whether the SQL statement uses the index reasonably, and try to avoid the occurrence of Using File Sort,UsingTemporary columns.
3.2.8 UPDATE and DELETE statements need to add indexes according to WHERE conditions.
3.2.9 it is not recommended to use% prefix fuzzy queries, such as LIKE "% weibo".
3.2.10 reasonably create a federated index (avoid redundancy), which is equivalent to (a), (aforce b), and (aforce b), but (aforce c) can only use part of the index.
3.2.11 make rational use of the overlay index.
3.2.12 SQL changes need to confirm whether the index needs to be changed and notify DBA.
3.3 SQL statement
3.3.1 IN should not contain too many values in the SQL statement.
3.3.2 LIMIT is not used in UPDATE and DELETE statements.
3.3.3 appropriate types must be used in WHERE conditions to avoid implicit type conversion by MySQL.
3.3.4 the SELECT statement gets only the required fields.
3.3.5 SELECT and INSERT statements must explicitly specify the field name without using SELECT * or INSERT INTO table ().
3.3.6 use SELECT column_name1, column_name2 FROM table WHERE [condition] instead of SELECT column_name1 FROM table [condition] and SELECT column_name2 FROM table WHERE [condition].
3.3.7 non-equivalent conditions in the WHERE condition (IN, BETWEEN, =) cause subsequent conditions to fail to use the index.
3.3.8 avoid mathematical or functional operations in SQL statements, and it is easy to couple business logic with DB.
3.3.9 the INSERT statement uses batch to submit (INSERT INTO table VALUES (), (), ()... The number of values should not be too large.
3.3.10 avoid using stored procedures, triggers, functions, etc., it is easy to couple business logic with DB, and there is a certain bug in MySQL stored procedures, triggers and functions.
3.3.11 avoid using JOIN.
3.3.12 use reasonable SQL statements to reduce the number of interactions with the database.
3.3.13 do not use ORDER BY RAND (), replace it with other methods.
3.3.14 it is recommended to use a reasonable paging method to improve the efficiency of paging.
3.3.15 use COUNT (*) instead of COUNT (primary_key) and COUNT (1) to record numbers in statistical tables.
3.4 Sub-table
3.4.1 it is recommended that the amount of data per table should be less than 500w.
3.4.2 use time subtable, and the table name suffix must be in a specific format, such as annual subtable user_2016 monthly subtable user_201602, daily subtable user_20160209.
4. FAQ
4.1 Library names, table names, and field names are separated by lowercase letters and "_".
A) MySQL has a configuration parameter lower_case_table_names, which cannot be changed dynamically. The default value of linux system is 0, that is, the database table name is stored in the actual situation and is case-sensitive. If it is 1, it is stored in lowercase and is case-insensitive. If it is 2, store it in reality, but compare it in lowercase.
B) if the case is mixed, multiple tables such as abc,Abc,ABC may coexist, which can easily lead to confusion.
C) the field name display is case-sensitive, but the actual use is not case-sensitive, that is, you cannot create two fields with the same name but different case.
D) for uniform specification, library names, table names, and field names are in lowercase letters.
4.2 the library name, table name and field name must not exceed 12 characters.
Library names, table names, and field names support up to 64 characters, but in order to be standardized, easy to identify, and reduce the amount of transmission, they must not exceed 12 characters.
4.3 see the library name, table name and field name, and it is recommended to use nouns instead of verbs.
A) user comments can be made with the table name usercomment or comment.
B) the library table is an objective thing and an object, so it is recommended to use nouns.
It is recommended to use the InnoDB storage engine.
A) the default lead after 5.5supports transactions, row-level locks, better recovery, better performance under high concurrency, and better support for hardware such as multi-core, large memory, ssd, etc.
B) A specific comparison can be found in the official White Paper attached.
To store exact floating point numbers, DECIMAL must be used instead of FLOAT and DOUBLE.
A) numeric types in mysql (excluding integers):
IEEE754 floating point numbers: float? (single precision), double? Or real? (double precision)
Fixed points: decimal or numeric
The effective digit binary of a single-precision floating-point number is 24 bits, 8 bits in decimal terms, and 53 bits in double-precision floating-point numbers, and 16 bits in decimal system
If the valid number of a real number is more than 8 digits, it will produce an error if it is expressed by a single-precision floating-point number. Similarly, if the significant number of a real number exceeds 16 digits, expressed as a double-precision floating-point number, there will also be an error.
B) IEEE754 standard computer floating-point numbers are internally expressed in binary, but it can also cause errors when converting a decimal number to a binary floating-point number because not all numbers can be converted to binary numbers of finite length.
That is, a binary can be accurately converted to a decimal, but a decimal with a decimal may not be accurately represented in binary.
Example:
Drop table if exists t
Create table t (value float (10j2))
Insert into t values (131072.67), (131072.68)
Select value from t
+-+
| | value |
+-+
| | 131072.67 |
| | 131072.69 |
+-+
It is recommended that you use UNSIGNED to store non-negative values.
With the same number of bytes, the range of stored values is larger. For example, tinyint signed is-128127, unsigned is 0255,
4.7 how do I use INT UNSIGNED to store ip?
Use INTUNSIGNED instead of char (15) to store ipv4 addresses and convert them through the MySQL functions inet_ntoa and inet_aton. Currently, there is no conversion function for Ipv6 address, and you need to use DECIMAL or two bigINT to store it. For example:
SELECT INET_ATON ('209.207.224.40')
3520061480
SELECT INET_NTOA (3520061480)
209.207.224.40
4.8 INT [M], what does the M value mean?
Note that the number after the parenthesis of the numerical type only indicates the width and has nothing to do with the storage range, such as INT (3) displays 3 digits by default, blanks are filled, and displays normally when it is exceeded. Python, java clients do not have this function. If we use the INT type, we all use INT (10)
4.9 it is not recommended to use ENUM or SET types, but use TINYINT instead.
A) ENUM, there are three problems: adding a new value to do DDL, default value problem (inserting an illegal value into ENUM (that is, a string other than the allowed value column), inserting an empty string as a special error value), and index value problem (inserting a number actually inserts the corresponding value of the index)
Example:
Drop table if exists t
Create table t (sex enum ('099))
Insert into t values (1)
Insert into t values ('3')
Select * from t
+-+
| | sex |
+-+
| | 0 |
| | |
+-+
2 rows in set (0.00 sec)
4.10 avoid using TEXT and BLOB types as much as possible.
A) Index sorting problem, which can only be sorted by using the length of max_sort_length or manually specifying the length of ORDER BY SUBSTRING (column,length)
B) Memory does not support text,blog type and will generate temporary tables on disk
C) more space may be wasted
D) adaptive hash index may not be available
E) causes statements that use where without an index to slow down
Will additional storage be generated in 4.11 VARCHAR?
VARCHAR (M), which uses two bytes to store the length if masking 256.
4.12 Table character set Select UTF8.
A) use the utf8 character set. If it is a Chinese character, it takes up 3 bytes, but the ASCII character is still 1 byte.
B) uniform so that there is no risk of garbled code caused by conversion
C) users in other regions (USA, India, Taiwan) can read your text normally without installing simplified Chinese support, and there will be no garbled code
D) ISO-8859-1 encoding (latin1) uses all the space in a single byte, and byte streams that transmit and store any other encoding in a system that supports ISO-8859-1 will not be discarded. That is to say, it is no problem to treat any other encoded byte stream as an ISO-8859-1 code, and the byte stream is preserved intact.
4.13 use VARBINARY to store variable-length strings.
Binary byte stream, no coding problem
4.14 Why is it recommended to use TIMESTAMP to store time instead of DATETIME?
Both DATETIME and TIMESTAMP are accurate to seconds, so TIMESTAMP is preferred because TIMESTAMP has only 4 bytes and DATETIME8 has 4 bytes. At the same time, TIMESTAMP has the characteristics of automatic assignment and automatic update.
How do I use the auto-assignment property of TIMESTAMP?
A) take the current time as the default value for ts: ts TIMESTAMP DEFAULTCURRENT_TIMESTAMP.
B) when the row is updated, update the value of ts: ts TIMESTAMP DEFAULT 0 ONUPDATE CURRENT_TIMESTAMP.
C) you can combine 1 and 2: ts TIMESTAMP DEFAULTCURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP.
The value saved by TIMESTAMP type cannot be earlier than 1970 or later than 2037
4.15 the recommended field is defined as NOT NULL.
A) if the null field is indexed, an additional 1 byte is required
B) make indexes, index statistics, and value comparisons more complex
C) can be replaced by', 0
D) if it is an index field, it must be defined as not null
For example: NOT NULL default''
Forbid the use of VARBINARY or BLOB to store pictures, files, etc.
Using a distributed file system is more efficient
4.17 Why does the performance of MySQL depend on indexes?
The query speed of MySQL depends on good index design, so indexes are essential for high performance. A reasonable index will speed up the query (including UPDATE and DELETE, MySQL will load the page containing the row into memory, and then perform UPDATE or DELETE operations), and unreasonable indexes will slow down.
The MySQL index search is similar to the pinyin and radical search of Xinhua Dictionary. When the pinyin and radical indexes do not exist, it can only be found by turning pages. When an MySQL query cannot use an index, MySQL performs a full table scan, which consumes a lot of IO.
4.18 Why can't there be too many indexes in a table?
InnoDB's secondaryindex is stored using b+tree, so b+tree needs to be adjusted in UPDATE, DELETE, and INSERT. Too many indexes will slow down the update.
4.19 EXPLAIN statement
The EXPLAIN statement (executed in the MySQL client) can get information about how MySQL executes the SELECT statement. By executing EXPLAIN on the SELECT statement, you can know whether MySQL uses information such as indexes, full table scans, temporary tables, sorting, and so on, when executing the SELECT statement. Try to avoid MySQL to scan full tables, use temporary tables, sort and so on. See the official documentation for details.
4.20 it is not recommended to use% prefix fuzzy queries, such as LIKE "% weibo".
Will cause a full table scan
4.21 what is an overlay index?
In the InnoDB storage engine, secondaryindex (non-primary key index) does not directly store row addresses and primary key values. If users need to query data columns that are not included in secondaryindex, they need to find the primary key value through secondaryindex first, and then query other data columns through the primary key, so they need to query twice.
The concept of the overlay index is that the query can be completed in an index, the overlay index will be more efficient, and the primary key query is a natural overlay index.
Reasonable creation of indexes and reasonable use of query statements can improve performance when overriding indexes are used.
4.22 LIMIT is not used in UPDATE and DELETE statements.
A) it may lead to inconsistency between master and slave data
B) error logs will be logged, causing the logs to take up a lot of space
4.23 Why do you need to avoid implicit type conversion by MySQL?
Because after implicit type conversion by MySQL, the index field type may be converted to the type with the right boundary value of the = sign, resulting in not using the index, the reason is similar to avoiding using functions in the index field.
4.24 Why is SELECT * not recommended?
Increase a lot of unnecessary consumption (cpu, io, memory, network bandwidth); increase the possibility of using overlay indexes; and the front segment also needs to be updated when the table structure changes.
Why can't I use ORDER BY rand ()?
Because ORDER BYrand () reads the data from disk and sorts it, which consumes a lot of IO and CPU, you can get a Randt value in the program, and then get the corresponding value from the database.
4.26How is paging done in MySQL?
If you have a paging statement like the following:
SELECT * FROM table ORDER BY TIME DESC LIMIT 10000 no. 10
This paging method results in a large amount of io because MySQL uses a read-ahead strategy.
Recommended paging method:
SELECT * FROM table where id > = (select id from table order by id limit 10000 limit 1) limit 10
SELECT * FROM table inner JOIN (SELECT id FROM table ORDER BY id LIMIT 10000 Magazine 10) as t USING (id)
4.27 Why avoid using complex SQL?
Refuse to use a complex SQL and split the large SQL into multiple simple SQL to execute step by step. Reason: simple SQL is easy to use MySQL's querycache; to reduce table locking time, especially MyISAM; can use multicore cpu.
2. Why does the InnoDB storage engine avoid using COUNT (*)?
Avoid using COUNT (*) operations in InnoDB tables. Memcache or redis can be used for counting statistics with strong real-time requirements, and separate tables can be used for non-real-time statistics to be updated regularly.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.