How MySQL optimizes the performance of Schema and data types 07/04 Update SLTechnology News&Howtos

How MySQL optimizes the performance of Schema and data types

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces MySQL how to optimize Schema and data type performance, the content of the article is carefully selected and edited by the author, with a certain pertinence, for everyone's reference significance is still relatively great, the following with the author to understand how MySQL optimizes Schema and data type performance bar.

Good logical and physical design is the cornerstone of high performance, and schema should be designed according to the query statements to be executed by the system.

Anti-paradigm design can speed up some types of queries, while alone may slow down another type of query, such as adding count tables and summary tables is a good way to optimize queries, but the maintenance cost of these tables can be high.

1. It is usually better to choose a smaller data type that is optimized.

You should try to use the smallest types that store data correctly, and smaller data types are usually faster because they take up less disk, memory, and CPU cache, and require fewer CPU cycles to process.

Just keep it simple.

Operations with simpler data types usually require fewer CPU cycles. For example, integer numbers are cheaper than character operations because character sets and proofreading rules (collations) make characters more complex than integer numbers. For example, INTERGER should be used to store IP addresses (inet_aton)

Try to avoid NULL

In general, it is best to specify a column as NOT NULL. If the query contains columns that can be NULL, it is more difficult to optimize for MySQL, because columns that can be NULL make indexes, index statistics, and value comparisons very complex, but columns that can be NULL use more storage space, and when columns that can be described as NULL are indexed, each index record requires an extra byte. However, the performance improvement from changing a column that can be NULL to NOT NULL is relatively small, but if you plan to create an index on a column, you should avoid designing columns that can be NULL.

1.1 Integer type integer type occupies space range TINYINT8 [- 2 ^ 7,2 ^ 7-1] SMALLINT16 [- 2 ^ 15,2 ^ 15-1] MEDIUMINT24 [- 2 ^ 23,2 ^ 23-1] INT32 [- 2 ^ 31,2 ^ 31-1] BIGINT64 [- 2 ^ 63,2 ^ 63-1]

The integer type has an optional UNSIGNED attribute, which means that negative values are not allowed, which can double the number of positive numbers. Signed and unsigned types use the same storage space and have the same performance. Integers compute each other, using 64-bit BIGINT as the intermediate type.

1.2 Real number types

Real numbers are numbers with decimal parts, and you can use DECIMAL to store integers larger than BIGINT.

The DECIMAL type is used to store precise decimals and supports precise calculations. For example, DECIMAL (18P9) will store 9 digits on each side of the decimal point, using a total of 9 bytes, of which the digits before the decimal point will be used. DECIMAL allows a maximum of 65 digits.

Floating-point types usually take up less space than DECIMAL when storing the same range of values, and use DOUBLE as the calculation type for internal calculations.

Because of the extra space and computational overhead, try to use DECIMAL only for accurate calculation of mice. When the amount of data is relatively large, you can consider using BIGINT instead of DECIMAL, and say that the monetary units you need to store can be multiplied by the corresponding multiple according to the decimal places.

1.3 string type varchar

The varchar type is used to store variable length strings, which saves more space than fixed length. Varchar needs to use one or two extra bytes to record the length of the string. If the maximum length of the column is less than or equal to 255bytes, it is only represented by 1 byte, otherwise 2 bytes are used. Varchar saves storage space, so it is also good for performance. However, because the row is longer, if the actual storage length of the long column of the edge is increased during UPDATE, this leads to additional work, and if the space occupied by a row grows and there is no more storage space within the page, in this case, InnoDB needs to split the page so that the row can be put into the page.

Varchar usage: 1. The maximum length of a string column is much larger than the average length, and the column is rarely updated.

two。 Uses a composite character set such as UTF-8 (each character is stored in a different number of bytes)

MySQL preserves the trailing space of the varchar when storing and retrieving. InnoDB can store overly long VARCHAR as BLOB.

Char

Fixed-length string, MySQL removes the trailing space from the char when storing it. It will cause the uniqueness conflict between "A" and "A". How the data is stored depends on the storage engine, and the act of filling and intercepting spaces takes place at the MySQL services layer.

Longer columns consume more memory, and MySQL usually allocates a fixed amount of memory to hold internal values, especially when sorting or manipulating using memory temporary tables is always particularly bad.

Blob

It is stored in binary mode without collation and character set. Include tinyblob,blob,mediumblob,longblob

Text

It is stored as a string, with sorting rules and character sets, including tinytext,text,mediumtext,longtext.

Unlike other types, MySQL treats each blob value and text value as a separate object, and the storage engine usually makes special treatment when storing them. When the BLOB and text values are too large, InnoDB uses a special "external" storage area to store, using a pointer to the external storage area in the original row. Colleagues, these two data formats can only be indexed with prefixes at most.

ENUM

Enumeration is not recommended (if you want to know, you can refer to the original book)

1.4 date and time types DATETIME and TIMESTAMP

DATETIME is now recommended, which has a larger range, regardless of time zone, and occupies 8 bytes

1.5-bit data type

InnoDB uses a minimum integer type for each BIT column to store. Using the BIT type does not save much storage space. MySQL treats BIT as a string type, and when retrieving the value of BIT (1), the result is a string containing binary 0 or 1.

Trap 2.1 designed by 2.MySQL pattern has too many columns

MySQL's storage engine API needs to copy data in the row buffer format at the cloud server layer and storage engine layer, and then decode the row buffer content into columns in the cloud server layer. Converting encoded columns into row structures from row buffers is very expensive, and the cost of conversion depends on the number of columns.

2.2 too many connections

A rough rule of thumb, if you want a query to execute quickly and have good concurrency, a single query should be associated in 12 tables.

2.3NULL value

When you need to store a de facto "null value" in the list, you can use 0, a special value, or an empty string instead. MySQL stores null values in the index, while Oracle does not.

3. Paradigm and anti-paradigm

In a standardized database, each factual data appears only once

In an anti-normalized database, the information is redundant and may be stored in multiple places.

3.1 advantages and disadvantages of the paradigm:

Stylized update operations are faster and require fewer changes to the data.

Formatted tables are smaller, can be better placed in memory, and perform operations faster.

Without extra data, you can reduce the operation of distinct or GROUP BY.

Disadvantages:

Associations are usually required, which are expensive and may invalidate some indexing strategies.

3.2 advantages and disadvantages of anti-paradigm:

All the data is in one table, so associations can be avoided.

Even if the full table is scanned when it is not associated, it is a sequential IO.

Disadvantages:

Redundant data with slower updates

The table is large, put in memory, occupies a large amount, and is easy to extrude hot data.

4. Read faster and write more slowly

In order to improve the speed of reading queries, it is often necessary to build additional indexes, add redundant columns, and even create cached tables and summary tables, which will increase the burden of writing queries.

Slower write operations are not the only price to pay for faster read operations, but may also increase the concurrency difficulty of both read and write operations.

5. Speed up ALTER TABLE operation

ALTER TABLE operation is a big problem for oversized tables.

MySQL performs most of the steps to modify the table structure:

1. Create an empty table with the new structure

two。 Find out all the data from the old table and insert it into the new table

3. Delete old table

In general, most ALTER TABLE operations will cause MySQL service access to the table to be interrupted.

For common scenarios, there are two common techniques:

1. Now ALTER TABLE is performed on a machine that does not provide services, and then switch

two。 Shadow copy, that is, the same as the original step, but update the new table and old table data by trigger, and then rename

All MODIFY COLUMN operations result in table reconstruction.

5.1 modify only frm (table structure) files

The following actions may not need to be rebuilt:

Remove the AUTO_INCREMENT property of a column

Add, remove, or change ENUM and SET constants

Steps (this operation is to take chestnuts from the fire):

1. Create an empty table with the same structure and make the necessary changes

two。 Execute FLUSH TABLES WITH READ LOCK. Close all tables in use and prevent tables from being opened

3. Swap frm files

4. Execute UNLOCK TABLES to release the read lock for the second step.

6. Summary

1. Avoid designing overly complex database schemas

two。 Use small and simple appropriate data types and avoid null values as much as possible

3. Try to use the same data type to store similar or related values.

4. Variable-length strings are likely to allocate memory pessimistically according to the maximum length in temporary tables and sorting.

5. Try to use self-increasing integer columns to define primary keys

6. Avoid using features that MySQL no longer recommends

7. Be cautious about BIT,ENUM,SET

After reading the above about how MySQL optimizes the performance of Schema and data types, many readers must have some understanding. If you need to get more industry knowledge and information, you can continue to follow our industry information column.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.