Example Analysis of MySQL large Table Optimization 04/27 Update SLTechnology News&Howtos

Example Analysis of MySQL large Table Optimization

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

Editor to share with you the example analysis of MySQL table optimization. I hope you will get something after reading this article. Let's discuss it together.

Background

Aliyun RDS FOR MySQL (MySQL5.7 version) database business table adds more than 10 million data every month. As the amount of data continues to increase, there are large tables and slow queries in our business. During the business peak, slow queries of the main business tables take dozens of seconds to seriously affect the business.

Overview of the solution

Database design and index optimization

MySQL database itself is highly flexible, resulting in poor performance, and heavily depends on the developer's table design ability and index optimization ability. Here are some optimization suggestions.

Time type is converted to timestamp format, stored in int type, and indexed to increase query efficiency.

It is recommended that the field define not null. Null values are difficult to query and optimize and take up extra index space.

Use TINYINT types instead of enumerating ENUM

DECIMAL must be used instead of FLOAT and DOUBLE to store exact floating point numbers

The length of the field is serious according to the business requirements, so don't set it too large.

Try not to use the TEXT type, for example, it is recommended to split uncommonly used large fields into other tables

MySQL has a limit on the length of index fields. The length of each index column of the innodb engine is limited to 767 bytes (bytes) by default. The length of all constituent index columns cannot be greater than 3072 bytes (mysql8.0 single index can create 1024 characters).

Please contact DBA if you have DDL requirements in the big table.

Leftmost index matching rule

As the name implies, when creating a composite index, the most frequently used column in the where clause is placed on the far left, depending on the business requirements. A very important problem for a composite index is how to arrange the order of columns. For example, the two fields c1recoverc2 are used after where, so is the order of the index (c1memc2) or (c2memc1)? the correct way is to put the less repeated values in front of them. For example, if 95% of the values of a column are not duplicated, then you can generally put this column first.

Composite index index (a _ r _ b _ r _ c)

Where axiom 3 uses only a

Where adept 3 and baud 5 uses ameme b

Where adept 3 and baccalaure5 and center4 uses a meme bmeme c

Where bread3 or where cantilever 4 does not use an index

Where axiom 3 and cantilever 4 uses only a

Where axiom 3 and b > 10 and cantilever 7 uses a meme b

Where adept 3 and b like 'xx%' and Clipper 7 uses a meme b

In fact, it is equivalent to creating multiple indexes: key (a), key (aforme b), and key (ameme bmeme c).

Second, the database is switched to PloarDB read-write separation

PolarDB is a next-generation relational cloud database developed by Aliyun. It is 100% compatible with MySQL with a storage capacity of up to 100 TB, and a single database can be expanded to up to 16 nodes. It is suitable for diversified database application scenarios of enterprises. PolarDB uses a separate storage and computing architecture, and all computing nodes share a share of data, providing minute-level configuration upgrade, second-level fault recovery, global data consistency and free data backup and disaster recovery services.

Cluster architecture, separation of computing and storage

PolarDB adopts the architecture of a multi-node cluster, in which there is a Writer node (master node) and multiple Reader nodes (read-only nodes). Each node shares the underlying storage (PolarStore) through the distributed file system (PolarFileSystem).

Separation of reading and writing

When an application uses a cluster address, the PolarDB provides services through the internal proxy layer (Proxy), and the application's requests go through the proxy before accessing the database node. The proxy layer can not only do security authentication and protection, but also parse SQL, send write operations (such as transaction, UPDATE, INSERT, DELETE, DDL, etc.) to the master node, and evenly distribute read operations (such as SELECT) to multiple read-only nodes to achieve automatic read-write separation. For an application, it is as simple as using a single point of database.

In offline hybrid scenarios: different businesses use different connection addresses and different data nodes to avoid mutual influence

Sysbench performance stress test report:

PloarDB 4 core 16G 2 sets

PloarDB 8-core 32G 2 sets

Third, the historical data of sub-table is migrated to MySQL8.0 X-Engine storage engine

The sub-table business table retains 3 months of data (this is based on the needs of the company), and the historical data is divided into monthly tables to the history library X-Engine to store the engine table. why choose X-Engine to store the engine table? what are its advantages?

Cost savings. The storage cost of X-Engine is about half that of InnoDB.

X-Engine hierarchical storage improves QPS, adopts hierarchical storage structure, stores hot data and cold data in different levels, and compresses the layers of cold data by default.

X-Engine is an online transaction processing OLTP (On-Line Transaction Processing) database storage engine developed by Ali Cloud Database Product Division.

Not only does the X-Engine storage engine seamlessly interface with MySQL (thanks to the MySQL Pluginable Storage Engine feature), but X-Engine uses a tiered storage architecture. Because the goal is to face large-scale massive data storage, provide high concurrent transaction processing capacity and reduce storage costs, in most large data scenarios, the opportunity for data access is not equal, and the hot data accessed frequently actually accounts for very few. X-Engine divides the data into multiple levels according to the data access frequency, and designs the corresponding storage structure according to the data access characteristics of each level. Write to the appropriate storage device

X-Engine uses LSM-Tree as the architectural basis for tiered storage and has been redesigned:

The hot data layer and data update use in-memory storage, and the performance of transaction processing is improved through in-memory database technology (Lock-Free index structure/append only).

The pipelined transaction mechanism parallels several stages of transaction processing, which greatly improves the throughput.

The data with low access frequency is gradually eliminated or merged into the persistent storage level, and combined with multi-level storage devices (NVM/SSD/HDD) for storage.

A large number of optimizations have been made to the Compaction process which has a great impact on performance:

Split the granularity of data storage, make use of the centralized characteristics of data update hotspots, and reuse data as much as possible in the process of merging.

Fine control the shape of the LSM, reduce the I _ par O and the computational cost, and effectively alleviate the increase of space in the merging process.

At the same time, more fine-grained access control and caching mechanisms are used to optimize the performance of reads.

4. Parallel query of Aliyun's PloarDB MySQL8.0 version

After dividing the table, we still have a large amount of data, which does not completely solve our slow query problem, but only reduces the volume of our business table. In this part of the slow query, we need to use the parallel query optimization of PolarDB.

PolarDB MySQL 8.0 launched a parallel query framework, which automatically starts the parallel query framework when your query data reaches a certain threshold, thus reducing the query time exponentially.

In the storage layer, the data is sliced to different threads, multiple threads are calculated in parallel, the result pipeline is summarized into the total thread, and finally the total thread is simply merged and returned to the user to improve the query efficiency.

Parallel query (Parallel Query) takes advantage of the parallel processing power of multi-core CPU, taking an 8-core 32-GB configuration as an example, as shown in the following diagram.

Parallel queries are suitable for most SELECT statements, such as large table queries, multi-table join queries, and queries with large amounts of computation. For very short queries, the effect is not significant.

Parallel query usage, use Hint syntax to control a single statement, for example, when the system turns off parallel queries by default, but needs to speed up a high-frequency slow SQL query, you can use Hint to speed up a specific SQL.

SELECT / + PARALLEL (x) / … FROM... ;-x > 0

SELECT / * + SET_VAR (max_parallel_degree=n) * / * FROM... / / n > 0

Query test: database configuration 16-core 32G single table data of more than 30 million

Before adding parallel query, it is 4326ms, and after adding 525ms, the performance is improved by 8.24 times.

Interactive analysis of Hologre

Large table slow query although we use parallel query optimization to improve efficiency, but some specific requirements of real-time reports, real-time large screen we are still unable to achieve, can only rely on big data to deal with.

Here we recommend the interactive analysis Hologre of Ali Cloud (

Https://help.aliyun.com/product/113622.html)

After reading this article, I believe you have some understanding of "sample Analysis of MySQL large Table Optimization". If you want to know more about it, you are welcome to follow the industry information channel. Thank you for your reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.