Five stages of MySQL Database Optimization 07/15 Update SLTechnology News&Howtos

Five stages of MySQL Database Optimization

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following mainly brings you the five stages of MySQL database optimization. I hope these contents can bring you practical use, which is also the main purpose of this article when I edit the five stages of MySQL database optimization. All right, don't talk too much nonsense, let's just read the following.

1. Database table design

After the establishment of the project, the development department develops the project according to the needs of the product department, and part of the work of the development engineer is to design the table structure. This is important for the database, which, if improperly designed, will directly affect access speed and user experience. There are many factors, such as slow query, inefficient query statement, improper indexing, database blocking (deadlock) and so on. Of course, there is a team of test engineers who will do stress tests and look for bug. For teams without test engineers, most development engineers do not think too much about whether the database design is reasonable, but complete the functional implementation and delivery as soon as possible, and when the project has a certain number of visits, the hidden problems will be exposed. it's not so easy to modify it at this time.

2. Database deployment

The OPS engineer is on the stage, and the initial visit volume of the project will not be very large, so a single deployment is sufficient to handle the QPS (query rate per second) of about 1500. For high availability, you can use MySQL master / slave replication + Keepalived as double-click hot backup. Common cluster software include Keepalived and Heartbeat.

3. Database performance optimization

If MySQL is deployed to an ordinary X86 CVM, the theoretical value of MySQL can normally handle about 2000 QPS without any optimization. after optimization, it may be raised to about 2500 QPS. Otherwise, when the number of visits reaches about 1500 concurrent connections, the database processing performance will slow down, and the hardware resources are still very rich, so it is time to consider the software problem. So how do you maximize the performance of the database? On the one hand, you can run multiple MySQL instances on a single server to maximize the performance of the CVM, and on the other hand, you can optimize the database. The default configurations of the operating system and database are often conservative, which will limit the performance of the database. You can adjust these configurations appropriately to handle as many connections as possible.

There are three levels of specific optimization.

3.1 Database configuration optimization

There are two commonly used storage engines in MySQL, one is MyISAM, which does not support transaction processing, fast read performance processing and table-level locking. The other is InnoDB, which supports transaction processing (ACID) and is designed to maximize performance for processing large volumes of data, row-level locking.

Table lock: low overhead, large locking granularity, high probability of deadlock and low concurrency.

Row lock: high overhead, small locking granularity, low deadlock probability and high relative concurrency.

Why do table locks and row locks appear? The main purpose is to ensure the integrity of the data. for example, if a user is operating a table and other users also want to operate the table, then other users will not be able to operate until the first user completes the operation. Table locks and row locks are for this purpose. Otherwise, if multiple users operate a table at the same time, there will be conflicts or exceptions in the data.

From the above point of view, using the InnoDB storage engine is the best choice, and it is also the default storage engine in future versions of MySQL5.5. There are many parameters associated with each storage engine. The parameters that mainly affect the performance of the database are listed below.

Default values of common parameters:

Default value of MyISAM parameter:

Key_buffer_size = 16m

# Index cache size, generally set to 30-40% of physical memory

Read_buffer_size = 128K

# buffer size for read operations. 16m or 32m is recommended.

Default value of InnoDB parameter:

Innodb_buffer_pool_size = 128m

# Index and data buffer size, which is generally set to 60% and 70% of physical memory.

Innodb_buffer_pool_instances = 1

# number of buffer pool instances. It is recommended to set 4 or 8 instances

Innodb_flush_log_at_trx_commit = 1

# key parameter. 0 means that the log is written to the log and synchronized to disk every second. If the database fails, transaction data will be lost for about 1 second. 1 for each SQL executed, it is written to the log and synchronized to disk, which costs a lot of money, and it is inefficient to wait for the log to read and write after executing the SQL. 2 means that the log is only written to the system cache and then synchronized to disk every second, which is very efficient. If the CVM fails, the transaction data will be lost. The recommended setting for data security is not very high 2, the performance is high, and the effect after modification is obvious.

Innodb_file_per_table = OFF

# shared tablespaces are the default, and the number of idbdata files in shared tablespaces is increasing, which affects the performance of iUnix. It is recommended to enable independent tablespace mode. The indexes and data of each table exist in its own independent tablespace, so that a single table can be moved in different databases.

Innodb_log_buffer_size = 8m

# Log buffer size. Since the log is refreshed at most once a second, it is generally not required to exceed 16m.

3.2 system kernel optimization

Most MySQL is deployed on linux systems, so some parameters of the operating system will also affect the performance of MySQL. Here are some appropriate optimizations for the linux kernel.

On linux systems, if the number of file handles opened by a process exceeds the system default value of 1024, the "too many files open" message will be prompted, so adjust the open file handle limit.

3.3 hardware configuration

Increase physical memory to improve file system performance. The linux kernel allocates caches (system cache and data cache) from memory to store hot data. Through the file system write delay mechanism, the cache is synchronized to disk only when the conditions are met (such as reaching a certain percentage of cache size or executing the sync command). In other words, the larger the physical memory, the larger the allocated cache and the more cached data. Of course, a certain amount of cached data will be lost if the CVM fails.

SSD hard disk replaces SAS hard disk and adjusts the RAID level to RAID1+0, which has better read and write performance (IOPS) than RAID1 and RAID5. After all, the pressure on the database mainly comes from the disk Icano.

4. Database schema extension

With the increasing volume of business, the performance of a single database cloud server can no longer meet the business requirements. It's time to consider adding machines and cluster. The main idea is to decompose the load of a single database, break through the performance of disk Imax O, store hot data in the cache, and reduce the access frequency of disk Imax O.

4.1 Master-slave replication and read-write separation

Because in the production environment, most databases are read operations, so deploy one master and multi-slave architecture, the master database is responsible for write operations, and do double-click hot backup, and multiple slave databases do load balancers responsible for read operations. The mainstream load balancers include LVS, HAProxy and Nginx. How to achieve the separation of reading and writing? Most enterprises achieve the separation of read and write at the code level, which is more efficient. Another way is to achieve the separation of read and write through the agent program, which is rarely used in the enterprise, and the common agents are MySQL Proxy and Amoeba. In such a database cluster architecture, the high concurrency ability of the database is greatly increased, and the bottleneck problem of single performance is solved. If one slave from the database can handle 2000 QPS, then five can handle 1w QPS, and the horizontal scalability of the database is also easy.

Sometimes, in the face of a large number of write applications, the performance of a single write can not meet the business requirements. If you do double master, you will encounter database data inconsistency, this reason is that users in different applications may operate the two databases, and the simultaneous update operation results in conflicts or inconsistencies between the two databases. In a single database, MySQL uses storage engine mechanism table lock and row lock to ensure data integrity. How to solve this problem when there are multiple main libraries? There is a set of master-slave replication management tool based on perl language, called MySQL-MMM (Master-Master replication managerfor Mysql,Mysql Master replication Manager). The biggest advantage of this tool is that it only provides one database write operation at a time, which effectively ensures data consistency.

4.2 increase cach

Add a cache system to the database to cache the hot data in memory, and if there is data to be requested in the memory cache, the results will no longer be returned to the database to improve the read performance. The cache implementation includes local cache and distributed cache. Local cache caches data into the memory or files of the local CVM, which is fast. Distributed caching massive data, easy to expand, the mainstream distributed cache systems are memcached, redis,memcached, stable performance, data cache in memory, the speed is very fast, QPS can reach about 8w. If you want to persist the data, then use redis, the performance is not lower than memcached.

Working process:

4.3 sub-library

Sub-database is to split the relevant tables into different databases according to different business, such as web, bbs, blog and so on. If there is a large amount of business, the split library can also be used as a master-slave architecture to further avoid excessive pressure on a single library.

4.4 sub-table

The daily increase in the amount of data, there are millions of pieces of data in a table in the database, resulting in query and insertion time is too long, how can we solve the pressure of a single table? You should consider whether to split the table into multiple small tables to reduce the pressure on a single table and improve processing efficiency, which is called sub-table.

Sub-table technology is more troublesome, to modify the SQL statements in the program code, but also manually to create other tables, you can also use merge storage engine to achieve sub-table, relatively simple. After dividing the table, the program operates on a master table, which does not store data, but only has some relations between sub-tables, as well as the way to update the data. The total table will divide the pressure into different small tables according to different queries, so as to improve the concurrency ability and disk IHAGO performance.

The subtable is divided into vertical split and horizontal split:

Vertical split: split the original table with many fields into multiple tables to solve the problem of table width. You can put less commonly used fields in a separate table, large fields in a separate table, or closely related fields in a table.

Horizontal split: split the original table into multiple tables, each table has the same structure, to solve the problem of large amount of data in a single table.

4.5 Subarea

Partitioning is to divide the data of a table into multiple blocks, which can be on one disk or on different disks. After partitioning, the data is still a table on the surface, but the data is hashed in multiple locations. In this way, multiple hard disks handle different requests at the same time, thus improving the read and write performance of disk Imando, which is relatively simple to implement.

Note: adding cache, sub-library, sub-table and partition is mainly implemented by programmers.

5. Database maintenance

Database maintenance is the main work of operation and maintenance engineers or DBA, including performance monitoring, performance analysis, performance tuning, database backup and recovery and so on.

5.1 performance status key indicators

QPS,Queries Per Second: the number of queries per second, the number of queries that a database can handle per second

TPS,Transactions Per Second: transactions per second

When you check the running status through show status, there are more than 300 status information records, of which several values can help us calculate QPS and TPS, as follows:

Uptime: the actual running time of the CVM (in seconds)

Questions: number of queries sent to the database

Com_select: number of queries, actual operation of the database

Com_insert: number of inserts

Com_delete: number of deletions

Com_update: number of updates

Com_commit: number of transactions

Com_rollback: number of rollbacks

So, here comes the method of calculating QPS based on Questions:

Mysql > show global status like & # 039

QPS = Questions / Uptime

Calculate the TPS based on Com_commit and Com_rollback:

Mysql > show global status like & # 039

Mysql > show global status like & # 039 domestic Commodities rollback accounts 039

Mysql > show global status like & # 039

TPS = (Com_commit + Com_rollback) / Uptime

Another calculation method: calculate QPS based on Com_select, Com_insert, Com_delete, Com_update

Mysql > show global status where Variable_name in (& # 039 witchcomposer selectsand 039 mastermind, & # 039 witch compositions, insertures, 039 witch, deletes, deletes, 039, and updatebooks, 039)

Wait 1 second before execution to get the interval difference. The second time each variable minus the corresponding value of the first time is the QPSSTPS calculation method:

Mysql > show global status where Variable_name in (& # 039 witch inserts insertsand 039 leadership phonetics, 039 insecticides deleteurs, 039 witches, 039 witches updatebooks, 039;)

Calculate TPS, not the query operation, calculate insert, delete, update four values can be.

After netizens' tests on these two calculation methods, it is concluded that when there are more myisam tables in the database, the use of Questions calculation is more accurate. When there are many innodb tables in the database, the calculation of Com_* is more accurate.

5.2 enable slow query log

MySQL enables the slow query log and analyzes which SQL statement is slow. If you use set to set variables, the restart service fails, and you can add parameters to my.cnf permanently.

To analyze slow query logs, you can use the mysqldumpslow tool that comes with MySQL, and the log analysis is relatively simple.

# mysqldumpslow-t 3 / var/log/mysql/mysql-slow.log # View the first three slowest queries

You can also use percona's pt-query-digest tool, which has comprehensive log analysis functions and can analyze slow log, binlog and general log.

Analyze slow query logs: pt-query-digest / var/log/mysql/mysql-slow.log

Analyze binlog logs: mysqlbinlog mysql-bin.000001 > mysql-bin.000001.sql

Pt-query-digest-type=binlog mysql-bin.000001.sql

Analyze ordinary logs: pt-query-digest-type=genlog localhost.log

5.3 Database backup

Backup database is the most basic work, but also the most important, otherwise the consequences are very serious, you know! However, because the database is relatively large, hundreds of gigabytes, often backup is very time-consuming, so we should choose an efficient backup strategy, for a large amount of data database, generally use incremental backup. The commonly used backup tools are mysqldump, mysqlhotcopy, xtrabackup and so on. Mysqldump is more suitable for small databases, because it is a logical backup, so backup and recovery take a long time. Mysqlhotcopy and xtrabackup are physical backups with fast backup and recovery speed, and hot copy without affecting database service. It is recommended to use xtrabackup and support incremental backup.

5.4 Database repair

Sometimes the MySQL CVM is suddenly powered off or shut down abnormally, resulting in table corruption and inability to read table data. At this point, you can use two tools that come with MySQL to fix it, myisamchk and mysqlcheck.

Myisamchk: only the myisam table can be repaired, and the database needs to be stopped

Common parameters:

-f-force force repair, overwrite old temporary files, generally do not use

-r-recover recovery model

-Q-quik fast recovery

-a-analyze analysis table

-o-safe-recover old recovery model. If-r cannot be repaired, you can try this parameter.

-F-fast only checks tables that are not properly closed

Quickly repair the weibo database:

# cd / var/lib/mysql/weibo

# myisamchk-r-Q * .MYI

Both mysqlcheck:myisam and innodb tables can be used. There is no need to stop the database. For example, to repair a single table, you can add the table name to the end of the database and divide it with spaces.

Common parameters:

-a-all-databases checks all libraries

-r-repair repair table

-c-check checklist, default option

-a-analyze analysis table

-o-optimize optimization table

-Q-quik to check or repair the table as soon as possible

-F-fast only checks tables that are not properly closed

Quickly repair the weibo database:

Mysqlcheck-r-Q-uroot-p123 weibo

5.5Additionally, check the performance methods of CPU and Istroke O

# View CPU performance

# Parameter-P shows the number of CPU, ALL is all, or you can show only the number of the number.

# check the performance of Icano

# Parameter-m is displayed in M units, default K

#% util: when it reaches 100%, it means that iThink O is busy.

# await: the waiting time of the request in the queue directly affects the read time.

The limit of IOPS (r/s+w/s) is about 1200. (IOPS, number of read and write operations per second)

Bandwidth: in sequential read-write mode, the theoretical value of SAS hard disk is about 300M/s, and the theoretical value of SSD hard disk is about 600M/s.

For the above five stages of MySQL database optimization, we do not think it is very helpful. If you need to know more, please continue to follow our industry information. I'm sure you'll like it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.