Mysql optimization 07/03 Update SLTechnology News&Howtos

Mysql optimization

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Database level:

Application system level optimization

SQL optimization

SQL optimization generally catches the sql with long transactions and high consumption by analyzing the slow query log, and analyzes the sql logic by combining the specific business. And is simplified, and or rewrites sql. By configuring slow_query_log=1 and log_queries_not_using_indexes=1 to start slow log recording and recording queries that do not use indexes, the latter will cause slow log files to swell quickly and need to be cut regularly. Tools for analyzing slow logs generally use pt-query-digest of the pt toolkit or mysqldumpslow that comes with mysql. Individuals tend to use pt, and the analysis is more detailed. It should be noted that pt-query-digest is actually a perl script, and if the slow log file is large (more than a few gigabytes), it will consume a lot of CPU resources. It is recommended that the tool be dispatched at a low business peak.

Index optimization

Index is the most common object in database. Basically 90% of sql performance problems are caused by unindexed or inefficient indexes. Therefore, building an appropriate index according to the actual business scenario can make sql optimization get twice the result with half the effort.

Index design rules: select the unique index or primary key; for the frequent need for sorting, grouping and joint operation of the field to build an index, try to build a composite index rather than a single-column index; often as a query condition of the field to build an index; limit the number of indexes; try to use a small amount of data index; try to use a prefix to index; delete no longer in use or rarely used index.

Some considerations for creating an index:

(1) avoid using the! = or operator in the where clause, otherwise the engine will abandon the index and perform a full table scan

(2) avoid using or to join conditions in the where clause, and consider using union instead.

(3) avoid expression operations or function operations on fields in the where clause

(4) We should first consider indexing the columns involved in where and order by.

(5) when using an index field as a condition, if the index is a composite index, the first field in the index must be used as a condition to ensure that the system uses the index, otherwise the index will not be used. and the order of the fields should be consistent with the order of the index as far as possible. That is, the leftmost principle.

(6) the index can improve the efficiency of the corresponding select, but it also needs the cost to maintain the index, so the number of indexes in the table is not as many as possible, generally not more than 7.

(7) if temporary tables are used, when all temporary tables are explicitly deleted at last, truncate table first, and then drop table, which can avoid locking the system tables for a long time.

(8) avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources. For one-time events, it is best to use an export table.

(9) it is best not to leave NULL for the database, but to fill the database with NOT NULL as much as possible. Comments, descriptions, comments, etc., can be set to NULL, others, it is best not to use NULL.

(10) to create an index for varchar fields, specify the index length to avoid creating a full-field index. You can use count (distinct left (column name, index length)) / count (*) to calculate the discrimination to determine the index length.

(11) when you associate a query with multiple tables, you must ensure that the associated fields are indexed. And the type of the field must be consistent to avoid index failure due to implicit conversion.

Database table optimization

For example: the early stage of the business does not pay attention to table design, resulting in logs, messages, pictures and other data are stored in the database through tables. This kind of data is generally stored through fields such as text/blob, which is easy to increase the capacity of the table, and it is difficult to optimize the query sql of this kind of table. It is necessary to design the database table and store the data such as pictures and messages through NOSQL databases such as mongodb. Or split part of the large fields of the table into a separate table, and realize the data association between the table and the table through redundant fields (it is not recommended to do so through the foreign key constraint of mysql, because in the case of high concurrency, a large number of row locks will affect database performance. It is strongly recommended to achieve data association through applications. )

Using a unified character set and check set, using the innodb engine, the table design uses business-independent self-increasing ID columns as the primary key, reduces the stored procedure and custom function, and tries not to use field types such as text/blob.

Table design specification

Please refer to the attached Alibaba Java development manual and 58-home database 30 military rules interpretation (https://www.oschina.net/question/54100_2231325)

Database object optimization

Memory configuration optimization

Innodb buffer pool setting: innodb_buffer_pool_size, which is generally 70% / 80% of the memory of the machine.

Percentage of dirty pages in the buffer pool: innodb_max_dirty_pages_pct, which defaults to 75%. It is recommended to set it according to business scenarios.

It is strongly recommended that you close query cache. Set query_cache_size = 0 and query_cache_type = 0 through the configuration file.

Redo log buffer setting: innodb_log_buffer_size, if there is no big transaction, you can control it in 8M-16M. The production environment is currently configured to 64m.

IO configuration optimization

Sync_binlog:

Sync_binlog=0, when a transaction is committed, MySQL does not do disk synchronization instructions such as fsync to refresh the information in binlog_cache to disk, but let Filesystem decide when to synchronize, or synchronize to disk after the cache is full.

Sync_binlog=n, after every n transaction commits, MySQL will issue a disk synchronization instruction such as fsync to force the data in binlog_cache to be written to disk.

Sync_relay_log:

The same function as the sync_binlog parameter, except that the object is relay log instead of binlog.

Innodb_flush_log_at_trx_commit:

If innodb_flush_log_at_trx_commit is set to 0 innodb_flush_log_at_trx_commit log buffer will be written to log file once a second, and the flush (brush to disk) operation of log file will occur at the same time. In this mode, writing to disk is not actively triggered when the transaction is committed.

If innodb_flush_log_at_trx_commit is set to 1, MySQL will write log buffer data to log file and flush (flush to disk) each time the transaction commits.

If innodb_flush_log_at_trx_commit is set to 2, MySQL writes log buffer data to log file. But the flush (flush to disk) operation does not happen at the same time. In this mode, MySQL performs a flush (flush to disk) operation once a second.

Sync_master_info:

How many transactions per interval refresh master.info? if the table (innodb) setting is invalid, each transaction will be updated

Sync_relay_log_info:

How many transactions per interval refresh relay-log.info? if the table (innodb) setting is invalid, each transaction will be updated

Master_info_repository:

To record the information of the main library binlog, you can set FILE (master.info) or TABLE (mysql.slave_master_info)

Relay_log_info_repository:

To record the information of relaylog, you can set FILE (relay-log.info) or TABLE (mysql.slave_relay_log_info)

Innodb_io_capacity: default is 200. if it is an SSD disk, it is recommended to adjust it to 5000.

High concurrency setting

Expand the file descriptor:

1. Dynamic modification, restart failure, can only use root, and the current session is valid: ulimit-n 65535

2. Modify the configuration file, take effect permanently, and add the following to the / etc/security/limits.conf configuration file:

* soft nofile 65535 * soft nproc 65535 * hard nofile 65535 * hard nproc 65535

Host level:

1 、 CPU

Adjust it to performance mode, that is: performance, please refer to the blog: http://blog.csdn.net/myarrow/article/details/7917181/

2. Memory

(1) turn off the NUMA feature. The NUMA trap phenomenon is that when your server still has memory, you find that it is already using swap, and it has even caused the machine to stagnate. This may be due to the limitation of numa. If a process restricts it to use only the memory of its own numa node, then when its own numa node memory runs out, it will not use the memory of other numa node. It will start to use swap, or even worse, when the machine does not set swap, it may be down directly. Therefore, it is strongly recommended to turn off the NUMA feature at the operating system level. Just add numa=off at the end of the kernel line of / etc/grub.conf.

(2) configure high memory as much as possible. This kind of mysql can make full use of memory resources to cache hot block data, avoid IO bottleneck caused by dirty data constantly brushing disk due to lack of memory, and avoid hot block data being squeezed out of the cache area.

(3) modify swappiness settings. Swappiness is a kernel parameter of linux that controls the strategy of swapping out physical memory. It allows a percentage of the value, the minimum is 0, the maximum is 100. by default, setting 60.m.swappiness to 0 means using as little swap,100 as possible to swap inactive memory pages into swap or release cache. Inactive memory means memory that is mapped by the program, but not used for a "long time". This value is recommended to be set to 1, as follows, by adding a line to the / etc/sysctl.conf file: vm.swappiness = 1

3. Disk IO

(1) try to separate the data file from the log file and carry the corresponding disk. Avoid contention for IO between log flushing and data flushing.

(2) try to use high IO disks, or use disk arrays such as raid10, or directly use SSD disks.

4. Network optimization

The machines in the cluster should be deployed in the same private network or directly connected network environment to ensure a network environment with low latency and high throughput. Avoid the abnormal master-slave replication of or in the cluster caused by network problems.

Operating system level:

1. File system

It is strongly recommended to use the xfs file system. Bug exists in the ext4 file system, and the trigger will occupy most of its own IO, causing IO bottleneck. See http://1057212.blog.51cto.com/1047212/1891734 for details

Optimize file system mount parameters: the file system mount parameters are modified in the / etc/fstab file and take effect when you restart. Noatime does not record the access time, and nodiratime does not record the access time of the directory. Barrier=0, which means to turn off the barrier function. Where nobarrier is specific to the xfs file system, and the ext4 file system does not have this parameter.

2. IO scheduling algorithm

The full write of NOOP:NOOP algorithm is No Operation. The algorithm implements the simplest FIFO queue, and all IO requests operate in a first-come-first-served order.

The full write of CFQ:CFQ algorithm is Completely Fair Queuing. The characteristic of this algorithm is that it sorts according to the address of IO request, rather than responds in a first-come-first-served order.

DEADLINE:DEADLINE solved the extreme case of IO requesting starvation on the basis of CFQ. In addition to the IO sorting queues inherent in CFQ, DEADLINE provides additional FIFO queues for reading IO and writing IO, respectively. The maximum wait time for reading FIFO queues is 500ms and for writing FIFO queues is 5s. The priority of IO requests in FIFO queues is higher than that in CFQ queues, and the priority of reading FIFO queues is higher than that of writing FIFO queues. The priority can be expressed as: FIFO (Read) > FIFO (Write) > CFQ

Generally, the disk IO scheduling algorithm of mysql server adopts deadline, which can not only ensure that the IO request will not starve to death, but also make the processing priority of reading IO is higher than that of writing IO.

Optimization at the system architecture level

Load balancing

It can be divided into two categories:

1. In database clusters such as PXC or mysql cluster or mysql group replication, due to the support for multi-point writing, the load balancer here can balance the load between read and write, and achieve load balancing by deploying haproxy or LVS at the database frontend. However, it should be emphasized that at present, this type of cluster is prone to lock conflicts and update loss in the case of multi-point writes, so it is generally recommended to enable single-point writes. In other words, generally speaking, only a single node can carry and write operations, and the remaining nodes can balance the load read operations.

2. Mysql one-master and multi-slave architecture: since the master database must carry write operations separately, load balancing is only aimed at read operations. Load balancing of read operations is also achieved by deploying haproxy or LVS at the front end of the database.

Caching

Generally, the in-memory database such as Redis, memcached and mysql are combined to put the hot data on the in-memory database to achieve high concurrency. Please refer to the blog: http://blog.csdn.net/stubborn_cow/article/details/50586990

Distributed optimization

Sub-database sub-table:

It can also be divided into two categories:

The main contents are as follows: (1) the way of table splitting is realized by using the logic of the front-end application code. This is more intrusive to the application, but the process of data processing logic is in your own hands, and there are exceptions that can be located independently.

(2) it is realized by middleware, and data slicing is realized by mycat and cobar which are commonly used at present.

Read-write separation:

Generally implemented through database middleware, commonly used middleware such as maxscale, mycat, cobar, altas, etc.

Attachment: http://down.51cto.com/data/2366569

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.