I have to tell you the principle of MySQL optimization 3 07/02 Update SLTechnology News&Howtos

I have to tell you the principle of MySQL optimization 3

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Talk about MySQL configuration.

Most developers may not pay much attention to the configuration of MySQL, after all, when there is no problem with the basic configuration, it is a very pragmatic strategy to pay more attention to schema design, index optimization and SQL optimization. At this point, if you make more efforts to optimize the configuration items, the benefits are usually relatively small. More often, for security reasons, ordinary developers have little access to the MySQL configuration of the production environment. As a result, developers (including me) don't know much about MySQL configuration. I hope this article will help you understand MySQL configuration better.

What would you do if you were asked to install and configure MySQL in some environment? After installation, it should be the practice of most people to modify the sample configuration file directly by copy. First of all, the sample configuration file has a lot of commented-out configuration items, which may tempt you to open a configuration you don't know, and these comments are not necessarily accurate. Second, some configurations of MySQL are a bit out of date for modern hardware and workloads.

MySQL has a lot of configuration items to modify, but in most cases, you should not modify it casually, because the potential risks caused by errors or useless configurations are very high, and it is difficult to locate the problem. Make sure the basic configuration is correct, then carefully diagnose the problem, make sure that the problem happens to be resolved through a configuration item, and then modify the configuration.

In fact, the quickest way to create a good configuration is not to start by learning configuration items, nor by asking which configuration items should be set or modified, let alone by checking server behavior and asking which configuration items can improve performance. It's best to start by understanding the MySQL kernel and behavior, and then use that knowledge to guide you in configuring MySQL.

Start by understanding how MySQL configuration works.

How MySQL configuration works

Where does MySQL get configuration information: command line parameters and configuration files. In Unix-like systems, the configuration file is usually located in / etc/my.cnf or / etc/mysql/my.cnf. At startup, you can specify the location of the configuration file through the command line arguments, and of course other parameters can be specified on the command line. The server reads the contents of the configuration file, removes all comments and line breaks, and then processes it with the command line options.

Any configuration items intended for long-term use should be written to the configuration file, not specified on the command line. Be sure to clearly know the location of the configuration file used by MySQL, and you can't take it for granted when modifying, for example, changing the configuration item of / etc/my.cnf, but MySQL doesn't actually use this configuration file. If you don't know which profile path you are currently using, you can try:

Root@msc3:~# which mysqld/usr/sbin/mysqldroot@msc3:~# / usr/sbin/mysqld-- verbose-- help | grep-A 1 'Default options'Default options are read from the following files in the given order:/etc/my.cnf / etc/mysql/my.cnf ~ / .my.cnf

A typical configuration file contains multiple sections, each beginning with a segment name enclosed in square brackets. MySQL programs usually read segments with the same name, for example, many client programs read client parts. The server usually reads the mysqld section, and make sure that the configuration item is placed in the correct section of the file, otherwise the configuration will not take effect.

Each configuration item in MySQL is in lowercase, and words are separated by underscores or underscores. Although our common separator is underscore, if you see the following configuration on the command line or in the configuration file, you should know that they are actually equivalent:

# configuration file max_connections=5000max-connections=5000# command line / usr/sbin/mysqld-- max_connections=5000/usr/sbin/mysqld-- max-connections=5000

Configuration items can have multiple scopes: global scope, session scope (each connection plays a different role), and object scope. Many session-level configuration items are equal to the global configuration and can be considered as the default value. If you change the session-level configuration item, it only affects the changed current connection. When the connection is closed, all parameter changes will become invalid. Here are a few sample configuration items:

Query-cache-size global configuration item

Sort-buffer-size is the same globally by default, but it can also be set in each thread

Join-buffer-size defaults to global, and each thread can also be set. However, if multiple tables are associated in a query, an association cache (join-buffer) can be allocated for each association, so a query may have multiple association buffers.

There are many (but not all) of the variables (configuration items) in the configuration file that can be modified while the server is running, which MySQL classifies as dynamic configuration variables:

# set global variables. GLOBAL and @ @ global have the same function as set GLOBAL sort-buffer-size = set @ @ global.sort-buffer-size: = # set session-level variables The following six ways work the same # that is, without modifiers, SESSION, LOCAL and other modifiers, set SESSION sort-buffer-size = set @ @ session.sort-buffer-size: = set @ @ sort-buffer-size = set LOCAL sort-buffer-size = set @ @ ocal.sort-buffer-size: = set sort-buffer-size = # set command can set multiple variables at the same time However, as long as one of the variables fails to set, none of the variables are in effect. SET GLOBAL sort-buffer-size = 100, SESSION sort-buffer-size = 1000 SET GLOBAL max-connections = 1000, sort-buffer-size = 1000000

Dynamically set variables that are invalidated when MySQL is turned off. If the global value of a variable is modified while the server is running, this value has no effect on the current session or any other existing session, because the variable value of the session is initialized from the global value when the connection is created. Note that after the configuration is modified, you need to confirm that the modification is successful.

You may notice that in the above example, some use "=" and some use ": =". For the set command itself, there is no difference between the two assignment operators, and the use of either operator on the command line takes effect. In other statements, the assignment operator must be ": =" because "=" is considered a comparison operator in non-set statements. For more information, please refer to the following example: for detailed examples, please refer to stackoverflow

/ / @ exp indicates the user variable, and the above examples are all system variables / / error set @ user = 123456 SET @ group = select GROUP from USER where User = @ user;select * from USER where GROUP = @ group;// correct SET @ user: = 123456 select @ group: = `group`group` = @ user;SELECT * FROM user WHERE `group` = @ group

Some configurations use different units, such as the table-cache variable, which specifies how many bytes the table can be cached, rather than the number of bytes that the table can be cached. Key-buffer-size is in bytes.

There are also configurations that can specify suffix units, such as 1M=1024*1024 bytes, but it is important to note that this is only valid in the configuration file or as a command line argument. When using SQL's SET command, you must use an expression with a numeric value of 1048576 or 1024, but you cannot use an expression in a configuration file.

Configure MySQL carefully

We often modify the configuration dynamically, but be careful, as they can cause the database to do a lot of time-consuming work, thus affecting the overall performance of the database. For example, if a dirty block is refreshed from the cache, different refresh methods have a great impact on Ipicuro (more on that later). It is best to incorporate some good habits into the workflow as norms, such as:

Good habit 1: do not infer the role of a variable by the name of the configuration item

Don't infer the role of a variable by the name of the configuration item, because it may be completely different from what you think. For example:

Read-buffer-size: when MySQL needs to read data sequentially, if the index cannot be used, it will perform a full table scan or a full index scan. At this time, MySQL reads the data blocks according to the storage order of the data, and the data blocks read each time will first be temporarily stored in the cache. When the cache space is full or all the data is read, the data in the cache will be returned to the upper caller to improve efficiency.

Read-rnd-buffer-size: corresponding to sequential reads, this buffer is used to temporarily store read data when MySQL does non-sequential read (random read) data blocks. For example, reading table data according to index information, Join according to the sorted result set and table, and so on. In general, when data blocks need to be read in a certain order, MySQL needs to generate random reads, which in turn uses the memory buffer set by the read-rnd-buffer-size parameter.

Both configurations are valid when scanning the MyISAM table, and MySQL allocates memory for each thread. For the former, MySQL allocates memory for the cache only when the query needs it, and allocates all the memory specified by the parameter at once, while the latter allocates memory only when needed, but only allocates the amount of memory needed rather than the value specified by the parameter. The name max-read-rnd-buffer-size (actually does not have this configuration item) can better express the actual meaning of this variable.

Good habit 2: do not easily modify session-level configuration globally

For some session-level settings, do not easily increase their values globally unless you are sure that this is the right thing to do. For example: sort-buffer-size, this parameter controls the cache size of the sort operation. MySQL allocates memory for the buffer only when the query needs to sort. Once sorting is needed, it allocates the specified size of memory at one time, even if it is a very small sort operation. So you should configure it smaller in the configuration file, and then make it larger in the connection when some queries need to be sorted. For example:

SET @ @ seession.sort-buffer-size: =-- sqlSET @ @ seession.sort-buffer-size: = DEFAULT # restore the default value # can encapsulate similar code in a function for ease of use. Good habit 3: when configuring variables, the higher the value, the better.

When configuring variables, the higher the value, the better, and if you set the value too high, it may be more likely to cause memory problems. After the modification is completed, monitoring should be used to confirm the impact of the modification of the variables on the overall performance of the server.

Good habit 4: specification comments, version control

Writing comments in the configuration file may save yourself and your colleagues a lot of work, and a better habit is to put the configuration file under version control.

After talking about good habits, let's talk about bad habits.

Bad habit 1: tune according to some "ratios"

A classic rule of thumb for ratio tuning is that the cache hit ratio should be higher than a certain percentage, and if the hit ratio is too low, you should increase the cache size. This is a very wrong idea, and you can think about it carefully: is there a certain relationship between cache hit rate and cache size? When the denominator increases, the value increases. Unless the cache is really too small. The MyISAM key buffer hit ratio is described in more detail below.

Bad habit 2: feel free to use tuning scripts

Try not to use tuning scripts! Different business scenarios and different hardware environments have different performance requirements for MySQL. For example, some businesses have high requirements for data integrity, so we must ensure that the data is not lost, and the data can be recovered after a failure, while some businesses do not have such high requirements for data integrity, but higher performance requirements. Therefore, even if it is the same variable, its configuration values should be different in these two different scenarios. Can you still rest assured to use the script found online?

Several configuration items in this section are for illustration only and do not represent how important they are. Please configure them according to the actual application scenario. Like sort-buffer-size, do you really need 100m of memory to cache 10 rows of data?

Give you a basic MySQL configuration

As mentioned earlier, MySQL is so configurable that it seems to take a lot of time to configure, but in fact, the default values for most configurations are already the best, and it's best not to change too many configurations easily, and you don't even need to know that some configurations exist. Here is the smallest sample configuration file that can be used as a starting point for the server configuration file, with some configuration items required. What is the purpose of this section for you to analyze each configuration in detail? Why configure it? How to determine the appropriate value?

[mysql] # CLIENT # port = 3306socket = / var/lib/mysql/mysql.sock [mysqld] # GENERAL # user = mysqlport = 3306default-storage-engine = InnoDBsocket = / var/lib/mysql/mysql.sockpid-file = / var/lib/mysql/mysql.pid# DATA STORAGE # datadir = / var/lib/mysql/# MyISAM # key-buffer-size = 32Mmyisam-recover = FORCE BACKUP# SAFETY # max-allowed-packet = 16Mmax-connect-errors = 1000000 hours BINARY LOGGING # log-bin = / var/lib/mysql/mysql-binexpire-logs-days = 14sync-binlog = "LOGGING # log-error = / var/lib/mysql/mysql-error.loglog-queries-not-using- Indexes = 1slow-query-log = 1slow-query-log-file = / var/lib/mysql/mysql-slow.log# CACHES AND LIMITS # tmp-table-size = 32Mmax-heap-table-size = 32Mquery-cache-type = 0query-cache-size = 0max-connections = 500thread-cache-size = 50open-files-limit = 65535table-definition-cache = 4096table-open-cache = 1024 INNODB # innodb-flush-method = O_DIRECTinnodb-log-files-in-group = 2innodb-log-file-size = 256Minnodb-flush-log-at-trx-commit = 1innodb-file-per-table = 1innodb-buffer-pool-size = 12G segments

The format of the MySQL configuration file is centralized, which is usually divided into several parts, and can provide configuration for multiple programs, such as [client], [mysqld], [mysql], and so on. A MySQL program usually reads a segmented part with the same name.

[client] client default setting content

[mysql] default settings when logging in to the mysql database using the mysql command

[mysqld] default settings for the database itself

For example, the server mysqld usually reads the relevant configuration items under the [mysqld] section. If the configuration item is not located correctly, the configuration will not take effect.

GENERAL

First create a user mysql to run the mysqld process, and make sure that the user has permission to manipulate the data directory. Set the default port to 3306, which may be modified for security. The Innodb storage engine is selected by default, which is the best choice in most cases. However, if the default is InnoDB, but you need to use the MyISAM storage engine, configure it explicitly. Many users think that their databases use one storage engine but actually use another because of the default configuration.

Then set the location of the data file, where the pid file and socket file on the same location, of course, you can also choose another location, but it is important to be careful not to put the socket file and pid file on the default location of MySQL compilation, because different versions of MySQL, the default path of the two files may be inconsistent, it is best to clearly set the location of these files to avoid problems when the version upgrade.

In a UNIX-like system, the local connection to MySQL can be based on UNIX domain sockets, which requires a socket file, that is, the mysql.sock file in the configuration. When the MySQL instance starts, it writes its own process ID to a file, which is the pid file. The file can be controlled by the parameter pid-file and is located in the database directory by default. The file name is hostname. pid

DATA STORAGE

Datadir is used to configure the storage location of data files, and there is nothing to say.

Allocate memory for cache

Next, there are many configuration items involving caching, and the most direct factor in the size of the cache setting must be the size of the server memory. If the server runs only MySQL, all memory that does not need to be reserved for OS and query processing can be used in the MySQL cache. Allocating more memory to MySQL cache can effectively avoid disk access and improve database performance. The most important cache in most cases:

InnoDB buffer pool

Operating system caching of InnoDB log files and MyISAM data (MyISAM relies on OS caching data)

MyISAM key cache

Query cache

Caches that cannot be configured, such as bin-log or OS cache for table definition files

There are other caches, but they don't usually use too much memory. With regard to query caching, as described in the previous article (see the first article in this series), we do not recommend enabling query caching in most cases, so query-cache-type=0 in the above configuration means that query caching is disabled and the corresponding query cache size query-cache-size=0. Apart from the query cache, the rest of the cache related to InnoDB and MyISAM will be described in more detail next.

If only a single storage engine is used, it is much easier to configure the server. If you use only the MyISAM table, you can turn off InnoDB completely, while if you use only InnoDB, you only need to allocate the least resources to MyISAM (MySQL internal system tables use the MyISAM engine). However, if you use a mixture of various storage engines, it is difficult to find the right balance between them, so you can only make a guess based on the business, and then make adjustments after observing the health of the server in operation.

MyISAMkey-buffer-size

Key-buffer-size is used to configure the MyISAM key cache size. There is only one key cache by default, but multiple can be created. MyISAM itself only caches indexes and does not cache data (relying on OS to cache data). If most of the tables are MyISAM, you should set up more memory for key caching. But how do you determine how big the setting is?

Assuming that the index size of the table in the entire database is X, it is certainly not necessary to set the cache larger than X, so the current index size becomes an important basis for this configuration item. You can query the size of the current index in two ways:

/ / 1. Query SELECT SUM (INDEX_LENGTH) FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE = 'MYISAM'// 2. Statistical index file size $du-sch `find / path/to/mysql/data/directory/-name "* .MYI" `for example: root@dev-msc3:# du-sch `find / var/lib/mysql-name "* .MYI" `72K / var/lib/mysql/static/t_global_region.MYI40K / var/lib/mysql/mysql/db.MYI12K / var/lib/mysql/mysql/proxies_priv.MYI12K / var/lib/mysql/mysql/tables_priv.MYI4.0K / var/lib/mysql/mysql/func.MYI4.0K / var/lib/mysql/mysql/columns_priv.MYI4.0K / var/lib/mysql/mysql/proc.MYI4.0K / var/lib/mysql/mysql/event.MYI4.0K / var/lib/mysql/mysql/user.MYI4.0K / var/lib/mysql/mysql/procs_priv.MYI4 .0K / var/lib/mysql/mysql/ndb_binlog_index.MYI164K total

You may ask, the newly created database, there is no data at all, the index file size is 0, how to configure the key cache size? This can only be based on experience: no more than 25%-50% of the memory reserved for the operating system cache. Set a basic value, and after running for a period of time, adjust the key cache size according to the operation. To sum up, the index size is less than the 25% or 50% of the OS cache. Of course, you can also calculate the usage of key cache. if you haven't used up all the key caches after a period of time, you can reduce the buffer a little bit, and the utilization of the cache can be calculated by the following formula:

/ / the value of key_blocks_unused can be obtained through SHOW STATUS / / the value of key_cache_block_size can be obtained through SHOW VARIABLES (key_blocks_unused * key_cache_block_size) / key_buffer_size

The key cache block size is an important value because it affects the interaction between MyISAM, OS cache, and file systems. If the cache block is too small, you may encounter reading at write time (OS must read some data from disk before writing data). You can check the relevant knowledge about reading when writing.

Here's one more thing about cache hit rates. What is the significance of cache hit rates? In fact, this number does not have much effect. For example, the gap between 99% and 99.9% looks small, but it actually represents a 10-fold gap. The practical significance of cache hit rate also has a lot to do with applications. Some applications can work well with a hit rate of 99%, while some Imax O-intensive applications may require 99.99%. So empirically, the number of misses per second is actually more useful. For example, 5 misses per second may not cause IO to be busy, but 100 cache misses per second may be problematic.

The number of misses per second cached by MyISAM key can be monitored by the following command:

# calculate the increment of cache misses every 10s # when using this command, please bring the user and password parameters: mysqladmin-uroot-pxxx extended-status-r-I 10 | grep Key_reads$ mysqladmin extended-status-r-I 10 | grep Key_reads

Finally, even if you don't use any MyISAM tables, you still need to set key-buffer-size to a small value, such as 32m, because MyISAM tables are used internally in MySQL, such as GROUP BY statements that may create MyISAM temporary tables.

Myisam-recover

The myisam-recover option is used to configure how MyISAM finds and fixes errors. Turning on this option tells MySQL that when you open the table, check the table for corruption and fix it when a problem is found, which can be set to the following values:

DEFAULT: indicates that it is not set and will attempt to repair a table that crashes or is not completely closed, but no other actions will be performed when the data is recovered

BACKUP: back up the data file to a .bak file for later inspection

FORCE: let the restore action continue even if more than 1 line of data is lost in the .myd file

QUICK: skip recovery unless there is a deleted block

You can set multiple values, each separated by a comma. For example, BACKUP,FORCE in the configuration file forces recovery and creates backups. This configuration is useful when there are only small MyISAM tables, because it is very dangerous for the server to run corrupted MyISAM tables, which can sometimes lead to more data corruption or even server crashes. However, if there is a large table, it will cause the server to check and repair all MyISAM tables when it is opened, which may take a lot of time to check and repair, and during this time, MySQL will prevent the connection from doing anything else, which is obviously impractical.

Therefore, when using the InnoDB storage engine by default, when there are only very small MyISAM tables in the database, you only need to configure key-buffe-size to a very small value (32m) and myisam-recover=BACKUP,FORCE. When most of the tables in the database are MyISAM tables, please configure key-buffer-size reasonably according to the above formula, while myisam-recover can be turned off, and use CHECK TABLES and REPAIR TABLES commands to check and repair after startup, which has less impact on the server.

SAFETY

After the basic configuration settings are in place, MySQL is relatively secure. Here are only two configuration items that need to be paid attention to. If you need to enable some settings to make the server more secure and reliable, you can refer to the MySQL official manual, but it should be noted that some of their options may affect performance. After all, there is a price to pay to ensure security and reliability.

Max-allowed-packet

Max-allowed-packet prevents the server from sending too large packets and controls how many packets the server can receive. The default value is 4m, which may be small. If the setting is too small, sometimes there will be a problem with replication, as shown by the inability to receive replication data from the master library. If there are Blob or Text fields in the table and the amount of data is large, be careful. If the amount of data exceeds the size of this variable, they may be truncated or set to NULL. It is recommended to set it to 16m.

Max-connect-errors

This variable is a security-related counter value in MySQL, which mainly prevents the client from violently cracking passwords. If a client fails to connect to the MySQL server more than n times, MySQL unconditionally forces the client connection to be blocked until the host cache is flushed again or the MySQL server is restarted.

This value defaults to 10, which is too small. Sometimes network problems or application configuration errors lead to continuous attempts to reconnect to the server in a short period of time, and the client will be blacklisted and unable to connect. If you are in an intranet environment, you can confirm that there is no security problem. You can set this value a little higher, and the default value is too easy to cause problems.

LOGGING

Next, let's take a look at the configuration of logs. For MySQL, slow logging and bin-log are two very important kinds of logs. The former can help applications monitor performance problems, while the latter plays a very important role in data synchronization, backup, and so on.

With regard to the three configurations of bin-log, log-bin is used to configure the file storage path, and expire_logs_days allows the server to clean up the old logs after a specified number of days, that is, the configuration retains the logs for the most recent days. Unless an OPS manually backs up and cleans up the bin-log, it is strongly recommended to turn on this configuration. If it is not enabled, the server space will eventually be exhausted, causing the server to get stuck or crash.

Sync-binlog

Sync-binlog controls whether MySQL flushes the bin-log to disk when the transaction is committed. If its value is equal to 0 or greater than 1, MySQL will not flush bin-log to disk after the transaction is committed, which has the highest performance, but there is also the greatest risk, because if the system crashes, bin-log will be lost. When the value is equal to 1, it is the safest, even if the system crashes, at most the unfinished transaction will be lost, which has no substantial impact on the actual data, but the performance is poor.

It is important to note that in versions prior to 5.7.7, the default value for this choice was 0, while the later default value was 1, which is the safest policy. For high concurrency performance, you need to pay attention to this to prevent performance problems after version upgrades.

There is not much to say about the remaining four configuration items.

Log-error: the directory used to configure the error log

Slow-query-log: turn on slow log and turn it off by default

Slow-query-log-file: configure the directory where slow logs are stored

Log-queries-not-using-indexes: if the sql does not use an index, it will be written to the slow log, but whether it is really slow, it needs to be distinguished and turned off by default.

CACHES AND LIMITStmp-table-size & & max-heap-table-size

These two configurations control how much memory can be used by memory temporary tables that use the Memory engine. If the size of the implicit temporary table exceeds these two values, it will be converted to a disk MyISAM table (the implicit temporary table is created by the server, and the user saves the intermediate results of the query in progress).

If the query statement does not create a large temporary table (avoided by reasonable indexing and query design), you can make this value larger to avoid the need to convert memory temporary tables to disk temporary tables. But beware of setting this value too high, if the query does create a large temporary table, then it is better to use disk, after all, the amount of memory required will increase dramatically.

You should simply set the two variables to the same value. 32m is selected here. You can guide your setting by carefully examining the created-tmp-disk-tables and created-tmp-tables variables. The values of these two variables will show how often temporary tables are created.

Query-cache-type & & query-cache-size

Look ahead

Max-connections

It is used to set the maximum number of connections for users to ensure that the server will not be overwhelmed by the proliferation of connections to the application. If there is a problem with the application, or if the server encounters a connection delay, many new connections will be created. But if these connections cannot execute queries, it is no good to open a connection, so being rejected by "too many connections" errors is a quick and inexpensive way to fail.

If server resources permit, the max-connections can be set large enough to accommodate the load that might normally be reached. If you think there will be 300 or more connections under normal circumstances, you can set it to 500 or more (for peak periods). The default value is 100, which is too small, which is set to 500 here, but it does not mean that it is a reasonable value. You should monitor how many connections there are in the application, and then set it according to the monitoring value (observe the changes of max_used_connections over time).

Thread-cache-size

The thread cache holds threads that are not currently associated with the connection but are ready to serve later new connections. When a new connection is created, if a thread exists in the cache, MySQL removes a thread from the cache and assigns it to the new connection. When the connection is closed, MySQL puts the thread back into the cache if there is still room in the thread cache. If there is no space, MySQL destroys the thread. As long as MySQL has free threads in the cache, it can respond quickly to connection requests, because there is no need to create new threads for each connection. Thread-cache-size specifies the number of threads that MySQL can hold in the cache. If the server does not have a lot of connection requests, you generally do not need to configure this value.

How to determine how big this value should be set?

Observe the threads-connected variable. If threads-connected is between 100 and 120, then thread-cache-size is set to 20. If it stays at 500-700200, the thread cache should be large enough. It can be understood that when there are 700 connections at the same time, there may be no threads in the cache. At 500 connections, 200 cached threads are ready to be used when the load increases again to 700 connections.

Open-files-limit

We set it as large as possible on Uinux-like systems. In modern OS, the overhead of opening handles is very small, and if this parameter is set too small, you may encounter a "too many open files too many files" error.

Tablecachesize

Table caching is similar to thread caching, but the stored object is the table, which contains the parsing results of the table .frm file and some other data. To be exact, the cached data depends on the storage engine, for example, for MyISAM, caching the data of the table and the file descriptor of the index. Table caching is much less important to InnoDB's storage engine because InnoDB doesn't rely on it to do so much.

Since version 5. 1, table caching has been divided into two parts: open table cache and define table cache, configured by table-open-cache-size and table-definition-cache-size variables, respectively. You can usually set table-definition-cache-size high enough to cache all table definitions because most storage engines benefit from table-definition-cache.

INNODB

InnoDB should be the most widely used storage engine, and the most important configuration options are the following two: innodb-buffer-pool-size and innodb-log-file-size. Solving these two configurations basically solves most of the configuration problems in real scenarios.

Innodb-buffer-pool-size

If most of them are InnoDB tables, then the InnoDB buffer pool probably needs memory more than anything else, and the data buffered by the InnoDB buffer pool: indexes, row data, adaptive hash indexes, insert buffers, locks, and other internal data structures. InnoDB also uses buffer pools to help delay writes so that multiple writes can be merged and then written sequentially together to improve performance. In summary, InnoDB relies heavily on buffer pools and must allocate enough memory for it.

Of course, if the amount of data is small and does not grow rapidly, there is no need to allocate too much memory to the buffer pool, and it doesn't really make sense to configure the buffer pool to be much larger than the tables and indexes that need to be cached. Large buffer pools can also pose challenges, for example, warm-up and shutdown can take a long time. If there are many dirty pages in the buffer pool, it may take a long time to write the dirty pages back to the data file when InnoDB is turned off. Although it can be shut down quickly, more recovery work needs to be done at startup, which means we can't accelerate both shutdown and restart at the same time. When there is a large buffer pool, restarting the service takes a long time (hours or days) to warm up, especially if the disk is slow, if you want to speed up the warm-up time, you can do a full table scan or index scan immediately after restart. load the index into the buffer pool.

You can see that this value is configured to 12G in the sample configuration file, which is not a standard configuration and needs to be estimated based on the specific hardware. How do you estimate that?

In the previous section, we mentioned that there are five most important caches in MySQL, which can be simply calculated using the following formula:

InnoDB buffer pool = total server memory-OS reservation-memory occupied by other applications on the server-memory needed by MySQL itself-memory occupied by InnoDB log files-other memory (MyISAM key cache, query cache, etc.)

Specifically, you need to reserve at least 1mm 2G memory for OS. If the machine memory is large, you can reserve more memory. It is recommended that 2GB and 5% of the total memory be used as the benchmark, whichever is larger. If there are still some memory-intensive tasks running on the machine, such as backup tasks, you can reserve more memory for OS. Don't add any memory to the OS cache, because OS usually uses all the remaining memory for file caching.

In general, servers running MySQL rarely run other applications, but if any, reserve enough memory for those applications.

MySQL itself requires some memory to run, but it's usually not too large. You need to consider the amount of memory needed for each connection in MySQL. Although each connection requires very little memory, it also requires a basic amount of memory to execute any given query, and temporary table memory needs to be allocated for sorting, GROUP BY, and so on during the query process, so you need to reserve enough memory for a large number of queries to be executed at peak times. How big is this memory? It can only be monitored during operation.

If most of the tables are InnoDB, a very small value for the MyISAM key cache is sufficient, and it is recommended that the query cache be turned off.

All that's left in the formula is the InnoDB log file, which is what we're going to talk about next.

Innodb-log-file-size & & innodb-log-files-in-group

If there are a large number of writes to the InnoDB data table, choosing the appropriate innodb-log-file-size value is important to improve MySQL performance. InnoDB uses logs to reduce the overhead of committing transactions. Transactions are recorded in the log, so there is no need to flush the dirty blocks of the buffer pool (pages in the cache that are inconsistent with the data on disk) to disk when each transaction is committed. Transactional modified data and indexes are usually mapped to random locations in the tablespace, so refreshing these changes to disk requires a lot of random Ihand O. Once the log is safely written to disk, the transaction is persisted, and even if the change has not been written to the data file, InnoDB can replay the log and restore committed transactions in the event of unexpected events (such as a power outage).

InnoDB uses a background thread to intelligently refresh these changes to the data file. In fact, the transaction log converts the random Imax O of the data file into the almost sequential log file and data file Imax O, so that the refresh operation can be completed more quickly in the background, and the cache Imax O pressure.

The overall log file size is controlled by two parameters, innodb-log-file-size and innodb-log-files-in-group, which are very important for write performance. The total size of the log file is the sum of the size of each file. By default, there are only two 5m files, a total of 10m, which is too small for high-performance work, requiring at least a few hundred megabytes or G log files. Pay attention to the parameter innodb-log-files-in-group here, which controls the number of log files. From the name, it looks like configuring a log group has several files. In fact, loggroup represents a collection of files for redo logs. There are no parameters and there is no need to configure how many log groups.

To change the size of the log file, you need to completely shut down MySQL, then migrate the old log file somewhere else, reconfigure the parameters, and then restart. When restarting, you need to migrate the old log back, and then wait for MySQL to recover the data before deleting the old log file. Be sure to check the error log and confirm that the MySQL restart is successful before deleting the old log file.

To determine the ideal log file size, you need to weigh the cost of normal data changes and the time it takes to recover in the event of a crash. If the log is too small, InnoDB will have to do more checkpoints, resulting in more log writes, and in rare cases, write statements will be dragged down, waiting for changes to be flushed to the data file before the log has no room to continue writing. On the other hand, if the log is too large, it takes a lot of work to recover in the event of a crash, which may increase the recovery time. InnoDB uses the checkpoint mechanism to refresh and recover data, which speeds up the time to recover data. For more information, please see:

MySQL-checkpoint technology

How InnoDB performs a checkpoint

Innodb-flush-log-at-trx-commit

I discussed a lot of caching earlier, and InnoDB logs also have caches. When InnoDB changes any data, a change record is written to the log cache. When buffering is slow, when a transaction commits, or every second, InnoDB flushes the log of the buffer to the log file on disk. If there are large transactions, increasing the size of the log buffer can help reduce the size of the log buffer, and the variable innodb-log-buffer-size controls the size of the log buffer. Usually, there is no need to set the log buffer to be very large. After all, any of the above three conditions will flush the contents of the buffer to disk, so there is certainly not too much data in the buffer. There are a lot of BLOB records in and out of your data. Generally speaking, you can configure 1M~8M.

Since there is a buffer, how to flush the log buffer is what we need to pay attention to. The log buffer must be flushed to disk to ensure that committed transactions are fully persisted. If you care more about performance than persistence, you can modify the innodb-flush-log-at-trx-commit variable to control how often the log buffer is refreshed.

0: write the log buffer to the log file and flush it to disk every 1 second, and no processing is done when the transaction is committed

1: each time a transaction commits, the log buffer is written to the log file and flushed to disk

2: write the log buffer to the log file each time the transaction is committed, and then refresh it to disk every second

1 is the safest setting to ensure that no committed transactions are lost, which is also the default setting. The main difference between 0 and 2 is that if the MySQL fails, 2 will not lose the transaction, but 0 is possible, 2 will at least flush the log buffer to the operating system cache each time the transaction commits, while 0 will not. If the entire server is down or power is off, some transactions may still be lost.

Innodb-flush-method

What kind of strategy is used to refresh, and when to refresh the log or data? how exactly does InnoDB refresh the data? Use the innodb-flush-method option to configure how InnoDB interacts with the file system. From the name point of view, you will think that it can only affect how InnoDB writes the data, in fact, it also affects how InnoDB reads the data. The values of this option under windows and non-Windows operating systems are mutually exclusive, that is, some values can only be used under Windows and some can only be used under non-Windows. Available values under Windows: async_unbuffered, unbuffered, normal, Nosync and littlesync, and non-Windows values: fdatasync, 0_DIRECT, 0_DSYNC.

This option affects both log and data files, and sometimes different types of files are treated differently, making this option somewhat difficult to understand. It would be better if there is an option to configure the log file and an option to configure the data file, but they are actually mixed in the same configuration item. Only options for Unix-like operating systems are described here.

Fdatasync

InnoDB calls the fsync () and fdatasync () functions to refresh the data and log files, where fdatasync () only brushes the data of the file, but does not contain metadata (for example, system data describing the characteristics of the file, such as access rights, file owner, last modification time, etc.), so fsync () will generate more fsync O than fdatasync (), but in some scenarios fdatasync () will cause data corruption. So the InnoDB developer decided to use fsync () instead of fdatasync ().

The disadvantage of fsync () is that the operating system buffers some data in its own cache. In theory, double buffering is wasteful because InnoDB manages buffering itself and is smarter than the operating system. But double buffering is not necessarily a bad thing if the file system can have smarter Icano scheduling and batch operations:

Some file systems and os can accumulate writes and then merge them to improve efficiency by reordering Icano, or write to multiple devices concurrently.

Some can also do pre-read optimization, such as continuously requesting several sequential blocks, which will inform the hard disk to pre-read the next block.

These optimizations only work in specific scenarios, and fdatasync is the default value for innodb-flush-method.

0_DIRCET

This setting does not affect log files and is not valid on all Unix-like systems, but at least it is supported in Linux, FreeBSD, and Solaris. This setting still uses fsync to flush files to disk, but it completely turns off the operating system cache, and all reads and writes are set directly through the storage setting, avoiding double buffering. If the storage device supports write buffering or pre-reading, then this option does not affect the device's settings, such as raid cards.

0_DSYNC

This option synchronizes all writes, that is, only writes are returned after the data is written to disk, but it only affects the log files, not the data files.

After talking about the role of each configuration, there are some suggestions: if you use a Unix-like operating system and the RAID controller has a battery-protected write cache, it is recommended to use 0DIRECT, if not, the default or 0DIRECT may be the best choice.

Innodb-file-per-table

Finally, let's talk about InnoDB tablespaces. InnoDB stores data in tablespaces, which is essentially a virtual file system made up of one or more disk files. InnoDB tablespaces not only store tables and indexes, but also hold rollback logs, insert buffers, double write buffers, and other internal data structures, in addition to many other functions. You can customize tablespace files through innodb-data-file-path configuration items, and innodb-data-home-dir configures the location of tablespace files, such as:

Innodb-data-home-dir = / var/lib/mysql innodb-data-file-path = ibdata1:1G;ibdata2:1G;ibdata3:1G

Here, 3G tablespaces are created in three files, and in order to allow the tablespaces to grow when the allocated space is exceeded, you can configure the last file to expand automatically like this.

Innodb-data-file-path = ibdata1:1G;ibdata2:1G;ibdata3:1G:autoextend

The innodb-file-per-table option allows InnoDB to use one file for each table, which makes it much easier to reclaim space when deleting a table, and it is particularly easy to manage, and you can determine the table size by looking at the file size, so it is recommended that you open this configuration.

Summary

There are so many configuration items in MySQL that there is no way to enumerate them one by one. It is important to understand how each configuration works and, starting with a basic configuration file, set basic options that match the server's hardware and software environment and workload.

references

Baron Scbwartz et al.; Ning Haiyuan Zhou Zhenxing et al.; High performance MySQL (third Edition); Electronic Industry Press, 2013

This article has been synchronously updated to the official account of Wechat: understatement CODE > > I have to tell you the principle of MySQL optimization 3

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.