In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail how to optimize the index configuration in mysql. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.
2 Index and query optimization
Type of index
General index: this is the most basic index type, and there are no restrictions such as uniqueness.
Unique index: basically the same as a normal index, but all index column values remain unique.
Primary key: the primary key is a unique index, but must be specified as "PRIMARY KEY".
Full-text indexing: MYSQL supports full-text indexing and full-text retrieval since 3.23.23. In MYSQL, the index type of the full-text index is FULLTEXT. A full-text index can be created on a column of type VARCHAR or TEXT.
Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in the B-tree. Indexes of spatial column types use R-trees, and MEMORY tables support hash indexes.
Single-column index and multi-column index (composite index)
An index can be a single-column index or a multi-column index. Using indexes on related columns is one of the best ways to improve the performance of SELECT operations.
Multi-column index:
MySQL can create indexes on multiple columns. An index can contain 15 columns. For some column types, you can index the left prefix of the column, and the order of the column is very important.
A multi-column index can be thought of as an array containing a sort of values created by concatenating the values of an index column. Generally speaking, even the most restrictive single-column index is far less restrictive than multi-column indexes.
Leftmost prefix
A feature of multi-column indexes is the leftmost prefix (Leftmost Prefixing). If a multi-column index is key (firstname lastname age), MySQL will use the multi-column index when the search criteria are the combination and order of the following columns:
Firstname,lastname,age
Firstname,lastname
Firstname
In other words, key (firstname lastname) and key (firstname) are also established.
The index is mainly used for the following operations:
Quickly find lines that match a WHERE clause.
Delete rows. When performing a join, retrieve rows from other tables.
Find the MAX () or MIN () values for the specific indexed column key_col. Optimized by the preprocessor to check that WHERE key_part_# = constant is used for all keyword elements that occur before key_col in the index. In this case, MySQL performs a keyword lookup for each MIN () or MAX () expression and replaces it with a constant. If all expressions are replaced with constants, the query returns immediately. For example:
SELECT MIN (key2), MAX (key2) FROM tb WHERE key1=10
If you sort or group the leftmost prefix of an available keyword (for example, ORDER BY key_part_1,key_part_2), sort or group a table. If all keyword elements are followed by DESC, the keyword is read in reverse order.
In some cases, a query can be optimized so that values can be retrieved without querying data rows. If the query uses only numeric columns from a table that make up the leftmost prefix of some keywords, values can be retrieved from the index tree for faster.
SELECT key_part3 FROM tb WHERE key_part1=1
Sometimes MySQL does not use indexes, even if an index is available. One scenario is when the optimizer estimates that using an index will require MySQL to access most of the rows in the table. In this case, the table scan may be faster. However, if such a query uses LIMIT to search for only part of the rows, MySQL uses the index because it can find several rows faster and return them in the results. For example:
Reasonable suggestions for indexing:
(1) smaller data types are usually better: smaller data types usually require less space in disk, memory, and CPU caches and are faster to process.
(2) simple data types are better: integer data is less expensive to process than characters, because the comparison of strings is more complex. In MySQL, you should use built-in date and time data types instead of strings to store time; and use integer data types to store IP addresses.
(3) avoid NULL as much as possible: you should specify the column NOT NULL unless you want to store NULL. In MySQL, columns with null values are difficult to optimize because they make indexes, index statistics, and comparison operations more complex. You should replace null values with 0, a special value, or an empty string.
This section is about some trivial suggestions and notes that should be paid attention to when indexing and writing SQL statements.
1. Use LIMIT 1 when the result set has only one row of data
two。 Avoid SELECT * and always specify the columns you need
The more data you read from the table, the slower the query becomes. He increases the amount of time the disk needs to operate, or if the database server is separate from the WEB server. You will experience a very long network delay simply because data is transferred unnecessarily between servers.
3. Use joins (JOIN) instead of subqueries (Sub-Queries)
Connect (JOIN).. It is more efficient because MySQL does not need to create temporary tables in memory to complete this logically two-step query.
4. Use ENUM, CHAR instead of VARCHAR, and use a reasonable field attribute length
5. Use NOT NULL whenever possible
6. Fixed length watches will be faster.
7. Split large DELETE or INSERT statements
8. The smaller the column of the query, the faster
Where condition
In the query, the WHERE condition is also an important factor, and it is very important to have as few and reasonable where conditions as possible. When there are multiple conditions, put the condition that will extract as little data as possible first, and reduce the query time of the latter where condition.
Some where conditions cause the index to be invalid:
In the query condition of the where clause! = MySQL will not be able to use the index.
The index will be invalid when the Mysql function is used in the where clause, for example: select * from tb where left (name, 4) = 'xxx'
When using LIKE for search matching, the index is valid: select * from tbl1 where name like 'xxx%', and invalid index when like'% xxx%'
III. Configuration optimization
After installing MySQL, the configuration file my.cnf is located in the / MySQL installation directory / share/mysql directory, which also contains several configuration files for reference, including my-large.cnf, my-huge.cnf and my-medium.cnf,my-small.cnf, respectively corresponding to the configuration of large, medium and small database applications. The .ini file that exists in the MySQL installation directory in the win environment.
The main variables that have a great impact on performance optimization are listed below, mainly divided into connection request variables and buffer variables.
1. Variables for connection requests:
1) max_connections
The maximum number of connections for MySQL, increasing this value increases the number of file descriptors required by mysqld. If the server has a large number of concurrent connection requests, it is recommended to increase this value to increase the number of parallel connections. Of course, this is based on the case that the machine can support, because if the number of connections is more, between MySQL will provide a connection buffer for each connection, the more memory will be spent, so adjust this value appropriately, not blindly increase the setting value.
ERROR 1040: Too many connections errors often occur when the value is too small. You can check the number of connections in the current state through the 'conn%' wildcard to determine the size of this value.
Maximum number of connections in show variables like 'max_connections'
Number of connections to show status like 'max_used_connections' response
As follows:
Mysql > show variables like 'max_connections'
+-+ +
| | Variable_name | Value |
+-+ +
| | max_connections | 256 |
+-+ +
Mysql > show status like 'max%connections'
+-+ +
| | Variable_name | Value |
+-+ +
| | max_used_connections | 256 |
+-+ +
Max_used_connections / max_connections * 100% (ideal value ≈ 85%)
If the max_used_connections is the same as max_connections, then the max_connections setting is too low or exceeds the server load limit, and less than 10% is too large.
2) back_log
The number of connections that MySQL can hold temporarily. This works when the main MySQL thread gets very many connection requests in a very short period of time. If the connection data of MySQL reaches max_connections, the new request will be stored in the stack to wait for a connection to release resources, the number of the stack is back_log, if the number of waiting connections exceeds back_log, connection resources will not be granted.
The back_ log value indicates how many requests can be stored on the stack in a short period of time before MySQL temporarily stops answering new requests. Only if you expect to have many connections in a short period of time, you need to increase it, in other words, this value is the size of the listening queue for incoming TCP/IP connections.
When looking at the list of your host processes (mysql > show full processlist), you can find a large number of 264084 | unauthenticated user | xxx.xxx.xxx.xxx | Connect | NULL | login | NULL processes to be connected, you should increase the value of back_log.
The default value is 50 and can be tuned to 128 for integers with a system setting range of less than 512.
3) interactive_timeout
The number of seconds an interactive connection waits for action before being shut down by the server. An interactive customer is defined as a customer who uses the CLIENT_INTERACTIVE option for mysql_real_connect ().
The default value is 28800 and can be tuned to 7200.
two。 Buffer variable
Global buffering:
4) key_buffer_size
Key_buffer_size specifies the size of the index buffer, which determines the speed of index processing, especially the speed of index reads. By checking the status values Key_read_requests and Key_reads, you can see whether the key_buffer_size setting is reasonable. The key_reads / key_read_requests ratio should be as low as possible, at least 1 SHOW STATUS LIKE 100 and 1 SHOW STATUS LIKE 1000 is better. (the above state values can be obtained using the state value).
Key_buffer_size only works on MyISAM tables. Use this value even if you do not use the MyISAM table, but the internal temporary disk table is the MyISAM table. You can use the check status value created_tmp_disk_tables to learn more.
Examples are as follows:
Mysql > show variables like 'key_buffer_size'
+-+ +
| | Variable_name | Value |
+-+ +
| | key_buffer_size | 536870912 | |
+-+ +
Key_buffer_size is 512MB. Let's take a look at the usage of key_buffer_size:
Mysql > show global status like 'key_read%'
+-+ +
| | Variable_name | Value |
+-+ +
| | Key_read_requests | 27813678764 |
| | Key_reads | 6798830 | |
+-+ +
There are 27813678764 index read requests, of which 6798830 are not found in memory to read the index directly from the hard disk. Calculate the probability that the index misses the cache:
Key_cache_miss_rate = Key_reads / Key_read_requests * 100%, it is better to set it to around 1max 1000.
The default configuration value is 8388600 (8m). The host has 4GB memory and can be tuned to 268435456 (256MB).
5) query_cache_size
Using query buffering, MySQL stores the query results in a buffer, and in the future, for the same SELECT statement (case sensitive), the results will be read directly from the buffer.
You can see whether the query_cache_size setting is reasonable by checking the status value Qcache_*, (the above status values can be obtained using SHOW STATUS LIKE 'Qcache%'). If the value of Qcache_lowmem_prunes is very large, it means that there is often insufficient buffering; if the value of Qcache_hits is also very large, it means that query buffering is used very frequently, and you need to increase the buffer size at this time; if the value of Qcache_hits is not large, it means that your query repetition rate is very low, in this case, using query buffering will affect efficiency, then you can consider not using query buffering. In addition, adding SQL_NO_CACHE to the SELECT statement makes it clear that query buffering is not used.
Other parameters related to query buffering are query_cache_type, query_cache_limit, and query_cache_min_res_unit.
Query_cache_type specifies whether to use query buffering, which can be set to 0, 1, 2, which is a SESSION-level variable.
Query_cache_limit specifies the size of the buffer that can be used by a single query, which defaults to 1m.
Query_cache_min_res_unit, introduced after version 4.1, specifies the minimum unit of buffer space allocated, which defaults to 4K. Check the status value Qcache_free_blocks, which, if very large, indicates that there is a lot of fragmentation in the buffer, which indicates that the query results are relatively small, and you need to reduce query_cache_min_res_unit at this point.
Examples are as follows:
Mysql > show global status like 'qcache%'
+-+ +
| | Variable_name | Value |
+-+ +
| | Qcache_free_blocks | 22756 | |
| | Qcache_free_memory | 76764704 | |
| | Qcache_hits | 213028692 | |
| | Qcache_inserts | 208894227 | |
| | Qcache_lowmem_prunes | 4010916 | |
| | Qcache_not_cached | 13385031 | |
| | Qcache_queries_in_cache | 43560 | |
| | Qcache_total_blocks | 111212 | |
+-+ +
Mysql > show variables like 'query_cache%'
+-+ +
| | Variable_name | Value |
+-+ +
| | query_cache_limit | 2097152 | |
| | query_cache_min_res_unit | 4096 | |
| | query_cache_size | 203423744 | |
| | query_cache_type | ON |
| | query_cache_wlock_invalidate | OFF |
+-+ +
Query cache fragmentation rate = Qcache_free_blocks / Qcache_total_blocks * 100%
If the query cache fragmentation rate exceeds 20%, you can use FLUSH QUERY CACHE to defragment the cache, or try to reduce query_cache_min_res_unit, if your query is a small amount of data.
Query cache utilization = (query_cache_size-Qcache_free_memory) / query_cache_size * 100%
If the query cache utilization is below 25%, it means that the query_cache_size setting is too large and can be reduced appropriately; if the query cache utilization is above 80% and the Qcache_lowmem_prunes is more than 50, the query_cache_size may be a little small, or there may be too many fragments.
Query cache hit ratio = (Qcache_hits-Qcache_inserts) / Qcache_hits * 100%
Example server query cache fragmentation rate = 20.46%, query cache utilization = 62.26%, query cache hit rate = 1.94%, the hit rate is very poor, maybe write operations are more frequent, and there may be some fragments.
Buffer per connection
6) record_buffer_size
Each thread that performs a sequential scan allocates a buffer of this size to each table it scans. If you do a lot of sequential scans, you may want to increase this value.
The default value is 131072 (128K), which can be changed to 16773120 (16m)
7) read_rnd_buffer_size
Random read buffer size. When rows are read in any order (for example, in sort order), a random read cache is allocated. When sorting a query, MySQL scans the buffer first to avoid disk search and improve query speed. If you need to sort a large amount of data, you can increase the value appropriately. However, MySQL allocates this buffer space for each customer connection, so you should try to set this value appropriately to avoid excessive memory overhead.
It can generally be set to 16m
8) sort_buffer_size
Each thread that needs to sort allocates a buffer of that size. Increase this value to speed up ORDER BY or GROUP BY operations.
The default value is 2097144 (2m), which can be changed to 16777208 (16m).
9) join_buffer_size
Buffer size that can be used by federated query operations
Record_buffer_size,read_rnd_buffer_size,sort_buffer_size,join_buffer_size is exclusive to each thread, that is, if there are 100 thread connections, the occupancy is 16M*100
10) table_cache
Size of the table cache. Whenever MySQL accesses a table, if there is space in the table buffer, the table is opened and placed in it, allowing faster access to the table contents. By checking the state values of peak time Open_tables and Opened_tables, you can decide whether you need to increase the value of table_cache. If you find that open_tables equals table_cache and opened_tables is growing, then you need to increase the value of table_cache (the above state values can be obtained using SHOW STATUS LIKE 'Open%tables'). Note that table_cache cannot be set to a large value blindly. If set too high, it may result in insufficient file descriptors, resulting in unstable performance or connection failure.
One gigabyte memory machine, the recommended value is 128mur256. Servers with memory around 4GB this parameter can be set to 256m or 384m.
11) max_heap_table_size
The size of the memory table (memory table) that the user can create. This value is used to calculate the maximum row value of the memory table. This variable supports dynamic change, i.e. set @ max_heap_table_size=#
This variable, along with tmp_table_size, limits the size of internal memory tables. If the size of an internal heap (stacking) table exceeds tmp_table_size,MySQL, you can automatically change the in-memory heap table to a hard disk-based MyISAM table as needed.
12) tmp_table_size
Increase the size of a temporary table by setting the tmp_table_size option, such as a temporary table generated by advanced GROUP BY operations. If you increase this value, MySQL will also increase the size of the heap table, which can improve the speed of the join query. It is recommended to optimize the query as far as possible to ensure that the temporary table generated during the query process is in memory, so as to avoid the generation of MyISAM table based on hard disk due to the temporary table being too large.
Mysql > show global status like 'created_tmp%'
+-+ +
| | Variable_name | Value |
+-+ +
| | Created_tmp_disk_tables | 21197 | |
| | Created_tmp_files | 58 | |
| | Created_tmp_tables | 1771587 | |
+-+ +
Each time a temporary table is created, Created_tmp_tables increases. If the temporary table size exceeds tmp_table_size, the temporary table is created on disk, and the Created_tmp_disk_tables is also increased. Created_tmp_files represents the number of temporary files created by the MySQL service. The ideal configuration is:
Created_tmp_disk_tables / Created_tmp_tables * 100%
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.