Combing the basic knowledge points of MySQL 12/06 Update SLTechnology News&Howtos

Combing the basic knowledge points of MySQL

2025-12-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains the "carding of basic knowledge points of MySQL". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "carding of basic knowledge points of MySQL".

Catalogue

I. Index correlation

2. Useful information in EXPLIAN

III. Field types and codes

IV. Summary of SQL sentences

Step on the pit

60,000,000 large tables are modified online

7. Slow query log

8. View sql process and kill process

IX. Thoughts on the performance of some databases

This paper mainly summarizes some common operations in the work, as well as unreasonable operations, and some useful data and information collected during the optimization of slow queries. This paper is suitable for developers with mysql foundation.

I. Index correlation

1. Index cardinality: the cardinality is the number of different values contained in the data column. For example, if a data column contains values 1, 3, 7, 4, 7, 3, its cardinality is 4. The index works best when the cardinality of the index is higher than the number of rows in the data table (that is, the column contains many different values, with few duplicates). If a data column contains many different ages, the index quickly distinguishes the data rows. If a data column is used to record gender (there are only "M" and "F" values), then the index is of little use. If the probability of occurrence of values is almost equal, then no matter which value you search for, you may get half of the data rows. In these cases, it is best not to use an index at all, because when the query optimizer finds that a value appears in a high percentage of the table's data rows, it generally ignores the index and performs a full table scan. The usual percentage line is "30%".

2. Reasons for index failure:

1. For index column operations, the operations include (+, -, *, /,!,%, like'%_' (% in front)

two。 Type error, such as field type varchar,where conditional with number.

3. Apply internal functions to the index, in which case you should establish a function-based index

For example, select * from template t where ROUND (t.logicdb_id) = 1

At this time, ROUND (t.logicdb_id) should be built as the index. Mysql8.0 starts to support functional indexes. 5.7. it can be supported by virtual columns. Previously, only one new ROUND (t.logicdb_id) column can be created and maintained.

4. If the condition has or, it will not be used even if there is a conditional index (which is why it is recommended to use less or). If you want to use or and want the index to be valid, you can only add an index to each column in the or condition.

5. If the column type is a string, be sure to use quotation marks in the data in the condition, otherwise do not use the index

6.B-tree index is null can't walk, is not null can walk, bitmap index is null,is not null can go.

7. The combined index follows the leftmost principle.

Establishment of index

1. The most important thing must be based on the statements frequently queried by the business

2. Try to select the highly differentiated column as the index, and the formula of distinguishing degree is COUNT (DISTINCT col) / COUNT (*). Indicates the rate at which fields are not duplicated. The higher the ratio, the fewer records we scan.

3. If the unique feature in the business is best to establish a unique key, on the one hand, the correctness of the data can be guaranteed, on the other hand, the efficiency of the index can be greatly improved.

Second, the basic usage of useful information in EXPLIAN

1. Desc or explain plus your sql

2. Extended explain plus your sql, and then you can view the actual executed statements through show warnings, which is also very useful. Many times, the actual executed code is the same after different writing methods are analyzed by sql.

Characteristics that improve performanc

1. Index covering index: all the data to be queried can be found on the index. There is no need to go back to the table and display using index in the EXTRA column.

2. ICP feature (Index Condition Pushdown): originally, index is only an access mode of data access. The data obtained by the memory engine through indexing back to the table will be passed to the MySQL server layer for where conditional filtering. When version 5.6 is opened, if some where conditions can use the indexed fields, MySQL server will push this part down to the engine layer, and you can use the where condition of index filtering to filter the data at the storage engine layer. EXTRA displays using index condition. Need to understand the architecture diagram of mysql is divided into server and storage engine layer

3. Index merging (index merge): multiple indexes are conditionally scanned separately, and then their respective results are combined (intersect/union). OR is generally used, and if it is an AND condition, consider establishing a composite index. The index type displayed by EXPLAIN will show that index_merge,EXTRA will show the specific merge algorithm and the index used.

Extra field

1. Using filesort: indicates that MySQL sorts the data using an external index instead of reading according to the index order within the table. The sorting operation that cannot be done by index in MySQL is called "file sorting". In fact, it is not necessarily file sorting, but fast sorting is used internally.

2. Using temporary: temporary tables are used to save intermediate results, and MySQL uses temporary tables when sorting query results. Common in sorting order by and grouping query group by

3. Using index: indicates that the overlay index (Covering Index) is used in the corresponding SELECT operation to avoid accessing the data rows of the table, which is very efficient.

4. The value of the impossible where: WHERE clause is always false and cannot be used to get any tuples

5. Select tables optimized away: optimize the MIN/MAX operation based on the index or optimize the COUNT (*) operation for the MyISAM storage engine without the GROUP BY clause. You do not have to wait until the execution phase to calculate. The optimization is completed in the query execution plan generation phase.

6. Distinct: optimize the distinct operation and stop looking for the same value after finding the first matching meta-ancestor.

When using filesort,using temporary appears, it should be noted that these two items consume a lot of performance. when using group by, although order by is not used, if there is no index, using filesort,using temporary may appear at the same time, because group by is sorted first in the grouping, and if there is no need for sorting, you can add an order by NULL to avoid sorting, so using filesort will be removed and can improve performance a little.

Type field

System: the table has only one row of records (equal to the system table). This is a special case of const type and usually does not appear.

Const: if found in turn by index, const is used to compare primary key indexes or unique indexes. Because only one row of data can be matched, it is very fast. If you place the primary key in the where list, MySQL can convert the query to a constant

Eq_ref: unique index scan, with only one record in the table matching for each index key. Common in primary key or unique index scans

Ref: a non-unique index scan that returns all rows that match a single value. It is essentially an index access, which returns all rows that match a single value, but it may find multiple qualified rows, so it should be a mixture of lookups and scans.

Range: only retrieve a given range of rows, using an index to select rows. The key column shows which index is used, that is, queries such as between, in, etc., appear in your where statements. This range scan index is better than a full table scan, because it only needs to start at one point of indentation and end at another point, without scanning all the indexes.

The difference between index:Full Index Scan, index and ALL is that the index type only traverses the index tree, which is usually faster than ALL because the index file is usually smaller than the data file. (that is, although both ALL and index read the whole table, index is read from the index and ALL is read from the hard disk.)

All:Full Table Scan, traversing the entire table to get matching rows

III. Field types and codes

1. Mysql returns string length: the CHARACTER_LENGTH method (like CHAR_LENGTH) returns the number of characters, and the LENGTH function returns the number of bytes. A Chinese character has three bytes.

2. Varvhar and other fields to establish index length calculation statement: select count (distinct left (test,5)) / count (*) from table; as close to 1 as possible

3. The maximum utf8 of mysql is 3 bytes and does not support emoji emoticons. Only utf8mb4 must be used. You need to configure the client character set to utf8mb4 in the mysql configuration file. The connection string of jdbc does not support the configuration of characterEncoding=utf8mb4. The best way is to specify the initialization sql in the connection pool, for example: hikari connection pool. Other connection pools are similar to spring.datasource.hikari.connection-init-sql=set names utf8mb4. Otherwise, you need to execute set names utf8mb4 before each sql execution.

4. Msyql collation (generally use _ bin and _ genera_ci):

Utf8_genera_ci is not case-sensitive, and ci is an abbreviation for case insensitive, that is, case-insensitive

Utf8_general_cs is case-sensitive, and cs is the abbreviation of case sensitive, that is, case-sensitive. However, sorting rules similar to * * _ genera_cs are no longer supported in MySQL versions, and utf8_bin is directly used instead.

Utf8_bin stores each character in a string in binary data, case-sensitive.

So, also case-sensitive, what's the difference between utf8_general_cs and utf8_bin?

Cs stands for case sensitive, which is case-sensitive; bin means binary, that is, binary coding comparison.

Under the utf8_general_cs collation, even if it is case-sensitive, some Western European characters are not distinguished from Latin characters, such as ä = a, but sometimes they do not need ä = a, which is why there is utf8_bin.

The characteristic of utf8_bin is that it uses the binary encoding of characters for operation, and any different binary coding is different, so under the utf8_bin collation: ä a

5. Specify the encoding type of the initial connection in sql yog using the initialization command of the connection configuration

Fourth, the SQL sentence summarizes the commonly used but easy to forget:

1. If there is a primary key or unique key conflict, do not insert: insert ignore into

2. Update if there is a primary key or unique key conflict. Note that this will affect the increment of self-increment: INSERT INTO room_remarks (room_id,room_remarks) VALUE (1, "sdf") ON DUPLICATE KEY UPDATE room_remarks= "234"

3. Use a new replacement if there is one. If values does not contain self-incrementing columns, the value of the self-incrementing columns will change: REPLACE INTO room_remarks (room_id,room_remarks) VALUE (1, "sdf")

4. Backup form: CREATE TABLE user_info SELECT * FROM user_info

5. Copy table structure: CREATE TABLE user_v2 LIKE user

6. Import: INSERT INTO user_v2 SELECT * FROM user or INSERT INTO user_v2 (id,num) SELECT id,num FROM user from the query statement

7. Update the linked table: UPDATE user a, room b SET a.num=a.num+1 WHERE a.room_id=b.id

8. Delete the linked table: DELETE user FROM user,black WHERE user.id=black.id

Lock related (rarely used for understanding)

1. Shared lock: select id from tb_test where id = 1 lock in share mode

2. Exclusive lock: select id from tb_test where id = 1 for update

Used in optimization:

1. Force the use of an index: select * from table force index (idx_user) limit 2

2. Prohibit using an index: select * from table ignore index (idx_user) limit 2

3. Disable caching (remove the effect of caching during testing): select SQL_NO_CACHE from table limit 2

View statu

1. View the character set SHOW VARIABLES LIKE 'character_set%'

2. View the collation SHOW VARIABLES LIKE 'collation%'

Notes on SQL Writing

1. The parsing order of where statements is from right to left, and the condition is to put where instead of having as far as possible.

2. Use delayed correlation (deferred join) technology to optimize super-paging scenarios, such as limit 10000paging 10. Delayed correlation can avoid returning to the table.

3. The distinct statement consumes a lot of performance and can be optimized by group by

4. try not to have more than three tables.

5. If you want to learn Java engineering, distributed architecture, high concurrency, high performance, simple understanding, micro service architecture, Spring,MyBatis,Netty source code analysis and other technologies, you can add the group: 479499375. There are videos of Ali Daniel LVB explanation technology and Java large-scale Internet technology in the group for free. Welcome to join the group for in-depth exchange and study.

Step on the pit

1. If there is a self-incrementing column, the truncate statement will reset the cardinality of the self-incrementing column to 0. In some scenarios, we should pay great attention to using the self-incrementing column as the business id.

2. The aggregate function will filter automatically. For example, if column an is of type int and all NULL, then SUM (a) returns NULL instead of 0.

3. Mysql cannot use "a=null" to judge the equality of null. In UnKnown,where and having, UnKnown is always regarded as a false,check constraint, and UnKnown will be treated as true. So use "an is null" to deal with

60,000,000 large tables are modified online

When mysql has a large amount of data in the table, if the table structure is modified, the table will be locked and the business request will be blocked. Mysql introduced online updates after 5.6.But in some cases, tables are still locked, so the pt tool (Percona Toolkit) is generally used.

If you add an index to a table:

As follows:

Pt-online-schema-change-- user='root'-- host='localhost'-- ask-pass-- alter "add index idx_user_id (room_id,create_time)" slow query log

Sometimes if the online request times out, you should pay attention to the slow query log, the analysis of the slow query is very simple, first find the location of the slow query log file, and then use mysqldumpslow to analyze. To query slow log information, you can view related variables directly by executing the sql command. The common sql is as follows:

-- check the slow query configuration-- whether the slow_query_log slow query log is enabled-- the value of slow_query_log_file is the recorded slow query log to the file-- long_query_time specifies the slow query threshold-- whether log_queries_not_using_indexes records all queries that do not take advantage of the index SHOW VARIABLES LIKE'% quer%' -- check whether slow queries are in the form of logs or tables SHOW VARIABLES LIKE 'log_output'-- check the number of slow queries

The tool for mysqldumpslow is very simple. I mainly use the following parameters:

-t: limit the number of lines output. I usually take the first ten lines.

-s: according to what to sort the default is the average query time at, I also often use c query times, because the query times are very frequent, but the time is not high, it is necessary to optimize, and t query time, check that statement special card.

-v: output details

Example: mysqldumpslow-v-s t-t 10 mysql_slow.log.2018-11-20-0500

8. View sql process and kill process

If you perform a sql operation, but do not return, you can look at its actual execution status by querying the process list. If the sql is very time-consuming, you can use the kill command to avoid affecting the online process. You can also visually see the current sql execution status by looking at the process list. If the current database load is high, a large number of processes may appear in the process list. It takes a long time to execute. The command is as follows:

-- View the list of processes SHOW PROCESSLIST;-- kills a process kill 183665

If you use sqlyog, there is also a graphical page in the menu bar-tools-display-process list. You can right-click to kill the process on the process list page. As follows:

IX. Thoughts on the performance of some databases

When optimizing the company's slow query log, most of the time you may forget to build an index. This kind of problem is easy to solve, just add an index. But there are two situations that can't be solved simply by adding an index:

1. Business code cycle through the database: consider such a scenario in which ten sql like this are added to the list of users' fans. In fact, it is very simple to query through linked tables, but sometimes, many developers use to take out a string of id, and then cycle through each id, so that if a lot of id puts great pressure on the database, and the performance is very low.

2. Statistical sql: most of the time, there are ranking lists in business. It is found that companies directly use databases for calculation. When doing aggregation operations on some large tables, they often take more than five seconds. These sql are generally very long and difficult to optimize. In such scenarios, if business permits (for example, statistics are not required for high consistency or statistics only after a period of time), you can do statistics from the database. In addition, I suggest using redis cache to handle this kind of business.

3. Extra-large paging: some super-large paged slow queries such as limit 4000Magi 1000 are found in the slow query log. Because mysql paging is done at the server layer, delayed association can be used to reduce back to the table. However, such requests will not appear in the normal business logic of the relevant business code, so it is likely that malicious users are brushing the interface, so it is best to intercept these malicious requests by adding verification to the interface during development.

Thank you for your reading, the above is the content of "carding the basic knowledge points of MySQL". After the study of this article, I believe you have a deeper understanding of the problem of combing the basic knowledge points of MySQL, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.