Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Mysql Why it is better for InnoDB tables to have self-incrementing columns as primary keys

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the "mysql why InnoDB table had better have self-adding column as the primary key" related knowledge, in the actual case of the operation process, many people will encounter such a dilemma, and then let the editor lead you to learn how to deal with these situations! I hope you can read it carefully and be able to achieve something!

1. Why is it better for InnoDB tables to have self-incrementing columns as primary keys?

InnoDB engine table is an index organization table (IOT) based on B + tree.

On B + Tree

(the picture comes from the Internet)

The characteristics of B + tree:

A. all keywords appear in the linked list of leaf nodes (dense index), and the keywords in the linked list happen to be ordered

B, it is impossible to hit at the non-leaf node.

C, the non-leaf node is equivalent to the index (sparse index) of the leaf node, and the leaf node is the data layer that stores (keywords) data.

1. If we define the primary key (PRIMARY KEY)

Then InnoDB will select the primary key as the clustered index, if there is no explicit definition of the primary key, InnoDB will select the first unique index that does not contain null values as the primary key index, and if there is no such unique index, InnoDB will choose the built-in 6-byte ROWID as the implicit clustered index (ROWID increments as the row records are written, this ROWID is not as referable as ORACLE's ROWID, it is implied).

2. The data record itself is stored on the leaf node of the main index (a B+Tree)

This requires that the data records within the same leaf node (the size is a memory page or a disk page) are stored in primary key order, so every time a new record is inserted, MySQL will insert it into the appropriate node and location according to its primary key, and if the page reaches the load factor (InnoDB default is 15top 16), a new page (node) will be opened.

3. If the table uses a self-increasing primary key

Then each time a new record is inserted, the record will be sequentially added to the subsequent position of the current index node, and when a page is full, a new page will be opened automatically.

4. If you use a non-self-incrementing primary key (if ID number or student number, etc.)

Because the value of each inserted primary key is approximately random, each new record has to be inserted somewhere in the middle of the existing index page, and MySQL has to move the data in order to insert the new record into the appropriate place, or even the target page may have been written back to disk and cleared from the cache, and then read back from disk, which adds a lot of overhead, while frequent movement and paging operations cause a lot of fragmentation. The index structure is not compact enough, and later we have to rebuild the table and optimize the populated page through OPTIMIZE TABLE.

Summary: if the data writing order of the InnoDB table is the same as that of the leaf nodes of the B+ tree index, the access efficiency is the highest, that is, the following situations have the highest access efficiency:

A. Use the self-incrementing column (INT/BIGINT type) as the primary key, and the writing order is self-increasing, which is consistent with the splitting order of B+ leaf nodes.

B. The table does not specify a self-incrementing column as the primary key, and there is no unique index that can be selected as the primary key (the above condition). In this case, InnoDB will choose the built-in ROWID as the primary key, and the write order is the same as the ROWID growth order.

C, if an InnoDB table does not show the primary key and has a unique index that can be selected as the primary key, but the unique index may not be an incremental relationship (for example, string, UUID, multi-field joint unique index), the access efficiency of the table will be relatively poor.

Here are the exact words from "High performance MySQL"

Reference link: https://segmentfault.com/q/1010000003856705

2. Why do you need to set double 1s to ensure the consistency of master and slave data?

Double 1:innodb_flush_log_at_trx_commit=1 and sync_binlog=1

Sync_binlog=n, after each commit of N transactions, MySQL will issue a disk synchronization instruction such as fsny to force the data in binlog_cache to disk. In MySQL, sync_binlog=0, that is, does not make any mandatory disk refresh instructions, this time the performance is the best, but the risk is also the greatest. Because once the system crash, all binlog information in binlog_cache will be lost.

Innodb_flush_log_at_trx_commit=1 is every transaction commit or transaction instruction that needs to be written to (flush) the hard disk, which is time-consuming when using battery-powered cache (Battery backed up cache).

Innodb_flush_log_at_trx_commit=2 is written not to the hard disk but to the system cache, and the log is still flush to the hard disk every second, so updates are generally not lost for more than 1-2 seconds, and data may be lost only when the system is down.

Innodb_flush_log_at_trx_commit=0 will be faster and less secure, even if mysql hangs up and may lose transaction data

3. There are several binlog formats. What is the difference?

Row,Statement,Mixed=Row+Statement

1. Row

Each row of data is recorded in the log as modified, and then the same data is modified on the slave side.

Advantages: in row mode, bin-log can not record context-sensitive information about executed SQL statements, just record which record has been modified and how it has been modified. So the contents of the row log will clearly record the details of each line of data modification, which is very easy to understand. And there will be no problems with stored procedures or function in certain cases, as well as calls and triggers of trigger that cannot be copied correctly.

Disadvantages: in row mode, when all executed statements are recorded in the log, they will be recorded as changes recorded per line, which may result in a large amount of log content.

2. Statement

Each SQL that modifies the data is recorded in the bin-log of the master. When slave is copied, the SQL process parses to the same SQL that was executed on the original master side and executes again.

Advantages: in statement mode, the first thing is to solve the shortcomings of row mode, there is no need to record the changes of each row of data, reduce the number of bin-log logs, save Icano and storage resources, and improve performance. Because he only needs to record the details of the statement executed on the master, as well as the context information when the statement is executed.

Disadvantages: in statement mode, because he is a recorded execution statement, so, in order to make these statements correctly executed on the slave side, then he must also record some relevant information about the execution of each statement, that is, context information, to ensure that all statements in the slave cup execution can get the same results as when executed on the master side. In addition, due to the rapid development of MySQL and the continuous addition of many new functions, the replication of MySQL has encountered a lot of challenges. The more complex content is involved in natural replication, the easier it is for bug to appear. In statement, there are many situations that have been found to cause problems with MySQL replication, mainly when modifying data using certain functions or functions, such as: the sleep () function cannot be copied correctly in some versions, the use of the last_insert_id () function in the stored procedure may lead to inconsistent id on slave and master, and so on. Because row records changes on a line-by-line basis, similar problems do not occur.

3. Mixed

As you can see from the official documentation, the previous MySQL only had a statement-based replication mode, and row replication was not supported until version 5.1.5 of MySQL. Since 5. 0, MySQL replication has solved a large number of problems in older versions that cannot be copied correctly. However, due to the emergence of stored procedures, it brings more new challenges to MySQL Replication. In addition, according to the official documentation, starting with version 5.1.8, MySQL provides a third replication mode besides Statement and Row: Mixed, which is actually a combination of the first two modes. In Mixed mode, MySQL distinguishes the log form of the record based on each specific SQL statement executed, that is, choosing between statement and row. The new version of statment is the same as before, recording only the statements executed. In the new version of MySQL, the row schema is also optimized. Not all changes will be recorded in row mode. For example, when table structure changes are encountered, they will be recorded in statement mode. If SQL statements are indeed statements that modify data, such as update or delete, then all row changes will be recorded.

Note:

Condition 1: when binlog format is set to mixed, normal replication will not be a problem, but cascading replication will lose binlog in special cases.

Condition 2: when there are a large number of data (about 400W) scanned updates, deletions, inserts, and there are uncertain dml statements (such as: delete from table where data)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report