Why should we suggest using self-increment column as primary key? 11/15 Update SLTechnology News&Howtos

Why should we suggest using self-increment column as primary key?

2025-11-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The first part is the characteristics of InnoDB engine table.

1. InnoDB engine table is an index organization table based on B + tree (IOT).

On B + Tree

The characteristics of B + tree:

(1) all keywords appear in the linked list of leaf nodes (dense index), and the keywords in the linked list happen to be ordered.

(2) it is impossible to hit at non-leaf nodes.

(3) the non-leaf node is equivalent to the index (sparse index) of the leaf node, and the leaf node is the data layer for storing (keyword) data.

2. If we define a primary key (PRIMARY KEY), then InnoDB will select the primary key as the clustered index, if there is no explicit definition of the primary key, InnoDB will select the first unique index that does not contain null values as the primary key index, and if there is no such unique index, InnoDB will select the built-in 6-byte ROWID as the implicit clustered index (ROWID increments the primary key as the row record is written This ROWID is not as referenced as ORACLE's ROWID and is implied.

3. The data record itself is stored on the leaf node of the main index (a B+Tree). This requires that the data records within the same leaf node (the size is a memory page or a disk page) are stored in primary key order, so every time a new record is inserted, MySQL will insert it into the appropriate node and location according to its primary key, and if the page reaches the load factor (InnoDB default is 15top 16), a new page (node) will be opened.

4. If the table uses a self-increasing primary key, each time a new record is inserted, the record will be sequentially added to the subsequent position of the current index node, and when a page is full, a new page will be automatically opened.

5. If you use a non-self-increasing primary key (such as a × × sign or a student number, etc.), because the value of each inserted primary key is approximately random, each new record will be inserted somewhere in the middle of the existing index page. At this time, MySQL has to move the data in order to insert the new record into the appropriate position, and even the target page may have been written back to disk and cleared from the cache, and then read back from the disk. This increases a lot of overhead, while frequent movement and paging operations cause a large number of fragments, resulting in an index structure that is not compact enough. Later, we have to use OPTIMIZE TABLE to rebuild the table and optimize the fill page.

To sum up, if the data writing order of the InnoDB table is the same as that of the leaf nodes of the B+ tree index, the access efficiency is the highest, that is, the access efficiency is the highest in the following cases:

1. Use the self-incrementing column (INT/BIGINT type) as the primary key, and the writing order is self-increasing, which is consistent with the splitting order of B+ leaf nodes.

2. The table does not specify a self-incrementing column as the primary key, and there is no unique index that can be selected as the primary key (the above condition). In this case, InnoDB will choose the built-in ROWID as the primary key, and the write order is the same as the ROWID growth order.

In addition, if an InnoDB table does not show the primary key and has a unique index that can be selected as the primary key, but the unique index may not be an incremental relationship (for example, in the case of string, UUID, multi-field union unique index), the access efficiency of the table will be less efficient.

The original words in "High performance MySQL"

The second part is about the analysis of self-increasing lock.

Self-adding locks, released before commit, high concurrent insertion, shared locks and exclusive locks are released after COMMIT commit. Values for self-incrementing columns cannot be rolled back.

Insert-like:

Simple-insert: you can determine the number of rows to be inserted before inserting. Bulk insert: statements that are uncertain about the number of rows to be inserted before inserting replace. Select mixed-mode inserts:insert into T1 (C1 ~ c2) values (1, "a"), (null, "b"), (4, "c"), (null, "d") insert. On duplicate key update: self-extension (any KEY duplicates, execute)

The parameter innodb_autoinc_lock_mode has three modes:

0: traditional way

Simple insert: traditional way

Bulk insert: traditional way

For INSERT... SELECT... Sometimes other transactions cannot be inserted, the assigned ID is continuous, and other transactions cannot be inserted.

Release the self-incrementing lock after SQL execution

1. (default configuration)

Simple insert concurrent

Bulk insert traditional way

two。 The loosest way

All self-increments are concurrent.

Self-increment of the same SQL statement may not be connected

Row-based binlog

Working mode 1:

How it works:

BULK INSERT:

ACQUIRE AI

INSERT.. SELECT: if the execution time is long, the holding time of the lock will be long. The uncertain number of records inserted will only increase itself after the insertion is finished. Other transactions will wait for insertion.

AI=AI+N

RELEASE AI

SIMPLE INSERT: no SQL statement execution wait

ACQURE AI

AI=AI+N

RELESE AI

The working principle when the working mode is 2:

A pair of BULK INSERT can also be inserted concurrently, which is not beneficial to single-line insertion, but beneficial to multi-threaded insertion. Self-increment may be discontinuous.

{

ACQUIRE AI LOCK

INSERT ONE REC

AI=Ai+1

RELEAS AI LOCK

}

The advantage of this is that for bulk, time-consuming inserts, SQL will not hold the AI self-increment lock for a long time, but will insert a (have and only insert one, while simple inserts is a determined M) statement and then release it, which can be used by other things to achieve concurrency.

However, the degree of concurrency in this way is increased, but the performance does not necessarily improve, especially when importing data in a single thread, you have to constantly apply for and release locks.

For batch insertion, self-increment may become discontinuous (whether it is acceptable to communicate with the developer)

Innodb_autoinc_lock_mode belongs to read-only, and the MySQL instance needs to be restarted after modification.

Creation of self-incrementing column:

For federated indexes, the self-incrementing column must be placed in the first column

Create table jjj (an int auto_increment,b int, key (adre b)); / / KEY (bjorn a)

Self-adding lock:

AUTO_INCREMENT competition cannot be persisted, and its speed is fast.

When the MYSQL server is restarted, the value is recalculated:

SELECT MAX (AUTO_INC_COL) FROM XX is based on index lookups rather than full table scans

Self-increasing lock related parameters:

Auto_increment_increment: step valu

Auto_increment_offset: initial valu

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.