What are the knowledge points about the primary key 07/12 Update SLTechnology News&Howtos

What are the knowledge points about the primary key

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what are the knowledge points about the primary key". In the course of the operation of the actual case, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. UUID mode

The universal unique identification code (Universally Unique Identifier) is generated according to the standard method and does not depend on the registration and allocation of the central authority. The probability of unique repeated UUID codes in UUID is close to zero, which can be ignored. UUID has multiple versions: time-based UUID, DCE secure UUID, name-based UUID (MD5) (UUID.nameUUIDFromBytes ()), random UUID (UUID.randomUUID (). ToString ()), name-based UUID (SHA1), Version 1 toString 2 is suitable for distributed computing environments with a high degree of uniqueness; Version 3 toString 5 is suitable for business scenarios that require the same content to generate the same UUID Version 4 is not recommended (random numbers are likely to repeat, but the probability of repetition is extremely low, which should be taken into account in the design).

Although UUID solves the strategy of relying on the database to generate the primary key, it also has some shortcomings: large storage space; random generation, no continuity, poor performance as the primary key; unable to sort according to the primary key to determine the order of record insertion; unfriendly to developers; if the machine MAC address is used in the generation process, there are certain security risks.

two。 Step mode

That is, Flickr's sharding primary key generation scheme. Use multiple database servers to keep each table primary key unique in each database by setting different starting values and consistent self-increment steps. As shown in the figure:

The step size method solves the problem of high concurrency to a certain extent, but there are also some problems, such as: it is difficult to expand, after the step size is set, it will be difficult to expand; ID is not strictly monotonous increasing in order, but the trend is increasing; each time you get ID, you still need to read and write to the database, and there is still a bottleneck.

3. Number segment mode

That is, every time you get id from the database, get the maximum value of the current id from the database, and then return max+step. When the application has finished using this number segment, it will get the next length step number range from the database. For this reason, it is necessary to design a table to record id. When the application service is a cluster and the primary key server is a single point, conflicts will occur when multiple application service nodes obtain id at the same time, and the version field can be added to use optimistic locks for concurrent access control.

The number segment mode caches the primary key on the application server, thus reducing the frequency of accessing the database; when the database is not available, the application service can still run for a period of time until the current number segment is used up; however, part of the id may be lost when the application service is restarted, resulting in discontinuous id growth.

There are some mature schemes based on the segment mode, which have been verified by practice: Meituan's Leaf-segment has carried out double buffer cache and high availability disaster recovery optimization. Using dual buffer mode, the next segment is loaded asynchronously into memory when the current segment is consumed to a certain point. It is not necessary to update the number segment when the number field is exhausted, and it will not block the thread when the application server requests id from the database because the id number segment is not retrieved.

Didi's TinyId refers to Meituan's implementation of Leaf, and extends it by adding multi-db support and tinyid-client.

4. Snowflake mode (snowflake algorithm)

Distributed ID generation algorithm implemented by Twitter. The structure is as follows: 0-0000000000000000000000000000000000000000000-00000-000000000000

1 bit: reserved bits, symbol bits, all 0, indicating that the generated id is all positive.

41bit: timestamp in milliseconds. 41 bits can represent 69 years of time.

10bit: 5 bits in the machine id,10bit represent the computer room id,5 bit represents the machine id, which can represent 32 computer rooms, and 32 machines can be used in each computer room.

The 12bit:12 bit sequence number, incremented sequentially, records the id generated by each node within 1 millisecond, which can generate 4096 id per millisecond.

Advantages of snowflake:

The primary key increases sequentially on a single node and can be increased according to the time trend.

The generation of primary keys does not depend on the database and can be generated by the application.

Duplicate id is not generated within a distributed cluster.

The bit bit can be adjusted according to business requirements.

Disadvantages of snowflake:

For high time dependence, primary key repetition occurs if the time is called back.

When the size of the cluster is large, workid configuration will increase the cost.

Meituan's Leaf-snowflake uses zk to solve the problem that snowflake depends on the clock and time callback produces duplicate primary keys; Baidu's UidGenerator supports custom timestamps, workerId, serial numbers and so on.

5. Redis mode

It is realized by using Redis atomic operations INCR and INCRBY, and using Redis clusters to increase concurrency, which is similar to the step mode, except that the id generator is changed from a traditional database to a more efficient Redis database. However, when Redis restarts or goes down, the record primary key value will be lost, so the current primary key value needs to be persisted when using Redis for primary key generation. Redis supports both RDB and AOF persistence mechanisms. In RDB mode, some unmirrored data may be lost, and some duplicate ID may be generated after the snapshot is restored, so RDB is not suitable for persistent Redis data scenarios. AOF records each write command in an independent log, and executes the commands in the log to restore data during restart. There will be no ID duplication, but it will take a long time for Redis to recover data due to too many backup commands.

Five kinds of database primary key generation strategies are introduced above. You can choose the most suitable primary key strategy according to the specific business scenarios and the actual situation of the system to improve database performance and ensure system stability in the case of high concurrency.

This is the end of the content of "what are the knowledge points about the primary key". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.