Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use ID Generation tool in Snowflake algorithm

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to use ID generation tool in snowflake algorithm". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use the ID generation tool in the snowflake algorithm.

? Introduction of algorithm

❄ this is the optimized snowflake algorithm (snowflake drift), which generates shorter and faster ID.

❄ supports automatic expansion of container environments such as K8s (automatically registering WorkerId) to generate digital unique ID in stand-alone or distributed environments.

❄ natively supports languages such as C#/Java/Go/Rust/C/SQL, and provides PHP extensions and Python, Node.js multithread safe invocation dynamic libraries (FFI).

❄ is compatible with all snowflake algorithms (segment mode or classic mode, large factory or small factory), and you can do any upgrade switch in the future. (it is generally not necessary to upgrade, but theoretically supports it)

❄ this is the most comprehensive snowflake ID generation tool in computer history. I look forward to your surpassing.

Source of demand

? As an architectural designer, you want to solve the only problem of database primary keys, especially in distributed systems with multiple databases.

? You want the datasheet primary key to use the least storage space, faster indexing, and faster Select, Insert, and Update.

? You should consider that the primary key value can be used directly and can reflect the business timing when dividing the database and table (combining the table).

? If the primary key value is too long and exceeds the maximum value of the front-end js Number type, you will feel a little frustrated that you have to convert the Long type to the String type.

? Although Guid can increase itself, it takes up a lot of space and the indexing speed is slow, so you don't want to use it.

? There may be more than 50 application instances, and each concurrent request can reach 10W/s.

? To deploy applications in a container environment, horizontal replication and automatic expansion are supported.

? You do not want to rely on the self-increment operation of redis to obtain a continuous primary key ID, because a continuous ID has a business data security risk.

? You want the system to run for more than 100 years.

Traditional algorithm problem

The ID generated by ❌ is too long.

The instantaneous concurrency of ❌ is not enough.

❌ does not solve the time callback problem.

❌ does not support post-generation of preorder ID.

❌ may rely on external storage systems.

Characteristics of the new algorithm

✔ shaping number, monotonously increasing over time (not necessarily continuous), shorter length, will not exceed the maximum value of js Number type for 50 years. (default configuration)

The ✔ is faster, 2-5 times faster than the traditional snowflake algorithm, and can generate 500000 in 0.1 seconds (based on the eighth generation of low voltage i7).

✔ supports time callback processing. For example, if the server time is called back for 1 second, this algorithm can automatically adapt to generate the unique ID of the critical time.

✔ supports manual insertion of new ID. When the business needs to generate a new ID in the historical time, the reserved bits of this algorithm can generate 5000 bits per second.

✔ does not rely on any external cache or database. (the dynamic library for automatically registering WorkerId in K8s environment depends on redis)

✔ basic functions, out of the box, no need for configuration files, database connections, etc.

Performance data

(parameter: 10-bit self-increasing sequence, 1000 drift maximum)

Continuous requests 5K5W50W traditional snowflake algorithm 0.0045s0.053s0.556s snowflake drift algorithm 0.0015s0.012s0.113s

? Extreme performance: 500W/s~3000W/s. (all test data are based on 8-generation low voltage i7)

How to handle time callback

? When the system time callback occurs, the algorithm uses the reserved number of the past time sequence to generate a new ID.

? The ID sequence number generated by callback is the first by default, or can be adjusted to the back.

? Allow time to call back to the preset cardinality of this algorithm (parameters can be adjusted).

? ID composition description

The ID generated by this algorithm consists of three parts (defined by snowflake algorithm):

+-+

| 1. The time difference relative to the base time | 2.WorkerId | 3. Number of sequences |

+-+

Part 1, the time difference, is the system time when the ID was generated minus the total time difference of the BaseTime (in millisecond units).

Part 2, WorkerId, is the only ID that distinguishes different machines or applications, and the maximum value is limited by WorkerIdBitLength (default 6).

Part 3, the number of sequences, is the number of sequences per millisecond, limited by the SeqBitLength (default 6) in the parameter.

ID example

? The ID generated by this algorithm is an integer (up to 8 bytes). The following is the ID generated based on the default configuration:

129053495681099 (running for 1 year) 387750301904971 (running for 3 years) 646093214093387 (running for 5 years) 12926582840139 (running for 10 years) 9007199254740992 (maximum js Number) 165399880288699493 (ID generated by ordinary snowflake algorithm)

? The ID value generated by this algorithm is 1% Mel 10% of the maximum value of js Number and 1/1000 of the value of ordinary snowflake algorithm, but the generation speed is faster than that of ordinary snowflake algorithm.

? The maximum value of js Number type is 9007199254740992. This algorithm takes 70 years to reach the js Number Max value while maintaining concurrency performance (5W+/0.01s) and a maximum of 64 WorkerId (6bit).

Length estimation? For each additional bit of WorkerIdBitLength or SeqBitLength, the resulting ID numeric value is multiplied by 2 (the base length can be found in the previous section, "ID example"), and vice versa. How long will it last?

The explanation of how long it can take is when the generated ID number can grow to exceed the maximum value of long (signed 64 bits, 8 bytes).

? In the default configuration, the ID can be used for 71000 years without repeating.

? When supporting 1024 worker nodes, ID can be used for 4480 years without repetition.

? When supporting 4096 worker nodes, ID can be used for 1120 years without repetition.

? Parameter setting

❄ WorkerIdBitLength, the machine code point length, determines the maximum value of WorkerId. The default value is 6, and the value range is [1,19]. In fact, some languages use unsigned ushort (uint16) type to receive this parameter, so the maximum value is 16. If signed short (int16) is used, the maximum value is 15.

❄ WorkerId, machine code, the most important parameter, no default, must be globally unique, must be programmed, with a maximum of 63 under the default condition (WorkerIdBitLength takes the default), and a theoretical maximum of 2 ^ WorkerIdBitLength-1 (different implementation languages may be limited to 65535 or 32767, with the same principle as WorkerIdBitLength rules). Different machines or different application instances cannot be the same. You can configure this value through the application, or you can get the value by calling an external service. For the requirement of automatic registration of WorkerId, this algorithm provides the default implementation: automatically register the dynamic library of WorkerId through redis, see "Tools\ AutoRegisterWorkerId" for details.

Special note: if a server deploys multiple independent services, you need to specify a different WorkerId for each service.

❄ SeqBitLength, sequence digit length, default value 6, value range [3,21] (recommended not less than 4) to determine the number of ID generated per millisecond. The rule requires that WorkerIdBitLength + SeqBitLength should not exceed 22.

❄ MinSeqNumber, minimum sequence number, default value 5, value range [5, MaxSeqNumber], the first five sequence numbers per millisecond correspond to the number 0-4 is reserved bit, where 1-4 is time callback corresponding reserved bit, 0 is manual new value reserved bit.

❄ MaxSeqNumber, maximum sequence number, set range [MinSeqNumber, 2 ^ SeqBitLength-1], default value 0, real maximum sequence number takes maximum value (2 ^ SeqBitLength-1), if not 0, it is taken as true maximum sequence number, generally, it does not need to be set, unless multiple machines share WorkerId segments to generate ID (at this time, the minimum sequence number should also be set correctly).

❄ BaseTime, base time (also known as base time, origin time, epoch time), has a default value (2020) and is a millisecond timestamp (integer, .NET is DatetTime type). The purpose is to use the difference between the system time and the base time (milliseconds) when generating ID as the timestamp of generating ID. The basic time generally does not need to be set, if you think the default value is too old, you can reset it, but note that this value had better not be changed in the future.

General integration

1 ️calls in singleton mode. The external integrator uses more instances to call this algorithm in parallel, which will not increase the output efficiency of ID, because the algorithm uses a single thread to generate ID.

2 ️specifies a unique WorkerId. The global uniqueness of the WorkerId must be ensured by the external system and assigned to the entry method of this algorithm.

3 ️"different WorkerId is used when deploying multiple instances on a single machine. Not all implementations support cross-process concurrency. Just to be on the safe side, when deploying multiple application instances on the same host, make sure that each WorkerId is unique.

4 ️exception handling. The algorithm will throw all Exception, and the external system should catch the exception and deal with it so as not to cause a larger system crash.

5 ️teachers carefully understand the definition of IdGeneratorOptions, which is helpful to integrate and use this algorithm.

6 ️snow uses snowflake drift algorithm. Although the code contains the definition of the traditional snowflake algorithm, and you can specify (Method=2) at the entrance to enable the traditional algorithm, it is still recommended that you use the snowflake drift algorithm (Method=1, default), after all, it has better scalability and higher performance.

7 ️do not modify the core algorithm. This algorithm has many internal parameters and complex logic. When you have not mastered the core logic, do not try to modify the core code and use it in the production environment, unless verified by a large number of meticulous and scientific tests.

8 the configuration policy within the ️domains is the same. When the project needs to transfer from the program specified WorkerId to automatically register WorkerId after the system has been running for a period of time, make sure that all active instances in the same application domain adopt the same configuration policy, which is not only for WorkerId, but also includes all other configuration parameters.

9 ️servers manage server time well. Snowflake algorithm depends on the system time, do not manually call back the operating system time substantially. If you have to adjust, remember to make sure that the system time when the service is started again is longer than the last time it was shut down. (note: small changes in system time caused by world-class or network-level time synchronization and callback have no effect on this algorithm.)

Configuration change

Configuration change means that the running parameters (IdGeneratorOptions option value) are adjusted after the system has been running for a period of time. Please note:

? 1. The most important principle is that BaseTime can only be assigned forward (smaller than the old value and farther from now) because later assignments are most likely to produce the same timestamp. [it is not recommended to adjust BaseTime after the system is running]

? 2. It is OK to add WorkerIdBitLength or SeqBitLength at any time, but use the "reduce" operation with caution, as this may result in the same ID generated in the future as in the old configuration. [allow any BitLength value to be added after the system is running]

? 3. If one of the WorkerIdBitLength or SeqBitLength terms must be reduced, one condition must be satisfied: the sum of the new two BitLength is greater than the sum of the old values. [it is not recommended to reduce any BitLength value after running]

? 4. The above three rules are not logically controlled in this algorithm, and the integrator shall make an impact assessment according to the above rules and confirm that they are correct before implementing configuration changes.

Automatically register WorkerId

? The unique ID generator relies on WorkerId, which requires that a globally unique WorkerId can be automatically registered before a unique ID can be produced when a business service requires horizontal indifference replication (automatic expansion).

? This algorithm provides an open source dynamic library (implemented in go language), which can automatically register WorkerId through redis in container environment such as container K8s.

? Registering WorkerId through redis is not the only way. You can also develop centralized configuration services that, when each endpoint service starts, obtain a unique WorkerId through the central service.

? Of course, if your services don't need to be automatically expanded, you don't have to register WorkerId automatically, but set globally unique values for each of them.

? There are many ways to use your imagination. Here's the catch: a development-centric ID generation service that generates available ID for each endpoint service (single or batch).

Automatic registration flow chart

Photo link: https://cache.yisu.com/upload/information/20210524/357/2830.jpg

Source path: / Go/source/regworkerid/reghelper.go

Dynamic library download

Download link: https://github.com/yitter/IdGenerator/releases/download/reg_v1.0/regworkerid_lib_v1.0.zip

The dynamic library interface defines / / registers a WorkerId, which first cancels all records registered locally / / ip: redis server address / / port: redis port / / password: redis access password, and can be an empty string "/ / maxWorkerId: maximum WorkerIdextern GoInt32 RegisterOne (char* ip, GoInt32 port, char* password, GoInt32 maxWorkerId); / / unregistering WorkerIdextern void UnRegister () registered locally." / / check whether the local WorkerId is valid (0-valid, other-invalid) extern GoInt32 Validate (GoInt32 workerId); implemented language github???? C # View example? Java view examples? Go view examples? Rust view examples? C View the example? C (PHP extension) View example? V View example? D look at the example here, I believe you have a deeper understanding of "how to use the ID generation tool in the snowflake algorithm". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report