In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "Go language how to achieve Snowflake snowflake algorithm", the content of the article is simple and clear, easy to learn and understand, now please follow the editor's ideas slowly in depth, together to study and learn "Go language how to achieve Snowflake snowflake algorithm" bar!
Snowflake algorithm
The original version of the snowflake algorithm is the scala version, which is used to generate distributed ID (pure numbers, chronological order numbers, etc.).
Self-increasing ID: not suitable for data-sensitive scenarios, and not suitable for distributed scenarios. GUID: meaningless strings are used. When the amount of data increases, the access is too slow, and it is not suitable for sorting.
UUID
The first is UUID, which consists of a 128-bit binary, which is generally converted to hexadecimal and then represented by String. In order to ensure the uniqueness of UUID, the specification defines the network card MAC address, timestamp, Namespace, random or pseudo-random number, timing and other elements, as well as the algorithm to generate UUID from these elements.
There are five versions of UUID:
Version 1: based on timestamp and mac address
Version 2: based on timestamp, mac address and POSIX UID/GID
Version 3: based on MD5 hash algorithm
Version 4: based on random numbers
Version 5: based on SHA-1 hash algorithm
Advantages and disadvantages of UUID:
The advantage is that the code is simple and the performance is good. The disadvantage is that there is no sorting, there is no guarantee to increase in order; secondly, it is too long, the storage database occupies a large space, which is not conducive to retrieval and sorting.
Database self-increasing primary key
If you are using a mysql database, setting the primary key to auto_increment is the easiest way to achieve a monotonously incremented unique ID, and it is also convenient for sorting and indexing.
But the disadvantage is also obvious: due to over-reliance on the database, the concurrency is not high due to the performance of the database; in addition, if the amount of data is too large, it will bring problems to the sub-database and table; and if the database is down, then this feature is not available.
Redis
At present, Redis is an indispensable existence in many projects. There are two commands Incr and IncrBy in Redis. Because Redis is single-threaded, atomicity can be guaranteed through these two instructions to achieve the goal of generating unique values, and the performance is also very good.
But in Redis, even with AOF and RDB, there is still data loss, which may cause ID duplication; then you need to rely on Redis, which can affect ID generation if it is unstable.
Snowflake
Through the above analysis, we finally lead to our distributed snowflake algorithm Snowflake, which is the only ID generation algorithm in the distributed environment used in twitter. Open source in 2014. The open source version is written by scala, you can find another address to find this version.
Https://github.com/twitter-archive/snowflake/tags
It has the following characteristics:
It can satisfy the non-repetition of ID in the environment of high concurrency distributed system.
Based on the timestamp, the basic orderly increment can be guaranteed.
Do not rely on third-party libraries or middleware
Realization principle
The Snowflake structure is data of type int64 of 64bit. As follows:
Location size function 0~11bit12bits sequence number, used to generate different ID within the same millisecond, can record 4095 12~21bit10bits10bit to record machine ID, a total of 1024 machine 22~62bit41bits can be recorded to record timestamp, 69 years 63bit1bit symbol bits can be recorded here without processing
The above is just a general standard for dividing 64bit, which can be adjusted according to your own business situation. For example, at present, the business has only about 10 machines and is expected to increase to three digits in the future, and multi-computer room deployment is required. QPS will grow to one million within a few years.
Then for millions of QPS to be divided equally among 10 machines, each machine can undertake a request of 100, 000, and a serial number of 12 bit is sufficient. For the requirements that will be increased to three-digit machines in the future and need to be deployed in multiple data centers, we only need to split the 10-bits work id into 3 bits to represent the total number of data centers, and the other 7bits represents the number of machines to be deployed in each data center. Then the data format would look like this:
Code implementation
Implementation steps
In fact, after understanding the above data structure, it is very simple to implement a snowflake algorithm by yourself. The steps are as follows:
Gets the current millisecond timestamp
Compare the current millisecond timestamp with the last saved timestamp
If it is equal to the last saved timestamp, add one to the serial number sequence
If it is not equal, set sequence to 0 directly.
Then the int64 return value that needs to be returned by the splicing snowflake algorithm is calculated by or.
Code implementation
First we need to define a Snowflake structure:
Type Snowflake struct {sync.Mutex / / lock timestamp int64 / / timestamp, millisecond workerid int64 / / worker node datacenterid int64 / / data center computer room id sequence int64 / / serial number}
Then we need to define some constants so that we can perform bit operations when using the snowflake algorithm:
Const (epoch = int64 (1577808000000) / / set start time (timestamp / millisecond): 2020-01-01 00:00:00 Valid for 69 years timestampBits = uint (41) / / number of digits occupied by timestamp datacenteridBits = uint (2) / / number of bits occupied by data center id workeridBits = uint (7) / / number of bits occupied by machine id sequenceBits = uint (12) / / the number of digits occupied by the sequence timestampMax = int64 (- 1 ^ (- 1)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.