In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
What is the meaning of snowflake algorithm in mysql? This problem may be often seen in our daily study or work. I hope you can gain a lot from this question. The following is the reference content that the editor brings to you, let's take a look at it!
First, why use the snowflake algorithm
1. The background of the problem
Nowadays, more and more companies are using distributed and micro-services, so the corresponding database will be split for different services, and then the tables will be split when the amount of data comes up, then there will be the problem of id after table splitting.
For example, the data primary key id in a table in a single project before is self-increasing, mysql uses autoincrement to achieve self-increment, and oracle uses sequences to achieve, but when the amount of data in a single table comes up, it is necessary to split the table horizontally. Ali java development suggests that a single table should be divided into tables when it is greater than 500w, but it still depends on the business, if the number used in the index, tens of millions of data in a single table is also possible. Horizontal sub-table is to divide the data of a table into multiple tables, then the problem arises if you still do the primary key id according to the previous self-increment, then there will be id duplication, and at this time you have to consider what solution to solve the problem of distributed id.
2. Solution
2.1, database table
You can specifically maintain a table in a library, and then check the record of this table every time any table needs to add id, then lock the table with for update, then add one to the value, and then return and record the value into the table, but this method is suitable for projects with relatively small concurrency, so you have to lock the table every time.
2.2 、 redis
Because redis is single-threaded, you can maintain a key-value pair in redis, and then which table needs to go directly to redis and add one, but this is the same as above, because single-threading does not support high concurrency, so it is only suitable for projects with low concurrency.
2.3 、 uuid
You can use uuid as the non-repeating primary key id, but the problem with uuid is that it is an unordered string, and if you use uuid as the primary key, the primary key index will be invalidated.
2.4. Snowflake algorithm
Snowflake algorithm is an efficient solution to distributed id. Most Internet companies use Snowflake algorithm, and of course, companies implement other solutions on their own.
2. Snowflake algorithm
1. Principle
Snowflake algorithm uses 64-bit long data storage id. The highest bit stores 0 or 1 long 0 for integers and 1 for negative numbers, usually 0, so the highest bit remains unchanged. 41 bits store millisecond timestamps, 10 bits store machine codes (including 5-bit datacenterId and 5-bit workerId), and 12 store serial numbers. So machines with a maximum of 2 to the power of 10, that is, 1024 machines, produce a maximum of 2 to the power of 12, or 4096 id per millisecond. (there is a code implementation below)
But usually we don't have that many machines, so we can also use 53-bit to store id. Why use 53 bits?
Because we almost always deal with web pages, we need to deal with js. Js supports a maximum integer range of 53 bits. Beyond this range, precision will be lost. Within 53 bits, it can be directly read by js. If it exceeds 53 bits, it needs to be converted to a string to ensure that js handles it correctly. 53 memory, 32-bit storage second timestamp, 5-bit storage machine code, 16-bit storage serialization, so that each machine can produce 65536 non-repeating id per second.
2. Shortcomings
Because the snowflake algorithm is heavily time-dependent, the problem of server clock callback may lead to duplicate id. Of course, almost no company will change the server time, the change will lead to a variety of problems, the company would rather add a new server than modify the server time, but do not rule out special cases.
How to solve the problem of clock callback? You can set the step size for the serialized initial value, and each time a clock callback event is triggered, the initial step size is added by 1w, which can be achieved in line 85 of the following code, setting the initial value of sequence to 10000.
Third, code implementation
64-bit code implementation:
Package com.yl.common;/** * Twitter_Snowflake
* the structure of SnowFlake is as follows (use-separate each part):
* 0-0000000000 0000000000 0000000000 00000000000-00000000-000000000000
* 1-bit identification. Because the basic type of long is signed in Java, the highest bit is the sign bit, the positive number is 0 and the negative number is 1, so the id is generally positive and the highest bit is 0
* 41-bit time cut (millisecond). Note that 41-bit time cut is not the time cut for storing the current time, but the difference between the stored time cut (current time cut-start time cut) *). The start time cut here is usually the time when our id generator starts to use, which is specified by our program (such as the startTime attribute of the IdWorker class below). 41-bit time limit, can be used for 69 years, annual T = (1L)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.