In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the problems caused by hbase timestamp modification. The introduction in this article is very detailed and has certain reference value. Interested friends must read it!
Company business: When data is entered, there are multiple versions of a certain field of a piece of data at the same time.
According to the data, the timestamp can be manually set when hbase inserts data, so as to distinguish the timestamps of multiple versions, but it is found that hbase data cannot be deleted.
After analysis, this is due to: insert data, artificially set time stamp is greater than, delete time stamp. This can happen when the client system time is greater than the cluster system time.
To conclude, the client server deployed by hbase java code, preferably synchronized with the cluster hbase server time, will avoid the above problems.
Storage systems such as OB and HBase generally have a timestamp (ts) on the data when it is inserted.
Hbase has a TTL (time to live) that identifies the expiration date of the data. For example, you can set TTL to 86400*1000, which means that the data will expire in one day. This is a table-level setting that must be specified at table creation time.
But if you need to store data for one day, it expires at 0:00 the next day. TTL is uncontrollable in this case, because TTL can only control the data to expire after a period of time, but not at a specific point in time.
The essence of TTL is to compare the ts of the data with the current system time and determine whether it should be invalidated, so we can hack it with ts.
Assuming the TTL of the data is 1 day, if I insert the data at 1 a.m., it will normally expire at 1 a.m. the next day. In fact, it is judged that currentMillseconds- ts > 86400*1000, if satisfied, the data is invalid.
At this time, if we want to control the data to expire at 0:00 the next day, we can push the ts inserted in the data back an hour, and it will expire early.
This scenario may not seem problematic in theory, but if your table involves deleting data, then the pit is coming.
HBase ordinary operations, will write WAL (Write ahead log), accumulated to a certain amount (or according to time), according to the operation ts, merge, and then do commit to the real data, this is a bit similar to the database log.
Implicit in this is that operations in hbase require ts to be greater than ts in the current data for the operation to be valid, otherwise it will be invalid (normally this is the case, because time is constantly increasing).
For example, there are currently two operations:
put 'key', 'value', ts=1
put 'key', 'value', ts=2
After the merger, there is actually only one operation:
put 'key',' value', ts=2 (because this timestamp is relatively large)
Next, if there are three operations:
put 'key', 'value', ts=1
put 'key', 'value', ts=2
del 'key', 'value', ts=3
Then, after merging, there is only the delete operation.
The pit is right here because we are manually setting the ts for inserting data. This means that if you want to delete data, you have to set ts for the delete operation to be larger than ts for the original data (in our case, both times are in the future).
If the deletion operation uses the system default ts, the result is that the data cannot be deleted.
OK, then we know that we will set the deleted ts large. But at this point, if you insert data again, you must set the ts of the insert data to be larger than the ts of the delete operation. In fact, for an operation on the same cell to be valid, you must set its ts to be larger than the largest in the current sequence of operations.
Then, if you accidentally set the deleted ts to Long.MAX_VALUE, you will find that you will never insert data... (Not until the next major compact.)
The above is "what are the problems caused by hbase timestamp modification". Thank you for reading all the contents of this article! Hope to share the content to help everyone, more relevant knowledge, welcome to pay attention to the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.