Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Storage and location of LSM

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "the storage and location of LSM". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "the storage and location of LSM".

Storage of LSM

The main idea is to directly modify the tree structure and change it into several levels. When the first level is completed, the feedback is completed, and the rest is handled by the background.

The process is to write memory table first, then merge to a low-level sstable, and finally merge to a high-level sstable.

The general structure of Hbase is as follows:

two。 Positioning

Trailer- this part is fixed in length. Save the offset of each segment, when reading a HFile, you will first read the Trailer,Trailer to save the starting position of each segment (the Magic Number of the segment is used for security check), and then the DataBlock Index will be read into memory, so that when retrieving a key, you do not need to scan the entire HFile, but only need to find the block where the key is located in memory, read the entire block into memory through a disk io, and then find the needed key. DataBlock Index was eliminated by LRU mechanism.

First of all, we can quickly find the region (partition) where the row is located, assuming that the table has 1 billion records and occupies space 1TB, which is divided into five hundred region and one region occupies two G. If you read 2G records at most, you can find the corresponding records

Secondly, it is stored by column, which is actually a column family. Suppose it is divided into three column families, each column family is 666m. If the thing to be queried is on one column family, one column family contains one or more HStoreFile, suppose a HStoreFile is 128m, and the column family contains five HStoreFile on disk. The rest is in memory.

Again, it's sorted, and the record you want may be at the front or at the end, assuming that in the middle, we only need to traverse 2.5 HStoreFile for a total of 300m.

Finally, each HStoreFile (encapsulation of HFile) is stored as a key-value pair (key-value), as long as it traverses the location of the key in each data block and determines that it meets the conditions. Generally speaking, key is of limited length. Assuming that it is 1:19 with value (ignoring other blocks on HFile), it takes only 15m to obtain the corresponding record. According to the disk access 100M/S, it takes only 0.15s. With the addition of block caching mechanism (LRU principle), higher efficiency can be achieved.

Thank you for reading, the above is the content of "Storage and location of LSM". After the study of this article, I believe you have a deeper understanding of the storage and positioning of LSM, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report