How to sort out the storage structure of HFile and quickly locate rowkey 07/02 Update SLTechnology News&Howtos

How to sort out the storage structure of HFile and quickly locate rowkey

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces how to sort out the storage structure of HFile and quickly locate rowkey. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

I. introduction to the structure of HFile

To support random query of data, the HFile structure is divided into six parts:

1. Data blocks-saves the data in the table. Each data block consists of a block head and some keyValue (record). The values of key are stored in strict order. The block size defaults to 64K (specified by cf or HColumnDescriptor.setBlockSize (size) when creating the table), which can be compressed and stored. When querying data, it is load from hard disk to memory in blocks. When looking for data, you traverse the keyValue pairs in the block sequentially.

2. Metadata block (optional)-saves user-defined kv pairs that can be compressed. For example, booleam filter exists in a metadata block, which only retains the value value, and the key value is stored in the metadata index block. Each metadata block consists of a block head and a value. You can quickly determine whether the key is all in this HFile.

3. The meta-information of File Info-Hfile is not compressed, and users can add their own meta-information in this part.

4. Data index block-the index of Data Block. The key of each index is the key of the first record of the indexed block (format: header information, data block offset, key,. of the first record of the block size block).

This parameter controls the size of the index block in hfile. The default value is 128K, which means that when the information of the index exceeds 128K, a new index block will be allocated. Hbase's access to hfile is realized through index blocks, and indexes are used to locate the data block in which the data is to be looked up. The index block in hfile can be divided into three groups: root index block, branch index block and leaf index block. There must be root index blocks, but branch and leaf index blocks may not exist if there are fewer data blocks in the hfile. When there is no way to store all the information of the data block in a single index block, the index block will split, resulting in the leaf index block and the root index block, and the root index block is the index to the leaf index block. if the data block continues to increase, the branch index block will be generated, and the level of the whole index result will be deepened.

Imagine that if there are only root index blocks in the entire hfile, the path to the real data is to first look up the root index block to locate the data block, and then query the data block to find the data you need. The whole process involves a scan of the index block and a scan of the data block.

If there are more hfile total blocks and the whole index structure has two times, the access path is: first visit the root index block to locate the leaf index block, and access the leaf index block to locate the data block. The whole process involves two scans of the index block and one scan of the data block.

The deeper the entire index tree, the longer the access process and the longer the corresponding scanning time.

Is it possible to set the hfile.index.block.max.size as big as possible? No, if the index block is too large, the scanning time for the index block itself will increase significantly.

The root index block must be cached in memory, which is cached when the hfile is opened.

Imagine that if there is only a root index block in the entire hfile, the path to the real data is to first check the root index block to locate the data block, and then query the number.

Find the data you need according to the block. The whole process involves a scan of the index block and a scan of the data block.

HFile data blocks and metadata blocks are usually stored by compression. After compression, the network IO and disk IO can be greatly reduced. Of course, the resulting overhead is to spend cpu compression and decompression.

Problems at the HFile Data Block Index index level

Hfile.data.block.size (default 64K): for the same amount of data, the smaller the data block is, the more the data block is, the more the index block is, and the deeper the index level is.

Hfile.index.block.max.size (default 128K): controls the size of index blocks. The smaller the index block is, the more index blocks are needed and the deeper the index level is.

Table key length: the larger the index, the deeper the index level

The amount of data stored in hfile: the larger the size, the deeper the index level

5. Metadata index block (optional)-the index of Meta Block.

6. Trailer- this paragraph is of fixed length. The offset of each segment (consisting of a type of block) is saved. When reading a HFile, the Trailer,Trailer is read first and the starting position of each segment is saved (the Magic Number of the segment is used as a secure check), and then the data index is read into memory, so that when retrieving a key, it is not necessary to scan the entire HFile, but only need to find the block where the key is located in memory, and read the entire block into memory through a disk io. Then find the key you need. Data index blocks are eliminated by LRU mechanism.

Second, how to find a rowkey from a series of HFile?

If booleamFilter is specified when the table is created, it is quickly determined whether the rowkey is in the HFile based on the booleamFilter.

If there is no definition, booleamFilter,hbase will first filter according to the timestamp or query column information, filter out those storefile or memstore that certainly do not contain the required data, and try to narrow the scope of our query target.

Although it has shrunk, there may still be multiple files to scan. The interior of the storefile is ordered, but not between the storefile. The range of storefile's rowkey is likely to cross. Therefore, the process of querying data cannot be a sequential lookup of storefile.

Hbase first looks at the smallest rowkey of each storefile, and then sorts it in the order from smallest to largest, and the results are put into a queue. The sorting algorithm is sorted according to the 3D order of hbase, sorted by rowkey,column,ts, rowkey and column are in ascending order, and ts is in descending order.

In fact, not all files that satisfy timestamp and column filtering will be added to this queue. Hbase will first probe the data in each storefile, scanning only those storefile that have records larger than the current query's rowkey.

The next step is to query the data, and the whole process uses an algorithm similar to merging and sorting. First, the header storefile of the queue is taken through poll, and a record is read from storefile and returned; next, the next record of the storefile is not necessarily the next record of the query result, because the comparison order of the queue is the first rowkey of each storefile that meets the requirements. So, hbase will continue to take the first record from the remaining storefile in the queue and compare it with the second record of the header storefile. If the former is large, it will go back to the second record of storefile; if the latter is large, it will put the header storefile back into the queue for reordering and re-fetch the header storefile of the queue. Then repeat the whole process until you find the HFile where key is located. After narrowing down to the HFile, locate the block according to the index described above and quickly find the corresponding record.

On how to sort out the storage structure of HFile and quickly locate rowkey, this is it. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.