Principle of Hbase data Storage and detailed explanation of Reading and Writing 12/20 Update SLTechnology News&Howtos

Principle of Hbase data Storage and detailed explanation of Reading and Writing

2025-12-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

1. The data storage principle of HBase.

A HRegionServer is responsible for managing many region, a * region contains many store, and a column family is divided into a store** if there is only one column family in a table, then there is only one store in each region. If there are N column families in a table, then there are N store in each region, and only one memstorememstore in each store is a memory area. The written data is first written to memstore for buffering, and then the data is brushed to disk.

There are many StoreFile in a store, and the final data is saved on the HDFS in many HFile data structure files.

StoreFile is an abstract object of HFile. If StoreFile is said to be HFile, every time memstore writes data to disk, a corresponding new HFile file will be generated.

2. HBase data reading process

Description: HBase cluster, only one meta table, this table has only one region, the region data is saved on a HRegionServer

1. The client first connects with zk; finds the region location of the meta table from zk, that is, the data of the meta table is stored on a certain HRegionServer; the client establishes a connection with this HRegionServer, and then reads the data in the meta table; the region information of all user tables is stored in the meta table, and we can view the meta table information 2 according to the namespace, table name and rowkey information to be queried through scan 'hbase:meta'. Find the region information corresponding to the written data 3, find the regionServer corresponding to the region, then send request 4, find and locate the corresponding region5, first look up the data from the memstore, if not, read the memory of the Regionserver on the HBase from the BlockCache is divided into two parts: one part is used as Memstore, mainly used for writing; the other part, as BlockCache, is mainly used to read data 6. If it is not found in BlockCache, read the data on StoreFile and read the data from storeFile. Instead of directly returning the result data to the client, the data is first written to BlockCache in order to speed up the subsequent query. Then the result is returned to the client. 3. HBase write data flow

1. The client first finds the region location of the meta table from zk, and then reads the data in the meta table. The region information of the user table is stored in the meta table.

2. According to namespace, table name and rowkey information. Find the region information corresponding to the written data

3. Find the regionServer corresponding to the region, and send the request

4. Write the data to HLog (write ahead log) and memstore respectively

5. When memstore reaches the threshold, the data is brushed to disk to generate storeFile files.

6, delete HLog historical data supplement: HLog (write ahead log): also known as Wall means Write ahead log, similar to binlog in mysql, used for disaster recovery, HLog records all changes in data, once the data is modified, it can be recovered from log. 4. HBase flush mechanism 4.1, flush trigger condition 4.1.1, memstore level limit when the size of any MemStore in the Region reaches the upper limit (hbase.hregion.memstore.flush.size, default 128MB), Memstore refresh will be triggered. Hbase.hregion.memstore.flush.size 1342177284.1.2, region level limit when the sum of all Memstore in Region reaches the upper limit (hbase.hregion.memstore.block.multiplier hbase.hregion.memstore.flush.size, default is 2128m = 256m), memstore refresh will be triggered. Hbase.hregion.memstore.flush.size 134217728 hbase.hregion.memstore.block.multiplier 24.1.3, RegionServer level limit when the sum of all Memstore in a RegionServer exceeds the low water threshold hbase.regionserver.global.memstore.size.lower.limit*hbase.regionserver.global.memstore.size (the former default is 0.95), RegionServer begins to force flush; to Flush Memstore the largest Region first, then the second largest, and then execute If the write speed is faster than the flush write speed, causing the total MemStore size to exceed the high water threshold hbase.regionserver.global.memstore.size (default is 40% of JVM memory), RegionServer will block updates and force flush Until the total MemStore size is lower than the low water threshold hbase.regionserver.global.memstore.size.lower.limit 0.95 hbase.regionserver.global.memstore.size 0.44.1.4, and the upper limit of the number of HLog is reached, when the number of HLog in a Region Server reaches the upper limit (which can be configured by parameter hbase.regionserver.maxlogs), the system will select one or more Region corresponding to the earliest HLog for flush4.1.5, and periodically refresh the Memstore default period of 1 hour Make sure that the Memstore does not go unpersisted for a long time. In order to avoid the problems caused by all MemStore flush at the same time, regular flush operations have a random delay of about 20000. 4.1.6. Manual flush users can use the shell command flush 'tablename' or flush' region name' to flush a table or a Region, respectively. 4.2.The process of flush

In order to reduce the impact of the flush process on reading and writing, the whole flush process is divided into three stages:

Prepare phase: iterate through all the Memstore in the current Region, make a snapshot snapshot; of the current dataset CellSkipListSet in the Memstore, and then create a new CellSkipListSet. Data written later is written to the new CellSkipListSet. The prepare phase needs to add a updateLock block to the write request, and the lock will be released at the end. Because there is no time-consuming operation at this stage, the lock time is very short.

Flush phase: traverses all the Memstore and persists the snapshot generated in the prepare phase into temporary files, which are uniformly placed in the .tmp directory. This process is relatively time-consuming because it involves disk IO operations. Commit phase: traverses all Memstore, moves the temporary files generated in flush phase to the specified ColumnFamily directory, generates corresponding storefile and Reader for HFile, adds storefile to the storefiles list of HStore, and finally clears the snapshot generated in prepare phase. 5. Compact merging mechanism

Hbase in order to prevent too many small files = = to ensure query efficiency, hbase needs to merge these small store file into relatively large store file when necessary, a process called compaction.

There are two main types of compaction merging in hbase = = minor compaction small merging = major compaction large merging = = 4.3.1 minor compaction small merging

After merging multiple HFile in Store into a single HFile

In the process, some small, adjacent StoreFile are selected to merge them into a larger StoreFile, and only marks the data that exceeds the TTL, updated data, and deleted data. There is no physical deletion, and the result of a Minor Compaction is less and larger StoreFile. The trigger frequency of this merger is very high.

The trigger condition of minor compaction is determined by the following parameters: hbase.hstore.compactionThreshold 3 hbase.hstore.compaction.max 10 hbase.hstore.compaction.min.size 134217728 hbase.hstore.compaction.max.size 92233720368547758074.3.2 major compaction merger

Merge all HFile in Store into one HFile

Merging all the StoreFile into a single StoreFile also cleans up three types of meaningless data: deleted data, TTL expired data, and data whose version number exceeds the set version number. The merge frequency is relatively low, which is performed once every 7 days by default, and the performance consumption is very high. It is recommended that the production be closed (set to 0) and triggered manually during the idle time of the application. Generally, the merger can be controlled manually to prevent it from appearing at the peak of the business.

Major compaction trigger time condition

Hbase.hregion.majorcompaction604800000

Manual trigger

# # using major_compact Command major_compact tableName

Hbase.hregion.majorcompaction

604800000

Manual trigger

# # using major_compact Command major_compact tableName

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.