Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of MemStore+Flush of HBase

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Introduction to MemStore:

The picture above is a rough description of the reading and writing process of HBase.

Write request process: client- > WAL (Write Ahead LOG)-> MemStore- > HFile- > END

Read request process: client- > MemStore- > BlockCache- > HFile- > END

The location of MemStore in HBase:

HBase is composed of Master and HRegionServer, but in fact, in the process of reading and writing, we do not have many opportunities to deal with Master, mainly HRegionServer, from the above figure we can see that each HRegionServer is composed of a HLog and multiple Region, there are multiple Store in a Region, each Strore is composed of a MemStore and multiple StoreFile, MemStore is a region of HBase in memory, and the bottom layer of StoreFile is HFile, which is a file in HDFS.

When does MemStore work:

Write: when the client initiates a write operation, the write operation is first written to the WAL and then to the MemStore. When a certain preset condition is reached, the content in the MemStore will be brushed to the StoreFile, and the write operation is completed.

(so here's the problem.)

1. Why write it to WAL first?

WAL is a file in HDFS, and MemStore is an area in memory. When we mention memory, we can think that it is not safe. We can see that only when the data in MemStore is written to StoreFile, the data will be dropped and written to disk. Therefore, when the data in MemStore is lost and there is no time to write down the disk due to system downtime, HBase will recover the data according to the WAL file stored in HDFS.

What is the strategy of 2.Flush?

It will be explained in detail below.

)

Read: when the client initiates a read operation, HBase will first look for it in the MemStore of the corresponding Region. If it cannot be found, it will look for it in BlockCache (BlockCache is an optimized read strategy of HBase, which will be explained in detail below). If not, it will look in StoreFile (HFile), and the read operation is completed.

Introduction to 2.Flush

Flush is an important operation in HBase, and we must configure a good flush policy to ensure the stability of the HBase cluster.

Flush is an operation in which HBase data is removed from disk, and the data will only be persisted after Flush. Each time Flush generates a StoreFile in Region and deletes the edits in WAL.

Flush is the Region class. When the MemStore in a Store in a Region reaches the preset condition, all the Sotre in a Region.

The following is the log when you Flush a table:

INFO [Priority.RpcServer.handler=1,port=60020] regionserver.HRegionServer: Flushing T1 INFO 1413622522846.58fd75078b4a47b8c6a20705f23209b7.

2014-10-18 16 DEBUG 58 DEBUG [Priority.RpcServer.handler=1,port=60020] regionserver.HRegion: Started memstore flush for T1 Magistrate 1413622522846.58fd75078b4a47b8c6a20705f23209b7, current region memstore size 168

2014-10-18 16 regionserver.DefaultStoreFlusher 5815 INFO [Priority.RpcServer.handler=1,port=60020] regionserver.DefaultStoreFlusher: Flushed, sequenceid=3, memsize=168, hasBloomFilter=true, into tmp file hdfs://beh/hbase/data/default/t1/58fd75078b4a47b8c6a20705f23209b7/.tmp/6ad49d65c8b94b678bab3c892bdb0d03

2014-10-18 16 Flux 58 DEBUG [Priority.RpcServer.handler=1,port=60020] regionserver.HRegionFileSystem: Committing store file hdfs://beh/hbase/data/default/t1/58fd75078b4a47b8c6a20705f23209b7/.tmp/6ad49d65c8b94b678bab3c892bdb0d03 as hdfs://beh/hbase/data/default/t1/58fd75078b4a47b8c6a20705f23209b7/cf/6ad49d65c8b94b678bab3c892bdb0d03

2014-10-18 16 regionserver.HStore 5815 INFO [Priority.RpcServer.handler=1,port=60020] regionserver.HStore: Added hdfs://beh/hbase/data/default/t1/58fd75078b4a47b8c6a20705f23209b7/cf/6ad49d65c8b94b678bab3c892bdb0d03, entries=1, sequenceid=3, filesize=1021

2014-10-18 16 regionserver.HRegion 585 INFO 29879 INFO [Priority.RpcServer.handler=1,port=60020] regionserver.HRegion: Finished memstore flush of ~ 168, currentsize=0/0 for region T1 Magi 1413622522846.58fd75078b4a47b8c6a20705f23209b7. In 1063ms, sequenceid=3, compaction requested=false

As you can see, first put the MemStore flush under .tmp, and then move it to the corresponding columnFamily under the region directory.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report