Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of two caching mechanisms in hbase: memstore and blockcache (must see)

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Background:

1. Caching is extremely important to the database

2. Ideally, all data can be cached in memory, so that there will not be any file IO requests, and read and write performance will inevitably be improved to the extreme.

3. We do not need to cache all the data. According to the 2008 rule, 80% of business requests are concentrated on 20% of hot data.

4. Caching 20% of the data can greatly improve the performance of the system.

HBase provides two cache structures in its implementation: MemStore and BlockCache.

MemStore

1. Where MemStore is called write cache

2. When HBase performs a write operation, the data is first written to MemStore and sequentially to HLog.

/ / in the code, we understand that the data is written sequentially to HLog and then to MemStore.

3. After meeting certain conditions, the data in MemStore will be refreshed to disk uniformly. This design can greatly improve the write performance of HBase.

4. MemStore is also very important for read performance. If there is no MemStore, reading newly written data needs to be looked up from the file through IO, which is obviously expensive!

BlockCache

1. BlockCache is called read cache

2. HBase caches the Block blocks of a file lookup into Cache, so that the subsequent same request or adjacent data lookup request can be obtained directly from memory to avoid expensive IO operations.

Briefly review the concept of Block in HBase

1. Block is the smallest data storage unit in HBase. The default is 64K, which can be specified by parameter BlockSize in the table statement.

2. There are four types of Block in HBase: Data Block,Index Block,Bloom Block and Meta Block.

3. Data Block is used to store actual data. Usually, each Data Block can store multiple KeyValue data pairs.

4. Both Index Block and Bloom Block are used to optimize the search path of random reads.

5. Index Block accelerates data lookup by storing index data

6. Through certain algorithms, Bloom Block can filter out some data files that must not have KeyValue to be checked, thus reducing unnecessary IO operations.

7. Meta Block mainly stores the metadata of the whole HFile.

1. BlockCache is Region Server-level.

2. A Region Server has only one Block Cache, and the initialization of the Block Cache is completed when the Region Server starts.

3. Up to now, HBase has implemented three BlockCache schemes successively. LRUBlockCache is the initial implementation scheme and the default implementation scheme. HBase version 0.92 implements the second scheme, SlabCache. See BucketCache, another option provided by the government after HBASE-4027;HBase 0.96, see HBASE-7404.

4. The difference between these three schemes lies in the management mode of memory.

5. LRUBlockCache puts all the data into JVM Heap and gives it to JVM for management.

6. SlabCache BucketCache uses different mechanisms to store part of the data out of the heap and leave it to HBase to manage it.

7. This evolution process is because the JVM garbage collection mechanism in LRUBlockCache scheme often causes programs to be suspended for a long time, and using out-of-heap memory to manage data can effectively avoid this situation.

Default BlockCache implementation of LRUBlockCache / / HBase

1. The memory is logically divided into three parts: single-access area, mutil-access area and in-memory area, which account for 25%, 50% and 25% of the total BlockCache size, respectively.

2. In a random read, a Block block is loaded from the HDFS and first put into the signle area.

3. If there are multiple requests to access this piece of data, the data will be moved to the mutil-access area.

3. The in-memory area indicates that the data can be resident in memory, which is generally used to store frequently accessed data with a small amount of data, such as metadata. Users can also set the column family property IN-MEMORY= true to put the column family into the in-memory area when creating the table. / / this section refers to the HBase-building table statement to parse the IN_MEMORY parameters mentioned in http://hbasefly.com/2016/03/23/hbase_create_table/.

4. Obviously, this design strategy is similar to young area, old area and perm area in JVM.

Stage summary:

LRUBlockCache mechanism: similar to jvm's young area, old area and perm area, it is divided into single-access area, mutil-access area and in-memory area, which account for 25%, 50% and 25% of the total BlockCache size, respectively. When the data is accessed randomly, it is loaded from hdfs and put into single-access area. If the data is requested many times later, it will be placed in mutil-access area, while in-memory area indicates that the data can be resident in memory. It is generally used to store frequently accessed data with a small amount of data, such as metadata.

/ / when the total amount of BlockCache reaches a certain threshold, the elimination mechanism will be activated, and the least used Block will be replaced to reserve space for the newly loaded Block.

Disadvantages: the use of LRUBlockCache caching mechanism can cause too much memory fragmentation due to CMS GC policies, which may trigger the infamous Full GC, trigger a terrible 'stop-the-world' pause, and seriously affect the upper layer business.

So how does the CMS GC policy cause too much memory fragmentation? How does too much memory fragmentation trigger Full GC? Please follow another blog post by the blogger.

SlabCache / / has been eliminated

1. In order to solve the service interruption caused by JVM garbage collection in LRUBlockCache solution, SlabCache scheme uses Java NIO DirectByteBuffer technology to realize out-of-heap memory storage, and JVM no longer manages data memory.

2. By default, the system allocates two cache areas during initialization, accounting for 80% and 20% of the total BlockCache size, respectively. Each cache area stores fixed-size Block blocks.

3. The former mainly stores Block with a size less than or equal to 64K, while the latter stores less than or equal to 128K Block. If a Block is too large, both regions cannot be cached.

4. Like LRUBlockCache, SlabCache also uses Least-Recently-Used algorithm to eliminate expired Block.

5. Unlike LRUBlockCache, when SlabCache eliminates Block, you only need to mark the corresponding bufferbyte as idle, and the subsequent cache can directly overwrite the memory on it.

The disadvantages of SlabCache and DoubleBlockCache are:

In the online cluster environment, the BlockSize settings of different tables and different column families may be different. Obviously, the SlabCache scheme, which can only store two fixed-size Block by default, cannot meet some user scenarios.

Therefore, SlabCache and LRUBlockCache are used together in the actual implementation of HBase, which is called DoubleBlockCache.

1. DoubleBlockCache scheme has many disadvantages. For example, the fixed size memory setting in SlabCache design will result in low actual memory utilization.

2. And using LRUBlockCache to cache Block still produces a lot of memory fragmentation because of JVM GC.

3. Therefore, after the version 0.98 of HBase, this scheme is no longer recommended.

Stage summary:

SlabCache: it implements out-of-heap memory storage, and data memory is no longer managed by JVM.

DoubleBlockCache: so in the actual implementation of HBase, SlabCache and LRUBlockCache are used together, which is called DoubleBlockCache.

Why should it be used together? Or the shortcomings of SlabCache.

1. In the online cluster environment, the BlockSize settings of different tables and different column families may be different. Obviously, the SlabCache solution which can only store two fixed-size Block by default cannot meet some user scenarios, such as setting BlockSize = 256k for users. Simply using SlabCache scheme cannot achieve the purpose of Block cache.

2. In a random read, a Block block is loaded from the HDFS and one is stored in the two Cache. When caching the read, look for it in the LRUBlockCache first. If the Cache Miss looks in the SlabCache again, then put the Block into the LRUBlockCache if the Block is hit.

Disadvantages:

The DoubleBlockCache scheme has many drawbacks. For example, fixed memory settings in the SlabCache design will result in low actual memory usage, and using LRUBlockCache cache Block will still result in a large amount of memory fragmentation due to JVM GC. Therefore, after version 0.98 of HBase, this scheme has been deprecated.

/ / has been eliminated

For this kind of cdh designed by BucketCache / / Ali, please refer to: hbase optimization for fullgc (optimization made by Memstore for BlockCache): https://blog.51cto.com/12445535/2373223

1. In practical application, the SlabCache scheme does not greatly improve the GC disadvantages of the original LRUBlockCache scheme, but also introduces additional defects such as low out-of-heap memory utilization. However, its design is not useless, at least in the use of out-of-stack memory to give Ali Daniel a lot of inspiration. Standing on the shoulders of SlabCache, they developed the BucketCache caching scheme and contributed it to the community.

2. In the actual implementation, HBase uses BucketCache with LRUBlockCache, which is called CombinedBlockCache.

3. Unlike DoubleBlockCache, the system mainly stores Index Block and Bloom Block in LRUBlockCache and Data Block in BucketCache

4. For a random read, you need to find the corresponding Index Block in LRUBlockCache first, and then look up the corresponding data block in BucketCache. BucketCache corrects the disadvantages of SlabCache through more reasonable design, which greatly reduces the actual impact of JVM GC on business requests.

5. However, there are also some problems, such as the problem of copying memory when using out-of-heap memory, which will affect read-write performance to a certain extent. Of course, this problem was also solved in later versions.

Advantages:

The Bucket Cache cache mechanism applies for a fixed size of memory as cache during initialization, cache elimination is no longer managed by JVM, and the cache operation of data Block only accesses and overwrites this space, thus greatly reducing the occurrence of memory fragments and reducing the frequency of Full GC. / / Bucket Cache cache obsolescence is no longer managed by JVM, which reduces the frequency of Full GC.

There are three modes in the Bucket Cache cache: heap mode and offheap mode file mode.

Offheap mode because memory belongs to the operating system, it basically does not generate CMS GC, that is, under no circumstances will Full GC be triggered because of memory fragmentation.

LRUBlockCache / / upgrade to understand

1. It uses a ConcurrentHashMap to manage the mapping of BlockKey to Block

2. To cache the Block, you only need to put the BlockKey and the corresponding Block into the HashMap, and the query cache can be obtained from the HashMap according to BlockKey.

3. At the same time, the scheme uses a strict LRU elimination algorithm, when the total amount of Block Cache reaches a certain threshold, the elimination mechanism will be started, and the least used Block will be replaced. In terms of specific implementation details, there are three points to pay attention to:

Cache tiering strategy

/ / divide the whole BlockCache into three parts: single-access, mutil-access and inMemory. It is important to note that the HBase system metadata is stored in the Inmemory area, so setting the data property InMemory = true needs to be very careful to ensure that the amount of data in this column family is small and accessed frequently, otherwise the hbase.meta metadata may be squeezed out of memory, seriously affecting all business performance.

Implementation of LRU elimination algorithm

Advantages and disadvantages of LRU scheme

/ / LRU scheme uses HashMap provided by JVM to manage cache, which is simple and effective.

But: there will be full gc debris space that accumulates all the time, resulting in the infamous Full GC.

Especially under the condition of large memory, a Full GC is likely to last a long time, or even reach the level of minutes. We all know that Full GC will pause the whole process (called stop-the-wold pause), so Full GC for a long time will greatly affect the normal read and write requests of the business.

BucketCache / / upgrade to understand the concept see blog cdh defaults to this caching mode

/ / it does not use the JVM memory management algorithm to manage the cache, but manages the memory itself, so it will not cause Full GC due to a large number of fragments.

Memory organization form

Block cache write and read process

BucketCache working mode

BucketCache configuration uses this mode used by / / focus / / cdh

/ /

The total size of the BucketCache, in MB. The size to configure depends on the amount of memory available for HBase or the size of the local SSD. If hbase.bucketcache.ioengine is set to "offheap", BucketCache consumes the amount of configured memory in Java's direct memory.

Hbase.bucketcache.size = 1m

Tip:

Both heap mode and offheap mode use memory as the final storage medium, and memory allocation queries also use Java NIO ByteBuffer technology. The difference is that the heap mode allocation memory calls the byteBuffer.allocate method, which is allocated from the heap area provided by JVM, while the latter calls the byteBuffer.allocateDirect method to allocate directly from the operating system.

These two memory allocation modes will have a certain impact on the actual performance of HBase. There is no doubt that GC has the greatest impact. Compared with heap mode, offheap mode basically does not generate CMS GC because the memory belongs to the operating system, that is, it will not trigger Full GC because of memory fragmentation in any case.

In addition, in terms of memory allocation and reading, the performance of the two is also different. For example, when memory is allocated, heap mode needs to first allocate memory from the operating system and then copy to JVM heap, which is more time-consuming than offheap to allocate memory directly from the operating system; but conversely, when reading cache, heap mode can be read directly from JVM heap, while offheap mode needs to be copied from the operating system to JVM heap and then read, which appears to be more time-consuming.

File mode is different from the previous two, it uses Fussion-IO or SSD as the storage medium, which provides more storage capacity than expensive memory, so it can greatly improve the cache hit ratio.

Tip:

LRUBlockCache

SlabCache

DoubleBlockCache: therefore, in the actual implementation of HBase, SlabCache and LRUBlockCache are used together, which is called DoubleBlockCache / / eliminated

This is used by cdh, the recommended out-of-heap memory mechanism for BucketCache / /

CombinedBlockCache:HBase uses BucketCache with LRUBlockCache, which is called CombinedBlockCache.

Summary:

So after the above analysis, LRU (LRUBlockCache) and CBC (CombinedBlockCache) are left.

We analyze in detail:

Conclusion

After looking at the comparative data of all the more important indicators, we can draw the following two points:

In the 'cache hit all' scenario, Mr. LRU beats Mr. CBC completely. Therefore, if the total data volume is very small compared to the memory capacity of JVM, select LRU; in all other scenarios where cache misses exist, the GC performance of LRU is almost only 1 CBC 3, while the throughput, read and write latency, IO, CPU and other metrics are basically the same, so it is recommended to choose CBC.

Theoretical explanation

1. The reason why all the metrics of LRU are better than CBC in the 'cache hit all' scenario

2. In the scenario of "a large number of cache misses", the metrics of LRU are basically the same as those of CBC.

3. This is because when HBase reads data, if all cache hits, for CBC, you need to copy the out-of-heap memory into JVM and then return it to the user. The process is more complex than LRU's in-heap memory, and the latency will be higher.

4. If a large number of caches miss, the proportion of memory operations will be very small, and the delay bottleneck mainly lies in IO, so that the indicators of LRU and CBC are basically the same.

Reference link:

HBase BlockCache Series-stepping into BlockCache http://hbasefly.com/2016/04/08/hbase-blockcache-1/

HBase BlockCache series-exploring the implementation mechanism of BlockCache http://hbasefly.com/2016/04/26/hbase-blockcache-2/

HBase BlockCache Series-performance comparison Test report http://hbasefly.com/2016/05/06/hbase-blockcache-3/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report