Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use ExploringCompactionPolicy of HBase Compaction algorithm

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "how to use the ExploringCompactionPolicy of HBase Compaction algorithm", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use the ExploringCompactionPolicy of HBase Compaction algorithm" this article.

In version 0.98, the default compaction algorithm was replaced by ExploringCompactionPolicy, which used to be RatioBasedCompactionPolicy

ExploringCompactionPolicy inherits RatioBasedCompactionPolicy and overrides the applyCompactionPolicy method, and applyCompactionPolicy is a policy algorithm for selecting files for minor compaction.

The content of the applyCompactionPolicy method:

Public List applyCompactionPolicy (final List candidates, boolean mightBeStuck, boolean mayUseOffPeak, int minFiles, int maxFiles) {/ / this ratio is used by the later algorithm. You can set the ratio for off-peak periods (default: 5.0) to merge more data final double currentRatio = mayUseOffPeak? ComConf.getCompactionRatioOffPeak (): comConf.getCompactionRatio (); / / Start off choosing nothing. List bestSelection = new ArrayList (0); List smallest = mightBeStuck? New ArrayList (0): null; long bestSize = 0; long smallestSize = Long.MAX_VALUE; int opts = 0, optsInRatio = 0, bestStart =-1; / / for debug logging / / Consider every starting place. For (int start = 0; start)

< candidates.size(); start++) { // Consider every different sub list permutation in between start and end with min files. for (int currentEnd = start + minFiles - 1; currentEnd < candidates.size(); currentEnd++) { List potentialMatchFiles = candidates.subList(start, currentEnd + 1); // Sanity checks if (potentialMatchFiles.size() < minFiles) { continue; } if (potentialMatchFiles.size() >

MaxFiles) {continue;} / / Compute the total size of files that will / / have to be read if this set of files is compacted. Long size = getTotalStoreSize (potentialMatchFiles); / / Store the smallest set of files. This stored set of files will be used / / if it looks like the algorithm is stuck. If (mightBeStuck & & size

< smallestSize) { smallest = potentialMatchFiles; smallestSize = size; } if (size >

ComConf.getMaxCompactSize () {continue;} + + opts; if (size > = comConf.getMinCompactSize () & &! filesInRatio (potentialMatchFiles, currentRatio)) {continue;} + + optsInRatio; if (isBetterSelection (bestSelection, bestSize, potentialMatchFiles, size, mightBeStuck)) {bestSelection = potentialMatchFiles; bestSize = size; bestStart = start } if (bestSelection.size () = = 0 & & mightBeStuck) {LOG.debug ("Exploring compaction algorithm has selected" + smallest.size () + "files of size" + smallestSize + "because the store might be stuck"); return new ArrayList (smallest) LOG.debug ("Exploring compaction algorithm has selected" + bestSelection.size () + "files of size" + bestSize + "starting at candidate #" + bestStart + "after considering" + opts + "permutations with" + optsInRatio + "in ratio"); return new ArrayList (bestSelection)

From the code, the main algorithm is as follows:

Traverse the file from beginning to end to determine all the combinations that meet the criteria

The number of files selected for the combination must be > = minFiles (default: 3)

The number of files selected for combination must be 0) {/ / Keep the selection that removes most files for least size. That penaltizes adding / / large files to compaction, but not small files, so we don't become totally inefficient / (might want to tweak that in future). Also, given the current order of looking at / / permutations, prefer earlier files and smaller selection if the difference is small. Final double REPLACE_IF_BETTER_BY = 1.05; double thresholdQuality = ((double) bestSelection.size () / bestSize) * REPLACE_IF_BETTER_BY; return thresholdQuality

< ((double)selection.size() / size); } // Keep if this gets rid of more files. Or the same number of files for less io. return selection.size() >

BestSelection.size () | (selection.size ()) = = bestSelection.size () & & size

< bestSize); } 主要算法至此结束,下面说说其他细节及其优化部分: 步骤6的ratio默认值是1.2,但是打开了非高峰时间段的优化时,可以有不同的值,非高峰的ratio默认值是5.0,此优化目的是为了在业务低估时可以合并更多的数据,目前此优化只能是天的小说时间段,还不算灵活。 算法中关于mightBeStuck的逻辑部分,这个参数是用来表示是否有可能compaction会被卡住,它的状态是 待选文件数 - 正在做compaction的文件数 + futureFiles(默认值是0,有正在做compaction的文件时是1) >

= hbase.hstore.blockingStoreFiles (default is 10, this configuration is also used in flush and will be added later when analyzing flush), if it is true:

Choosing a file algorithm will also find a minimum solution. Prior to step 4 above, a combination of the smallest file size will be recorded

In the isBetterSelection part, the algorithm is changed to (bestSelection.size () / bestSize) * 1.05 < selection.size () / size, and an appropriate solution is selected by the ratio of file size to number of files.

When the result is returned, there is no suitable optimal solution or a minimum solution is returned.

The optimization part of mightBeStuck is equivalent to ensuring that in the case of a large number of files, you can also choose a minimum solution to do compaction, instead of letting the files continue to grow until a suitable combination appears.

The difference between this algorithm and RatioBasedCompactionPolicy, to put it simply, is that RatioBasedCompactionPolicy simply traverses the StoreFile list from beginning to end, and selects to execute Compaction when it comes to a sequence that meets the Ratio condition. ExploringCompactionPolicy, on the other hand, traverses from beginning to end while recording the current best, and then selects a global optimal list.

The above is all the content of the article "how to use ExploringCompactionPolicy of HBase Compaction algorithm". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report