In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
In hbase, business reading is very frequent. Many operations are that the client locates to the specific regionserver according to the meta table and then queries the specific data in the region.
But now the problem is that a region consists of a memstore and multiple filestore. The memstore is similar to cached in the server memory, which can improve the insertion efficiency. When the memstore reaches a certain size (set by hbase.hregion.memstore.flush.size) or the user manually flush, it will be solidified and stored on disk systems such as hdfs. In other words, a region can correspond to many files with valid data, although the data in the file is sorted by rowkey, but the rowkey between the files does not have any order (unless merged into one file after a major_compact).
If the request made by the user now is to view a random column of a rowkey (row1) (cf1:col1)
Even with commands like get 'tab','row1','cf1:col1'
It is quite possible that the row1 is between the startkey and the endkey of each file, so the regionserver needs to scan the relevant data blocks of each file and perform multiple physical IO. However, there is no guarantee that there must be a line key like row1 in every file, and many physical IO are invalid, which has a big impact on performance. As a result, there is a Bloom filter to determine to a certain extent whether there is a specified row key in the document.
Bloom filter is divided into row and rowcol, the principle is similar, take the rowcol type as an example:
When memstore writes to hdfs to form a file, there is a part of the file called meta, which follows the following algorithm in the process of writing:
1. First, a long bit array might as well be called bit arr [n] = {0}.
two。 Using k hash functions (k
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.