In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "how to call the WAL of Hbase in RegionServer". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to call Hbase's WAL in RegionServer".
Hbase is a highly reliable, high-performance, column-oriented and scalable distributed storage system. Large-scale structured storage clusters can be built on cheap PC Server by using HBase technology.
Simple principle
Hbase is a storage system based on LSM tree, which uses log files and memory storage architecture to convert random writes into sequential writes, so as to ensure a stable data insertion rate. The log file here is the wal file, which is used to roll back data that has not been persisted after the server crashes.
WAL (Write-Ahead-Log) is a kind of log that HBase's RegionServer uses to record operations during the process of data insertion and deletion. The general process is shown in the following figure. First, the client initiates an operation to modify the data, each of which is encapsulated in an instance of the KeyValue object and sent to the HRegionServer containing the matching Region through a RPC call. Once the KeyValue arrives, they are sent to the HRegion instance that manages the corresponding line. The data is written to the WAL and then put into the MemStore of the storage file that actually owns the record. At the same time, it also checks whether the MemStore is full, and if it is full, it will be written to disk.
Analysis of the Source Code of wal call chain
This section will analyze a "write" process of hbase from a source code perspective as outlined above.
The basic calling procedure is as follows:
Put/delet and other "write" operations will use the service of MultiRowMutationService. In service, the mutateRows () method will be called to handle List. The real call to mutateRows () is that an implementation class of MultiRowMutationService, MultiRowMutationEndpoint,MultiRowMutationEndpoint class, implements the row transaction of hbase. Its main role can be seen from the MultiRowMutationEndpoint class documentation:
The implementation class of processor is MultiRowMutationProcessor.
Although there are many steps in the processRowsWithLocks method, the most critical steps are as follows:
List will be put in here, but it's not really in memstore. The real execution will wait for the sync () method to actually flush the log or WALEdite to disk, and then write the data to memstore through the asynchronous notification of the mvcc version number.
The syncOrDefer method will be called in this step, except that metaRegion,syncOrDefer will choose whether to call the sync method of wal (FSHLog) based on the persistence level set by client
In HBase, you can set the persistence level of WAL to determine whether to enable the WAL mechanism and how to uninstall HLog.
Client can be done by setting the WAL persistence level, such as code: put.setDurability (Durability. SYNC_WAL)
The persistence level of 1.1.3 WAL is divided into the following four levels:
USER_DEFAULT: by default, HBase uses SYNC_WAL level to persist data if the user does not specify a persistence level.
SKIP_WAL: write only the cache, not the HLog log. This approach can improve write performance because it only writes memory (memstore), but there is a risk of data loss.
ASYNC_WAL: writes data asynchronously to the HLog log.
SYNC_WAL: synchronously writes the data to the log file, which may just be written to the file system without actually setting up the disk.
FSYNC_WAL: synchronously writes data to the log file and forces the disk to be removed. The strictest log write level ensures that data will not be lost, but the performance is relatively poor.
As shown in the code, the current strategy for both sync_wal and fsync_wal is to call the sync () method of HFLog. Sync () is a blocking method that wakes up after the data is actually brushed to disk, and then the worker thread returns to write to memstore, completing a "write" operation.
Thank you for reading, the above is the content of "how to call Hbase WAL in RegionServer". After the study of this article, I believe you have a deeper understanding of how to call Hbase WAL in RegionServer, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.