In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "what is the Hbase WAL threading model". In daily operation, I believe many people have doubts about what the Hbase WAL threading model is. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how the Hbase WAL threading model is". Next, please follow the editor to study!
Hbase's WAL mechanism is a key link to ensure that hbase uses the lsm tree storage model to convert random writes into sequential writes and read data from memory, thus improving the efficiency of large-scale reading and writing. Wal's multi-producer, single-consumer threading model makes wal writes safe and efficient.
In order to write wal efficiently, safely and orderly, the author thinks that the two most critical mechanisms are the thread model and the multi-producer and single-consumer model used in wal.
Thread model
The thread model is mainly implemented in FSHLog. FSHLog is the implementation class of the WAL interface, which implements the most critical apend () and sync () methods. The model is shown in the figure:
This diagram mainly describes the thread flow model of hbase after calling append and sync in HRegion. On the far left are append and sync operations that have multiple client submitted to HRegion. When append is called, WALEdit and WALKey are encapsulated into FSWALEntry classes and then encapsulated into RinbBufferTruck classes into a thread-safe Buffer (LMAX Disruptor RingBuffer). When sync is called, a SyncFuture is generated and encapsulated into a RinbBufferTruck class that is also put into the Buffer, and then the worker thread is blocked waiting to be woken up by notify (). On the far right, there will be one and only one thread dedicated to processing the RinbBufferTruck, and if it is FSWALEntry, it will be written to the hadoop sequence file. Because of the existence of the file cache, it is very likely that the client data has not been dropped at this time. So further, if it is SyncFuture, it will be put into a thread pool in batches, and the disk will be flushed asynchronously. After successfully brushing, the worker thread will be awakened to complete the wal.
Source code analysis
The following will analyze the specific implementation process and details from the source code point of view.
In the worker thread, when HRegion prepares a row transaction "write" operation, WALEdit,WALKey calls the append method of FSHLog:
If the persistence level set by client is USER_DEFAULT,SYNC_WAL or FSYNC_WAL, the worker's HRegion will also call the sync () method of FSHLog:
The trace code can analyze that the Sync () method puts a SyncFuture object into the ringbuffer and blocks waiting for completion (wake-up).
As shown in the model diagram, multiple worker threads are encapsulated and taken into the sequence generated by ringbuffer and put into ringbuffer as producers. There is a private inner class RingBufferEventHandler class in FSHLog that implements the EventHandler interface of LAMX Disruptor, that is, the consumer of ringbuffer that implements the OnEvent method. Disruptor triggers the event handling of Consumer through the threads provided by java.util.concurrent.ExecutorService. You can see that only one thread is started in the wal of hbase, and you can also see from the source code comments that RingBufferEventHandler has only a single thread running. Because consumers are brushing data in the order of sequence, this ensures that only one thread is actually writing to the log file in a perceptible globally unique order when WAL logs are written concurrently.
You can see the structure of the RingBufferTruck class in this part of the source code, and you can see from the comments that you can select SyncFuture and FSWALEntry to put one into the ringbuffer.
This part of the source code can see that the final attribution of append is to write the FSWALEntry instance entry to the HadoopSequence file in an orderly manner according to sequence. The reason for the order here is that before multi-worker threads write, getting an incremental sequence,ringbuffer through ringbuffer thread-safe CAS will fetch the FSWALEntry according to sequence and set up the disk. In fact, this only needs to be thread-safe when you get an incremental sequence, while java's CAS is polled without locking, so it is very efficient.
SyncRunner is a thread, and wal actually has a thread group of SyncRunner that specializes in flushing the disk from append to the file cache.
SyncRunner's threading method (run ()) is responsible for brushing the file cache to disk. First of all, go to the previously submitted synceFutues to get the largest SyncFuture instance of sequence, and get the sequence corresponding to ringbuffer. Then compare the current largest sequence, and if you find the largest sequence, call the releaseSyncFuture () method to release the synceFuture. In fact, notify notifies the blocked sync operation so that the worker thread can continue. It was explained earlier that the sequence is based on the order of submission, and that the append is also globally ordered when it comes to the file cache, so here we take the largest sequence to flush the disk, as long as the maximum sequence has been flushed, then the sequence has been flushed successfully. Finally, call the current HadoopSequence file writer to brush the disk, and notify the corresponding syncFuture. In this way, the entire wal write is complete.
At this point, the study of "what is the Hbase WAL threading model" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.