Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What's the use of FreelistManager?

2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what is the use of FreelistManager". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the use of FreelistManager"?

Preface

BlueStore directly manages bare devices and needs to manage the allocation and release of space on its own. The results of the Stupid and Bitmap allocators are stored in memory, and the persistence of the allocation results is done through FreelistManager.

The state of a block can be occupied or idle. Only one state needs to be recorded during persistence, and another state can be deduced. BlueStore records idle block. There are two main reasons: first, when recycling space, it is convenient to merge free space; second, the allocated space has been recorded in Object.

FreelistManager started with two implementations of extent and bitmap, but now defaults to the bitmap implementation, and the extent implementation has been abandoned. The persistence of free space to disk is also written through RocksDB's Batch. FreelistManager divides the block into a certain number of segments, and each segment corresponds to a KLV key-value pair. Key is the state of the first block in the offset,value of the disk physical address space, that is, a bitmap composed of 0 block, 1 for idle and 0 for use. In this way, the allocation and recovery of space can be unified by XOR operation with 1.

Universal interface

The main interfaces of FreelistManager are allocator and release.

Virtual void allocate (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) = 0 position virtual void release (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) = 0; data structure class BitmapFreelistManager: public FreelistManager {/ / rocksdb key prefix: meta_prefix is B std::string meta_prefix, bitmap_prefix / / pointer to key-value DB, encapsulating rocksdb operation KeyValueDB * kvdb; / / rocksdb merge operation: locking std::mutex lock; / / device total size uint64_t size when xor ceph::shared_ptr merge_op; / / enumerate operation / / Total number of block of devices uint64_t blocks; / / block size: bdev_block_size. Default min_alloc_size uint64_t bytes_per_block; / / how many block are contained in each key. Default is 128uint64_t blocks_per_key; / / each key corresponds to space size uint64_t bytes_per_key. / / block Mask uint64_t block_mask; / / key Mask uint64_t key_mask; bufferlist all_set_bl; / / traverse rocksdb key related member KeyValueDB::Iterator enumerate_p; uint64_t enumerate_offset; bufferlist enumerate_bl; int enumerate_bl_pos;}; initialize

When initializing osd, BlueStore executes mkfs and initializes FreelistManager (create/init). Later, if the process is restarted, mount operation will be performed, and only init operation will be performed on FreelistManager.

Int BlueStore::mkfs () {. R = _ open_fm (true);.} int BlueStore::_open_fm (bool create) {. Fm = FreelistManager::create (cct, freelist_type, db, PREFIX_ALLOC); / / the first initialization requires curing meta parameter if (create) {fm- > create (bdev- > get_size (), min_alloc_size, t);}. Int r = fm- > init (bdev- > get_size ());} / / create solidifies some meta parameters into kvdb. When init reads these parameters int BitmapFreelistManager::create (uint64_t new_size, uint64_t min_alloc_size, KeyValueDB::Transaction txn) {txn- > set (meta_prefix, "bytes_per_block", bl) from kvdb. / / min_alloc_size txn- > set (meta_prefix, "blocks_per_key", bl); / / 128 txn- > set (meta_prefix, "blocks", bl); txn- > set (meta_prefix, "size", bl) } / / create/init calls the following function to initialize the mask void BitmapFreelistManager::_init_misc () {/ / 128 > > 3 = 16 of the block/key, and each block is represented by a bit. / / that is, the value of a key corresponds to 128block and requires 16 bytes. Bufferptr z (blocks_per_key > > 3); memset (z.c_str (), 0xff, z.length ()); all_set_bl.clear (); all_set_bl.append (z); / / 0x FFFF FFFF FFFF F000 block_mask = ~ (bytes_per_block-1); bytes_per_key = bytes_per_block * blocks_per_key / / 0xFFFF FFFF FFF8 0000 key_mask = ~ (bytes_per_key-1);} Merge

XOR Merge API implementation:

Https://github.com/ceph/ceph/blob/master/src/os/bluestore/BitmapFreelistManager.cc#L21

/ / inherit rocksdb merge API: xor struct XorMergeOperator: public KeyValueDB::MergeOperator {/ / old_value does not exist, then new_value is directly assigned to rdata. Void merge_nonexistent (const char * rdata, size_t rlen, std::string * new_value) override {* new_value = std::string (rdata, rlen);} / / old_value exists, XOR xor bit by bit with rdata. Void merge (const char * ldata, size_t llen, const char * rdata, size_t rlen, std::string * new_value) override {assert (llen = = rlen); * new_value = std::string (ldata, llen); for (size_t I = 0; I

< rlen; ++i) { (*new_value)[i] ^= rdata[i]; } } // We use each operator name and each prefix to construct the // overall RocksDB operator name for consistency check at open time. string name() const override { return "bitwise_xor"; }}; 异或Merge接口应用: https://github.com/ceph/ceph/blob/master/src/kv/RocksDBStore.cc#L91 bool Merge(const rocksdb::Slice& key, const rocksdb::Slice* existing_value, const rocksdb::Slice& value, std::string* new_value, rocksdb::Logger* logger) const override { // for default column family // extract prefix from key and compare against each registered merge op; // even though merge operator for explicit CF is included in merge_ops, // it won't be picked up, since it won't match. for (auto& p : store.merge_ops) { if (p.first.compare(0, p.first.length(), key.data(), p.first.length()) == 0 && key.data()[p.first.length()] == 0) { // 如果old_value存在,那么直接merge,否则直接替换。 if (existing_value) { p.second->

Merge (existing_value- > data (), existing_value- > size (), value.data (), value.size (), new_value) } else {p.second-> merge_nonexistent (value.data (), value.size (), new_value);} break;}} return true;}

Finally, the Merge method of Rocksdb's Batch is called. Batch can implement atomic operations for simple writes and conditional writes.

Allocate

The two operations of allocating and freeing space are exactly the same, both calling the Xor operation, so let's focus on the _ xor function.

Void BitmapFreelistManager::allocate (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) {_ xor (offset, length, txn);} void BitmapFreelistManager::release (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) {_ xor (offset, length, txn) } void BitmapFreelistManager::_xor (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) {/ / Note that both offset and length are aligned with the block boundary uint64_t first_key = offset & key_mask; uint64_t last_key = (offset + length-1) & key_mask If (first_key = = last_key) {/ / the simplest case, this operation corresponds to a segment bufferptr p (blocks_per_key > > 3); / / 16-byte buffer p.zero (); / / set to all 0 unsigned s = (offset & ~ key_mask) / bytes_per_block The number unsigned e of the starting block in the / / segment = ((offset + length-1) & ~ key_mask) / the number for of the ending block in the bytes_per_block; / / segment (unsigned I = s; I > 3] ^ = 1ull > 3 locates the bytes of the block corresponding bit, 1ullmerge (bitmap_prefix, k, all_set_bl) / / XOR operation with the current value / / add key, and locate the next segment first_key + = bytes_per_key } / / the last segment {/ / is similar to the previous operation}

The xor function looks complex and is all bit operations. If you analyze it carefully, the allocation and release operations are the same as XOR between the bit bit of the segment and the current value. A segment corresponds to a set of blocks, with a default of 128s, and a set of values in KLV. For example, when all disk space is free, the KBE status is as follows: (b000000000x00), (b00001000, 0x00), (b00002000, 0x00)... B is the prefix of key and stands for bitmap.

Release

Freeing up space is the same as allocating space.

Void BitmapFreelistManager::release (uint64_t offset, uint64_t length, KeyValueDB::Transaction txn) {_ xor (offset, length, txn);} Thank you for reading, the above is the content of "what's the use of FreelistManager"? after the study of this article, I believe you have a deeper understanding of what the use of FreelistManager is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report