In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Code structure (2) space manager
In this and the next article, we will introduce the space management part of dm dedup and the core process Imaco writing process.
Before that, let's take a look at the resources used and understand dm dedup's space manager space manager
The space manager is a huge array marked by the allocptr application pointer to scan the entire space for a week (back to current allocptr).
It is used to find the free block (white), assign it to a hash request, turn it into a green block, and hash_pbn this information into the kvs_hash_pbn table.
The space for kvs_lbn_pbn and kvs_hash_pbn is pre-allocated, so the index of both tables has a definite meaning, which is very easy to find.
Dc- > kvs_hash_pbn = dc- > mdops- > kvs_create_sparse (md, crypto_key_size,sizeof (struct hash_pbn_value), dc- > pblocks, unformatted); dc- > kvs_lbn_pbn = dc- > mdops- > kvs_create_linear (md, 8 memsizeof (struct lbn_pbn_value), dc- > lblocks, unformatted)
When creating kvs (key value space), there are two options, one is that linear is a mapped lbn-pbn, and the other is hash index.
Both ksize and vsize will be saved in different ways due to different types of space.
Create space mode inram of ① kvs-lbn-pbn,linear
Static struct kvstore * kvs_create_linear_inram (struct metadata * md,u32 ksize, U32 vsize,u32 kmax, bool unformatted) {struct kvstore_inram * kvs; U64 kvstore_size, tmp; kvs = kmalloc (sizeof (* kvs), GFP_NOIO); if (! kvs) return ERR_PTR (- ENOMEM); kvstore_size = (kmax + 1) * vsize; kvs- > store = vmalloc (kvstore_size) / * determine the size of kvs- > store. The idea here is very simple: 64 bit lbn is addressed to a ksize pbn, and the general pbn is 64 bit*/ / * kmax is the size of a logical device. This map-table means the mapping of lbn-pbn * / tmp = kvstore_size; (void) do_div (tmp, (1024 * 1024)). Memset (kvs- > store, EMPTY_ENTRY, kvstore_size); kvs- > ckvs.vsize = vsize; kvs- > ckvs.ksize = ksize; kvs- > kmax = kmax; kvs- > ckvs.kvs_insert = kvs_insert_linear_inram; / * insert api*/ kvs- > ckvs.kvs_lookup = kvs_lookup_linear_inram; / * find api*/ kvs- > ckvs.kvs_delete = kvs_delete_linear_inram / * Delete api*/ kvs- > ckvs.kvs_iterate = kvs_iterate_linear_inram; / * iterative api*/ md- > kvs_linear = kvs; return & (kvs- > ckvs);}
Let's take a quick look at kvs_insert_linear_inram and kvs_lookup_linear_inram, deletions and iterations left in the garbage collection section.
Static int kvs_insert_linear_inram (struct kvstore * kvs, void * key,s32 ksize, void * value,int32_t vsize) {u64 idx; char * ptr; struct kvstore_inram * kvinram = NULL; kvinram = container_of (kvs, struct kvstore_inram, ckvs); idx = * ((uint64_t *) key); ptr = kvinram- > store + kvs- > vsize * idx; / * map*/ memcpy with lbn as key,pbn (ptr, value, kvs- > vsize) Return 0;}
The inserted code is so simple that it can be understood that linear is an array like U64 kvs [kmax], lbn is the array subscript, and value is the array content.
Static int kvs_lookup_linear_inram (struct kvstore * kvs, void * key, S32 ksize, void * value, int32_t * vsize) {u64 idx; char * ptr; int r =-ENODATA; struct kvstore_inram * kvinram = NULL; kvinram = container_of (kvs, struct kvstore_inram, ckvs); idx = * (uint64_t *) key); ptr = kvinram- > store + kvs- > vsize * idx If (is_empty (ptr, kvs- > vsize) return r; memcpy (value, ptr, kvs- > vsize); * vsize = kvs- > vsize; return 0;}
Create space mode inram of ② kvs-hash-pbn,sparse
The way of sparse is different from that of linear, and its organization will be more complex.
He wants to save both key and value. Key_size is the size of the hash algorithm. For example, md5 is 128bit,value, pbn is 64 bit.
Static struct kvstore * kvs_create_sparse_inram (struct metadata * md, U32 ksize, U32 vsize, U32 knummax, bool unformatted) {struct kvstore_inram * kvs; U64 kvstore_size, tmp; kvs = kmalloc (sizeof (* kvs), GFP_NOIO) / * the maximum value of knummax key is applied for here according to the maximum value of pbn. The size of physical equipment * / knummax + = (knummax * HASHTABLE_OVERPROV) / 100 spare vsize * / kvstore_size = (vsize + ksize)); / * the applicant is vsize (pbn) 64bit ksize (hash_size) 128 bit*/ kvs- > store = vmalloc (kvstore_size); tmp = kvstore_size (void) do_div (tmp, (1024 * 1024)); memset (kvs- > store, EMPTY_ENTRY, kvstore_size); / * change all key to EMPTY_ENTRY= 0xFB in advance (the latest code 4.13 is 0xFF) * / kvs- > ckvs.vsize = vsize; kvs- > ckvs.ksize = ksize; kvs- > kmax = knummax; kvs- > ckvs.kvs_insert = kvs_insert_sparse_inram / * insert api*/ kvs- > ckvs.kvs_lookup = kvs_lookup_sparse_inram; / * find api*/ kvs- > ckvs.kvs_delete = kvs_delete_sparse_inram; / * Delete api*/ kvs- > ckvs.kvs_iterate = kvs_iterate_sparse_inram; / * iterative api*/ md- > kvs_sparse = kvs; return & (kvs- > ckvs);}
Next, let's take a look at the insert and find function, which will be a little more difficult than linear's, as long as a few examples will be easy to understand.
You can see the initialization process in the debugging information.
It also includes two writing processes, including lookup and insert, which will be easier to follow in order to help you understand quickly.
Static int kvs_insert_sparse_inram (struct kvstore * kvs, void * key,s32 ksize, void * value, S32 vsize) {u64 idxhead = * ((uint64_t *) key); / * whether key is 128bit or 64bit, we take 64bit out * / U32 entry_size, head, tail; char * ptr; struct kvstore_inram * kvinram = NULL; kvinram = container_of (kvs, struct kvstore_inram, ckvs); entry_size = kvs- > vsize + kvs- > ksize / * establish the entry_size of each unit, generally: 64bit + 128bit*/ head = do_div (idxhead, kvinram- > kmax); / * this step calculates the position of the specific key in the hash, taking the remainder of the idxhead as head, and the same probability of the key from the hash is very low. Dividing by kmax to take the remainder is to find the location for key, where head is also part of the hash attribute of key * / tail = head. / * this loop is very funny, although my debugging did not show that it worked, but we said earlier that head has certain hash properties, but it is not unique here, because key itself is very large 128bit, it can represent the uniqueness of the content of a pbn, the remaining head is a small value, it can not only represent pbn, although you can think that head:150702 represents idxhead:0x24100420901436 And save it in this head location, but it is not unique. Then if the * ptr of the location where the head after the remainder is generated already exists, you need to find another head for the key to save it. The code here is to take it backwards, find a NULL or * ptr dropped by deleted, and record the key at this location * / do {ptr = kvinram- > store + entry_size * head. If (is_empty (ptr, entry_size) | | is_deleted (ptr, entry_size) {memcpy (ptr, key, kvs- > ksize); memcpy (ptr + kvs- > ksize, value, kvs- > vsize); return 0;} head = next_head (head, kvinram- > kmax);} while (head! = tail); return-ENOSPC;}
The search code is almost the same as the inserted code.
That is, in do {} while, start looking from head, find the same key value, and take out its value (pbn).
Static int kvs_lookup_sparse_inram (struct kvstore * kvs, void * key,s32 ksize, void * value, int32_t * vsize) {U64 idxhead = * ((uint64_t *) key); U32 entry_size, head, tail; char * ptr; struct kvstore_inram * kvinram = NULL; int r =-ENODATA; kvinram = container_of (kvs, struct kvstore_inram, ckvs); entry_size = kvs- > vsize + kvs- > ksize; head = do_div (idxhead, kvinram- > kmax) Tail = head; do {ptr = kvinram- > store + entry_size * head; if (is_empty (ptr, entry_size)) return r; if (memcmp (ptr, key, kvs- > ksize)) {head = next_head (head, kvinram- > kmax);} else {memcpy (value, ptr + kvs- > ksize, kvs- > vsize); return 0 }} while (head! = tail); return r;}
Here you may feel very strange, why here is such a simple search method, why not sort and so on.
My thinking here is like this: first of all, this is a search in inram, so the performance of the search itself is still very high. Secondly, the head value of the remainder of this key, although it is not unique, it has the hash attribute of key, that is to say, the probability of conflict is not high, so when pbn does not apply for many applications, it should be found in one step, and even if there is a conflict. It is also looking backward, and it is likely to be the next one, so even if it collides, its saved location is also related, and it can be found in the next few of it. I think this performance should be good.
[this article is only posted by 51cto blogger "underlying Storage Technology" https://blog.51cto.com/12580077, official account release: storage Valley], if you need to reprint, please contact me, thank you.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.