In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "what is the method of RGW Bucket Shard design and optimization". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Recovery of OSD services with excessive OMAP
When the OSD omap where the bucket index is located is too large, once an exception causes the OSD process to crash, it is necessary to "put out the fire" on the spot and restore the OSD service as quickly as possible, so there is the following article.
First determine the OMAP size of the corresponding OSD, which causes the OSD to spend a lot of time and resources to load levelDB data when it starts, causing the OSD to fail to start (suicide timeout). In particular, this kind of OSD startup requires a very large memory consumption, so be sure to reserve good memory. (the physical memory is about 40 GB, so you can't use swap on top)
Root@demo:/# du-sh / var/lib/osd/ceph-214/current/omap/22G / var/lib/osd/ceph-214/current/omap/ 2017-08-11 11 11 var/lib/osd/ceph-214/current/omap/22G 52 var/lib/osd/ceph-214/current/omap/ 46.601938 7f298ae2e700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7f2980894700' had suicide timed out after 1800 > 2017-08-11 11 11 52race 46.605728 7f298ae2e700-1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check (ceph::heartbeat_handle_d*, const char*) Time_t) 'thread 7f298ae2e700 time 2017-08-11 11:52:46.601952common/HeartbeatMap.cc: 79: FAILED assert (0 = "hit suicide timeout") # adjust the osd timeout setting before committing suicide and starting the OSD service
Add to the ceph.conf configuration of the failed node
[osd] debug osd = 20 # adjust debug level [osd.214]... Filestore_op_thread_suicide_timeout = 7000 # set the corresponding osd timeout to prevent osd from committing suicide and starting the OSD process
Observation log
Tailf / home/ceph/log/ceph-osd.214.log
Start the service
/ etc/init.d/ceph start osd.214
One more top is launched at the backend to observe the resource consumption of the process. Currently, an OSD with an OMAP of about 16 GB requires about 37 GB of memory. The OSD process takes up a very high amount of memory and CPU during recovery, as shown in the following figure
Choose a machine to release memory
When you observe the following records in the log, you can start the memory release operation (or leave it to the end).
2017-08-11 15 load_pgs opened 0814. 551305 7f2b3fcab900 0 osd.214 29425 pgs
The command to free memory is as follows
Monitoring during the recovery of ceph tell osd.214 heap releaseOSD Services
After the above operations, osd will continue to recover Omap data, and the whole process is relatively long. You can open watch ceph-s for observation at the same time. Generally, the recovery rate is 14MB per second, and the recovery time estimation formula
Recovery time (in seconds) = total OMAP capacity / 14 Note: where the total OMAP capacity is obtained by the previous du command
The logs during the recovery are as follows
2017-08-11 15 lc 11VERV 25.049357 7f2a3b327700 10 osd.214 pg_epoch: 29450 pg [76.2b6 (v 29425 "5676261 lc 29425" 5676261] local-les=29449 nasty 4 ec=20531 les/c 29449Mab 29447 29448MB 28171) [70Eng 23214] rang 2 lpr=29448 pi=20532-2944735 Universe luod=0'0 crt=29425'5676261 lcod 0 active mould1] handle_message: 0x6511312002017-08-11 1514 11RZ 25.049380 7f2a3b327700 10 osd.214 pg_epoch: 29450 pg [76.2b6 (v 294252576261) Lc 29425 lpr=29448 pi=20532 5676260 (29296 luod=0'0 crt=29425'5676261 lcod 5672800 Eng 29425 5676261] local-les=29449 naughty 4 ec=20531 les/c 29449 Lex29447 29448 handle_push ObjectRecoveryInfo (6f648ab6/.dir.hxs1.55076.1.6/head//76@29425'5676261) Copy_subset: [], clone_subset: {}) ObjectRecoveryProgress (! first, data_recovered_to:0, data_complete:false, omap_recovered_to:0_00001948372.1948372.3 Omap_complete:false) 2017-08-11 1515 osd.214 pg_epoch 1115 pg 1125.049400 7f2a3b327700 10 osd.214 pg_epoch: 29450 pg [76.2b6 (v 29425 "5676261 lc 29425" 5676261 (29296 "5672800 Ensemble 29425" 5676261] local-les=29449 nasty 4 ec=20531 les/c 29449 use 29447 29448 MB 28171) [70mei 23214] rang 2 lpr=29448 pi=20532-29447Compact 35 luod=0'0 crt=29425'5676261 lcod 0 active maser 1] submit_push_data: Creating oid 6f648ab6/.dir.hxs1.55076.1.6/head//76 in the temp collection2017- 08-11 15 7f2a3b327700 11 dequeue_op 0x651131200 finish2017 25.123153 7f2a3b327700 10 osd.214 29450 dequeue_op 0x651131200 finish2017-08-11 15 15 dequeue_op 0x651131200 finish2017 11 15 dequeue_op 0x651131200 finish2017 25.138155 7f2b357a1700 5 osd.214 29450 tick2017-08-11 15 15 dequeue_op 0x651131200 finish2017 11 Ride 25.138186 7f2b357a1700 20 osd.214 29450 scrub_should_schedule should run between 0-24 now 15 = yes2017-08-11 15 Rich 11Rom 25.138210 7f2b357a1700 20 osd.214 29450 scrub_should_schedule loadavg 3.34 > = max Load too high2017-08-11 15 osd.214 11 load too high2017 25.138221 7f2b357a1700 20 osd.214 29450 sched_scrub load_is_low=02017-08-11 15 sched_scrub done2017 11 25.138223 7f2b357a1700 10 osd.214 29450 sched_scrub 76.2a9 high load at 2017-08-10 11 11 7f2b357a1700 359828: 99109.8 < max (604800 seconds) 2017-08-11 1515 do_waiters-- 25.138235 7f2b357a1700 20 osd.214 29450 sched_scrub done2017-08-11 1515 do_waiters-- Start2017-08-11 15 finish2017 1115 do_waiters 25.138239 7f2b357a1700 10 osd.214 29450 do_waiters-- finish2017-08-11 1515 osd.214 1125.163988 7f2aaef77700 20 osd.214 29450 share_map_peer 0x66b4e0260 already has epoch 294502017-08-11 1515 osd.214 11VOR 25.164042 7f2ab077a700 20 osd.214 29450 share_map_peer 0x66b4e0260 already has epoch 29450 2017-08-11 15VlV 11VX 25.268001 7f2aaef77700 20 osd.214 29450 share_map_peer 0x66b657a20 already has epoch 29450 2017-08-11 15RV 11Rich 25.268075 7f2ab077a700 20 osd.214 29450 share_map_peer 0x66b657a20 already has epoch 29450
When the corresponding PG status of OSD returns to normal, the following closing operation can be performed.
Wrap-up work
Clean up memory
After OSD completes data recovery, CPU will drop, but memory will not be freed, so you must use the previous command to free memory.
Adjust the log level
Ceph tell osd.214 injectargs "--debug_osd=0/5"
Delete the temporary new content in ceph.conf.
At this point, the three articles in the bucket shard section are finished.
This is the end of the content of "what is the method of RGW Bucket Shard Design and Optimization". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.