In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the example analysis of the reading and writing process of Ceph, which has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.
1. Modules in Osd
Message encapsulation
Send and receive messages on the OSD. There are two categories:
1.cluster_messenger-communicate with other OSDs and monitors
2.client_messenger-communicate with the client
Message scheduling
Dispatcher class, which is mainly responsible for message classification
Work queue
1. OpWQ: handles ops (from client) and sub ops (from other OSD). Runs in the op_ TP thread pool.
2. PeeringWQ: handles peering tasks, running in the op_ tp thread pool.
3. CommandWQ: processes the cmd command, which runs on command_tp.
4. RecoveryWQ: data repair, running on recovery_tp.
5. SnapTrimWQ: snapshot related, running on disk_tp.
6. ScrubWQ: scrub, running on disk_tp.
7. ScrubFinalizeWQ: scrub, running on disk_tp.
8. RepScrubWQ: scrub, running on disk_tp.
9. RemoveWQ: delete the old pg directory. Run on disk_tp.
Thread pool
There are four types of OSD thread pools:
1. Op_tp: dealing with ops and sub ops
2. Recovery_tp: handle repair tasks
3. Disk_tp: handle disk-intensive tasks
4. Command_tp: processing commands
II. Ceph reading process
Note: it is not clear about the format of the index, finding and updating the index and how to persist it.
There is no index, everything is regular:
The file name format for each object is:
Objectname_key_head (snap_num) _ hash_namespace_poolid
Angular objectname: object name
Angular key and namespace: both are specified by the client and are used for namespace subdivision. When a block device is used, it is generally set to empty.
·head (snap_num): snapshot version
·hash: calculated by objectname, u_int32_t type, which is converted to hexadecimal character printing, such as 3AF0B980
The id of the angular poolid:pool
Directory structure:
Data directory / PG name / subdirectory / object file name
Examples are as follows:
/ data09/ceph/osd2/current/0.0_head/DIR_0/DIR_8/DIR_9/10000007af4.00000000__head_3AF0B980__0
Where the subdirectory is generated according to the reverse arrangement of the characters of the hash field in the object file name. When the number of files in a directory is greater than the configured value (merge_threshold * 16 * split_multiplier), a subdirectory is created to archive the files.
ReplicatedPG.h
ReplicatedPG.cc
In the int ReplicatedPG::do_osd_ops (OpContext * ctx, vector& ops) function
Case CEPH_OSD_OP_READ branch
R = osd- > store- > fiemap (coll, soid, op.extent.offset, op.extent.length, bl)
R = pgbackend- > objects_read_sync (
Soid, miter- > first, miter- > second, & tmpbl)
Pgbackend- > objects_read_sync to int ReplicatedBackend::objects_read_sync call store- > read (coll, hoid, off, len, * bl), from ObjectStore::read
3. Ceph writing process
Phase 1: the primary node sends a request
Phase 2: processing requests from the node
Osd- > store- > queue_transactions (& osr, rm- > tls, onapply, oncommit)
The two callbacks registered here:
Context * oncommit = new C_OSD_RepModifyCommit (rm); called when the log is written to disk
Context * onapply = new C_OSD_RepModifyApply (rm); called when the operation is processed
Make ACK and ON_DISK responses to the master node respectively.
Note: transaction encapsulation, journal log write details, object write details have not had time to see.
Phase 3: the master node receives the response from the slave node and responds to the client
Thank you for reading this article carefully. I hope the article "sample Analysis of Ceph Reading and Writing process" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.