Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the structure of Ceph?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how the Ceph structure is, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

4.1 hierarchical structure of Ceph system

The logical hierarchy of the Ceph storage system is shown in the following figure [1].

From the bottom up, the Ceph system can be divided into four tiers:

(1) basic storage system RADOS (Reliable, Autonomic, Distributed Object Store, i.e. reliable, automated, distributed object storage)

As the name implies, this layer itself is a complete object storage system, and all user data stored in the Ceph system is actually stored by this layer. The high reliability, high scalability, high performance, high automation and other features of Ceph are essentially provided by this layer. Therefore, understanding RADOS is the basis and key to understanding Ceph.

Physically, RADOS is composed of a large number of storage device nodes, each of which has its own hardware resources (CPU, memory, hard disk, network) and runs the operating system and file system. Sections 4.2 and 4.3 will introduce RADOS.

(2) basic library librados

The function of this layer is to abstract and encapsulate RADOS and provide API to the upper layer for application development directly based on RADOS (rather than the whole Ceph). It is important to note that RADOS is an object storage system, so the API implemented by librados is only for object storage capabilities.

RADOS is developed by C++ and provides both C and C++ native librados API, as documented in [2]. Physically, librados and the application developed on it are on the same machine, so it is also called local API. The application calls the librados API on the local machine, and the latter communicates with the nodes in the RADOS cluster through socket and completes various operations.

(3) High-level application interface

This layer consists of three parts: RADOS GW (RADOS Gateway), RBD (Reliable Block Device) and Ceph FS (Ceph File System). Its function is to provide upper-level interfaces with a higher level of abstraction and more convenient for applications or clients on the basis of librados libraries.

Among them, RADOS GW is a gateway that provides RESTful API compatible with Amazon S3 and Swift for the development of corresponding object storage applications. RADOS GW provides a higher level of abstraction for API, but not as powerful as librados. Therefore, developers should choose to use it according to their own needs.

RBD provides a standard block device interface, which is often used to create volume for virtual machines in virtualization scenarios. At present, Red Hat has integrated the RBD driver into KVM/QEMU to improve the performance of virtual machine access.

Ceph FS is a POSIX-compatible distributed file system. Because it is still under development, the Ceph official website does not recommend it for use in a production environment.

(4) Application layer

This layer is a variety of application methods for various application interfaces of Ceph in different scenarios, such as object storage applications directly developed based on librados, object storage applications developed based on RADOS GW, cloud disk based on RBD and so on.

In the above introduction, there is one area that may easily cause confusion: since RADOS itself is already an object storage system and can also provide librados API, why develop a separate RADOS GW?

Understanding this problem actually helps to understand the nature of RADOS, so it is necessary to analyze it here. At first glance, the difference between librados and RADOS GW is that librados provides native API, while RADOS GW provides RESTful API, and their programming model and actual performance are different. Furthermore, it is related to the differences between the target application scenarios of these two different levels of abstraction. In other words, although RADOS, S3 and Swift are both distributed object storage systems, RADOS provides more basic and richer features. This can be seen by comparison.

Because the API functions supported by Swift and S3 are similar, here is an example of Swift. The API features provided by Swift mainly include:

User management operations: user authentication, obtaining account information, listing containers, etc.

Container management operations: create / delete containers, read container information, list objects in containers, etc.

Object management operations: write, read, copy, update, delete, access permission settings, metadata read or update of objects.

Thus it can be seen that the API provided by Swift (and S3) operates on only three "objects": the user account, the container where the user stores the data object, and the data object. Moreover, all operations do not involve the underlying hardware or system information of the storage system. It is not difficult to see that such API design is entirely aimed at object storage application developers and users, and it is assumed that the content that developers and users care about is more focused on account and data management, but not interested in the details of the underlying storage system, let alone the in-depth optimization of efficiency, performance and so on.

The design idea of librados API is completely different. On the one hand, there is no high-level concept such as account and container in librados; on the other hand, librados API opens a large number of RADOS status information and configuration parameters to developers, allowing developers to observe the state of the RADOS system and the objects stored in it, and strongly control the system storage policy. In other words, by calling librados API, the application can not only operate the data object, but also manage and configure the RADOS system. This is unthinkable and unnecessary for the RESTful API design of S3 and Swift.

Based on the above analysis and comparison, it is not difficult to see that librados is actually more suitable for advanced users who have a deep understanding of the system and have a strong demand for function customization and deep performance optimization. Librados-based development may be more suitable for developing dedicated applications on private Ceph systems, or background data management and processing applications for Ceph-based public storage systems. RADOS GW is more suitable for the development of common web-based object storage applications, such as object storage services on the public cloud.

4.2 logical structure of RADOS

RADOS cluster is mainly composed of two kinds of nodes. One is a large number of OSD (Object Storage Device) responsible for data storage and maintenance, and the other is several monitor responsible for system state detection and maintenance. OSD and monitor transmit node state information to each other to get the overall working state of the system, and form a global system state record data structure, namely the so-called cluster map. This data structure, combined with the specific algorithm provided by RADOS, realizes the core mechanism of Ceph "no need to look up the table, just do the calculation" and some excellent features.

When using RADOS system, a large number of client programs obtain cluster map through interaction with OSD or monitor, and then calculate directly locally. After obtaining the storage location of the object, they communicate directly with the corresponding OSD to complete various operations of the data. It can be seen that in this process, as long as the cluster map is not updated frequently, the client can obviously complete the data access process without relying on any metadata server and without any table lookup operation. During the operation of RADOS, the update of cluster map depends entirely on the state change of the system, and there are only two common events that lead to this change: the failure of OSD or the expansion of RADOS. In normal application scenarios, the frequency of these two events is obviously much lower than the frequency of client access to data.

4.3 logical structure of OSD

By definition, OSD can be abstracted into two components, namely, the system part and the OSD deamon part.

The system part of OSD is essentially a computer with an operating system and file system installed, and its hardware part includes at least a single-core processor, a certain amount of memory, a hard disk and a network card.

Because such a small-scale x86 architecture server is not practical (in fact, it can not be seen), multiple OSD are usually centrally deployed on a larger server in practical applications. When choosing a system configuration, you should be able to ensure that each OSD takes up a certain amount of computing power, a certain amount of memory, and a hard disk. At the same time, the server should have sufficient network bandwidth. For specific hardware configuration choices, please refer to [4].

On the above system platform, each OSD has its own OSD deamon. This deamon is responsible for completing all the logical functions of the OSD, including communicating with monitor and other OSD (actually other OSD's deamon) to maintain and update the system state, working with other OSD to complete data storage and maintenance, communicating with client to complete various data object operations, and so on.

Thank you for reading this article carefully. I hope the article "what is the structure of Ceph" shared by the editor will be helpful to everyone? at the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report