Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to build object Storage based on Ceph

2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to build object storage based on Ceph, which has a certain reference value. Interested friends can refer to it. I hope you can learn a lot after reading this article.

Storage development

Data storage is an eternal topic and a theme constantly explored by human beings.

Knot record

In primitive society, before writing was invented, people used a method of recording notes by tying knots on a rope.

Punch card

Punched card is the main storage method that began in the 20th century, and it is also the earliest form of mechanized information storage. After entering the 1960s, it was gradually replaced by other storage methods. Punch cards are rarely used at present unless they are used to read the historical data stored in that year.

Magnetic drum memory

In the 1950s, magnetic drum was used as internal memory in IBM 650s. The drum is also used as swap area storage and page storage in the subsequent IBM 360 Compact 91 and DEC PDP-11. The representative product of magnetic drum is IBM 2301 fixed head drum memory. The magnetic drum uses the magnetic material coated on the surface of the aluminum drum to store data. The drum rotates at a high speed, so the access speed is fast. It uses saturated magnetic recording, from fixed magnetic head to floating magnetic head, from magnetic glue to electroplated continuous magnetic medium. All these laid the foundation for the later disk storage.

The biggest disadvantage of magnetic drum is that the storage capacity is too small. A large cylinder has only one surface layer for storage, while both sides of the disk can be used for storage, obviously much higher utilization. Therefore, when the disk appears, the drum is eliminated.

Magnetic tape

Magnetic tape has been used as a data storage device since 1951. Magnetic tape is one of the commonly used storage media with the lowest unit storage cost, the largest capacity and the highest degree of standardization. From the late 1970s to the 1980s, small cassette tapes appeared. Each side of the 90-minute tape could record about 660KB data.

Floppy disk

Invented in 1969, the floppy disk is 8 inches in diameter and has a single-sided capacity of 80KB. Four years later, a 5.25-inch floppy disk with the capacity of 320KB was born. The development trend of floppy disk is that the diameter of floppy disk is getting smaller and smaller, while the capacity is getting larger and larger, and the reliability is also getting higher and higher. Figure 2-10 shows three typical floppy disks, of which an is a floppy disk of different sizes, and the 3.5in floppy disk in b has a capacity of 1.44MB, which has been widely used as the main mobile storage medium. In the late 1990s, there was a 3.5-inch floppy disk with a capacity of 250MB, but it was not widely used because of compatibility, reliability, cost and other reasons.

Optical disk

Early CDs were mainly used in the film industry. the first CD entered the market in 1987 with a diameter of 30cm and 60 minutes of audio and video recording on each side.

Hard disk storage

The first hard drive, the IBM Model 350 Disk File, was manufactured in 1956 and contained 50 24-inch disks with a total capacity of less than 5MB. Since the development of mechanical hard drives, the single disk capacity has exceeded 16T.

Three ways of storage block storage

DAS

As the simplest way of external storage, Direct attached Storage (Directed Attached Storage,DAS) is directly connected to a variety of server or client extension interfaces through data lines. It is a stack of hardware without any storage operating system, so it cannot provide storage services independently of the server. A common form of DAS is an external disk array, and the usual configuration is a RAID controller + a stack of disks. The characteristics of easy installation and low cost of DAS make it especially suitable for small and medium-sized data centers with low storage capacity and small number of servers.

SAN

Storage area Network (Storage Area Network, referred to as SAN). By default, SAN means that FC-SAN,SAN storage has two structures:

FC-SAN

A typical SAN uses fibre Channel (Fiber Channel,FC) technology to connect nodes and uses fibre Channel switch (FC Switch) to provide network switching. Unlike the general data network, the data transmission in the storage area network is based on the FC protocol stack. The SCSI protocol that runs on top of the FC protocol stack provides storage access services. In contrast, the iSCSI storage protocol provides a low-cost alternative, that is, the SCSI protocol runs on top of the TCP/IP stack. In order to distinguish between these two types of storage area networks, the former is usually called FC SAN and the latter is called IP SAN.

IP-SAN

Because of the high cost of FC-SAN, people begin to consider building a storage network based on Ethernet technology, so that iSCSI can run SCSI protocol on IP network. But in SAN, the instructions transmitted are SCSI read and write instructions, not IP packets. ISCSI (Internet small computer system Interface) is a standard for data block transmission over TCP/IP. It is initiated by Cisco and IBM, and has been strongly supported by major storage manufacturers. ISCSI can run SCSI protocol on IP network, so that it can perform fast data access and backup operations on high-speed Gigabit Ethernet. In order to distinguish it from the previous FC SAN based on optical fiber technology, this technology is called IP SAN.

Advantages

High performance, centralized management, stability and security are guaranteed

Shortcoming

The cost is high, and the compatibility of disk arrays limits the space for device selection and resource sharing.

NAS storage

Photo Source: redhat official website

Network Attached Storage network attached storage uses NFS or CIFS protocol to access data, takes files as transfer protocol, and realizes network storage through TCP/IP, which has the advantages of good expansibility, low price and easy management for users, such as NFS file system, which is widely used in cluster computing.

Advantages

The cost is low, there is a server, installed with network file storage software, it can be provided to other servers to mount access.

File-level data sharing

Shortcoming

Low reading and writing rate

Object storage

Block storage read and write fast, is not conducive to data sharing, file storage data sharing is convenient, but read and write slowly, whether to get a read and write fast and can share data storage, so object storage was born. Block storage and file storage are two mainstream storage types that we are familiar with, while object storage (Object-based Storage) is a new network storage architecture.

Three core conceptual objects

Object is the smallest unit in object storage, for example, a photo is an object, which is composed of metadata information (MataData, including Length,lastModify, etc.), user data (Data), user-defined data information (photographer, shooting device, etc.) and file name (Key).

Storage bucket

As a container for storing objects

User

The consumer of the object store, the owner of the bucket, each user uses AccessKeyId and SecretAccessKey symmetric encryption to verify the identity of the sender of a request.

What is suitable for object storage

For storing large amounts of unstructured data, object storage stores data as objects rather than traditional files and data blocks, each of which stores data, metadata, and a unique identifier.

Picture

Video

Audio frequency

Document

Code js/html

Shortcoming

The application code needs to be changed, and the object cannot be modified. It needs to be written completely at once.

Advantages

Unlimited expansion

Construction practice of object Storage based on Ceph what is Ceph

A new generation of free software distributed file system designed by Sage Weil (co-founder of DreamHost) of the University of California, Santa Cruz. Software defined Storage (Software Defined Storage, SDS). Unified storage solution. Three storage methods are provided: block storage, file storage and object storage. The architecture of Ceph is as follows:

Photo Source: Ceph official website

Ceph component

Ceph Monitor (Monitor, Mon for short)

Mon maintains the health of the entire cluster by saving a cluster state mapping. It maintains mapping information for each component separately. All cluster nodes report status information to Mon nodes

RADOS

(Reliable Autonomix Distributed Object Store), which is the foundation of the storage cluster. All data in Ceph is stored in the form of objects, and RADOS is responsible for storing the data, regardless of their type.

Ceph object Storage device OSD

The object storage daemon of the Ceph distributed object storage system. It is responsible for storing objects on the local file system and making them accessible over the network.

RADOS Gateway (RGW)

Provides a restful API interface compatible with Amazon S3 and OpenStack object Storage API (Swift). Support for multi-tenancy and OpenStack Keystone authentication.

MDS (Cephmetadata server)

Track file hierarchies and store metadata for CephFS.

Librados

The librados library provides convenient access to the RADOS interface for programming languages such as PHP,Ruby,Java,Python,C and C++.

RBD (RADOS Block device)

Ceph block devices, formerly known as RADOS block devices, provide reliable distributed and high-performance block storage disks to clients, distribute block data on multiple OSD in the form of sequential striping, and support enterprise-level features such as automatic streamlining configuration, dynamic resizing, full and incremental snapshots, realistic replication and cloning, and RBD services have been encapsulated into a native interface based on librados.

CephFS (Ceph Filesystem)

The Ceph file system provides a POSIX-compatible file system that uses Ceph storage clusters to store user data. Like RBD and RGW, native interfaces are encapsulated based on librados.

Characteristics of Ceph

High performance

Abandoning the traditional centralized storage metadata addressing scheme, using CRUSH algorithm, the data distribution is balanced and the parallelism is high.

High availability

Strong consistency of data and self-healing of multiple fault scenarios

High scalability

Decentralization and flexible expansion

Rich in features

Three storage interfaces are supported: block storage, object storage, and file storage

Support for multi-language (Python, C++, Java, PHP, Ruby, etc.) drivers and custom interfaces

Practice of object Storage based on Ceph

Based on the HTTP protocol, the client forwards the request to the object storage gateway (Rados GateWay) through layer 7 load balancing, and the object storage gateway communicates with the cluster through Sockets. So far, the whole data transmission is completed.

User authentication

Before sending the request, the application uses the user's private key (secret key), request content, etc., calculates the digital signature using the algorithm agreed with the RGW gateway, and then encapsulates the digital signature and the user access key access_key in the request and sends it to the RGW gateway.

After receiving the request, the RGW gateway uses the user access key as an index to read the user information in the RADOS cluster, and obtains the user private key from the user information.

Use the user's private key, request content, etc., and use the algorithm agreed with the application to calculate the digital signature.

Judge whether the digital signature generated by RGW matches the signature of the request. If so, the request is considered to be real and the user is authenticated. If the match is made, S3 error: 403 (SignatureDoesNotMatch) is returned.

Path Analysis of object Storage IO

The application sends the request to the object storage gateway through the http protocol. After receiving the Swift request, the gateway parses the S3 or Swift data from the http semantics and carries out a series of checks. After passing the check, it executes different data processing logic according to different API operation requests, GET or PUT data from the RADOS Cluster through the librados interface, and completes the whole PUT O process.

Thank you for reading this article carefully. I hope the article "how to build Ceph-based object Storage" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report