In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what is the design method of web object storage service architecture". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the design method of web object storage service architecture".
The basic framework of object storage service architecture design
1. Gateway Services (Gateway):
The client sends the request (Request) to the gateway service (Gateway), and then the gateway service implements the operation of converting the client request into corresponding data (Data), metadata (Metadata) and message to column (MQ). Generally speaking, the gateway service mainly undertakes the following functions:
Protocol conversion: realize the protocol conversion from front-end client protocol (HTTP/RPC) to back-end module (TCP/RPC/MQ).
Request distribution: responsible for the front-end request according to different request types (data operation. Metadata operations, asynchronous queue operations) are distributed to different back-end modules.
Collaboration and scheduling: some front-end requests may involve interaction with multiple back-end modules at the same time, so gateway services also need to unify these requests and achieve coordination and scheduling among multiple modules.
Load balancing: achieve load balancing of client requests and improve the concurrent throughput performance of the overall system.
Cache: realize the cache of hot data, improve the hit rate of client request, and reduce the access pressure of the underlying module. When the underlying module is unavailable, it can still provide part of the data to support the client's request access and provide similar degraded services, so as to improve the availability of the overall service to a certain extent.
In view of the above-mentioned features, if the entire object storage is compared to a super truck, the gateway service is equivalent to "steering wheel, gearbox, dashboard", which have a close relationship with the driver. whether the car is smooth or not is largely determined by these.
two。 Data Storage Service (Data store):
At the same time, it meets the characteristics of distributed storage such as horizontal expansion, high performance and high availability, and provides the most solid cornerstone of the underlying data storage for the whole object storage, which is described as "rock solid" in a word. The data storage service module can provide many types of data storage I / O interfaces, such as file storage, object storage and block storage. The upper layer realizes the storage of object data content by calling these standardized storage interfaces. If the object storage system is compared to a car, then the data storage service is equivalent to the "car body, suspension, tire" of the whole object storage.
3. Metadata Storage (KV store):
A complete object data is mainly composed of data content and metadata. In addition to the data storage services mentioned above, some metadata information also needs to be stored. It is worth noting that the data content is generally unstructured or semi-structured, but metadata is generally structured content, such as file MIME, MD5 value, modification time (mtime), ower and so on. These information are generally stored in key-value and associated to specific objects, and these metadata information often needs to be quickly traversed, queried, updated, etc. At the same time, in order to achieve better decoupling between modules, it is very necessary to separate the metadata storage and store it in a specific KV storage engine in Key-value mode, especially when the object storage data scale reaches a large amount, independent KV storage (metadata storage) can greatly avoid becoming the performance bottleneck of the whole system. It is no exaggeration to say that the importance of metadata storage is equivalent to the "transmission" of the whole object storage system.
4. Asynchronous task queue (Async queue):
Why an object storage system needs to use an independent asynchronous task queue system is believed to be the confusion of many novice drivers. Also based on the original intention of decoupling, let's take a look at the following scenarios.
1)。 User data requires regular data operations, such as periodically filtering and purging user data through lifecycle, or periodically migrating data from a hot storage resource pool to a cold storage resource pool.
2)。 The user has deleted the object, and the underlying layer needs to trigger the corresponding garbage collection (GC) mechanism according to certain rules to free up the occupied disk space.
3)。 Users need to synchronize data between multiple storage clusters across physical regions. Considering network delay, disk delay and other factors, these operations can not achieve real-time synchronization.
4)。 Users need to process the uploaded data, such as transcoding the uploaded video file, compressing the picture, encrypting the file, and so on, all of which consume a lot of computing resources and cannot return the results in real time.
After understanding the above scenarios, you will find that it is very unrealistic to adopt a synchronization mechanism to require all client operations to return the execution result immediately, at least at the hardware level. So we can only make appropriate choices, design an independent asynchronous task queue to meet these requirements, and throw some time-consuming operations to this asynchronous task queue. The introduction of asynchronous queuing system does solve some pain points in the whole object storage system that cannot be operated in real time, but it also introduces some new problems:
1)。 How to ensure the data consistency of users, especially when users frequently operate data and metadata, how to ensure the atomization of these asynchronous operations, to the maximum extent in line with users' expectations of data consistency.
2)。 The robustness of the asynchronous queue itself, how to ensure that every task submitted to the asynchronous queue can be executed in real time and effectively, especially when there are various problems such as the failure of the asynchronous queue itself, how to implement these practices quickly, effectively and correctly.
3)。 Smooth horizontal expansion, how to deal with the existing task queue while ensuring the smooth horizontal expansion of the entire queue system.
4)。 Task sequencing and priority, object storage systems generally achieve the ultimate consistency of data, how to ensure that all tasks are executed strictly according to time series or other rules, and how to determine the priority order of different operations on the same object at the same time.
The above are just a few problems brought about by the introduction of asynchronous queues that I have listed here. I believe all readers have their own different understandings of these problems. There are 1000 Hamlets for a thousand readers. Because of the limited space here, we will not go any further. When we talk about these questions, I mainly want to tell readers that the asynchronous task queue is a "double-edged sword". If you have deep skills, you can make a lot of features beyond your imagination, and improve the functionality of the entire object storage service by several Level, but on the contrary, once you fall into a deep hole, it may be doomed. So my personal experience is to always be wary of asynchronous queues and involve this module as little as possible, which can be summed up as "simple is the best" in one sentence. It can be said without concealment that asynchronous task queue is a key indicator of the maturity of object storage products, such as advanced functions such as "reversing radar, constant speed cruise". If you compare the purchase of object storage products to the purchase of cars, this will also be the key to distinguish between "ordinary cars" and "high-end cars".
Thank you for your reading, the above is the content of "what is the design method of web object storage service architecture". After the study of this article, I believe you have a deeper understanding of what the design method of web object storage service architecture is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.