What is the difference between cloud native storage and cloud storage? 04/27 Update SLTechnology News&Howtos

What is the difference between cloud native storage and cloud storage?

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Author | High-level technical expert of Aliyun Intelligent Business Group

Guide: the new enterprise load / smart workload containerization, cloud migration, storage performance, elasticity, high availability, encryption, isolation, observability, lifecycle and other problems not only need to improve the storage product level, but also need to improve the cloud native control / data plane to promote the evolution of cloud native storage and cloud storage. This article will introduce the problem scenario, explore possible solutions, and finally figure out what cloud native storage and cloud storage can do at present and what needs to be done in the future.

Introduction

Recently, I had the honor to participate in the Meetup for cloud native persistence applications jointly organized by Infra Meetup and Kubernetes & Cloud Native Meetup. Combined with the recent thinking on cloud storage, open source storage and cloud native storage, I did more reflection and combing on what cloud native storage is and what the challenges are in the future. I shared several preliminary views.

With the requirements of cloud native applications for mobility, scalability and dynamic characteristics, accordingly, cloud native storage also brings the requirements of density, speed and mixing, so the basic capabilities of cloud storage are also put forward in the aspects of efficiency, flexibility, autonomy, stability, application low coupling, GuestOS optimization, security and so on.

Cloud native status quo containers and cloud native computing are quickly accepted by enterprises

Forrester predicts that by 2022, the proportion of global organizations / companies running containerized applications in the generation environment will increase significantly from less than 30% today to more than 75%, and the trend of containerization of enterprise applications will be unstoppable.

On the other hand, according to IDC's forecast for the future growth trend of the enterprise storage market, the demand for cloud storage will more than triple in 2020 compared with 2015. In the enterprise storage market, the proportion of core data consumed by data management enterprises will increase from 15% to 23%. Structured data and DBMS data will be further strengthened in the enterprise storage market.

For cloud natives, core enterprise applications / smart applications use cloud native storage to deploy stateful applications that are available for production, showing an accelerated upward trend. Overseas storage giants EMC and NetApp embrace cloud native storage and actively layout cloud native storage choreography solutions such as REX-Ray flexrex and Trident.

Kubernetes has gradually become the infrastructure of the cloud native era.

In the past year (2018-2019), Kubernetes has gradually become the infrastructure of the cloud native era. More and more stateful enterprise core applications, such as Internet, database and message queue, have gradually migrated to the cloud native platform Kubernetes, which puts forward different requirements for the performance of different cloud block storage in terms of latency, throughput and stability, such as:

Millisecond NvME SSD-level stable latency to meet the requirements of high-performance KVstore and database; with the increase of application stand-alone deployment density, it is a challenge to block storage stand-alone density; local block storage sharing also puts forward higher requirements for block storage flexibility and isolation.

In the cloud native environment, how to meet different business scenarios in a declarative way has become a challenge for cloud native storage in the implementation of control plane and data plane.

In the intelligent application AI scenario, high-performance computing and streaming computing also try to deploy through Kubernetes cloud native platform, using cloud storage to complete training, computing, reasoning and other work, which challenges the selection and use of cloud storage in Kubernetes environment. For example, there is evidence that the Spark ecosystem is gradually migrating from Hadoop YARN to Kubernetes's native scheduler and extended scheduler e.g. Gang Scheuler.

In the cloud computing environment: due to the model of separation of cost and storage computing, HDFS still exists as a storage protocol, but the storage mode will gradually migrate from three copies of HDFS to object storage (OSS,S3); GPU multi-card MPI computing and Flink streaming computing Kubernetes has gradually become the mainstream, and storage access is often presented by object storage.

However, in the use of object stored procedures, the computational efficiency of big data / AI applications is still faced with severe challenges:

Reduce the network IO; generated by repeatedly pulling up the same Block on the same node, reduce the write IO; generated by the Shuffle of the data, realize the data awareness of the calculation, and calculate the nearest computing of the data migration.

The current Kubernetes scheduler and cloud storage features do not provide a good solution, so it also provides a stage for cloud native storage to accelerate big data computing and make up for the lack of IO throughput.

Big data's offline computing, such as genetic computing, has run computing tasks on a large scale through the Kubernetes cloud native platform: the peak rigidity of 10GBps-30GBps for file storage peak throughput requires the evolution and transformation of independent high-throughput file storage forms and delivery methods in the cloud native environment.

Container service becomes the infrastructure of the cloud native era.

As more and more clouds choose to use containerization in enterprise applications, container services have seen significant business growth among different cloud vendors. Container service has gradually become a new infrastructure in the cloud native era and the entrance to the best use of cloud resources. Cloud native storage also has a new connotation for cloud computing / cloud storage, so it is necessary to rethink the essential difference and relationship between cloud storage and cloud native storage.

Thoughts on Cloud Native Storage and Cloud Storage

Cloud Native Storage vs Cloud Storage:

Opposition or unity? The connection between the two? Differences and priorities? 1. Cloud native storage = cloud storage UI, application-oriented declarative application layer storage + efficiency

Six elements of cloud native storage declaration:

Capacity Size; performance IOPS, throughput, latency; accessibility, sharing / exclusive; IO observability; QoS; multi-tenant isolation. two。 Hierarchical storage, reuse infrastructure dividend, do not reinvent the wheel, partial storage form is moved up 3. 5% according to the new load type. Ability to achieve efficiency and autonomy in the control plane to maximize storage stability and security in the cloud native storage market

To better understand how to build cloud native storage in a cloud environment, take a look at several mainstream cloud native storage deployed in a Kubernetes enterprise environment, and compare the forms of cloud storage:

Ceph on Kubernetes with RookPortworxOpenEBSCeph on Kubernetes with Rook

Ceph was developed in 2003 by Sage Weil of the University of California, Santa Cruz, as part of his doctorate program. Ceph LTS is mature, stable, highly available, ecologically powerful, and tightly integrated with Kubernets in the cloud native era. Ceph is based on RADOS (Reliable Autonomic Distributed Object Store) high-availability storage, which was released in 2003 before the cloud native era, and has been widely deployed to support the widest range of block storage RBD, file POSIX Cephfs, and object storage access protocols.

RedHat/SUSE is currently the main commercial supporter of Ceph. In several container platform landing cases, RBD and CephFS are used as the main storage of container platform implementation to make up for the lack of basic cloud storage.

Rook is currently a deployment and operation and maintenance Ceph orchestration tool available at the Kubernetes production level.

The basic architecture of Ceph consists of data plane OSDs (RADOS) and control plane MON/RBD/RADOSGW/CEPHFS. CRUSH Algorithm is used as the core algorithm to deal with data redundancy and high availability. The upper application storage completes data reading and writing directly through librados and data plane OSDs, can support snapshot, backup, monitoring and observability, and can be output directly through Kubernetes through Rook. RedHat/SUSE also provides independent cluster installation capability.

Some basic architectural features and capabilities of Ceph:

Control side: MON/RBD/RADOSGW/CEPHFS; data side: OSDs (RADOS); Snapshot, backup, support IO monitoring and other storage performance monitoring, support RBD QoS server speed limit capability. Portworx

Portworx is deployed as a container service, and each node is called PX, which interfaces downwards with block storage or bare metal servers of various public clouds, and provides block or file services up.

Without binding hardware configuration and manufacturer, you can access any public cloud or self-built server cluster (only need to support iSCSI or FC protocol). Currently, Portworx has the main capabilities of cloud disaster recovery DR, multi-cloud replication, complete snapshot (ROW), multi-cloud management, synchronous replication (RTO, seconds) asynchronous replication (RPO10000 Volume supports topology-aware computing OpenEBS).

OpenEBS is an open source version of EBS built on Kubernetes. Software-defined PV: pooling and managing all kinds of media, including local disk, cloud and other storage. Use iSCSI as the storage protocol. One of the reasons why you can access all kinds of storage flexibly without binding to a certain manufacturer's storage. In a sense, it is also more flexible and lightweight. However, it is strongly dependent on the container network, adding an abstraction layer OpenEBS layer, the write operation has to go through the abstraction layer, and each volume PV has an independent controller, which adds additional overhead. Although it can be more flexible, it has a great disadvantage in performance compared with Portworx and Ceph.

Some basic functional / performance characteristics of OpenEBS:

Control plane: extend the container choreography system to support super fusion. Compared with blocks, the number of volumes is large and the size of volumes is arbitrarily configured, which is more flexible; high availability: each volume can have multiple copies, and data synchronization is synchronized between different storage pools; snapshot, backup, and monitoring storage performance features; and Cloud-Native Tools has a good integration: you can use cloud native tools (such as Prometheus,Grafana,Fluentd,Weavescope,Jaeger, etc.) to configure, monitor and manage storage resources. Understanding cloud storage Pangu vs RADOS

Comparing the above three kinds of open source / enterprise storage, in order to make it easier to understand the cloud storage architecture, we compare Pangu's tiered architecture with the tiered Ceph storage.

We can compare CS (Chunk Server) to Ceph OSDs service process and Pangu's Master process to Ceph MDSs process.

Cloud product block storage is compared to Ceph RBD, file storage is classified to CephFS, and object storage is compared to RADOSGW. There is no correspondence between local block storage and high-performance file storage CPFS products.

With the evolution of Pangu architecture, the comprehensive promotion of Pangu 2.0, the promotion of user-mode TCP network protocol stack, the comprehensive RDMA storage network, and the comprehensive optimization of RPC performance, the upper product storage has also enjoyed the huge dividend of the underlying storage revolution, entering an era of sub-millisecond latency and millions of IOPS. Cloud primary storage must also be able to inherit these capabilities above the product storage level.

Differences between cloud native storage in public cloud and private (private) cloud

By analyzing the cloud native storage on the market, we can find that the common feature of these storage is that it supports declarative API, which can measure and declare performance, capacity, function, etc., and support quality / stability / security more or less.

Furthermore, cloud native load can be stored in capacity, performance, and accessibility directly through data plane lossless use of the product, continue to improve user-oriented IO observability, application-level QoS, and multi-tenant isolation in the control plane, implement declarable storage interfaces such as CSI/Flexvolume through the control plane interface, and provide Operator for part of the storage life cycle. Container choreography glues business applications and storage into actual load statements, which may be a more correct posture for using cloud storage.

Due to the complete storage of public cloud infrastructure products, more lightweight data planes (virtio, nfs-utils, cpfs-sdk, oss-sdk) can be used to access product storage.

Proprietary cloud environments vary greatly. Some of them are virtualized or non-virtualized. SAN and bare disk are the main storage methods. You need to build ceph RADOS or Pangu to implement SDS, and then access the storage through the data plane (librados/px/pv-controller).

For vSphere,OpenStack, the private cloud built by Feitian has a storage delivery method similar to that of the public cloud, but there are also differences in the support capabilities of different control / data planes due to differences in deployment modules.

To put it simply:

Public cloud   Cloud Native Storage = Declarative API + Cloud Storage private cloud   Cloud Native Storage = Declarative API + Native Storage cloud original survival storage layer, reuse infrastructure dividend, do not reinvent the wheel.

Cloud native storage improves the consistency of the data plane (kernel/OS/net/client/sdk optimization parameters and version control); builds a unified control plane CSI/Flexvolume/Operator, which provides customer-oriented statements that API; achieve topology awareness at the scheduling level, and achieve cloud disk zone awareness and local node awareness.

Block storage

In the control plane, Kernel Cgroup blkio is used in conjunction with Aliyun Linux 2 OS to achieve process-level buffer IO control, which improves the granularity of QoS control of local disk and cloud disk in the application layer. The density of the stand-alone cloud disk can be improved by splitting the LVM of the local disk. By measuring and collecting the IO index of the mount point / device, the observability of IO is realized.

Cloud native storage-the main characteristics of block storage:

Capacity: single disk 32TB delay: 0.2ms-10msIOPS: 5K-1m throughput: 300Mbps-4Gbps (local NvME ESSD: 2GBps) accessibility: single availability zone exclusive QoS: single disk isolation, process isolation multi-tenancy: single disk isolation

For more information, please see Cloud disk performance.

file store

In the control plane, the mandatory UID/GID control of the application can be realized through the control of Pod Security Policy and SecuritContext, and the ACL control of the file system can be realized. The control plane controls the life cycle of the file system. By measuring and collecting the IO index of the mount point, the observability of IO is realized.

Cloud native storage-the main characteristics of file storage:

Capacity: single file system 10PB latency: 100K-10msIOPS: 15K-50K throughput: 150Mbps-20GBps accessibility: multi-cluster multi-availability zone sharing QoS:IO scramble for multi-tenancy: PSP ACL (namespace)

CPFS parallel file system

ACL control of file system in the control plane, configurability of client speed limit for QoS, declarative lifecycle management capability of Operator for file system, and declarative deployment of CPFS file system in cloud native environment.

Cloud native storage-the main characteristics of high-performance file storage:

Capacity: single file system 100PB latency: 0.5ms-10msIOPS: 50K-1m throughput: 10Gbps-1000GBps accessibility: multi-cluster multi-availability zone sharing QoS: support client speed limit multi-tenancy: PSP ACL (namespace)

Summary: cloud native storage v1-functionality

Today's cloud native storage has achieved full category support for Ali Cloud product storage in the control plane / control plane interface, and most of the system-level and client-level optimizations have been completed in the data plane. However, with the containerization migration of a large number of persistent enterprise applications and intelligent applications, we are still facing more problems and challenges.

In the whole development process of cloud native storage v1, I would like to thank Ali Cloud Storage team for their cooperation and help in file storage, block storage and object storage to create cloud native storage.

With the demand of cloud native applications for mobility, scalability and dynamic characteristics, cloud native storage also brings corresponding requirements of density, speed and mixing, so the basic capabilities of cloud storage are also put forward in the aspects of efficiency, flexibility, autonomy, stability, application low coupling, GuestOS optimization, security and so on. The performance, resilience, high availability, encryption, isolation, observability, lifecycle and other problems encountered in the new enterprise load / intelligent workload containment, cloud migration, storage, etc., not only need to improve the storage product level, but also need to improve the cloud native control / data plane to promote the evolution of cloud native storage and cloud storage, which is the prospect and planning of cloud native storage v2. We will further reveal these new scenarios, requirements, solutions and development directions in subsequent articles.

"Alibaba Cloud's native Wechat official account (ID:Alicloudnative) focuses on micro-services, Serverless, containers, Service Mesh and other technology areas, focuses on cloud native popular technology trends, and large-scale cloud native landing practices, and is the technical official account that best understands cloud native developers."

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.