What is the design and basic architecture of Kubernetes storage? 07/19 Update SLTechnology News&Howtos

What is the design and basic architecture of Kubernetes storage?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the design and basic architecture of Kubernetes storage". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the design and basic architecture of Kubernetes storage?"

1. Basic knowledge of container platform storage

As an important part of container platform, storage ensures the security of container data, plays an important role in the whole system, and is the top priority of the whole design.

The storage of the data in the container is temporary, and when the container disappears, the data will disappear, and then there is the research of persistent storage. On the Kubernetes platform, multiple containers run in Pod at the same time, and these containers are often required to share data storage to ensure the security of the data. Kubernetes abstracts Volume objects to solve storage problems. Docker also has the concept of Volume, but there is only a small and loose management of it. In Docker, a Volume is a directory on disk or in another container. Later, Volume lifecycle management was added, as well as Volume drivers, although the functionality is still very limited.

A Kubernetes volume has a clear lifecycle-- the same as the Pod that wraps it. As a result, the volume has a longer lifetime than any container running in Pod, and the data is retained when the container is restarted. Of course, when a Pod no longer exists, the volume will not exist either. Kubernetes can support many types of volumes, and Pod can use any number of volumes at the same time.

At the core of a volume is a directory that contains some data, which can be accessed by containers in Pod. A specific volume type can determine how this directory is formed, what media it supports, and what content is stored in the directory. When using volumes, the Pod declaration needs to provide the type of volume (.spec.volumes field) and the location of the volume mount (.spec.containers.volumeMounts field). Processes in the container can see a view of the file system made up of their Docker images and volumes. The Docker image is at the root of the file system hierarchy, and any Volume is mounted on the specified path within the image. Volumes cannot be mounted to other volumes or have hard connections to other volumes. Each container in Pod must independently specify the mount location for each volume.

Type of container storage

Three types of storage used by the container architecture:

(1) Image storage

This can take advantage of existing shared storage, similar to the distribution protection mechanism for virtual machine images in a virtualized environment. The advantage of a container image is that its storage capacity is much smaller than a full virtual machine image because the operating system code is not copied. In addition, the operation of the container image is fixed at the beginning of the design, so it can be stored and shared more efficiently. But as a result, container images cannot store data for dynamic applications.

(2) configure data storage

These configuration data are used to manage containers, which can be implemented with the help of existing storage, mainly some management data such as configuration data and logging.

(3) Application data storage

These data are important, temporary, and the most difficult to store. Containers have a shorter design life than virtual machines, and once the container is destroyed, all temporary storage disappears. Therefore, the data that the application really needs to save can be written to persistent Volume data volumes. Because micro-service-based containers apply multi-bit distributed systems, containers may start, stop, scale, or migrate dynamically in multiple nodes. Therefore, when the container application has persistent data, it must ensure that the data can be accessed by different nodes. On the other hand, the container is an application-oriented running environment, and the data is usually saved to the file system, that is, the storage interface is more suitable for application access in the form of files.

Data persistence storage data volume needs to be planned in advance, and storage support is also needed to support container migration. Need to provide different storage according to the different needs of the business. Etcd will store the status and configuration information of the platform, so the requirements for performance, security and stability are relatively high.

Application scenarios stored in the platform

The use of storage in Kubernetes focuses on the following aspects:

Basic configuration file reading, password key management, etc.

Storage status of the service, data storage, etc.

Sharing data between different services or applications

The services deployed and run in Kubernetes are roughly as follows:

(1) stateless service

Kubernetes uses ReplicateSet to guarantee the number of instances of a service, and if a Pod instance dies or crashes for some reason, ReplicateSet will immediately replace it with the template of the Pod. Because it is a stateless service, the new Pod is similar to the old Pod. In addition, Kubernetes provides a stable access interface through Service (multiple Pod can be hung after a Service) to achieve high availability of services.

(2) General stateful service

Compared with stateless service, it has more requirements for state preservation. Kubernetes provides a storage system based on Volume and Persisettent Volume, which can save the service state.

(3) stateful cluster service

Compared with ordinary stateful services, it has more requirements for cluster management. There are two problems to be solved to run stateful cluster service, one is state preservation, the other is cluster management. Kubernetes developed StatefulSet for this purpose to facilitate the deployment and management of stateful cluster services on Kubernetes.

The importance of Container Cloud Storage Design

Data is an important asset of an enterprise, how to make good use of data, can achieve the prosperity of the enterprise, and how to preserve the data is a strong backing to ensure the prosperity of the enterprise. Therefore, in the planning stage before the construction of the platform, full technical preparation and project research must be done, and the importance of data must be raised to an important level to attract attention.

Data is an important asset of the enterprise, to ensure that the data is not lost, data integrity, data consistency, in order to better carry out business. Container and virtual machine or physical machine technology implementation focus on different, containers focus on stateless applications, to support stateful applications, data storage must be considered and planned in advance based on business requirements.

Container cloud is the basic platform, which involves platform components, mirrors, applications, middleware and other aspects, each of which may have different storage requirements. In order to achieve ideal performance and results, we need to comprehensively consider every aspect, storage and so on as infrastructure resources, is an essential part.

The container is used to host the application, and the data at all levels of the application has potential value. Capturing and processing, storing and analyzing these data is a step to obtain value. Therefore, the persistence of application data is one of the important basic capabilities of container cloud platform to support business applications. Only by building a good foundation can we serve the application well.

2. Several common storage systems in Kubernetes

According to the official data provided by Kubernetes, the Volume Plugin supported by Kubernetes is shown in the following table:

Kubernetes already provides a wealth of Volume and Persistent Volume plug-ins that can be used to provide storage services to containers according to the characteristics of their own business. For the usage and precautions of each Plugin, please see: https://kuberne-tes.io/zh/docs/concepts/storage/

Container Storage Interface (Container Storage Interface,CSI) is a cross-industry standard initiative designed to lower the threshold for cloud native storage development to further ensure a level of compatibility. CSI in Kubernetes, which simplifies the installation process of the new volume partition plug-in to be equivalent to installing Pod, and allows third-party storage vendors to develop their own solutions without access to the core Kubernetes code base.

3. Persistent storage design

As mentioned earlier, data is the source for enterprises to conduct business and further obtain value, and it is the core asset. Important data must be persisted and backed up in accordance with regulatory / business requirements. Container persistent storage can generally be implemented in two forms: first, it is in the form of local storage. The advantage is that it is easy to use, but the disadvantage is that it is difficult to migrate, share and scale. Second, it is in the form of shared storage cluster, and the advantage is data sharing. It can provide a variety of storage interfaces and can be flexibly scaled, while the disadvantage is that the architecture is slightly complex. Persistent storage requires different selection strategies for different scenarios:

4 Design and basic architecture of Kubernetes storage

4.1Conceptual introduction of Persistent Volume and Persistent Volume Claim

A running container, by default, writes to the file system occurs at the writable layer of its hierarchical file system (copy-on-write). Developers face enormous challenges when migrating applications from development to production. When the container hangs, crashes, or ends, any data associated with it is lost. In order to solve the data loss caused by this problem, we need to persist the data store, also known as Persistent Volume.

Kubernetes uses two resources to manage storage:

Persistent Volume (PV for short): a description of a storage added by an administrator, which is a global resource, including the type, size, access mode, and so on. Its life cycle is independent of Pod, for example, there is no impact on PV when the Pod that currently uses it is destroyed.

Persistent Volume Claim (PVC for short): a resource in Namespace that describes a request for PV, including storage size, access mode, and so on.

Volume in Kubernetes is extended based on Docker, using Docker Volume to mount the file directory on the host to the container. Generally speaking, Pod in Kubernetes accesses storage in three ways:

(1) Direct access

This method has poor portability and scalability, and completely exposes the basic information of Volume to users, which has serious security risks. At the same time, it is necessary to coordinate the access of different user to Volume.

(2) static provision

(3) dynamic provision

Obviously, dynamic provisioning is more flexible than static provisioning, and this approach decouples the computing layer and storage layer of the Kubernetes system, and more importantly, it provides a pluggable development model for storage vendors. Storage vendors only need to develop corresponding volume plug-ins based on this model to provide storage services for Kubernetes. There are three ways to do this:

In-tree Volume Plugin

Out-of-tree Provisioner

Out-of-tree CSI Driver

First, some storage plug-ins are implemented in Kubernetes internal code to support some mainstream network storage.

Second, if an official plug-in can not meet the requirements, the storage provider can customize or optimize the storage plug-in and integrate it into the Kubernetes system as needed.

The third is the container storage interface CSI, which is the open storage interface of Kubernetes. The implementation of this interface can be integrated into the Kubernetes system. The community has previously announced that it will no longer develop in tree/out of tree and migrate all existing features to CSI, so a third CSI solution is more recommended for storage vendors and users.

In general, the life cycle of PV and PVC is divided into five phases:

(1) Provisioning, that is, the creation of PV, can be created either directly (statically) or dynamically using Storage Class

(2) Binding, assign PV to PVC

(3) Using,Pod uses the Volume through PVC

(4) Releasing,Pod releases Volume and deletes PVC

(5) Reclaiming, recycle PV, you can keep PV for next use, or you can delete it directly from storage.

According to these five stages, the status of Volume has the following four types:

Available: available

Bound: has been assigned to PVC

Release:PVC is unbound but the recycling policy has not been implemented

Failed: an error occurred

4.2 basic Kubernetes storage architecture

Kubernetes storage is designed with a declarative (Declarative) architecture. At the same time, in order to be compatible with as many storage platforms as possible, Kubernetes docks different storage systems in the form of in-tree plugin, so that users can use these plug-ins to provide storage services to containers according to their own business needs. Both FlexVolume and CSI customized plug-ins are compatible with users. Compared with Docker Volume, it supports more rich and diverse storage functions.

The basic process of mount a PV in Kubernetes includes:

(1) users create a Pod containing PVC through API

(2) Scheduler assigns this Pod to a node Node

(3) Kublet on node Node starts waiting for Volume Manager to prepare Device

(4) PV controller calls the corresponding Volume Plugin (in-tree or out-of-tree), creates a PV, and binds it to the corresponding PVC in the system

(5) Attach/Detach controller or Volume Manager implement Device mount (Attach) through Volume Plugin

(6) after Volume Manager waits for the Device to be mounted, mount the volume to the specified directory of the node.

(7) after the Kublet on the Node node is told that the Volume is ready, start Pod and mount the PV to the corresponding container through Volume mapping.

Kubernetes storage architecture design diagram

Thank you for reading, the above is the content of "what is the design and basic architecture method of Kubernetes storage". After the study of this article, I believe you have a deeper understanding of what the design and basic architecture method of Kubernetes storage is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.