CNCF official Ambassador Zhang Lei: is Kubernetes a "database"? 07/04 Update SLTechnology News&Howtos

CNCF official Ambassador Zhang Lei: is Kubernetes a "database"?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Author | Zhang Lei, Aliyun senior technical expert, official ambassador of CNCF, senior maintainer of co-chair,Kubernetes project in the field of CNCF application delivery

Recently, there is a discussion about "Kubernetes is the new database" in the Kubernetes community, which has attracted a lot of attention. Of course, the more precise meaning of this argument is that the Kubernetes project itself works like a database, rather than that you should use Kubernetes as a database.

At first glance, the argument that "Kubernetes is a database" is still quite bizarre. After all, what we usually call the working principle of Kubernetes, such as controller mode, declarative API, etc., seems to have nothing to do with the "database" thing. But in fact, there is a very essential meaning behind this argument. The reason for this starts with one of the most basic theories in the Kubernetes project.

Theoretical basis of Kubernetes declarative Application Management

When we talk about Kubernetes, we often mention such a concept called "declarative application management". In fact, this is also a design that the Kubernetes project is different from all other infrastructure projects, and it is a unique capability of Kubernetes, so have you ever wondered what the specific performance of declarative application management in Kubernetes is?

1. Declarative application management is not just "declarative API"

If we review the core working principle of Kubernetes, it is not difficult to find the fact that most of the functions in Kubernetes, whether kubelet executes the container, kube-proxy executes iptables rules, or kube-scheduler performs Pod scheduling, and the process of Deployment managing ReplicaSet, etc., are actually designed to follow the "controller" pattern that we have often emphasized. That is, the user expresses his desired state through YAML files and other means, that is, the final state (whether network or storage), and then the various components of Kubernetes will make the state of the entire cluster approach to the final state declared by the user, and finally achieve complete agreement between the two. The process in which the actual state gradually approaches the desired state is called reconcile. And the same principle is the core way Operator and custom Controller work.

This kind of work form, which uses declarative description files to drive the controller to execute reconcile to approach two states, is the most intuitive embodiment of declarative application management. It is important to note that this process actually includes two meanings:

The expected state of the declarative description. This description must be strictly the final state that the user wants. If you fill in an intermediate state in this description, or if you want to adjust the expected state dynamically, it will destroy the accurate execution of the declarative semantics.

State approximation process based on reconcile. The existence of the Reconcile process ensures the theoretical correctness that the system state is consistent with the final state. To be exact, the Reconcile process constantly executes the cycle of "check-> Diff-> execute", so that the system can always make a direct difference between the state of the system itself and the final state and take necessary action. By contrast, it is not sufficient to have a declarative description. It's easy to understand that the first time you submit this description, the system achieves the desired state you want, and it doesn't represent or guarantee that it will be the same an hour later. Many people will confuse "declarative application management" with "declarative API" because they don't have a correct understanding of the necessity of Reconcile.

You may be curious about the benefits of such a declarative application management system for Kubernetes.

two。 The essence of declarative Application Management: Infrastructure as Data

In fact, the theoretical basis behind the declarative application management system is an idea called Infrastructure as Data (IaD). According to this idea, the management of infrastructure should not be coupled with a certain programming language or configuration, but should be pure, formatted, system-readable data, and these data can fully represent the system state expected by users.

Note: Infrastructure as Data is sometimes called Configuration as Data, and the meaning behind it is the same.

The advantage of this is that any time I want to operate on the infrastructure, it is ultimately equivalent to "adding, deleting, changing, and querying" the data. More importantly, the way I "add, delete, change and check" these data has nothing to do with the infrastructure itself. So, the process of interacting with an infrastructure is not bound to a programming language, a remote invocation protocol, or a SDK. As long as I can generate "data" in the corresponding format, I can operate on the infrastructure in any way I like.

This benefit is specifically reflected in Kubernetes, that is, if I want to do anything on Kubernetes, I just need to submit a YAML file, and then add, delete, modify and check the YAML file. Instead of having to use the Restful API or SDK of the Kubernetes project. The content in this YAML file is actually the Data (data) corresponding to Kubernetes, the IaD system.

So, since its birth, Kubernetes has defined all its functions as the so-called "API object", which is actually defined as a Data. In this way, Kubernetes users can add, delete, modify and query these Data to achieve their desired goals, rather than being bound to a specific language or SDK.

More importantly, compared with proprietary, imperative API or SDK, declarative data based on YAML can more easily shield the underlying implementation, making it easier to dock and integrate existing infrastructure capabilities. In fact, this is also a secret weapon that Kubernetes ecology can flourish to today at an astonishing speed: the declarative API and controller model brought about by IaD ideas. Make the whole community more willing to write plug-ins and interface various capabilities for Kubernetes, and these plug-ins and capabilities are also very versatile and portable, which is unmatched by other projects such as Mesos and OpenStack.

It can be said that IaD is the core combat effectiveness of Kubernetes to achieve the goal of "The Platform for Platform". At this point, you will probably understand: the Data in this IaD design is actually a declarative Kubernetes API object, while the control loop in Kubernetes is to ensure that the system itself is always consistent with the state described by these Data. From this point, Kubernetes is essentially a tuning system that expresses the set value of the system in terms of Data and keeps the system at the set value through the action of the controller (Controller).

Wait a minute, does the theory of "keeping the system at a set value" sound familiar?

In fact, the basic course behind Kubernetes may be the vast majority of engineering background readers have learned, it is called "control theory".

Do you feel suddenly enlightened?

After understanding the nature of Kubernetes, we look back at some of the settings that are more difficult to understand, and it may be easier to understand some of the essence.

For example, the reason why we write so many YAML files when using Kubernetes today is that we need to submit Data to the control system Kubernetes in a way. In this process, YAML is just a carrier for humans to write Data formatted. If we make an analogy, then YAML is like the "field character box" in our workbooks when we were young, and the words written in the "field character box" are the core of Data and the operation of the whole system that Kubernetes really cares about.

Careful readers should have thought by this time that since Kubernetes needs to deal with these Data, shouldn't Data itself have a fixed "format" so that Kubernetes can parse them? Yes, the format here is called the Schema of the API object in Kubernetes. If you often write custom Controller, you may be more impressed by the body of this Schema: CRD is a special API object specifically used to define Schema.

YAML engineer? No, you're a database engineer!

The nature of the IaD of the above Kubernetes determines that it works more like a "database" than a traditional distributed system. This difference is also a fundamental reason for the steep cost of Kubernetes learning.

From this point of view, the various API objects exposed by Kubernetes are actually tables (Table) that are pre-defined Schema. And we racked our brains to write those YAML files, in fact, is the data in these tables (Data) for addition, deletion and modification (CURD). The tool YAML itself, like SQL, is a tool and carrier that helps you manipulate the data in the database. The only thing that is different from the traditional database is that after getting the data, Kubernetes does not aim to persist the data, but hopes to use the data to drive Controller to perform some operations, so as to gradually adjust the state of the whole system to the final state declared in the data, which goes back to the part of "control theory" we mentioned earlier.

It is precisely because the whole system such as Kubernetes revolves around the setting of "data", a first-class citizen, that "writing and manipulating YAML files" has become almost the only daily job of Kubernetes engineers. However, after understanding the ideas of IaD introduced in this article today, you can actually compare yourself to a "database engineer", and this TItle is indeed more appropriate than a "YAML engineer".

View layer of Kubernetes project

As mentioned earlier, if you re-examine Kubernetes designs from a "database" perspective, it is not difficult to find that many Kubernetes designs actually have very subtle ideas behind them. For example:

Data model-various API objects of Kubernetes and CRD mechanisms-data interception, checksum and modification mechanism-Kubernetes Admission Hook data-driven mechanism-Kubernetes Controller/Operator data snooping change and indexing mechanism-Informer mechanism of Kubernetes...

On the other hand, as the Kubernetes infrastructure becomes more complex and there are more and more third-party plug-ins and capabilities, the maintainers of the community also find that the built-in "datasheets" of the "database" of Kubernetes are experiencing explosive growth in terms of size and complexity. So the Kubernetes community has been discussing for a long time how to design a "View" for Kubernetes, that is:

The benefits to Kubernetes users of such a "view layer" built on top of Kubernetes's built-in API resources are very similar to the "views" in the database, such as:

1. Simplify and change data format and presentation

The view layer of Kubernetes needs to be able to expose more concise, abstract application layer API objects to R & D and operations, rather than the original infrastructure layer API objects. How to define a view layer object, the degree of freedom should be completely in the hands of the user, and there is no need to be constrained on the Schema of the underlying Kubernetes built-in object.

two。 Simplify complex data operations (simplify SQL)

The view layer objects generated by abstraction not only need to be simpler on UI, but also need to be able to define and manage very complex underlying Kubernetes resource topology, so as to reduce the complexity and mental burden of users in managing Kubernetes applications.

3. Protect the underlying data table

Research and development and operation and maintenance directly manipulate the view layer objects, so the underlying Kubernetes original objects are protected. This allows the original objects of these Kubernetes to be changed and upgraded arbitrarily without the user's perception.

4. Reuse data operations (reuse SQL)

Because the view layer object is completely decoupled from the underlying infrastructure, an application or operation and maintenance capability declared through the view layer can drift in any Kubernetes cluster without having to worry about whether the capabilities supported by these clusters are different.

5. The view is still a table and supports standard table operations

The view layer object of Kubernetes must still be a standard Kubernetes object, so that all operations and primitive pairs of Kubernetes on API objects will apply to view layer objects. We cannot introduce an additional mental burden on the Kubernetes API model. Although the idea of setting up a view layer for Kubernetes did not land on the upper reaches of Kubernetes, it became the mainstream practice of most large players in the community. For example, Pinterest designs a CRD of PInterestService on top of Kubernetes to describe and define the application of Pinterest. This CRD is actually a view layer object. But this practice is still too crude for most enterprises. You know, the "view" of data is not just a simple abstraction and translation of data, and there are at least a few key issues that need to be addressed to use view layers on a large scale in a real production environment:

How to define and manage the mapping relationship between view layer objects and underlying K8s objects? Note that this is by no means a simple one-to-one mapping, a view layer object may correspond to multiple K8s objects; how to model and abstract "operation and maintenance capabilities"? A real application is not just a simple Deployment or Operator, it must be an organic combination of the program to be run and the corresponding operation and maintenance capabilities (such as a containerized application and its horizontal scaling strategy). How are these operation and maintenance capabilities reflected in the application definition? Is it feasible to define it all as annotation? How to manage the binding relationship between the operation and maintenance capability and the program to be run? How do you map this binding relationship to the real execution relationship in the underlying K8s? How to standardize the definition of cloud resources through view layer objects, such as an RDS instance of Ali Cloud? These problems are one of the important reasons why the upstream of Kubernetes failed to land the "view layer", and they are also the main concerns of Kubernetes application layer open source projects such as Open Application Model (OAM). It should be pointed out that only an OAM such a "specification" is still not enough to solve all the above problems, the establishment of Kubernetes view layer must be guaranteed in the implementation layer with the help of the standard view layer dependent library, in order to really enjoy the advantages and convenience of "data view" in Kubernetes. At present, the powerful Kubernetes view layer dependent library in the community is oam-kubernetes-runtime from Ali team.

Open Application Model (OAM) Project address: https://github.com/oam-dev/spec

Oam-kubernetes-runtime project address: https://github.com/crossplane/oam-kubernetes-runtime

Summary

Kubernetes, which is similar to the "database" design with IaD as the core, is the important theoretical basis behind the prosperity and development of this community. However, the idea of IaD itself is a double-edged sword, and the other side of the booming community it has spawned is countless "separate" Controller/Operator and an extremely complex Kubernetes cluster assembled through these Controller. Such a Kubernetes cluster with production-level complexity is 108000 miles away from a cloud native application management platform that is really loved by R & D and operation and maintenance. In the past five years, the great success of the Kubernetes project is actually a process of gradual standardization and unification of infrastructure capabilities (such as networks, storage, containers) under declarative API. With the gradual popularization of Kubernetes application layer technologies such as OAM, we have seen a standardized application layer ecology coming to the surface. More and more teams are trying to expose the popular API to end users through more user-friendly data view layers, while providing infrastructure engineers with more powerful horizontal connectivity and modular platform capabilities. At the same time, other missing parts of Kubernetes, the "database", are bound to emerge more and more in the community. For example, the Open Policy Agent (OPA) project, which is rapidly maturing today, can be thought of as the result of the continuous evolution of the "data interception checksum modification mechanism". For example, the management and control link performance tuning work promoted by Alibaba in the "ten thousand nodes" cluster, its theoretical basis and practice are similar to today's database performance optimization. If you have any ideas about the IaD system, you are welcome to join the group to communicate with us!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.