What system is Kubernetes? 07/11 Update SLTechnology News&Howtos

What system is Kubernetes?

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "what is Kubernetes system", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what is the Kubernetes system" this article.

Introduction to Kubernetes

As we know, Kubernetes is an open source container cluster management system, which has developed rapidly and has become the most active container orchestration system.

Architecturally, the components of Kubernetes can be divided into two parts: Master and Node, in which Master is the brain of the whole cluster, and Master is responsible for all the choreography, scheduling, API access and so on.

Specifically, Master includes the following components:

Etcd holds the state of the entire cluster.

Kube-apiserver provides a unique entry for resource operations and provides mechanisms such as authentication, authorization, access control, API registration and discovery.

Kube-controller-manager is responsible for maintaining the state of the cluster, including the controller of many resources, and is the brain that ensures that the Kubernetes declarative API works.

Kube-scheduler is responsible for resource scheduling and dispatches Pod to the corresponding Node according to the predetermined scheduling policy.

On the other hand, Node is responsible for running specific containers and providing necessary functions such as storage and network for containers:

Kubelet is responsible for maintaining the life cycle of the container and is also responsible for the management of Volume (CSI) and network (CNI).

Container runtime is responsible for image management and the actual operation of Pod and containers. The default container for Kubelet is Docker.

Kube-proxy is responsible for providing Service with service discovery and load balancing within cluster.

Network plugin is also responsible for configuring the network for the container based on CNI (Container Network Internface).

In addition to these core components, Kubernetes certainly contains a lot of rich features, which are deployed through Addon. For example, kube-dns and metrics-server are deployed in the cluster as containers and provide API for other components to call.

Tip: in Kubernetes, you can usually hear two different types of ways to extend Kubernetes functions: 1) * Addon, such as dashboard, EFK, Prometheus, various Operator, etc. These extensions do not require Kubernetes to provide a standard interface, but add new features to Kubernetes 2) another way is Plugin, such as CNI, CRI, CSI, Device Plugin, etc. These are the standard built-in interfaces provided by the core components of Kubernetes, while external plug-ins implement these interfaces, thus extending Kubernetes to more use case scenarios.

Kubelet architecture

As mentioned earlier, Kubelet is responsible for maintaining the lifecycle of the container. In addition, it also works with kube-controller-manager to manage the container's storage volumes and CNI to manage the container's network. Here is a simple architectural diagram of Kubelet:

As you can see, Kubelet is also made up of many components, including

Kubelet Server provides API for kube-apiserver, metrics-server and other services to call. For example, kubectl exec needs to interact with the container through Kubelet API / exec/ {token}.

Container Manager manages various resources of the container, such as cgroups, QoS, cpuset, device, and so on.

Volume Manager manages the storage volumes of the container, such as formatting the disk, mounting it locally to Node, and * * passing the mount path to the container.

Eviction is responsible for the expulsion of containers, such as expelling low-priority containers when resources are insufficient to ensure the operation of high-priority containers.

CAdvisor is responsible for providing metrics data sources for the container.

Metrics and stats provide metrics for containers and nodes. For example, the metrics extracted by metrics-server through / stats/summary are the basis for HPA automatic extension.

Further down is Generic Runtime Manager, the manager of the container runtime, who is responsible for interacting with CRI and managing containers and images.

Under CRI, there are two implementations of the container runtime

One is the built-in dockershim, which implements support for docker container engine and CNI network plug-ins (including kubenet)

The other is the external container runtime, which is used to support external container runtimes such as runc, containerd, gvisor, and so on.

Kubelet interacts with the external container runtime through the CRI interface, and the architecture of the CRI container runtime is shown on the right side of the figure. It usually consists of the following components:

CRI Server, this is CRI gRPC server, listening on unix socket. This Server is also often referred to as CRI shim (a layer sandwiched between the container engine and the Kubelet) when discussing the container runtime.

Streaming Server, which provides streaming API, is used on streaming interfaces such as Exec, Attach, Port Forward, etc.

Management of containers and images, such as pulling images, creating and starting containers, etc.

Support for the CNI network plug-in to configure the network for the container.

* is the management of container engine (Container Engine), such as supporting runc, containerd or multiple container engines.

In this way, the container runtime in Kubernetes can be divided into three parts according to different functions:

* Generic Runtime Manager, the management module of container runtime in Kubelet, manages containers and images through CRI.

The second is the container runtime interface, which is the communication interface between Kubelet and the external container runtime.

The third is the specific container runtime implementation, including Kubelet's built-in dockershim and external container runtimes (such as cri-o, cri-containerd, etc.).

Container Runtime Interface (CRI)

The container runtime interface (CRI), as its name implies, is used when Kubernetes extends the container runtime so that users can choose the container engine they like.

CRI is based on gPRC, users do not need to care about internal communication logic, but only need to implement defined interfaces, including RuntimeService and ImageService: RuntimeService is responsible for managing the lifecycle of Pod and container, while ImageService is responsible for the lifecycle management of images.

In addition to gRPC API,CRI, there are libraries for implementing streaming server (for Exec, Attach, PortForward, and so on) and CRI Tools. These two will be introduced in detail later.

The container runtime based on the CRI interface is often called CRI shim, which is a gRPC Server and listens on the local unix socket, while kubelet invokes the CRI interface as a client of gRPC.

In addition, the external container needs to be responsible for managing the container's network when it runs. It is recommended to use CNI, which is consistent with Kubernetes's network model.

The launch of CRI has brought new prosperity to the container community. A series of container runtimes such as cri-o, frakti, cri-containerd and so on are created for different scenarios:

Cri-containerd-- containerd-based container runtime

Cri-o-- OCI-based container runtime

Frakti-- virtualization-based container runtime

Based on the operation of these containers, you can also easily connect new container engines. For example, you can easily access Kubernetes through new container engines such as clear container and gVisor with cri-o or cri-containerd, thus extending the Kubernetes application scenarios to strong isolation and multi-tenant scenarios that can only be achieved by traditional IaaS.

When running with CRI, you need to configure the-container-runtime parameter of kubelet to remote, and set-- container-runtime-endpoint to the listening unix socket location (tcp or npipe above Windows).

CRI interface

So what does the CRI interface look like?

The CRI interface includes two services, RuntimeService and ImageService, which can be implemented in a gRPC server or separated into two separate services. Many of the current community runtimes implement it in a gRPC server.

ImageService for managing images provides five APIs, including querying the list of images, pulling images locally, querying image status, deleting local images, and querying image occupation space, etc. These can be easily mapped to docker API or CLI.

RuntimeService provides more interfaces, which can be divided into four groups according to their functions:

PodSandbox management interface: PodSandbox is an abstraction of Kubernete Pod to provide an isolated environment for containers (such as mounting to the same cgroup) and to provide shared namespaces such as networks. PodSandbox usually corresponds to a Pause container or a virtual machine.

Container's management interface: creates, starts, stops, and deletes containers in the specified PodSandbox.

Streaming API interface: includes three interfaces for data exchange with the container, including Exec, Attach and PortForward. These three interfaces return the URL of the runtime Streaming Server instead of directly interacting with the container.

Status interface, including querying API version and runtime status.

Streaming API

Streaming API is used in scenarios where the client needs to interact with the container, so it uses streaming interfaces, including Exec, PortForward, Attach, and so on. Kubelet's built-in docker supports these features through nsenter, socat, and so on, but they are not necessarily applicable to other runtimes. Therefore, CRI also explicitly defines these API and requires the container runtime to return a URL of streaming server so that Kubelet can redirect streaming requests sent by API Server.

Such a complete Exec process is

Client kubectl exec-I-t...

Kube-apiserver sends streaming request / exec/ to Kubelet

Kubelet requests URL of Exec from CRI Shim through CRI interface

CRI Shim returns Exec URL to Kubelet

Kubelet returns a redirected response to kube-apiserver

Kube-apiserver redirects streaming requests to Exec URL, and then Streaming Server within CRI Shim exchanges data with kube-apiserver to complete Exec requests and responses

In v1.10 and earlier, the container runtime must return a URL that API Server can directly access (usually using the same listening address as Kubelet); however, starting from v1.11, Kubelet has added-- redirect-container-streaming (default is false), which no longer forwards but proxies Streaming requests by default, so that the runtime can return a URL of localhost. The advantage of passing the Kubelet proxy is that the Kubelet handles the request for authentication with the API server communication.

In fact, the processing framework of each runtime streaming server is similar, so Kubelet also provides a steaming server library that can be easily referenced by container runtime developers.

Container runtime evolution process

Now that we understand the fundamentals of the container runtime interface, let's take a look at the evolution of the container runtime.

The evolution of container runtime can be divided into three phases:

First, before Kubernetes v1.5, Kubelet built-in support for Docker and rkt and configured them with container networks through the CNI network plug-in. It is painful for users at this stage to customize the functionality of the runtime, need to modify the Kubelet code, and most likely these changes cannot be pushed to the upstream community. In this way, you also need to maintain your own fork warehouse, which is very troublesome to maintain and upgrade.

Different user implementations of container runtimes have their own strengths, and many users want Kubernetes to support custom runtimes. As a result, the CRI interface has been added since v1.5, removing these obstacles through the abstraction layer of the container runtime, making it possible to run multiple container runtimes without modifying Kubelet. The CRI interface includes a set of Protocol Buffer, gRPC API, libraries for the streaming interface, and a series of tools for debugging and verification. At this stage, the built-in Docker implementation is also gradually migrated under the interface of CRI. However, rkt has not been fully migrated at this time, because the process of rkt migrating CRI will be completed in a separate repository, making it easy to maintain and manage.

In the third stage, starting with v1.11, the built-in rkt code in Kubelet is removed and the implementation of CNI is migrated to dockershim. In this way, all containers except docker are accessed through CRI at runtime. In addition to implementing the CRI interface, the external container runtime is also responsible for configuring the network for the container. CNI is generally recommended because it supports many network plug-ins in the community, but this is not required. Network plug-ins only need to meet the basic assumption of the Kubernetes network, that is, IP-per-Pod, all Pod and Node can access each other directly through IP.

CRI container runtime

The launch of CRI brings new prosperity to the container community, and the runtime suitable for a variety of different scenarios also arises at the historic moment. For example:

Also note the difference between the CRI container runtime and the container engine:

The CRI container runtime refers to the runtime that implements the Kubelet CRI interface so that it can be seamlessly integrated into Kubernetes.

The container engine is only a service responsible for managing container images and container runs, and it also has a standard: OCI (Open Container Initiative).

For example, CNCF's Container Runtime Landscape includes a number of columns of "Container Runtime", some of which implement CRI, such as cri-o;, and more are just a container engine that can be applied to Kubernetes through cri-o, cri-containerd, and so on.

CRI Tools

CRI Tools is a very useful tool for debugging Kubernetes container runtimes and container applications. It is a subproject to which SIG Node belongs and can be applied to all container runtimes that implement the CRI interface. CRI Tools includes two components, crictl for debugging and debugging, and critest for conformance verification of the container runtime implementation.

Crictl

Let's take a look at crictl first. Crictl provides a command-line tool similar to the docker command. When debugging or debugging container applications, sometimes the system administrator needs to log in to the node to view the status of the container or image in order to collect information about the system and container applications. At this point, it is recommended to use crictl to do this, because crictl provides a consistent experience for all different container engines.

In terms of use, the use of crictl is very similar to docker command-line tools, for example, you can use crictl pods to query the list of PodSandbox, crictl ps to query the list of containers, and crictl images to query the list of images.

It is important to note that crictl is designed to debug, not as a substitute for docker or kubectl. For example, because CRI does not define the interface for building images, crictl does not provide the ability to build images such as docker build. But because crictl provides an interface for Kubernetes, crictl provides a clearer view of containers and Pod than docker.

Critest

Critest is a conformance verification tool for the container runtime to verify that the container runtime meets the requirements of Kubelet CRI. It is part of the CRI TOOLS tool. In addition to verification testing, critest also provides performance testing of CRI interfaces, such as critest-benchmark.

It is recommended that critest be integrated into the Devops process of the CRI container runtime to ensure that each change does not break the basic functionality of CRI.

In addition, you can choose to submit the test results of critest and Kubernetes Node E2E to Sig-node 's TestGrid for presentation to the community and users.

Prospects for the future

Docker Runtime split

Docker is currently a runtime built into Kubelet and the default container runtime. In this way, Kubelet actually relies on Docker, which puts a certain maintenance burden on Kubelet itself.

For example, some features within Kubelet may only apply to the Docker runtime. When serious defects are found in Docker or other components that Docker depends on (such as containerd, runc), fixing these defects requires reproducing the compilation and release of Kubelet.

In addition, when users want to add new features to the Docker runtime, these new features may not be easily introduced into Kubelet, especially during the three-month release cycle, where new features are usually not introduced into existing stable branches. From the perspective of the Docker runtime, the introduction of new features is usually slow.

Therefore, splitting the Docker container engine and separating it into a cri-docker can solve all of the above problems.

Because Docker, as the default container engine, has been widely used in the production environment, splitting and migration will apply the vast majority of users, because the specific migration methods need to be discussed in detail by the community.

Strongly isolated container engine

Although Kubernetes provides basic multi-tenant function, different applications can be isolated in different namespace, and RBAC can be used to control the access of different users to all kinds of resources, but because of the characteristics of Docker sharing kernel, there are still great security risks when running untrusted applications in Kubernetes. In order to eliminate this problem, strong isolation container engine arises at the historic moment.

The earliest strongly isolated container engines are hyperd and clear container, the forerunners of Kata containers, which runs Kubernetes Pod as a virtual machine, so that container applications can be strongly isolated through virtualization. Virtualization is the foundation of IaaS in cloud computing, and its security has been widely verified, so its security has been guaranteed. The two projects have now been merged into Kata containers.

In addition to Kata containers, Google and AWS are also promoting strongly isolated container engines, namely gVisor and Firecraker.

Unlike Kata containers, gVisor does not create a complete VM, but implements its own sandbox (the document becomes a user-mode kernel) to intercept and filter the container's syscall, thus achieving security isolation. Although gVisor is lighter than VM, intercepting and filtering will also bring high costs, which will cause some loss to the performance of the final container application.

Similarly, Firecraker implements a lightweight VM based on KVM, called microVM. Unlike Kata, instead of using QEMU, it builds a compact device model with Rust, so that each microVM consumes only about 5MB memory.

Multi-container runtime

With a strongly isolated container engine, some new problems inevitably arise. For example, many of Kubernetes's own services or extensions cannot run in a strongly isolated environment because they require HostNetwork or privileged mode. Therefore, the multi-container runtime arises at the historic moment.

In this way, privileged applications such as runc/docker can be used to run privileged applications, while strong isolation container engines can be used to run normal applications. For example, a typical combination is:

Runc + kata

Runc + gVisor

Windows server containers + Hyper-V containers

Previously, many container runtimes supported multiple container engines in CRI Shim and were selected in the form of Annotations. With the help of new RuntimeClass resources, different runtime can be selected directly through Pod Spec.

ApiVersion: node.k8s.io/v1beta1 kind: RuntimeClass metadata: name: myclas # RuntimeClass is non-namespaced handler: myconfiguration-apiVersion: v1 kind: Pod metadata: name: mypod spec: runtimeClassName: myclass #.

RuntimeClass itself is still in a relatively early stage, and will continue to be further enhanced in scheduling and other aspects in the future.

The above is all the content of this article "what system is Kubernetes?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.