How to deal with the resource supply demand of K8s in offline scenario by Serverless 07/13 Update SLTechnology News&Howtos

How to deal with the resource supply demand of K8s in offline scenario by Serverless

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Today, I will talk to you about how Serverless responds to the resource supply demands of K8s in offline scenarios. Many people may not know much about it. In order to make you understand better, the editor summarizes the following contents for you. I hope you can gain something according to this article.

The topic of mixed parts of K8s is discussed because we find that after the transformation of business K8s, mixed parts and resource utilization are an unavoidable topic for the operation and maintenance team.

First of all, there is no doubt that the system capability of Kubernetes and the cloud native ecological influence driven by it as an engine are very powerful, which contribute to the large-scale practical application of many advanced ideas in the production environment, including micro-services, automatic scale-up, CICD, service grid, application mixing, and so on.

Some of these parts can be well supported in the existing K8s system, such as micro-service, automatic expansion and reduction. Some rely on the integration of K8s and ecological capabilities, such as CICD and service grid, rely on K8s and some community DevOps, servicemesh systems to get through, but most of them have been well integrated in the ecosystem, usually we do not need to do much work.

However, our topic today, application mixing under K8s architecture, is a special area. On the one hand, when most enterprise infrastructure is upgraded to cloud-based architecture, it will naturally support some mixed-part capabilities, which will bring some obvious benefits, such as the improvement of resource utilization. It can be said that containerization and K8s have opened a window for the entire industry to enter the application of a large-scale mixed part. On the other hand, when we really enter this field, even standing on the shoulders of K8s, mixing is still a huge challenge to the ability of enterprise architecture.

Before containerization, when applications are deployed on physical or virtual servers, resource utilization is usually very low. one is that many applications themselves have tidal phenomena, and the other is that in most cases, servers can only deploy one application. Instead of deploying multiple applications on one node, as K8s does. However, after containerization hosting to the K8s cluster, we will often find that the resource utilization is still not high.

The above figure shows a typical resource curve of K8s cluster online business. The top blue line is the container resource request application value, and the red line is the real curve value of the container. We can see that there is a large gap between request and usage. This is because the evaluation of container resources cannot be completely accurate. In addition, there are differences between peaks and troughs, resulting in low average utilization.

Is it not possible that K8s won't work? Of course not. Although K8s has not solved all the problems in helping us to mix applications, it is definitely one of the best alternative platforms.

Excellent system capabilities make K8s naturally suitable for mixing, including online service mixing and now hot offline mixing in the industry. Tencent has also significantly improved the utilization of resources in many scenarios through K8s.

In addition to being familiar with star apps, a large amount of computing power such as Tencent's is supporting services, offline computing and so on. By mixing this part of the application as well as some products and services with obvious tidal phenomena, the improvement of resource utilization is very significant.

In the industry, Google, which is based on the predecessor of K8s Borg system, has published a number of papers related to mixed parts since 2015. In a paper published last year, you can see that the mixed capabilities supported by Borg are approaching 60% CPU resource utilization. Comparing its mixing effect in 2011 and 2019, we can see that in addition to the increase in utilization, the most significant change is that the application grading granularity is finer, and the other is that all levels of applications run more smoothly. Especially in the second point, we can clearly feel the progress of Borg in the support level of mixed parts, such as scheduling enhancement, resource prediction and recovery, task avoidance and so on.

What is the key to improving the effect of the mix? First of all, we need to clarify two issues.

* * first question, what is the purpose of the mixed department? * * the purpose of mixed parts is to realize the reuse of idle resources and improve the overall resource utilization on the premise of ensuring the quality of online service. To ensure the quality of online business service is the premise, so that it will not be affected, without this premise, no matter how high the utilization rate is meaningless.

* * second question, what kind of applications are suitable for mixed applications? * * there are two types of applications suitable for mixed parts: one is periodic applications that require high computing power, usually some offline computing tasks. One is applications that are prone to waste of resources, usually long-running online services with tidal phenomena. However, it should be noted that some online services are highly sensitive to certain resources, and this kind of service is the biggest challenge to the mixed system, because a little carelessness will deviate from the purpose of the mixed part, affect the quality of online service, and the loss outweighs the gain.

After identifying these two problems, let's take a look at what mechanisms the hybrid system needs. It is usually divided into three layers:

One is the application management layer that carries on the feature portrait, grading and allocating resource quota to the mixed-part application. This layer defines the level of application, the timing of mixing, and ensures that resource allocation is not out of control through quotas.

The second is the core system for scheduling, isolation, resource reuse and avoidance of mixed-part clusters. This layer is the core of the mixed part, which basically determines what kind of mixed effect our cluster can achieve.

Finally, a complete set of adaptive automatic operation system is needed.

The basic principle of mixed parts is the redistribution of idle resources. Usually, there are two sources of idle resources. There will be fragmented resources in the cluster, which is caused by the granularity of resource allocation, and these fragmented resources can be allocated to applications with smaller granularity. In addition, the resources applied for by online services are usually larger than those actually used. With the application portrait, the peaks and troughs of this part of the service can be predicted, and this part of idle resources can be allocated to other applications.

Based on this background, the two core sub-modules of the mixed part are derived: resource reuse and task avoidance. As the name implies, resource reuse is the real-time redistribution of the above two kinds of idle resources to mixed applications through the mechanism of prediction and recovery. Task avoidance is to check whether the core online service has been affected by the mixed part. Once the interference occurs, the conflict handling mechanism is triggered immediately, and operations such as suppression and rescheduling are carried out.

It can be said that these two modules determine the effect of the mixed part and the scope of application of the mixed part. In addition to the theoretical problems, there are some important points that must be considered: in order to ensure the mixing effect, frequent real-time prediction and resource recovery of the cluster brings additional burden to the cluster itself. how to find a balance between reusing resources as much as possible and minimizing the frequency of resource prediction and recovery? In addition, in order to ensure the quality of online service, task avoidance is usually inevitable, which reduces the execution efficiency of sub-priority applications, and high load may lead to frequent retry and accumulation of tasks, which may drag down the whole cluster.

Both the existing workload expansion and the new workload can be scheduled in a smooth way. And this capability will not incur additional maintenance costs for the cluster.

The core value of this capability to the mixed department is that it expands the cluster resource pool without cost, reduces the risk of resource conflict, and improves the redundancy and applicability of the mixed cluster. In addition, when conflicts such as insufficient resources are detected, in many scenarios, the secondary priority tasks can be expanded or rescheduled as appropriate, and the tasks can continue to run on the elastic container, adhering to the principle of not interrupting the started tasks as much as possible. improve the efficiency of the entire system.

Several typical scenarios of this kind of mixed-part cluster:

1. Fill in tasks when the load is low, run more tasks and improve the utilization of cluster resources.

2. In case of online service interference, block the relevant nodes, expel the sub-priority tasks to the virtual node, and let it continue to run.

3. When there is a shortage of cluster resources, the relevant nodes are blocked. If there is a shortage of compressible resources, such as CPU, IO, etc., the secondary priority tasks are suppressed; if there is a shortage of incompressible resources, such as memory, storage, etc., then the secondary priority tasks are expelled to the virtual node. In this case, all new Pod are scheduled to the virtual node, without any pressure on the fixed resources of the cluster to avoid avalanches.

After reading the above, do you have any further understanding of how Serverless responds to the resource supply demands of K8s in offline scenarios? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.