Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Kubernetes Scheduler

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use Kubernetes Scheduler". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use the Kubernetes scheduler.

Kube-scheduler is one of the core components of kubernetes system, which is mainly responsible for the scheduling function of the whole cluster resources. According to the specific scheduling algorithm and strategy, Pod is dispatched to the optimal working node, so as to make more reasonable and full use of cluster resources, which is also a very important reason for us to choose to use kubernetes. If a new technology can not help enterprises to save costs and provide efficiency, I believe it is very difficult to promote.

Scheduling process

By default, the default scheduler provided by kube-scheduler can meet most of our requirements, and the examples we have contacted with you basically use the default policy to ensure that our Pod can be assigned to nodes with sufficient resources to run. But in actual online projects, we may know our own applications better than kubernetes, for example, we want a Pod to run on only a few specific nodes, or these nodes can only be used to run specific types of applications, which requires that our scheduler can be controlled.

Kube-scheduler is the scheduler of kubernetes. Its main function is to schedule Pod to the appropriate Node node according to specific scheduling algorithm and scheduling policy. It is an independent binary program. After startup, it will listen to API Server all the time, get the Pod with empty PodSpec.NodeName, and create a binding for each Pod.

Kube-scheduler structrue

This process seems relatively simple to us, but in the actual production environment, there are many issues to consider:

How to ensure the fairness of all node scheduling? You know, it doesn't mean that the resource configuration of nodes is the same.

How to ensure that each node can be allocated resources?

How can cluster resources be used efficiently?

How can cluster resources be maximized?

How to ensure the performance and efficiency of Pod scheduling?

Can users customize their scheduling strategies according to their actual needs?

Considering all kinds of complex situations in the actual environment, the scheduler of kubernetes is implemented in the form of plug-in, which is convenient for users to customize or redevelop. We can customize a scheduler and integrate it with kubernetes in the form of plug-ins.

The source code of the kubernetes scheduler is located in kubernetes/pkg/scheduler, and the general code directory structure is as follows: (the directory structure may be different for different versions)

Specific implementation related to kubernetes/pkg/scheduler-- scheduler.go / / scheduling |-- algorithm | |-- predicates / / Node filtering strategy | |-- priorities / / Node scoring strategy |-- algorithmprovider | |-- defaults / / define the default scheduler.

The core program that Scheduler creates and runs, the corresponding code is in pkg/scheduler/scheduler.go. If you want to view the entry program of kube-scheduler, the corresponding code is in cmd/kube-scheduler/scheduler.go.

Scheduling is mainly divided into the following parts:

The first is the pre-selection process, which filters out nodes that do not meet the criteria, which is called Predicates

Then there is the optimization process, which sorts the passed nodes according to priority, which is called Priorities.

Finally, select the node with the highest priority, and if there is an error in any of the intermediate steps, the error will be returned directly.

The Predicates phase first traverses all the nodes, filtering out the nodes that do not meet the conditions, which is a mandatory rule. All the Node output in this stage will be recorded and used as the input of the second stage. If all nodes do not meet the conditions, then the Pod will remain in the Pending state until some nodes meet the conditions, during which the scheduler will keep retrying.

Therefore, when we deploy the application, if we find that Pod has been in the state of Pending, then there are no nodes that meet the scheduling conditions. At this time, we can check whether the node resources are available.

In the Priorities phase, the nodes are filtered again. If multiple nodes meet the conditions, the system will sort the nodes according to their priorites, and finally select the node with the highest priority to deploy the Pod application.

The following is a simple diagram of the scheduling process:

kube-scheduler filter

The more detailed process is as follows:

First, the client creates a Pod resource through API Server's REST API or kubectl tool

After receiving the user's request, API Server stores the relevant data in the etcd database.

The scheduler listens to API Server to look at the list of Pod for bind and iterates through the attempt to allocate nodes for each Pod. This allocation process is the two stages we mentioned above:

In the pre-selection phase (Predicates), filter nodes, and the scheduler uses a set of rules to filter out Node nodes that do not meet the requirements. For example, if Pod sets the request of resources, hosts with fewer available resources than Pod will obviously be filtered out.

In the optimization phase (Priorities), the priority of the node is scored and the Node list filtered out in the previous stage is graded. The scheduler will consider some overall optimization strategies, such as distributing multiple Pod copies controlled by Deployment to different hosts, using the hosts with the lowest load, and so on.

After filtering in the above stage, select the Node node with the highest score and Pod for binding operation, and then store the results in etcd

Finally, the kubelet corresponding to the selected Node node is used to perform the operations related to creating the Pod.

Among them, Predicates filtering has a series of algorithms that can be used. Here are a few:

PodFitsResources: whether the remaining resources on the node are larger than those requested by Pod

PodFitsHost: if Pod specifies NodeName, check whether the node name matches NodeName

PodFitsHostPorts: whether the port already used on the node conflicts with the port applied for by Pod

PodSelectorMatches: filter out nodes that do not match the label specified by Pod

NoDiskConflict: the volume that has been mount does not conflict with the volume specified by Pod unless they are both read-only

CheckNodeDiskPressure: check whether the disk space of the node meets the requirements

CheckNodeMemoryPressure: check whether the node memory is sufficient

In addition to these filtering algorithms, there are some other algorithms. For more details, we can look at the source file: k8s.io/kubernetes/pkg/scheduler/algorithm/predicates/predicates.go.

The Priorities priority consists of a series of key-value pairs. The key is the name of the priority, and the value is its weight. Similarly, here are several representative options:

LeastRequestedPriority: determine the weight by calculating the utilization rate of CPU and memory. The lower the utilization rate, the higher the weight. Of course, it is also normal that the lower the utilization rate of the resource, the higher the weight, the more likely it is to run other Pod.

SelectorSpreadPriority: for better high availability, schedule multiple Pod replicas under the same Deployment or RC to multiple different nodes as far as possible. When a Pod is scheduled, it will first find the corresponding controller of the Pod, and then check the existing Pod under the controller. The less Pod runs, the higher the weight of the node.

ImageLocalityPriority: if there is already a mirror node to be used on a node, the larger the total size of the image, the higher the weight

NodeAffinityPriority: this is to calculate a weight value based on the affinity of the node. Later, we will explain the use of affinity in detail.

In addition to these strategies, there are many other strategies, and you can also check the source file: k8s.io/kubernetes/pkg/scheduler/algorithm/priorities/ for more information. Each priority function will return a score of 0-10, the higher the score, the better the node, and each function will correspond to a value that represents the weight. The score of the final host is calculated using the following formula:

FinalScoreNode = (weight1 * priorityFunc1) + (weight2 * priorityFunc2) + … + (weightn * priorityFuncn) Custom scheduling

The above is the basic flow of kube-scheduler default scheduling. In addition to using the default scheduler, we can also customize the scheduling policy.

Scheduler extension

Kube-scheduler can specify the scheduling policy file with the-policy-config-file parameter when it starts, and we can assemble the Predicates and Priority functions according to our own needs. Selecting different filter functions and priority functions, controlling the weight of priority functions and adjusting the order of filter functions will affect the scheduling process.

Here is an example of an official Policy file:

{"kind": "Policy", "apiVersion": "v1", "predicates": [{"name": "PodFitsHostPorts"}, {"name": "PodFitsResources"}, {"name": "NoDiskConflict"}, {"name": "NoVolumeZoneConflict"}, {"name": "MatchNodeSelector"} {"name": "HostName"}], "priorities": [{"name": "LeastRequestedPriority", "weight": 1}, {"name": "BalancedResourceAllocation", "weight": 1}, {"name": "ServiceSpreadingPriority", "weight": 1}, {"name": "EqualPriority", "weight": 1}]} multiple schedulers

If the default scheduler does not meet the requirements, you can also deploy a custom scheduler. Also, multiple scheduler instances can be run simultaneously throughout the cluster, using podSpec.schedulerName to choose which scheduler to use (the built-in scheduler is used by default).

ApiVersion: v1kind: Podmetadata: name: nginx labels: app: nginxspec: schedulerName: my-scheduler # choose to use a custom scheduler my-scheduler containers:-name: nginx image: nginx:1.10

It's also easy to develop our own scheduler, such as our my-scheduler here:

First, you need to get the node and Pod through the specified API

Then select pod for phase=Pending and schedulerName=my-scheduler

After calculating where each Pod needs to be placed, the scheduler creates a Binding object

Then calculate the most suitable target node according to the algorithm of our custom scheduler.

Priority scheduling

Different from the priority (Priorities) in the scheduling optimization strategy mentioned earlier, the priority mentioned above refers to the node priority, while the priority pod priority we are talking about here refers to the priority of the Pod, and the high-priority Pod will be scheduled first, or sacrifice the low-priority Pod when the resources are insufficient, so that the important Pod can be deployed.

To define the Pod priority, you need to first define the PriorityClass object, which has no restrictions on Namespace:

ApiVersion: v1kind: PriorityClassmetadata: name: high-priorityvalue: 1000000globalDefault: falsedescription: "This priority class should be used for XYZ service pods only."

Where:

Value is the priority of a 32-bit integer. The higher the value, the higher the priority.

GlobalDefault is used for Pod with no PriorityClassName configured, and only one PriorityClass in the whole cluster should set it to true

Then you can specify the defined PriorityClass name in the spec.priorityClassName of Pod:

ApiVersion: v1kind: Podmetadata: name: nginx labels: app: nginxspec: containers:-name: nginx image: nginx imagePullPolicy: IfNotPresent priorityClassName: high-priority

It is also worth noting that preemption logic is triggered when the node does not have enough resources for the scheduler to schedule Pod, resulting in Pod in pending. Preemption attempts to remove the low-priority Pod from a node, freeing up resources so that the high-priority Pod can be deployed with node resources.

Now let's review whether the scheduling process of kubernetes is much clearer through the following figure:

At this point, I believe you have a deeper understanding of "how to use the Kubernetes scheduler". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report