How to use map/reduce pattern to realize optimal calculation in kubernetes 07/13 Update SLTechnology News&Howtos

How to use map/reduce pattern to realize optimal calculation in kubernetes

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article is about how to use map/reduce mode to achieve optimal calculation in kubernetes. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it.

1. Design Foundation 1.1 two stages: single point and aggregation

When making the optimization, in addition to the last calculation, the calculation for a single algorithm is divided into two stages: single point and aggregation.

In the single point phase, it will be calculated for a single node according to the current algorithm in the aggregation stage, and will be aggregated according to the completion of the calculation in the current single point phase.

1.2 parallelism: nodes and algorithms

When computing, both single point and aggregation phases are parallel, but the objects are different, in which single point phase parallelism is for single node computing, while aggregation phase is for algorithm level computing. This design separates calculation, thus avoiding data competition between multiple goroutine and accelerating preferred calculation without lock.

1.3 map and reduce

On the other hand, map and reduce are for one of the two parallel implementations above, in which map is responsible for scoring a single node, while reduce aggregates the scores for the map phase, and then calculates the second score according to the summary results.

1.4 weight

The map/reduce phase is calculated through algorithms. If we want to make custom adjustments, for a single algorithm, we can adjust its weight in the pre-selection process to customize our own pre-selection process.

1.5 Random Distribution

When priority determination is made, it is certain that multiple node will have the same priority, and when the node is selected, random calculation will be carried out to decide whether to replace the previous most appropriate node with the current node with the same priority.

two。 Source code analysis

The preferred core process is mainly in PrioritizeNodes, and only the key core data structure design is introduced here.

2.1 Storage of unlocked calculation results

The preservation of lock-free calculation results is mainly achieved through the following two-dimensional array. If you want to store the results of an algorithm for a node, you only need to use two indexes: the algorithm index and the node index. Similarly, if I assign an index for a single node to a goroutine, then it can be calculated in parallel without other goroutine.

/ / during the calculation, the array of nodes [] * v1.Node is passed to store all the nodes. The node index mainly refers to this part.

Results: = make ([] schedulerapi.HostPriorityList, len (priorityConfigs), len (priorityConfigs))

2.2 Map calculation based on Node Index

In the pre-selection stage, we introduced the implementation of the ParallelizeUntil function, which generates a computational index according to the number passed in and puts it into chan. Subsequent goroutine takes data from chan and calculates directly.

Workqueue.ParallelizeUntil (context.TODO (), 16, len (nodes), func (index int) {

/ / calculate according to the node and configuration algorithm

NodeInfo: = NodeNameToInfo [Nodes [index] .Name]

/ / get the index of the algorithm

For I: = range priorityConfigs {

If priority configurations [I] .function! = nil {

Continue

}

Var err error

/ / save the calculation results for a single node through the node index

Results [I] [index], err = priorityConfig [I] .Map (pod, meta, nodeInfo)

If err! = nil {

AppendError (err)

Results [I] [index]. Host = Nodes [index] .Name

}

})

2.3 Reduce calculation based on algorithmic Index

Based on the parallelism of the algorithm, a goroutine is started for the calculation of each algorithm, and each goroutine reads and calculates the results of all map phases of the algorithm through the algorithm index, and the subsequent results are still stored in the corresponding location.

/ / calculate the score of the policy

For I: = range priorityConfigs {

If priorityConfig [I] .reduce = = nil {

Continue

}

Wg.Add (1)

Go func (index int) {

Defer wg.Done ()

If err: = priorityConfiguration.reduce (pod, meta, nodeNameToInfo, results [index]); err! = nil {

AppendError (err)

}

If klog.V (10) {

For _, hostPriority: = range results [index] {

Klog.Infof ("% v->% v:% v, Score: (% d)", util.GetPodFullName (pod), hostPriority.Host, priorityconfiguration [index] .Name, hostPriority.Score)

}

} (I)

}

/ / Wait for all computations to be finished.

Wg.Wait ()

2.4 priority score result statistics

According to the previous map/reduce phase, the next step is to accumulate the results of all algorithms for all node.

/ / Summarize all scores.

Result: = make (schedulerapi.HostPriorityList, 0, len (nodes))

For I: = range nodes {

Result = append (result, schedulerapi.HostPriority {Host: Nodes [I] .Name, Score: 0})

/ / facilitate all algorithm configurations

For j: = range priorityConfigs {

Result [I] .score + = results [j] [I] .Score * priorityconfiguration [j] .Weight

}

For j: = range scoresMap {

Result [I] .score + = scoresMap [j] [I] .Score

}

2.5 randomly filter host according to priority

Random filtering here means that when multiple host have the same priority, there is a certain probability that the previous equal priority node (by far the highest priority node) will be replaced with the current node, which is mainly implemented through cntOfMaxScore and rand.Intn (cntOfMaxScore).

Func (g * genericScheduler) selectHost (priorityList schedulerapi.HostPriorityList) (string, error) {

If len (priorityList) = = 0 {

Return "", fmt.Errorf ("empty priorityList")

}

MaxScore: = priorityList [0] .score

Selected: = priorityList [0] .Host

CntOfMaxScore: = 1

For _, hp: = range priorityList [1:] {

If hp.Score > maxScore {

MaxScore = hp.Score

Selected = hp.Host

CntOfMaxScore = 1

} else if hp.Score = = maxScore {

CntOfMaxScore++

If rand.Intn (cntOfMaxScore) = = 0 {

/ / Replace the candidate with probability of 1/cntOfMaxScore

Selected = hp.Host

}

Return selected, nil

} the above is how to use map/reduce mode to achieve optimal calculation in kubernetes. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.