How to use MPI for Python to parallelize genetic algorithm 07/04 Update SLTechnology News&Howtos

How to use MPI for Python to parallelize genetic algorithm

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how to use MPI for Python parallel genetic algorithm. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Text

When we use genetic algorithm to optimize the objective function, the function is usually a high-dimensional function, and its derivative is generally difficult to obtain. In this way, the calculation of our fitness function is usually time-consuming.

For example, when using genetic algorithm to find the optimal structure, we usually need to call quantitative software to calculate the total energy of the structure, which is a very time-consuming process; for example, when we optimize the force field parameters, taking the error before the energy calculated by the force field and the benchmark energy as the fitness, we also need to call the corresponding force field program to obtain the total energy, and this process is also relatively time-consuming.

This will lead to a problem, when our population is relatively large, we need to use fitness information to generate the next generation of population, when the reproductive process of each generation will be very time-consuming. But fortunately, the selection crossover and mutation process of the population is an independent process for the individuals in the population, and we can process this part in parallel to accelerate the iteration of the genetic algorithm. Xiamen forklift truck

Further encapsulate the interface of mpi4py

In order to make the interface of mpi more convenient to call in GAFT, I decided to further encapsulate mpi4py for where it is needed in genetic algorithm, so I wrote a separate MPIUtil class. For more information, please see gaft/mpiutil.py.

Intra-group collective communication interface

Since the task of parallelization is carried out during the reproduction of the population, I need to divide the previous generation population into multiple sub-parts, and then carry out genetic operations such as selection and crossover mutation on the divided sub-parts in each process. At the end, the subpopulations obtained from each word part are collected and merged. Several interfaces for partition and collection have been written for this purpose:

Python

Def split_seq (self, sequence):

Split the sequence according to rank and processor number.

Starts = [i for i in range (0, len (sequence), len (sequence) / / self.size)]

Ends = starts [1:] + [len (sequence)]

Start, end = list (zip (starts, ends)) [self.rank]

Return sequence [start: end]

Def split_size (self, size):

Split a size number (int) to sub-size number.

If size < self.size:

Warn_msg = ('Splitting size ({}) is smaller than process' +

'number ({}), more processor would be'+

'superflous') .format (size, self.size)

Self._logger.warning (warn_msg)

Splited_sizes = [1] * size + [0] * (self.size-size)

Elif size% self.size! = 0:

Residual = size% self.size

Splited_sizes = [size / / self.size] * self.size

For i in range (residual):

Splited_ Sizes [I] + = 1

Else:

Splited_sizes = [size / / self.size] * self.size

Return splited_ sizes[self.rank]

Def merge_seq (self, seq):

Gather data in sub-process to root process.

If self.size = = 1:

Return seq

Mpi_comm = MPI.COMM_WORLD

Merged_seq= mpi_comm.allgather (seq)

Return list (chain (* merged_seq))

Add parallelism to the main loop of genetic algorithm

In the process of population reproduction, the population is divided according to the number of processes, and then genetic operations are carried out in parallel and sub-populations are merged to complete parallelism, with few code changes. For details, see: https://github.com/PytLab/gaft/blob/master/gaft/engine.py#L67

Python

# Enter evolution iteration.for g in range (ng): # Scatter jobs to all processes. Local_indvs = [] local_size = mpi.split_size (self.population.size / / 2) # Fill the new population. For _ in range (local_size): # Select father and mother. Parents = self.selection.select (self.population, fitness=self.fitness) # Crossover. Children = self.crossover.cross (* parents) # Mutation. Children = [self.mutation.mutate (child) for child in children] # Collect children. Local_indvs.extend (children) # Gather individuals from all processes. Indvs = mpi.merge_seq (local_indvs) # The next generation. Self.population.individuals = indvs

Next, I'll run parallel acceleration tests for examples of one-dimensional optimization in the project to see the effect of acceleration. The example code is in / examples/ex01/

Due to the limited number of cores in my book, I install gaft on the laboratory cluster and use MPI to optimize one-dimensional parallel computing using multi-cores. The population size is 50 and the algebra is 100. according to the number of cores, we can get different optimization time and speedup.

The above is the editor for you to share how to use MPI for Python parallel genetic algorithm, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.