How to optimize garbage collection with genetic algorithm in Python 07/09 Update SLTechnology News&Howtos

How to optimize garbage collection with genetic algorithm in Python

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to optimize garbage collection with genetic algorithms in Python". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's ideas to study and learn "how to optimize garbage collection with genetic algorithms in Python".

Background

In one chapter, Mitchell introduces a robot called Robby whose sole purpose in life is to pick up garbage and describes how to use GA to optimize the control strategy of Robby. Next I'll explain my solution to this problem and show how to implement the algorithm in Python. There are some good packages that can be used to construct such algorithms (such as DEAP), but in this tutorial, I will only use basic Python, Numpy, and TQDM (optional).

Although this is just an example of a toy, GAs is used in many practical applications. As a data scientist, I often use them for hyperparameter optimization and model selection. Although the computational cost of GAs is high, GAs allows us to explore multiple areas of the search space in parallel and is a good choice when calculating gradients.

Problem description

A robot named Robby lives in a two-dimensional grid world full of garbage, surrounded by four walls (shown in the following image). The goal of this project is to develop an optimal control strategy so that he can effectively pick up rubbish instead of hitting a wall.

Robby can only see the up and down four squares around him and the squares he is in, each with three choices, empty, garbage, or a wall. Therefore, there are three different situations in Robby. Robby can perform seven different actions: moving up and down (4), moving randomly, picking up garbage, or standing still.

Therefore, the control policy of Robby can be encoded as a "DNA" string consisting of 243digits between 0 and 6 (corresponding to the actions that Robby should take in all 243 possible situations).

Method

The optimization steps for any GA are as follows:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Generate the "population" of the initial stochastic solution of the problem

The "fit degree" of an individual is evaluated according to the degree to which he solves the problem.

The most appropriate solution is to "reproduce" and pass on "genetic" substances to the next generation.

Repeat steps 2 and 3 until we have an optimized set of solutions

In our task, you created the first generation of Robbys to initialize to random DNA strings (corresponding to random control strategies). The robots are then simulated to run in a randomly assigned grid world and their performance is observed.

Degree of fit

The fitness of the robot depends on how much rubbish it picks up in n movements and how many times it hits the wall. In our example, the robot gives it 10 points for every piece of garbage it picks up, and minus 5 points every time it hits the wall. These robots then "mate" with the probability of their fitness (that is, robots that pick up large amounts of garbage are more likely to reproduce), and a new generation of robots is born.

Mating

There are several different ways to achieve mating. In Mitchell's version, she randomly splices two strands of DNA from her parents and then connects them together to create a child for the next generation. In my implementation, I randomly assigned each gene from each parent (that is, for each of the 243 genes, I flipped a coin to determine whose gene I inherited).

For example, using my method, among the top 10 genes, the possible genes of parents and children are as follows:

Parent 1: 1440623161 Parent 2: 2430661132 Child: 2440621161

Abrupt change

Another concept of natural selection that we replicate with this algorithm is "mutation". Although most of a child's genes are inherited from parents, I have also established a small possibility of gene mutation (that is, random distribution). This mutation rate enables us to explore new possibilities.

Python implementation

The first step is to import the required packages and set parameters for this task. I have chosen these parameters as the starting point, but they can be adjusted, and I encourage you to try to adjust them.

"" Import package "" import numpy as np from tqdm.notebook import tqdm "set parameters" # Simulation Settings pop_size = 200# number of robots per generation num_breeders = 100 # number of robots capable of mating per generation num_gen = 400 # Total algebraic iter_per_sim = 100 # number of garbage collection simulations per robot moves_per_iter = 200 # machine The number of movements that people can do per simulation # Grid setting rubbish_prob = 0.5 # probability of garbage in each grid grid_size = 10 # 0 grid size (except walls) # Evolutionary setting wall_penalty =-5 # fit point no_rub_penalty =-1 # deducted points for picking up garbage in empty blocks rubbish_score = 10 # probability of variation mutation_rate = 0.01for picking up garbage in empty squares

Next, we define a class for the grid world environment. We use the tags "o", "x" and "w" to represent each unit, corresponding to an empty unit, a unit with garbage, and a wall, respectively.

Class Environment: the "class, which is used to represent a grid environment full of garbage. Each cell can be represented as: 'def: empty' x garbage: garbage: Wall'"" def _ _ init__ (self, p=rubbish_prob) G_size=grid_size): self.p = p # probability that the cell is garbage self.g_size = g_size # does not include walls # initialize the grid and randomly assign garbage self.grid = np.random.choice (['oflakes last garbage x'], size= (self.g_size+2,self.g_size+2), p = (1-self.p) Self.p)) # set the outer square to the wall self.grid [:, [0dsself.gfeatsizeroom1]] = 'w'self.grid [[0memself.gfeatsizeroom1],:] =' w' def show_grid (self): # print the grid print (self.grid) def remove_rubbish (self,i) in the current state J): # clear garbage if self.grid from the specified cell (iMagnej) [iMagnej] = = 'odyne: # Cell is already empty return False else: self.grid [iMagnej] =' o' return True def get_pos_string (self,i,j): # returns a string The cell return self.grid [I-1 Magi j] + self.grid [iGrain1] + self.grid [iMagazine 1] + self.grid [iGrain1] + self.grid [iMagazine] represents the "visible" cell of the robot in the cell (iQuery j).

Next, we create a class to represent our robot. This class includes methods for performing actions, calculating fit, and generating a new DNA from a pair of parent robots.

Class Robot: "" used to represent the garbage collection robot "" def _ _ init__ (self, p1_dna=None, p2_dna=None, m_rate=mutation_rate, w_pen=wall_penalty, nr_pen=no_rub_penalty) R_score=rubbish_score): self.m_rate = m_rate # mutation rate self.wall_penalty = w_pen # penalty for hitting a wall self.no_rub_penalty = nr_pen # penalty for picking up garbage in empty squares self.rubbish_score = r_score # reward for picking up garbage self.p1_dna = p1_dna # DNA of parents 2 Self.p2_dna = p2_dna # DNA # of Parental 2 generates a dictionary to look up the genetic index con = ['w' 'from the scene string The wall is' ostentatious. Empty Junk self.situ_dict = dict () count = 0 for up in con: for right in con: for down in con: for left in con: for pos in con: self.situ_ garbage [up + right+down+left+pos] = count Count + = 1 # initialize DNA self.get_dna () def get_dna (self): # initialize the robot's dna string if self.p1_dna is None: # randomly generate DNA self.dna = '.join ([str (x) join (7) when there is no parent) Size=243)]) else: self.dna = self.mix_dna () def mix_dna (self): # generate robot DNA mix_dna = '.join ([np.random.choice ([self.p1_dna)] from parents' DNA Self.p2_dna]) [I] for i in range]) # add mutation for i in range: if np.random.rand () > 1-self.m_rate: mix_dna = mix_dna [: I] + str (np.random.randint (7)) + mix_ DNA [I + 1:] return mix_dna def simulate (self, n_iterations, n_moves Debug=False): # simulated garbage collection tot_score = 0 for it in range (n_iterations): self.score = 0 # fit score self.envir = Environment () self.i, self.j = np.random.randint (1) self.envir.gambisizecollection1 Size=2) # randomly assign the initial location if debug: print ('before') print (' start position:',self.i Self.j) self.envir.show_grid () for move in range (n_moves): self.act () tot_score + = self.score if debug: print ('after') print (' end position:',self.i) Self.j) self.envir.show_grid () print ('score:',self.score) return tot_score / n_iterations # average score for n iterations def act (self): # Action based on DNA and robot position post_str = self.envir.get_pos_string (self.i Self.j) # Robot current position gene_idx = self.situ_ position [post _ str] # relevant index of current position DNA act_key = self.dna [gene _ idx] # read action if act_key from DNA = = '5pm: # Random Mobile act_key = np.random.choice ([' 0,0,0,0,0,0,1,2,4 '3']) if act_key =' 0: self.mv_up () elif act_key = ='1: self.mv_right () elif act_key = ='2: self.mv_down () elif act_key = ='3: self.mv_left () elif act_key = =' 6 move: self.pickup () def mv_up (self): # move up if self.i = = 1: self.score + = self.wall_penalty else: self.i-= 1 def mv_right (self): # move right if self.j = = self.envir.g_size: Self.score + = self.wall_penalty else: self.j + = 1 def mv_down (self): # move down if self.i = = self.envir.g_size: self.score + = self.wall_penalty else: self.i + = 1 def mv_left (self): # move left Dynamic if self.j = = 1: self.score + = self.wall_penalty else: self.j-= 1 def pickup (self): # garbage collection success = self.envir.remove_rubbish (self.i Self.j) if success: # successfully picked up garbage self.score + = self.rubbish_score else: # No garbage was found in the current box self.score + = self.no_rub_penalty

Finally, it's time to run genetic algorithms. In the following code, we generate an initial population of robots and let natural selection run its process. I should mention that there are of course faster ways to implement this algorithm (such as using parallelization), but I sacrificed speed to achieve clarity for the purposes of this tutorial.

# initial population pop = [Robot () for x in range (pop_size)] results = [] # perform evolutionary for i in tqdm (range (num_gen)): scores = np.zeros (pop_size) # traverse all robots for idx, rob in enumerate (pop): # run garbage collection simulation and calculate fit score = rob.simulate (iter_per_sim) Moves_per_iter) scores [idx] = score results.append ([scores.mean (), scores.max ()]) # keep the average and maximum values of each generation best_robot = pop [scores.argmax ()] # preserve the best robots # limit the number of robots that can mate inds = np.argpartition (scores -num_breeders) [- num_breeders:] # based on the degree of fit, the top robot index subpop = [] for idx in inds: subpop.append (pop[ IDX]) scores = scores [inds] # squared and standardized norm_scores = (scores-scores.min ()) * * 2 norm_scores = norm_scores / norm_scores.sum () # created First-generation robot new_pop = [] for child in range (pop_size): # Select parents with good fit p1 P2 = np.random.choice (subpop, p=norm_scores, size=2, replace=False) new_pop.append (Robot (p1.dna, p2.dna)) pop = new_pop

Although at first most robots don't pick up garbage and always hit walls, generations later, we begin to see some simple strategies (such as "pick it up if you're with garbage" and "don't move into the wall if you're next to the wall"). After hundreds of iterations, we have only one generation of incredible garbage collection genius!

Result

The chart below shows that we can "evolve" a successful garbage collection strategy among 400 generations of robots.

To assess the quality of evolutionary control strategies, I manually created a benchmark policy that contains some intuitive and reasonable rules:

If the garbage is in the current box, pick it up

If garbage can be seen on adjacent squares, move to that square

If you are near the wall, move in the opposite direction

Otherwise, move at will

On average, this benchmark strategy has a 426.9 degree of fit, but our final "evolutionary" robot has an average degree of fit of 475.9.

Strategic analysis

The coolest thing about this optimization method is that you can find counterintuitive solutions. Robots can not only learn reasonable rules that humans may design, but also spontaneously come up with strategies that humans may never consider. An advanced technology has emerged, that is, the use of "markers" to overcome myopia and memory deficiency.

For example, if a robot is now on a square with garbage and can see the garbage on the East and West squares, then a naive way is to immediately pick up the trash on the current square and move to the square with garbage. The problem with this strategy is that once the robot moves (say west), he can't remember that there is another piece of garbage in the east. To overcome this problem, we watched our evolutionary robots perform the following steps:

Move westward (leave trash as a mark in the current square)

Pick up the garbage and go east (it can see the garbage as a sign)

Pick up the garbage and move it to the east.

Pick up the last piece of garbage

Another example of a counterintuitive strategy that emerges from this optimization is shown below. OpenAI uses reinforcement learning, a more complex optimization method, to teach agents to play hide-and-seek. We see that these agents learn the "human" strategy at first, but eventually learn new solutions.

Thank you for your reading, the above is the content of "how to optimize garbage collection with genetic algorithm in Python". After the study of this article, I believe you have a deeper understanding of how to optimize garbage collection with genetic algorithm in Python. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.