Table of Contents
Fetching ...

Deep Learning-Based Operators for Evolutionary Algorithms

Eliad Shem-Tov, Moshe Sipper, Achiya Elyasaf

TL;DR

The paper introduces two deep-learning–driven, domain-independent evolutionary operators: Deep Neural Crossover (DNC) for genetic algorithms and BERT Mutation for genetic programming. DNC leverages a DRL-based encoder–decoder with a pointer network to learn gene correlations and guide offspring construction, with a transfer-learning strategy to reduce domain-specific training costs. BERT Mutation applies masked language modeling ideas to GP trees, using reinforcement learning to replace masked nodes with fitness-enhancing alternatives, trained online from population data. Empirical results show DNC (especially in multi-parent mode) outperforming standard crossovers across Graph Coloring and Bin Packing, while BERT Mutation consistently yields the best test RMSE across symbolic regression benchmarks, albeit with higher per-generation time. Together, these operators demonstrate significant performance gains and adaptability across problem domains, pointing to practical impacts in automated, domain-agnostic optimization workflows.

Abstract

We present two novel domain-independent genetic operators that harness the capabilities of deep learning: a crossover operator for genetic algorithms and a mutation operator for genetic programming. Deep Neural Crossover leverages the capabilities of deep reinforcement learning and an encoder-decoder architecture to select offspring genes. BERT mutation masks multiple gp-tree nodes and then tries to replace these masks with nodes that will most likely improve the individual's fitness. We show the efficacy of both operators through experimentation.

Deep Learning-Based Operators for Evolutionary Algorithms

TL;DR

The paper introduces two deep-learning–driven, domain-independent evolutionary operators: Deep Neural Crossover (DNC) for genetic algorithms and BERT Mutation for genetic programming. DNC leverages a DRL-based encoder–decoder with a pointer network to learn gene correlations and guide offspring construction, with a transfer-learning strategy to reduce domain-specific training costs. BERT Mutation applies masked language modeling ideas to GP trees, using reinforcement learning to replace masked nodes with fitness-enhancing alternatives, trained online from population data. Empirical results show DNC (especially in multi-parent mode) outperforming standard crossovers across Graph Coloring and Bin Packing, while BERT Mutation consistently yields the best test RMSE across symbolic regression benchmarks, albeit with higher per-generation time. Together, these operators demonstrate significant performance gains and adaptability across problem domains, pointing to practical impacts in automated, domain-agnostic optimization workflows.

Abstract

We present two novel domain-independent genetic operators that harness the capabilities of deep learning: a crossover operator for genetic algorithms and a mutation operator for genetic programming. Deep Neural Crossover leverages the capabilities of deep reinforcement learning and an encoder-decoder architecture to select offspring genes. BERT mutation masks multiple gp-tree nodes and then tries to replace these masks with nodes that will most likely improve the individual's fitness. We show the efficacy of both operators through experimentation.
Paper Structure (13 sections, 7 figures, 2 tables)

This paper contains 13 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Encoder architecture. Parents' genes are embedded into a shared latent feature space and subsequently processed by an LSTM network to produce an embedded representation of the parents.
  • Figure 2: Decoder architecture. The decoder input is the encoder's output, i.e., the embedded representation of the two parents, which is subsequently processed and decoded by an LSTM network to produce a child genome. At each decoding step, the output is sent to a pointer network that chooses a gene from one of the parents and returns an embedded representation of the sampled gene. Thus, the child's genome is constructed gene by gene, from left to right, starting from an empty genome.
  • Figure 3: Fitness value of best individual vs. generation, of each operator for the zeroin.i.2 graph-coloring problem. The fitness value is averaged over 20 runs. The black dots and triangles along each line mark the 5 and 10-minute runtime cutoffs, respectively.
  • Figure 4: Illustration of Masked Language Modeling using BERT joshi2020spanbert: "Super Bowl 50 was an American football game to determine the champion" becomes "Super Bowl 50 was # # # # to determine the champion," where # represents a mask. The model is then trained to predict the masked tokens, thereby inferring the missing words. This approach enables the model to learn bidirectional contextual representations by incorporating both left and right contexts during training.
  • Figure 5: A GP tree.
  • ...and 2 more figures