Table of Contents
Fetching ...

Learning Semantics-aware Search Operators for Genetic Programming

Piotr Wyrwiński, Krzysztof Krawiec

TL;DR

The paper tackles the problem of rugged fitness landscapes in genetic programming by introducing NEON, a semantics-aware search operator that uses a graph neural network to guide expansion and grafting of subprograms. NEON maintains a library of promising subprograms, expands candidates using domain-aware semantics, and selects high-potential components via a saliency-guided mechanism, rather than relying solely on fitness. Experiments on symbolic regression benchmarks show NEON improves success rates and yields smaller final expressions compared to traditional tree-based GP and ablated variants, demonstrating the benefit of infusing search with problem-specific semantic information. The approach is modular and extensible, suggesting broad applicability to other DSLs and domains where partial solutions can be meaningfully expanded and recombined.

Abstract

Fitness landscapes in test-based program synthesis are known to be extremely rugged, with even minimal modifications of programs often leading to fundamental changes in their behavior and, consequently, fitness values. Relying on fitness as the only guidance in iterative search algorithms like genetic programming is thus unnecessarily limiting, especially when combined with purely syntactic search operators that are agnostic about their impact on program behavior. In this study, we propose a semantics-aware search operator that steers the search towards candidate programs that are valuable not only actually (high fitness) but also only potentially, i.e. are likely to be turned into high-quality solutions even if their current fitness is low. The key component of the method is a graph neural network that learns to model the interactions between program instructions and processed data, and produces a saliency map over graph nodes that represents possible search decisions. When applied to a suite of symbolic regression benchmarks, the proposed method outperforms conventional tree-based genetic programming and the ablated variant of the method.

Learning Semantics-aware Search Operators for Genetic Programming

TL;DR

The paper tackles the problem of rugged fitness landscapes in genetic programming by introducing NEON, a semantics-aware search operator that uses a graph neural network to guide expansion and grafting of subprograms. NEON maintains a library of promising subprograms, expands candidates using domain-aware semantics, and selects high-potential components via a saliency-guided mechanism, rather than relying solely on fitness. Experiments on symbolic regression benchmarks show NEON improves success rates and yields smaller final expressions compared to traditional tree-based GP and ablated variants, demonstrating the benefit of infusing search with problem-specific semantic information. The approach is modular and extensible, suggesting broad applicability to other DSLs and domains where partial solutions can be meaningfully expanded and recombined.

Abstract

Fitness landscapes in test-based program synthesis are known to be extremely rugged, with even minimal modifications of programs often leading to fundamental changes in their behavior and, consequently, fitness values. Relying on fitness as the only guidance in iterative search algorithms like genetic programming is thus unnecessarily limiting, especially when combined with purely syntactic search operators that are agnostic about their impact on program behavior. In this study, we propose a semantics-aware search operator that steers the search towards candidate programs that are valuable not only actually (high fitness) but also only potentially, i.e. are likely to be turned into high-quality solutions even if their current fitness is low. The key component of the method is a graph neural network that learns to model the interactions between program instructions and processed data, and produces a saliency map over graph nodes that represents possible search decisions. When applied to a suite of symbolic regression benchmarks, the proposed method outperforms conventional tree-based genetic programming and the ablated variant of the method.

Paper Structure

This paper contains 23 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: An example of graph expansion performed by NEON. The subtree $s$ selected from the source program is converted to a graph representation. The graph is then expanded by a layer of nodes that combines $s$ with the available operations and subtrees of $s$. Then, the GNN is queried on the expanded graph and decorates the expanded nodes with saliency values, which are used to appoint the programs to be added to the library (see Sec. \ref{['sec:expander']}).
  • Figure 2: The architecture of the GNN used for saliency estimation in NEON; $n$: the number of examples in the dataset that specifies the SR problem instance; $N$: the number of message passing iterations and of the GAT layers of the model.