Learning Semantics-aware Search Operators for Genetic Programming
Piotr Wyrwiński, Krzysztof Krawiec
TL;DR
The paper tackles the problem of rugged fitness landscapes in genetic programming by introducing NEON, a semantics-aware search operator that uses a graph neural network to guide expansion and grafting of subprograms. NEON maintains a library of promising subprograms, expands candidates using domain-aware semantics, and selects high-potential components via a saliency-guided mechanism, rather than relying solely on fitness. Experiments on symbolic regression benchmarks show NEON improves success rates and yields smaller final expressions compared to traditional tree-based GP and ablated variants, demonstrating the benefit of infusing search with problem-specific semantic information. The approach is modular and extensible, suggesting broad applicability to other DSLs and domains where partial solutions can be meaningfully expanded and recombined.
Abstract
Fitness landscapes in test-based program synthesis are known to be extremely rugged, with even minimal modifications of programs often leading to fundamental changes in their behavior and, consequently, fitness values. Relying on fitness as the only guidance in iterative search algorithms like genetic programming is thus unnecessarily limiting, especially when combined with purely syntactic search operators that are agnostic about their impact on program behavior. In this study, we propose a semantics-aware search operator that steers the search towards candidate programs that are valuable not only actually (high fitness) but also only potentially, i.e. are likely to be turned into high-quality solutions even if their current fitness is low. The key component of the method is a graph neural network that learns to model the interactions between program instructions and processed data, and produces a saliency map over graph nodes that represents possible search decisions. When applied to a suite of symbolic regression benchmarks, the proposed method outperforms conventional tree-based genetic programming and the ablated variant of the method.
