Transformer Semantic Genetic Programming for Symbolic Regression
Philipp Anthes, Dominik Sobania, Franz Rothlauf
TL;DR
The paper addresses the inefficiency of standard genetic programming in exploiting solution semantics for symbolic regression by introducing Transformer Semantic Genetic Programming (TSGP), a semantic-aware variation operator based on a pretrained encoder–decoder transformer. A model-building phase trains the transformer on synthetic SR pairs using semantic distance $SD$ and a $k$-NN similarity search to generate input–output training pairs with semantically related offspring via $SD(s(f_i),s(f_o))$. During GP search, the transformer samples offspring from a parent function with a controllable target semantic distance $SD^d$, while syntax-control ensures valid tree structures; the approach is trained once and generalizes to unseen problems. Empirical results on five black-box SR datasets show that TSGP achieves comparable or superior $RMSE$ to stdGP, SLIM_GSGP, DSR, and DAE-GP, while producing significantly smaller solutions than SLIM_GSGP and exhibiting higher semantic similarity in its variations. This work demonstrates that a semantic-aware transformer can effectively guide semantic exploration in GP, offering faster convergence and improved interpretability without inflating solution size.
Abstract
In standard genetic programming (stdGP), solutions are varied by modifying their syntax, with uncertain effects on their semantics. Geometric-semantic genetic programming (GSGP), a popular variant of GP, effectively searches the semantic solution space using variation operations based on linear combinations, although it results in significantly larger solutions. This paper presents Transformer Semantic Genetic Programming (TSGP), a novel and flexible semantic approach that uses a generative transformer model as search operator. The transformer is trained on synthetic test problems and learns semantic similarities between solutions. Once the model is trained, it can be used to create offspring solutions with high semantic similarity also for unseen and unknown problems. Experiments on several symbolic regression problems show that TSGP generates solutions with comparable or even significantly better prediction quality than stdGP, SLIM_GSGP, DSR, and DAE-GP. Like SLIM_GSGP, TSGP is able to create new solutions that are semantically similar without creating solutions of large size. An analysis of the search dynamic reveals that the solutions generated by TSGP are semantically more similar than the solutions generated by the benchmark approaches allowing a better exploration of the semantic solution space.
