A Functional Analysis Approach to Symbolic Regression
Kirill Antonov, Roman Kalkreuth, Kaifeng Yang, Thomas Bäck, Niki van Stein, Anna V Kononova
TL;DR
This work reframes symbolic regression as a norm-minimization problem in a Hilbert space $\mathbb{F}$ of functions on the training set and introduces Fourier Tree Growing (FTG), an FA-inspired SR algorithm. FTG iteratively builds a linearly independent set of atom functions $v_i$ from elementary operators and computes coefficients $\alpha_i$ by projecting onto their span using Gram-matrix computations, yielding a Fourier-like expansion of the target function $F$. Empirical results show FTG substantially outperforming traditional tree-based GP on classical one-dimensional SR benchmarks and on a large-scale polynomial benchmark (LSP), while identifying Gram-matrix conditioning as a key numerical bottleneck; these insights motivate future FA-based refinements and hybrid GP-FTG approaches. The work thus provides a novel theoretical lens and practical algorithmic tool for advancing SR and explainable ML through functional-analytic methods.
Abstract
Symbolic regression (SR) poses a significant challenge for randomized search heuristics due to its reliance on the synthesis of expressions for input-output mappings. Although traditional genetic programming (GP) algorithms have achieved success in various domains, they exhibit limited performance when tree-based representations are used for SR. To address these limitations, we introduce a novel SR approach called Fourier Tree Growing (FTG) that draws insights from functional analysis. This new perspective enables us to perform optimization directly in a different space, thus avoiding intricate symbolic expressions. Our proposed algorithm exhibits significant performance improvements over traditional GP methods on a range of classical one-dimensional benchmarking problems. To identify and explain limiting factors of GP and FTG, we perform experiments on a large-scale polynomials benchmark with high-order polynomials up to degree 100. To the best of the authors' knowledge, this work represents the pioneering application of functional analysis in addressing SR problems. The superior performance of the proposed algorithm and insights into the limitations of GP open the way for further advancing GP for SR and related areas of explainable machine learning.
