ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization
Philipp Scholl, Katharina Bieker, Hillary Hauger, Gitta Kutyniok
TL;DR
This paper introduces ParFam, a symbolic regression method that recasts the discrete search for interpretable equations as a continuous optimization problem using a parametric family of rational-function blocks and physically informed base functions. It demonstrates strong expressivity despite structural constraints and employs a basin-hopping global optimizer with a local BFGS refinement to reliably locate sparse, accurate models; a theoretical analysis shows high coverage for low-complexity formulas. An extension, DL-ParFam, uses a pre-trained Set Transformer to predict ParFam model parameters, achieving up to roughly 100× speedups while maintaining competitive accuracy, and enabling end-to-end differentiable inference in many cases. On SRBench, ParFam and DL-ParFam reach state-of-the-art performance across ground-truth datasets like Feynman and Strogatz, with DL-ParFam delivering substantial runtime advantages and robust cross-domain generalization in synthetic-to-real settings. The work highlights practical, interpretable SR with a clear tradeoff between model flexibility and optimization tractability, and points to future work in regularization, parametrization, and scalability for higher-dimensional problems.
Abstract
The problem of symbolic regression (SR) arises in many different applications, such as identifying physical laws or deriving mathematical equations describing the behavior of financial markets from given data. Various methods exist to address the problem of SR, often based on genetic programming. However, these methods are usually complicated and involve various hyperparameters. In this paper, we present our new approach ParFam that utilizes parametric families of suitable symbolic functions to translate the discrete symbolic regression problem into a continuous one, resulting in a more straightforward setup compared to current state-of-the-art methods. In combination with a global optimizer, this approach results in a highly effective method to tackle the problem of SR. We theoretically analyze the expressivity of ParFam and demonstrate its performance with extensive numerical experiments based on the common SR benchmark suit SRBench, showing that we achieve state-of-the-art results. Moreover, we present an extension incorporating a pre-trained transformer network DL-ParFam to guide ParFam, accelerating the optimization process by up to two magnitudes. Our code and results can be found at https://github.com/Philipp238/parfam.
