ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization

Philipp Scholl; Katharina Bieker; Hillary Hauger; Gitta Kutyniok

ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization

Philipp Scholl, Katharina Bieker, Hillary Hauger, Gitta Kutyniok

TL;DR

This paper introduces ParFam, a symbolic regression method that recasts the discrete search for interpretable equations as a continuous optimization problem using a parametric family of rational-function blocks and physically informed base functions. It demonstrates strong expressivity despite structural constraints and employs a basin-hopping global optimizer with a local BFGS refinement to reliably locate sparse, accurate models; a theoretical analysis shows high coverage for low-complexity formulas. An extension, DL-ParFam, uses a pre-trained Set Transformer to predict ParFam model parameters, achieving up to roughly 100× speedups while maintaining competitive accuracy, and enabling end-to-end differentiable inference in many cases. On SRBench, ParFam and DL-ParFam reach state-of-the-art performance across ground-truth datasets like Feynman and Strogatz, with DL-ParFam delivering substantial runtime advantages and robust cross-domain generalization in synthetic-to-real settings. The work highlights practical, interpretable SR with a clear tradeoff between model flexibility and optimization tractability, and points to future work in regularization, parametrization, and scalability for higher-dimensional problems.

Abstract

The problem of symbolic regression (SR) arises in many different applications, such as identifying physical laws or deriving mathematical equations describing the behavior of financial markets from given data. Various methods exist to address the problem of SR, often based on genetic programming. However, these methods are usually complicated and involve various hyperparameters. In this paper, we present our new approach ParFam that utilizes parametric families of suitable symbolic functions to translate the discrete symbolic regression problem into a continuous one, resulting in a more straightforward setup compared to current state-of-the-art methods. In combination with a global optimizer, this approach results in a highly effective method to tackle the problem of SR. We theoretically analyze the expressivity of ParFam and demonstrate its performance with extensive numerical experiments based on the common SR benchmark suit SRBench, showing that we achieve state-of-the-art results. Moreover, we present an extension incorporating a pre-trained transformer network DL-ParFam to guide ParFam, accelerating the optimization process by up to two magnitudes. Our code and results can be found at https://github.com/Philipp238/parfam.

ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization

TL;DR

Abstract

Paper Structure (53 sections, 7 theorems, 62 equations, 12 figures, 15 tables, 1 algorithm)

This paper contains 53 sections, 7 theorems, 62 equations, 12 figures, 15 tables, 1 algorithm.

Introduction
Our Contributions
Related work
Methods
ParFam
The Structure of the Parametric Family
Optimization
Expressivity of ParFam
DL-ParFam
Benchmark
Competitors
Metrics
Hyperparameters
Results
Discussion and Conclusion
...and 38 more sections

Key Result

Theorem 1

For $(c_l)_{l\in\mathbb{N}}$, the number of unary-binary trees expressible by ParFam with complexity $l$, it holds that with some constants $x_1$,$v_0,v_1\in\mathbb{R}$ depending on the number of binary operators $b$, number of unary operators $k$, and number of variables $n$.

Figures (12)

Figure 1: The architecture of ParFam: ParFam can be interpreted as a residual neural network with one hidden layer. Instead of linear weights between the layers, it applies rational functions $Q_i(\cdot)=p_{d_i^1}(\cdot) /p_{d_i^2}(\cdot)$. Furthermore, the standard basis functions are substituted by physically relevant functions like $\sin, \exp, \sqrt{}$, etc. The learnable parameters are the coefficients of $p_{d_i^1}$ and $p_{d_i^2}$.
Figure 2: Examples for the different kinds of expression trees counted by $b_l$, $c_l$, and $d_l$.
Figure 3: DL-ParFam first applies the pre-trained neural network to input data $(x_i,y_i)_{i=1,...,N}$, which outputs the model parameters for ParFam: the degrees of the polynomials used in $Q_1,...,Q_{k+1}$ and the basis functions $g_1,...,g_k$. Afterwards, ParFam can run using these settings to find the best parameters $\theta *$ and, therefore, identify the best fitting function $f_\theta$.
Figure 4: Mean results on the SRBench ground-truth problems. Following SRBench terminology, training time refers to the time each algorithm requires to compute a result for a specific problem, which corresponds to inference time for pre-trained methods.
Figure 5: Median $R^2$, formula complexity, and training time on the 77 black-box problems from SRBench la2021contemporary with at most 10 independent variables. The asterisk indicates that it is a symbolic regression method.
...and 7 more figures

Theorems & Definitions (12)

Theorem 1
Theorem 2
Lemma 1
Lemma 2
Theorem 3
Lemma 3
Theorem 4
proof
proof
proof
...and 2 more

ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization

TL;DR

Abstract

ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (12)