[RETRACTED]Evolving Form and Function: Dual-Objective Optimization in Neural Symbolic Regression Networks
Amanda Bertschinger, James Bagrow, Joshua Bongard
TL;DR
The paper tackles the challenge of symbolic regression generalization by separating form and function metrics and proposing a dual-objective SR method, SRNE, that evolves neural networks to minimize both symbolic loss $CE(eq,\hat{eq})$ and numeric loss $MSE(Y,\hat{Y})$. By pretraining with gradient descent and then applying evolutionary weight optimization, SRNE creates data-to-equation networks that outperform state-of-the-art GNSR methods on unseen equations, achieving near-zero NMSE and near-zero CE on test sets. The key contribution is demonstrating that joint optimization of symbolic and numeric objectives via neuroevolution yields more accurate and generalizable equations, suggesting a productive synergy between evolutionary algorithms and gradient-based training for SR. The work has practical significance for automated, interpretable modeling across diverse datasets, offering a path toward robust equation discovery without manual form specification or per-dataset retraining.
Abstract
[RETRACTED]Data increasingly abounds, but distilling their underlying relationships down to something interpretable remains challenging. One approach is genetic programming, which `symbolically regresses' a data set down into an equation. However, symbolic regression (SR) faces the issue of requiring training from scratch for each new dataset. To generalize across all datasets, deep learning techniques have been applied to SR. These networks, however, are only able to be trained using a symbolic objective: NN-generated and target equations are symbolically compared. But this does not consider the predictive power of these equations, which could be measured by a behavioral objective that compares the generated equation's predictions to actual data. Here we introduce a method that combines gradient descent and evolutionary computation to yield neural networks that minimize the symbolic and behavioral errors of the equations they generate from data. As a result, these evolved networks are shown to generate more symbolically and behaviorally accurate equations than those generated by networks trained by state-of-the-art gradient based neural symbolic regression methods. We hope this method suggests that evolutionary algorithms, combined with gradient descent, can improve SR results by yielding equations with more accurate form and function.
