Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression
Jiří Kubalík, Robert Babuška
TL;DR
The paper addresses the challenge of learning accurate, sparse analytic models from data by blending evolutionary search for neural network topologies with gradient-based weight optimization. The EN4SR framework introduces a master topology, subtopologies, a weight memory, and memory-guided genetic operators to enable short backpropagation updates per topology while progressively refining promising models. Empirical results across four problems, including a real-world quadcopter dynamic identification task, show EN4SR outperforms purely NN-based SR and GP-based baselines in both accuracy and computational efficiency, thanks to its hybrid exploration-exploitation strategy. The work advances symbolic regression by integrating prior knowledge and a memory-based evolution mechanism, paving the way for more robust physics-aware SR with scalable search.
Abstract
Symbolic regression is a technique that can automatically derive analytic models from data. Traditionally, symbolic regression has been implemented primarily through genetic programming that evolves populations of candidate solutions sampled by genetic operators, crossover and mutation. More recently, neural networks have been employed to learn the entire analytical model, i.e., its structure and coefficients, using regularized gradient-based optimization. Although this approach tunes the model's coefficients better, it is prone to premature convergence to suboptimal model structures. Here, we propose a neuro-evolutionary symbolic regression method that combines the strengths of evolutionary-based search for optimal neural network (NN) topologies with gradient-based tuning of the network's parameters. Due to the inherent high computational demand of evolutionary algorithms, it is not feasible to learn the parameters of every candidate NN topology to full convergence. Thus, our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs. In this way, each NN topology can be trained using only a short sequence of backpropagation iterations. The proposed method was experimentally evaluated on three real-world test problems and has been shown to outperform other NN-based approaches regarding the quality of the models obtained.
