Table of Contents
Fetching ...

Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression

Jiří Kubalík, Robert Babuška

TL;DR

The paper addresses the challenge of learning accurate, sparse analytic models from data by blending evolutionary search for neural network topologies with gradient-based weight optimization. The EN4SR framework introduces a master topology, subtopologies, a weight memory, and memory-guided genetic operators to enable short backpropagation updates per topology while progressively refining promising models. Empirical results across four problems, including a real-world quadcopter dynamic identification task, show EN4SR outperforms purely NN-based SR and GP-based baselines in both accuracy and computational efficiency, thanks to its hybrid exploration-exploitation strategy. The work advances symbolic regression by integrating prior knowledge and a memory-based evolution mechanism, paving the way for more robust physics-aware SR with scalable search.

Abstract

Symbolic regression is a technique that can automatically derive analytic models from data. Traditionally, symbolic regression has been implemented primarily through genetic programming that evolves populations of candidate solutions sampled by genetic operators, crossover and mutation. More recently, neural networks have been employed to learn the entire analytical model, i.e., its structure and coefficients, using regularized gradient-based optimization. Although this approach tunes the model's coefficients better, it is prone to premature convergence to suboptimal model structures. Here, we propose a neuro-evolutionary symbolic regression method that combines the strengths of evolutionary-based search for optimal neural network (NN) topologies with gradient-based tuning of the network's parameters. Due to the inherent high computational demand of evolutionary algorithms, it is not feasible to learn the parameters of every candidate NN topology to full convergence. Thus, our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs. In this way, each NN topology can be trained using only a short sequence of backpropagation iterations. The proposed method was experimentally evaluated on three real-world test problems and has been shown to outperform other NN-based approaches regarding the quality of the models obtained.

Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression

TL;DR

The paper addresses the challenge of learning accurate, sparse analytic models from data by blending evolutionary search for neural network topologies with gradient-based weight optimization. The EN4SR framework introduces a master topology, subtopologies, a weight memory, and memory-guided genetic operators to enable short backpropagation updates per topology while progressively refining promising models. Empirical results across four problems, including a real-world quadcopter dynamic identification task, show EN4SR outperforms purely NN-based SR and GP-based baselines in both accuracy and computational efficiency, thanks to its hybrid exploration-exploitation strategy. The work advances symbolic regression by integrating prior knowledge and a memory-based evolution mechanism, paving the way for more robust physics-aware SR with scalable search.

Abstract

Symbolic regression is a technique that can automatically derive analytic models from data. Traditionally, symbolic regression has been implemented primarily through genetic programming that evolves populations of candidate solutions sampled by genetic operators, crossover and mutation. More recently, neural networks have been employed to learn the entire analytical model, i.e., its structure and coefficients, using regularized gradient-based optimization. Although this approach tunes the model's coefficients better, it is prone to premature convergence to suboptimal model structures. Here, we propose a neuro-evolutionary symbolic regression method that combines the strengths of evolutionary-based search for optimal neural network (NN) topologies with gradient-based tuning of the network's parameters. Due to the inherent high computational demand of evolutionary algorithms, it is not feasible to learn the parameters of every candidate NN topology to full convergence. Thus, our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs. In this way, each NN topology can be trained using only a short sequence of backpropagation iterations. The proposed method was experimentally evaluated on three real-world test problems and has been shown to outperform other NN-based approaches regarding the quality of the models obtained.

Paper Structure

This paper contains 18 sections, 5 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Master topology with two hidden layers as proposed for N4SR. The blue lines mark links with learnable weights. The red lines represent skip connections leading from the source units of layer $k-1$ to the copy units in layer $k$. For simplicity, this scheme does not show the bias links leading to every $z$-node.
  • Figure 2: Reference models, training data $D_t$, interpolation domain $\mathbb{D}_i$, and extrapolation domain $\mathbb{D}_e$ of the three test problems.
  • Figure 3: Convergence trajectories of models with 6 and 7 active units observed in 30 independent runs with EN4SR and EN4SR-base on magic problem.
  • Figure 4: Quadcopter model simulation on unseen test data. Blue solid line: measured data, red dashed line: model.