Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
Chenglu Sun, Shuo Shen, Wenzhi Tao, Deyi Xue, Zixia Zhou
TL;DR
Symbolic regression often fails to recover accurate expressions when data are contaminated with noise. The paper proposes Noise-Resilient Symbolic Regression (NRSR), which combines a Noise-Resilient Gating Module (NGM) for input filtering with a reinforcement-learning-based expression generator and a Mixed Path Entropy (MPE) bonus to promote diverse exploration. The approach uses a PPO-based policy with a reward $R(\tau)=1/(1+NRMSE)$ and a joint entropy objective $L(\theta)=L_p(\theta)+\alpha H_\tau(\pi_\theta)+\beta H(\pi_\theta)$ to drive robust symbol selection. Empirical results on Nguyen benchmarks show that NRSR achieves state-of-the-art performance on high-noise data and strong results on clean data, outperforming multiple baselines in recovery rate $RR$, explored-expression number $EEN$, and NMSE. The work demonstrates the effectiveness and modularity of NGM and MPE for robust SR and suggests future directions like distributed RL to further improve exploration in large search spaces.
Abstract
Symbolic regression (SR) has emerged as a pivotal technique for uncovering the intrinsic information within data and enhancing the interpretability of AI models. However, current state-of-the-art (sota) SR methods struggle to perform correct recovery of symbolic expressions from high-noise data. To address this issue, we introduce a novel noise-resilient SR (NRSR) method capable of recovering expressions from high-noise data. Our method leverages a novel reinforcement learning (RL) approach in conjunction with a designed noise-resilient gating module (NGM) to learn symbolic selection policies. The gating module can dynamically filter the meaningless information from high-noise data, thereby demonstrating a high noise-resilient capability for the SR process. And we also design a mixed path entropy (MPE) bonus term in the RL process to increase the exploration capabilities of the policy. Experimental results demonstrate that our method significantly outperforms several popular baselines on benchmarks with high-noise data. Furthermore, our method also can achieve sota performance on benchmarks with clean data, showcasing its robustness and efficacy in SR tasks.
