Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching
Nan Jiang, Md Nasim, Yexiang Xue
TL;DR
The paper addresses the challenge of discovering symbolic ODEs from trajectory data when fixed training sets lead to overfitting in chaotic systems. It introduces APPS, an active framework that sketches phase portraits to identify informative regions in phase space and then samples batches of near-neighbor initial conditions, with a Transformer-based decoder generating candidate ODEs and a REINFORCE-trained data-query loop. By evaluating candidates on region-specific phase portraits and using NMSE-based rewards, APPS consistently outperforms passive baselines on Strogatz and ODEBase datasets under noiseless, noisy, and irregular-time settings. The approach reduces data requirements while improving accuracy and ranking of predicted ODEs, offering a scalable path for active discovery of dynamical laws in complex systems.
Abstract
The symbolic discovery of Ordinary Differential Equations (ODEs) from trajectory data plays a pivotal role in AI-driven scientific discovery. Existing symbolic methods predominantly rely on fixed, pre-collected training datasets, which often result in suboptimal performance, as demonstrated in our case study in Figure 1. Drawing inspiration from active learning, we investigate strategies to query informative trajectory data that can enhance the evaluation of predicted ODEs. However, the butterfly effect in dynamical systems reveals that small variations in initial conditions can lead to drastically different trajectories, necessitating the storage of vast quantities of trajectory data using conventional active learning. To address this, we introduce Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching (APPS). Instead of directly selecting individual initial conditions, our APPS first identifies an informative region within the phase space and then samples a batch of initial conditions from this region. Compared to traditional active learning methods, APPS mitigates the gap of maintaining a large amount of data. Extensive experiments demonstrate that APPS consistently discovers more accurate ODE expressions than baseline methods using passively collected datasets.
