Bayesian data-driven discovery of partial differential equations with variable coefficients
Aoxue Chen, Yifan Du, Liyao Mars Gao, Guang Lin
TL;DR
This work tackles data-driven discovery of partial differential equations with spatiotemporally varying coefficients by casting the problem as Bayesian grouped sparse regression. It introduces threshold Bayesian group Lasso with spike-and-slab priors (tBGL-SS) and a block Gibbs sampler, augmented with an approximate MCMC thresholding strategy to achieve scalable inference. A Bayesian total error bar is proposed for model selection, addressing uncertainty neglected by classical criteria. Across four benchmark PDEs and varying noise levels, the method demonstrates improved coefficient recovery, credible uncertainty estimates, and robust model selection, offering a practical tool for physics-informed discovery in noisy data.
Abstract
The discovery of Partial Differential Equations (PDEs) is an essential task for applied science and engineering. However, data-driven discovery of PDEs is generally challenging, primarily stemming from the sensitivity of the discovered equation to noise and the complexities of model selection. In this work, we propose an advanced Bayesian sparse learning algorithm for PDE discovery with variable coefficients, predominantly when the coefficients are spatially or temporally dependent. Specifically, we apply threshold Bayesian group Lasso regression with a spike-and-slab prior (tBGL-SS) and leverage a Gibbs sampler for Bayesian posterior estimation of PDE coefficients. This approach not only enhances the robustness of point estimation with valid uncertainty quantification but also relaxes the computational burden from Bayesian inference through the integration of coefficient thresholds as an approximate MCMC method. Moreover, from the quantified uncertainties, we propose a Bayesian total error bar criteria for model selection, which outperforms classic metrics including the root mean square and the Akaike information criterion. The capability of this method is illustrated by the discovery of several classical benchmark PDEs with spatially or temporally varying coefficients from solution data obtained from the reference simulations. In the experiments, we show that the tBGL-SS method is more robust than the baseline methods under noisy environments and provides better model selection criteria along the regularization path.
