Table of Contents
Fetching ...

Bayesian Inference for Missing Physics

Arno Strouwen

Abstract

Model-based approaches for (bio)process systems often suffer from incomplete knowledge of the underlying physical, chemical, or biological laws. Universal differential equations, which embed neural networks within differential equations, have emerged as powerful tools to learn this missing physics from experimental data. However, neural networks are inherently opaque, motivating their post-processing via symbolic regression to obtain interpretable mathematical expressions. Genetic algorithm-based symbolic regression is a popular approach for this post-processing step, but provides only point estimates and cannot quantify the confidence we should place in a discovered equation. We address this limitation by applying Bayesian symbolic regression, which uses Reversible Jump Markov Chain Monte Carlo to sample from the posterior distribution over symbolic expression trees. This approach naturally quantifies uncertainty in the recovered model structure. We demonstrate the methodology on a Lotka-Volterra predator-prey system and then show how a well-designed experiment leads to lower uncertainty in a fed-batch bioreactor case study.

Bayesian Inference for Missing Physics

Abstract

Model-based approaches for (bio)process systems often suffer from incomplete knowledge of the underlying physical, chemical, or biological laws. Universal differential equations, which embed neural networks within differential equations, have emerged as powerful tools to learn this missing physics from experimental data. However, neural networks are inherently opaque, motivating their post-processing via symbolic regression to obtain interpretable mathematical expressions. Genetic algorithm-based symbolic regression is a popular approach for this post-processing step, but provides only point estimates and cannot quantify the confidence we should place in a discovered equation. We address this limitation by applying Bayesian symbolic regression, which uses Reversible Jump Markov Chain Monte Carlo to sample from the posterior distribution over symbolic expression trees. This approach naturally quantifies uncertainty in the recovered model structure. We demonstrate the methodology on a Lotka-Volterra predator-prey system and then show how a well-designed experiment leads to lower uncertainty in a fed-batch bioreactor case study.
Paper Structure (10 sections, 11 equations, 4 figures, 2 tables)

This paper contains 10 sections, 11 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Lotka-Volterra system. Solid lines: true state values. Thick dots: noisy measurements of states. Dashed lines: predicted states after replacing missing physics with a neural network. Dotted lines: predicted states after replacing missing physics with 10 draws from Bayesian symbolic regression.
  • Figure 2: Expression tree representing $(2x_1+1) + \sin(x_2)$. Circular nodes contain operators ($+$, $\mathop{\mathrm{lt}}\nolimits$, $\sin$). Rectangular nodes are terminals: rounded rectangles denote features ($x_1$, $x_2$), and plain rectangles denote constants. The $\mathop{\mathrm{lt}}\nolimits$ node implements the linear transformation $\mathop{\mathrm{lt}}\nolimits(x_1,2,1) = 2x_1 + 1$.
  • Figure 3: Bioreactor system. States and measurements.
  • Figure 4: Fed-batch bioreactor system. Blue solid line: missing physics, the Monod equation. Thick orange dots: predictions of the missing physics by the neural network. Green dashed lines: predicted missing physics by the top 10 trees generated by Bayesian symbolic regression.