Table of Contents
Fetching ...

MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle

Wuyue Yang, Liangrong Peng, Guojie Li, Liu Hong

TL;DR

MEP-Net tackles the challenge of reconstructing probability distributions from limited information by fusing the Maximum Entropy Principle with neural networks. It trains with a two term objective that combines a constraint loss enforcing moment data with a fixed point entropy surrogate to regularize updates, steering the learned distribution toward the maximum entropy solution $p^*(x) ∝ exp(∑_i λ_i f_i(x)) / Z$. Validated on problems including unimodal and multimodal distributions, high dimensional Gaussian mixtures, time dependent Gaussians, the Schloegl chemical master equation, diffusion in confined domains, and the Allen-Cahn gradient flow, achieving high accuracy and stability. The results show that MEP-Net yields physically meaningful distributions in data scarce regimes and opens paths for inverse problems and hybrid modeling.

Abstract

Maximum entropy principle (MEP) offers an effective and unbiased approach to inferring unknown probability distributions when faced with incomplete information, while neural networks provide the flexibility to learn complex distributions from data. This paper proposes a novel neural network architecture, the MEP-Net, which combines the MEP with neural networks to generate probability distributions from moment constraints. We also provide a comprehensive overview of the fundamentals of the maximum entropy principle, its mathematical formulations, and a rigorous justification for its applicability for non-equilibrium systems based on the large deviations principle. Through fruitful numerical experiments, we demonstrate that the MEP-Net can be particularly useful in modeling the evolution of probability distributions in biochemical reaction networks and in generating complex distributions from data.

MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle

TL;DR

MEP-Net tackles the challenge of reconstructing probability distributions from limited information by fusing the Maximum Entropy Principle with neural networks. It trains with a two term objective that combines a constraint loss enforcing moment data with a fixed point entropy surrogate to regularize updates, steering the learned distribution toward the maximum entropy solution . Validated on problems including unimodal and multimodal distributions, high dimensional Gaussian mixtures, time dependent Gaussians, the Schloegl chemical master equation, diffusion in confined domains, and the Allen-Cahn gradient flow, achieving high accuracy and stability. The results show that MEP-Net yields physically meaningful distributions in data scarce regimes and opens paths for inverse problems and hybrid modeling.

Abstract

Maximum entropy principle (MEP) offers an effective and unbiased approach to inferring unknown probability distributions when faced with incomplete information, while neural networks provide the flexibility to learn complex distributions from data. This paper proposes a novel neural network architecture, the MEP-Net, which combines the MEP with neural networks to generate probability distributions from moment constraints. We also provide a comprehensive overview of the fundamentals of the maximum entropy principle, its mathematical formulations, and a rigorous justification for its applicability for non-equilibrium systems based on the large deviations principle. Through fruitful numerical experiments, we demonstrate that the MEP-Net can be particularly useful in modeling the evolution of probability distributions in biochemical reaction networks and in generating complex distributions from data.

Paper Structure

This paper contains 20 sections, 1 theorem, 69 equations, 6 figures, 2 tables.

Key Result

Proposition A.1

For a $1$-d random variable $X$ with two observables, the mean $\langle X \rangle\equiv \mu$ and the variance $\langle (X - \langle X \rangle)^2\rangle=\langle X^2 \rangle - \langle X \rangle^2 \equiv \sigma^2$. Then if and only if the probability density function is normally distributed with its m

Figures (6)

  • Figure 1: A schematic diagram for the MEP-Net. (a) Generation of binomial features $\phi_{i_1,i_2,\ldots,i_d}(\vec{x})$ based on data points. (b) Architecture of the MEP-Net. The inputs $\vec{x}$ and $t$ are scaled and then fed into the neural network. The output is the generated probability distribution $\hat{p}(\vec{x},t)$. (c) and (d) illustrate the generated probability distribution by the MEP-Net compared with the true one, and the evolution of the MSE during the training procedure.
  • Figure 2: Generation of one-dimensional Gaussian mixture distributions by MEP-Net. (a) Unimodal Gaussian distribution with $\mu_1=1$ and $\sigma_1=\frac{1}{5}$. (b) Bimodal Gaussian distribution with $\mu_1=\frac{1}{4},\sigma_1=\frac{1}{14}$ for the first mode and $\mu_2=\frac{1}{2},\sigma_2=\frac{1}{20}$ for the second mode. (c) Reconstruction of a Bimodal Gaussian distribution by polynomial moments. (d) Tri-modal Gaussian distribution with $\mu_1=\frac{1}{4},\sigma_1=\frac{1}{25}$ for the first mode, $\mu_2=\frac{1}{2},\sigma_2=\frac{1}{20}$ and $\mu_3=\frac{3}{4},\sigma_3=\frac{1}{20}$ for the second and third modes. (e) Quad-modal Gaussian distribution with means at $\frac{1}{5},\frac{2}{5},\frac{3}{5}$, and $\frac{4}{5}$, standard deviations of $\frac{1}{14},\frac{1}{20},\frac{1}{20}$, and $\frac{1}{14}$. (f) Beta distribution, $\alpha=\beta=\frac{1}{2}$. (g) Comparison on the reconstruction abilities of different types of moments. Checkmarks ($\checkmark$) indicate those successful generations, while crosses ($\times$) indicate the unsuccessful ones.
  • Figure 3: 5-dimensional Gaussian mixture distributions and time-dependent Gaussian function. Comparisons on (a) the true and (b) generated 5-dimensional Gaussian mixture distributions by the MEP-Net, with the contour plots for $(x_1, x_2), (x_2, x_3), (x_3, x_4), (x_4, x_5)$, and $(x_5, x_1)$ listed from left to right. Their absolute errors are given in (c). (d) The true values of time-dependent Gaussian function (left panel), the predicted values by the MEP-Net (middle panel), and their difference (right panel). (e) Snapshots of the time-dependent Gaussian function at $t=0$, $t=1.1$ and $t=2.0$, respectively.
  • Figure 4: Dynamics of the Schlögl model and generation by the MEP-Net. (a) The mechanism of the Schlögl model and data generated via the Gillespie algorithm. (b) Logarithmic Mean Squared Errors (MSE) of the MEP-Net with and without the entropy loss as a function of the training epoch. (c) Heatmaps for the dynamics predicted by the Schlögl model (left) and by the MEP-Net (center), with their absolute errors shown on the right. (d) Snapshots of the pdfs of the Schlögl model and the MEP-Net generated at $T=0.1$, $T=2.6$, and $T=5.0$, respectively.
  • Figure 5: Particle diffusion in a confined domain. (a) Architecture of the MEP-Net with the entropy loss being replaced by the Rayleighian. (b) Comparison between the approximate solution obtained by the MEP-Net (represented by the dashed line) and the exact solution (represented by the blue solid line) at time instances t = 0.01, 0.21, 0.41, 0.60, 0.80, and 1.00. The spatial domain is defined as $[-1, 1]$ with initial conditions $\eta_0 = 0.5$ and $\eta_1 = 0.45$, and the diffusion coefficient $D = 1.0$.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Proposition A.1
  • proof
  • Remark A.2
  • Remark A.3
  • Remark A.4