Table of Contents
Fetching ...

A Periodic Bayesian Flow for Material Generation

Hanlin Wu, Yuxuan Song, Jingjing Gong, Ziyao Cao, Yawen Ouyang, Jianbing Zhang, Hao Zhou, Wei-Ying Ma, Jingjing Liu

TL;DR

The paper tackles crystal generation under periodic symmetry by modeling non-Euclidean variables on the hyper-torus $\mathbb{T}^{3\times N}$. It introduces CrysBFN, a periodic E(3)-equivariant Bayesian flow that uses a von Mises input distribution and an entropy-conditioning mechanism to handle non-additive accuracy, enabling fast, non-autoregressive sampling. The method jointly handles atom types, lattice parameters, and fractional coordinates, achieving state-of-the-art performance on ab initio generation and crystal structure prediction, with orders-of-magnitude improvements in sampling efficiency. This work broadens the applicability of Bayesian flows to non-Euclidean, periodically invariant data and provides a framework that can be extended to other hyper-torus domains.

Abstract

Generative modeling of crystal data distribution is an important yet challenging task due to the unique periodic physical symmetry of crystals. Diffusion-based methods have shown early promise in modeling crystal distribution. More recently, Bayesian Flow Networks were introduced to aggregate noisy latent variables, resulting in a variance-reduced parameter space that has been shown to be advantageous for modeling Euclidean data distributions with structural constraints (Song et al., 2023). Inspired by this, we seek to unlock its potential for modeling variables located in non-Euclidean manifolds e.g. those within crystal structures, by overcoming challenging theoretical issues. We introduce CrysBFN, a novel crystal generation method by proposing a periodic Bayesian flow, which essentially differs from the original Gaussian-based BFN by exhibiting non-monotonic entropy dynamics. To successfully realize the concept of periodic Bayesian flow, CrysBFN integrates a new entropy conditioning mechanism and empirically demonstrates its significance compared to time-conditioning. Extensive experiments over both crystal ab initio generation and crystal structure prediction tasks demonstrate the superiority of CrysBFN, which consistently achieves new state-of-the-art on all benchmarks. Surprisingly, we found that CrysBFN enjoys a significant improvement in sampling efficiency, e.g., ~100x speedup 10 v.s. 2000 steps network forwards) compared with previous diffusion-based methods on MP-20 dataset. Code is available at https://github.com/wu-han-lin/CrysBFN.

A Periodic Bayesian Flow for Material Generation

TL;DR

The paper tackles crystal generation under periodic symmetry by modeling non-Euclidean variables on the hyper-torus . It introduces CrysBFN, a periodic E(3)-equivariant Bayesian flow that uses a von Mises input distribution and an entropy-conditioning mechanism to handle non-additive accuracy, enabling fast, non-autoregressive sampling. The method jointly handles atom types, lattice parameters, and fractional coordinates, achieving state-of-the-art performance on ab initio generation and crystal structure prediction, with orders-of-magnitude improvements in sampling efficiency. This work broadens the applicability of Bayesian flows to non-Euclidean, periodically invariant data and provides a framework that can be extended to other hyper-torus domains.

Abstract

Generative modeling of crystal data distribution is an important yet challenging task due to the unique periodic physical symmetry of crystals. Diffusion-based methods have shown early promise in modeling crystal distribution. More recently, Bayesian Flow Networks were introduced to aggregate noisy latent variables, resulting in a variance-reduced parameter space that has been shown to be advantageous for modeling Euclidean data distributions with structural constraints (Song et al., 2023). Inspired by this, we seek to unlock its potential for modeling variables located in non-Euclidean manifolds e.g. those within crystal structures, by overcoming challenging theoretical issues. We introduce CrysBFN, a novel crystal generation method by proposing a periodic Bayesian flow, which essentially differs from the original Gaussian-based BFN by exhibiting non-monotonic entropy dynamics. To successfully realize the concept of periodic Bayesian flow, CrysBFN integrates a new entropy conditioning mechanism and empirically demonstrates its significance compared to time-conditioning. Extensive experiments over both crystal ab initio generation and crystal structure prediction tasks demonstrate the superiority of CrysBFN, which consistently achieves new state-of-the-art on all benchmarks. Surprisingly, we found that CrysBFN enjoys a significant improvement in sampling efficiency, e.g., ~100x speedup 10 v.s. 2000 steps network forwards) compared with previous diffusion-based methods on MP-20 dataset. Code is available at https://github.com/wu-han-lin/CrysBFN.

Paper Structure

This paper contains 25 sections, 7 theorems, 66 equations, 7 figures, 7 tables, 2 algorithms.

Key Result

Proposition 4.1

The probability density function of Bayesian flow distribution defined by eq:cirflow_equiveq:cirflow_equiv2 is equivalent to the original definition in eq:flow_frac.

Figures (7)

  • Figure 1: Framework of CrysBFN. Left: overview of training and sampling process. At training time, the network receives $\bm{\theta}_{i-1}$ from Bayesian flow based on data distribution, and tries to improve the belief $\bm{\theta}_{i-1}$ over the groundtruth $\mathcal{M}$ by outputting an estimated distribution $p_O$ and minimizing the gap between estimation and groundtruth. At sampling time with the trained network, the uninformative prior $\bm{\theta}_0$ is gradually improved by belief updates until $\bm{\theta}_n$ with high fidelity. Right: illustration of the periodic equivariant Bayesian flow.
  • Figure 2: Visualization of the proposed periodic Bayesian flow with mean parameter $\mu$ and accumulated accuracy parameter $c$ which corresponds to the entropy/uncertainty. For $x = 0.3, \beta(1) = 1000$ and $\alpha_i$ defined in \ref{['appd:bfn_cir']}, this figure plots three colored stochastic parameter trajectories for receiver mean parameter $m$ and accumulated accuracy parameter $c$, superimposed on a log-scale heatmap of the Bayesian flow distribution $p_F(m|x,\alpha_1,\alpha_2,\dots,\alpha_i)$ and $p_F(c|x,\alpha_1,\alpha_2,\dots,\alpha_i)$. Note the non-monotonicity and non-additive property of $c$ which could inform the network the entropy of the mean parameter $m$ as a condition and the periodicity of $m$.
  • Figure 3: An intuitive illustration of non-additive accuracy Bayesian update on the torus. The lengths of arrows represent the uncertainty/entropy of the belief (e.g.$1/\sigma^2$ for Gaussian and $c$ for von Mises). The directions of the arrows represent the believed location (e.g.$\mu$ for Gaussian and $m$ for von Mises).
  • Figure 4: Experimental results on MP-20 with different Number of Function Evaluations (NFE) i.e. number of network forward passes.
  • Figure 5: Depiction of von Mises distributions with different directions parameters $m$ and concentration parameters $c$. The parameter $m$ denotes the central location about which the distribution is centered, while $c$ functions as a measure of the distribution's concentration. When $c = 0$, the distribution is uniform on the circle. As $c$ increases, the distribution becomes more concentrated around the value $m$, with $c$ serving as a measure of this concentration. In the limit as $c \rightarrow +\infty$, the distribution converges to $\delta(m)$, a Dirac delta distribution centered at $m$.
  • ...and 2 more figures

Theorems & Definitions (15)

  • Proposition 4.1
  • Proposition 4.2
  • Proposition 4.3
  • Definition 1: Permutation Invariance jiao2023crystal
  • Definition 2: O(3) Invariance jiao2023crystal
  • Definition 3: Periodic Translation Invariance jiao2023crystal
  • Definition 4
  • Lemma 1: xu2021geodiff
  • proof
  • Proposition B.1
  • ...and 5 more