Table of Contents
Fetching ...

Diffusion priors for Bayesian 3D reconstruction from incomplete measurements

Julian L. Möbius, Michael Habeck

TL;DR

This paper tackles the challenge of reconstructing 3D structures from highly incomplete measurements by embedding rich 3D priors within a Bayesian framework. It introduces Diffusion Posterior Sampling (DPS), which combines diffusion priors trained on 3D point clouds with likelihood guidance from forward models to perform conditional posterior sampling. The method is validated on ShapeNet and CryoStruct datasets, showing improved reconstruction quality under sparse observations and demonstrating applicability to cryo-EM reconstruction tasks with multi-source data such as projections, coarse-grained structures, and subunits. The results suggest diffusion priors can distill biological and synthetic shape knowledge from large databases to enhance ill-posed inverse problems in structural biology and beyond, albeit with runtime considerations that motivate further efficiency improvements.

Abstract

Many inverse problems are ill-posed and need to be complemented by prior information that restricts the class of admissible models. Bayesian approaches encode this information as prior distributions that impose generic properties on the model such as sparsity, non-negativity or smoothness. However, in case of complex structured models such as images, graphs or three-dimensional (3D) objects,generic prior distributions tend to favor models that differ largely from those observed in the real world. Here we explore the use of diffusion models as priors that are combined with experimental data within a Bayesian framework. We use 3D point clouds to represent 3D objects such as household items or biomolecular complexes formed from proteins and nucleic acids. We train diffusion models that generate coarse-grained 3D structures at a medium resolution and integrate these with incomplete and noisy experimental data. To demonstrate the power of our approach, we focus on the reconstruction of biomolecular assemblies from cryo-electron microscopy (cryo-EM) images, which is an important inverse problem in structural biology. We find that posterior sampling with diffusion model priors allows for 3D reconstruction from very sparse, low-resolution and partial observations.

Diffusion priors for Bayesian 3D reconstruction from incomplete measurements

TL;DR

This paper tackles the challenge of reconstructing 3D structures from highly incomplete measurements by embedding rich 3D priors within a Bayesian framework. It introduces Diffusion Posterior Sampling (DPS), which combines diffusion priors trained on 3D point clouds with likelihood guidance from forward models to perform conditional posterior sampling. The method is validated on ShapeNet and CryoStruct datasets, showing improved reconstruction quality under sparse observations and demonstrating applicability to cryo-EM reconstruction tasks with multi-source data such as projections, coarse-grained structures, and subunits. The results suggest diffusion priors can distill biological and synthetic shape knowledge from large databases to enhance ill-posed inverse problems in structural biology and beyond, albeit with runtime considerations that motivate further efficiency improvements.

Abstract

Many inverse problems are ill-posed and need to be complemented by prior information that restricts the class of admissible models. Bayesian approaches encode this information as prior distributions that impose generic properties on the model such as sparsity, non-negativity or smoothness. However, in case of complex structured models such as images, graphs or three-dimensional (3D) objects,generic prior distributions tend to favor models that differ largely from those observed in the real world. Here we explore the use of diffusion models as priors that are combined with experimental data within a Bayesian framework. We use 3D point clouds to represent 3D objects such as household items or biomolecular complexes formed from proteins and nucleic acids. We train diffusion models that generate coarse-grained 3D structures at a medium resolution and integrate these with incomplete and noisy experimental data. To demonstrate the power of our approach, we focus on the reconstruction of biomolecular assemblies from cryo-electron microscopy (cryo-EM) images, which is an important inverse problem in structural biology. We find that posterior sampling with diffusion model priors allows for 3D reconstruction from very sparse, low-resolution and partial observations.

Paper Structure

This paper contains 30 sections, 23 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Results for five different reconstruction tasks. In all examples, the ML reconstruction has a higher likelihood of observing the input data than the models obtained with approximate DPS. However, the ML-based models show a higher reconstruction error than those from DPS. The results are also part of the tests presented in Table \ref{['tab:shapenet']} and correspond to rows 9, 8, 1, 8 and 2 (from left to right).
  • Figure 2: Outcomes for five cryo-EM reconstruction tasks. The top row shows the sparse input measurements. The second row shows all ten point clouds generated with DPS. The third row shows the 1024 component means of a mixture model fitted to the atomic models (last row). (A) Nucleosome-CHD4 from five projections (PDB code 6ryr). (B) F-ATP Synthase from four projections (PDB code 6rdm). (C) RNA polymerase transcription open promoter complex with Sorangicin from three projections (PDB code 6vvy). (D) Human spliceosome after Prp43 loaded from one projection and a low-resolution structure consisting of 40 particles (PDB code 6id1). (E) 26S proteasome from three projections and a known 20S structure (PDB code 6fvt).
  • Figure 3: Unconditional samples from the diffusion prior trained on the ShapeNet-Chair dataset. Sampled with Algorithm \ref{['alg']} using $\beta(t) = 1/t \text{ if } t > 1 \text{ else } 1$, $t_{\max} = 80$ and $100$ time steps. The images are created using the surface mode of PyMOL pymol.
  • Figure 4: Unconditional samples from the diffusion prior trained on the ShapeNet-Mixed dataset. Sampled with Algorithm \ref{['alg']} using $\beta(t) = 1/t \text{ if } t > 1 \text{ else } 1$, $t_{\max} = 80$ and $100$ time steps. The images are created using the surface mode of PyMOL pymol.
  • Figure 5: Unconditional samples from the diffusion prior trained on the CryoStruct dataset. Sampled with Algorithm \ref{['alg']} using $\beta(t) = 1/t \text{ if } t > 0.8 \text{ else } 1$, $t_{\max} = 80$ and $100$ time steps. The images are created using the surface mode of PyMOL pymol.
  • ...and 1 more figures