Table of Contents
Fetching ...

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

Yi Zhou, Yilai Li, Jing Yuan, Quanquan Gu

TL;DR

CryoFM presents a flow-matching foundation model that learns the prior distribution of high-quality cryo-EM density maps and enables posterior sampling conditioned on observed data without task-specific fine-tuning. It deploys two architectures, CryoFM-S for local detail and CryoFM-L for global structure, built on a 3D HDiT transformer and trained on EMDB maps to model $p_0({\mathbf{x}}_0)$. By deriving a flow posterior sampling method, CryoFM can tackle downstream tasks such as spectral and anisotropic noise denoising, missing wedge restoration, and ab initio modeling within a plug-and-play framework, achieving state-of-the-art performance on most tasks. The work demonstrates the potential of flow-based foundation models to unify and accelerate cryo-EM density processing, while noting challenges in real-world noisy densities and reconstruction from raw 2D particles for future research.

Abstract

Cryo-electron microscopy (cryo-EM) is a powerful technique in structural biology and drug discovery, enabling the study of biomolecules at high resolution. Significant advancements by structural biologists using cryo-EM have led to the production of over 38,626 protein density maps at various resolutions1. However, cryo-EM data processing algorithms have yet to fully benefit from our knowledge of biomolecular density maps, with only a few recent models being data-driven but limited to specific tasks. In this study, we present CryoFM, a foundation model designed as a generative model, learning the distribution of high-quality density maps and generalizing effectively to downstream tasks. Built on flow matching, CryoFM is trained to accurately capture the prior distribution of biomolecular density maps. Furthermore, we introduce a flow posterior sampling method that leverages CRYOFM as a flexible prior for several downstream tasks in cryo-EM and cryo-electron tomography (cryo-ET) without the need for fine-tuning, achieving state-of-the-art performance on most tasks and demonstrating its potential as a foundational model for broader applications in these fields.

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

TL;DR

CryoFM presents a flow-matching foundation model that learns the prior distribution of high-quality cryo-EM density maps and enables posterior sampling conditioned on observed data without task-specific fine-tuning. It deploys two architectures, CryoFM-S for local detail and CryoFM-L for global structure, built on a 3D HDiT transformer and trained on EMDB maps to model . By deriving a flow posterior sampling method, CryoFM can tackle downstream tasks such as spectral and anisotropic noise denoising, missing wedge restoration, and ab initio modeling within a plug-and-play framework, achieving state-of-the-art performance on most tasks. The work demonstrates the potential of flow-based foundation models to unify and accelerate cryo-EM density processing, while noting challenges in real-world noisy densities and reconstruction from raw 2D particles for future research.

Abstract

Cryo-electron microscopy (cryo-EM) is a powerful technique in structural biology and drug discovery, enabling the study of biomolecules at high resolution. Significant advancements by structural biologists using cryo-EM have led to the production of over 38,626 protein density maps at various resolutions1. However, cryo-EM data processing algorithms have yet to fully benefit from our knowledge of biomolecular density maps, with only a few recent models being data-driven but limited to specific tasks. In this study, we present CryoFM, a foundation model designed as a generative model, learning the distribution of high-quality density maps and generalizing effectively to downstream tasks. Built on flow matching, CryoFM is trained to accurately capture the prior distribution of biomolecular density maps. Furthermore, we introduce a flow posterior sampling method that leverages CRYOFM as a flexible prior for several downstream tasks in cryo-EM and cryo-electron tomography (cryo-ET) without the need for fine-tuning, achieving state-of-the-art performance on most tasks and demonstrating its potential as a foundational model for broader applications in these fields.

Paper Structure

This paper contains 48 sections, 39 equations, 17 figures, 10 tables, 3 algorithms.

Figures (17)

  • Figure 1: The overview of CryoFM. In the training stage, CryoFM learns a vector field ${\textnormal{v}}_\Theta(t,{\mathbf{x}}_t)$, whose corresponding probability flow generates the data distribution $p_0({\mathbf{x}}_0)$ of high-quality protein densities. In the inference stage, given an observation ${\mathbf{y}}$, a likelihood term $p_t({\mathbf{y}}|{\mathbf{x}}_t)$ is incorporated to convert the unconditional vector field ${\textnormal{v}}_\Theta(t,{\mathbf{x}}_t)$ to a conditional one ${\textnormal{v}}_\Theta(t,{\mathbf{x}}_t|{\mathbf{y}})$, so that we can sample from the posterior distribution $p_0({\mathbf{x}}_0|{\mathbf{y}})$. This enables signal restoration of the density map, resulting in improved resolution of the alpha helices in the shown case.
  • Figure 2: CryoFM's architecture. The side length $D$ of the input ${\mathbf{x}}_t$ undergoes dimension reduction through down-sampling layers and is then expanded back to its original size. To minimize computational cost near the input and output, the model employs neighborhood attention (NA). Neighborhood attention only attend to a localized area, whereas global attention (GA) calculates attention across all positions.
  • Figure 3: Result of the spectral noise denoising task. Two density maps from EMDB were added spectral noise so that the estimated resolution is $4.3$ Å. The degraded density maps (results after applying the forward model) were filtered by relion_postprocess for visual clarity.
  • Figure 4: Forward operators for (i) the anisotropic noise and (ii) the missing wedge.
  • Figure 5: Result of the anisotropic noise denoising task. Two density maps from EMDB are added anisotropic spectral noise that the estimated resolution is 4.38 Å. The degraded density maps are filtered by relion_postprocess for visual clarity.
  • ...and 12 more figures