Table of Contents
Fetching ...

CryoSAMU: Enhancing 3D Cryo-EM Density Maps of Protein Structures at Intermediate Resolution with Structure-Aware Multimodal U-Nets

Chenwei Zhang, Khanh Dao Duc

TL;DR

CryoSAMU tackles the challenge of enhancing cryo-EM density maps at intermediate resolution ($4{-}8$ Å) by jointly modeling 3D map features and fixed-size structural embeddings derived from ESM-IF1 through a structure-aware multimodal 3D U-Net. The method uses cross-attention to fuse a density-based encoder with a structural bottleneck, trained on a curated dataset with simulated targets, and demonstrates competitive improvements in real-space and Fourier-space metrics while delivering significantly faster processing than prior methods. An ablation study confirms the added value of incorporating structural information for map enhancement and protein-structure modeling, reducing boundary artifacts and improving RSCC and residue coverage; inference remains feasible when structural embeddings are unavailable. The work suggests practical impact for large-scale cryo-EM analysis and future directions including global context modeling (e.g., Swin Transformers) and additional losses (e.g., SSIM) to further boost performance, along with expanding the dataset to higher-resolution maps.

Abstract

Enhancing cryogenic electron microscopy (cryo-EM) 3D density maps at intermediate resolution (4-8 Å) is crucial in protein structure determination. Recent advances in deep learning have led to the development of automated approaches for enhancing experimental cryo-EM density maps. Yet, these methods are not optimized for intermediate-resolution maps and rely on map density features alone. To address this, we propose CryoSAMU, a novel method designed to enhance 3D cryo-EM density maps of protein structures using structure-aware multimodal U-Nets and trained on curated intermediate-resolution density maps. We comprehensively evaluate CryoSAMU across various metrics and demonstrate its competitive performance compared to state-of-the-art methods. Notably, CryoSAMU achieves significantly faster processing speed, showing promise for future practical applications. Our code is available at https://github.com/chenwei-zhang/CryoSAMU.

CryoSAMU: Enhancing 3D Cryo-EM Density Maps of Protein Structures at Intermediate Resolution with Structure-Aware Multimodal U-Nets

TL;DR

CryoSAMU tackles the challenge of enhancing cryo-EM density maps at intermediate resolution ( Å) by jointly modeling 3D map features and fixed-size structural embeddings derived from ESM-IF1 through a structure-aware multimodal 3D U-Net. The method uses cross-attention to fuse a density-based encoder with a structural bottleneck, trained on a curated dataset with simulated targets, and demonstrates competitive improvements in real-space and Fourier-space metrics while delivering significantly faster processing than prior methods. An ablation study confirms the added value of incorporating structural information for map enhancement and protein-structure modeling, reducing boundary artifacts and improving RSCC and residue coverage; inference remains feasible when structural embeddings are unavailable. The work suggests practical impact for large-scale cryo-EM analysis and future directions including global context modeling (e.g., Swin Transformers) and additional losses (e.g., SSIM) to further boost performance, along with expanding the dataset to higher-resolution maps.

Abstract

Enhancing cryogenic electron microscopy (cryo-EM) 3D density maps at intermediate resolution (4-8 Å) is crucial in protein structure determination. Recent advances in deep learning have led to the development of automated approaches for enhancing experimental cryo-EM density maps. Yet, these methods are not optimized for intermediate-resolution maps and rely on map density features alone. To address this, we propose CryoSAMU, a novel method designed to enhance 3D cryo-EM density maps of protein structures using structure-aware multimodal U-Nets and trained on curated intermediate-resolution density maps. We comprehensively evaluate CryoSAMU across various metrics and demonstrate its competitive performance compared to state-of-the-art methods. Notably, CryoSAMU achieves significantly faster processing speed, showing promise for future practical applications. Our code is available at https://github.com/chenwei-zhang/CryoSAMU.

Paper Structure

This paper contains 16 sections, 7 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Overview of the CryoSAMU framework. a Generating protein multimodal representations: structure features are derived from a frozen pretrained ESM-IF1 model with self-attention weighting for a fixed-size representation; map voxel features are simulated via resolution-lowering point spread function and partitioned into smaller cubes. b The CryoSAMU architecture. The experimental map is partitioned into smaller cubes and processed by a U-Net with residual blocks and linear attention modules. Structural embeddings are integrated into the bottleneck layer with cross-attention mechanism. The output cubes are reconstructed into the full-size enhanced map.
  • Figure 2: Visual and quantitative comparison of deposited (blue) and CryoSAMU-enhanced (green) maps, with superimposed corresponding PDB structures (brown). a, b: Maps are shown at two contour levels. Left: recommended contour level (volume = 85.74e3). Right: higher contour level (volume = 22.57e3). c: RSCC comparisons between deposited and CryoSAMU-enhanced maps. The example protein is a CX3CL1-US28-G11iN18-scFv16 in TL-state (PDB-7RKF, EMDB-24496, reported resolution of 4.00 Å) 7RKF.
  • Figure 3: The violin plots for comparison of different methods across four evaluation metrics (see Section \ref{['bm1']}) over 75 test samples.
  • Figure 4: a-b: The polar plots for comparison of protein structures constructed from deposited (blue) and CryoSAMU-enhanced (green) maps, using metrics of (a) residue coverage and (b) sequence match. c-d: The box-whisker plots for comparison of different methods across two evaluation metrics over 20 test samples. See Section \ref{['sec:bm2']}.
  • Figure 5: The scatter plot of map processing time against map volume. Each dot represents the processing time for an individual map based on its volume. The shaded area around the regression line denotes the confidence interval of the regression estimate.
  • ...and 3 more figures