Table of Contents
Fetching ...

Struc2mapGAN: improving synthetic cryo-EM density maps with generative adversarial networks

Chenwei Zhang, Anne Condon, Khanh Dao Duc

TL;DR

This work tackles the challenge of generating high-fidelity synthetic cryo-EM density maps from molecular structures, where traditional simulation-based methods fail to reproduce complex features such as SSEs. It introduces struc2mapGAN, a data-driven 3D GAN with a nested U-Net++ generator trained against curated experimental-like targets using a SmoothL1Loss term, achieving superior correlation and structural similarity to real maps while remaining computationally efficient. Through a carefully designed preprocessing, augmentation, and tile-based training pipeline, the model outperforms conventional simulators across multiple metrics (SSIM, ChimeraX correlation, PCC) and demonstrates robust performance across resolutions, with map generation times suitable for real-time workflows. The work suggests future directions in resolution-conditioned generation and diffusion/attention-based enhancements, and highlights potential applications in template-based particle picking and integration with structure predictions from AlphaFold.

Abstract

Generating synthetic cryogenic electron microscopy 3D density maps from molecular structures has potential important applications in structural biology. Yet existing simulation-based methods cannot mimic all the complex features present in experimental maps, such as secondary structure elements. As an alternative, we propose struc2mapGAN, a novel data-driven method that employs a generative adversarial network to produce improved experimental-like density maps from molecular structures. More specifically, struc2mapGAN uses a nested U-Net architecture as the generator, with an additional L1 loss term and further processing of raw training experimental maps to enhance learning efficiency. While struc2mapGAN can promptly generate maps after training, we demonstrate that it outperforms existing simulation-based methods for a wide array of tested maps and across various evaluation metrics.

Struc2mapGAN: improving synthetic cryo-EM density maps with generative adversarial networks

TL;DR

This work tackles the challenge of generating high-fidelity synthetic cryo-EM density maps from molecular structures, where traditional simulation-based methods fail to reproduce complex features such as SSEs. It introduces struc2mapGAN, a data-driven 3D GAN with a nested U-Net++ generator trained against curated experimental-like targets using a SmoothL1Loss term, achieving superior correlation and structural similarity to real maps while remaining computationally efficient. Through a carefully designed preprocessing, augmentation, and tile-based training pipeline, the model outperforms conventional simulators across multiple metrics (SSIM, ChimeraX correlation, PCC) and demonstrates robust performance across resolutions, with map generation times suitable for real-time workflows. The work suggests future directions in resolution-conditioned generation and diffusion/attention-based enhancements, and highlights potential applications in template-based particle picking and integration with structure predictions from AlphaFold.

Abstract

Generating synthetic cryogenic electron microscopy 3D density maps from molecular structures has potential important applications in structural biology. Yet existing simulation-based methods cannot mimic all the complex features present in experimental maps, such as secondary structure elements. As an alternative, we propose struc2mapGAN, a novel data-driven method that employs a generative adversarial network to produce improved experimental-like density maps from molecular structures. More specifically, struc2mapGAN uses a nested U-Net architecture as the generator, with an additional L1 loss term and further processing of raw training experimental maps to enhance learning efficiency. While struc2mapGAN can promptly generate maps after training, we demonstrate that it outperforms existing simulation-based methods for a wide array of tested maps and across various evaluation metrics.
Paper Structure (29 sections, 10 equations, 8 figures, 3 tables)

This paper contains 29 sections, 10 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The data preprocessing workflow. The top panel depicts the process of curating raw experimental maps and dividing the curated maps into 3D image subsets as training targets. The bottom panel depicts the process of generating simulated maps and dividing them into 3D image subsets as training inputs.
  • Figure 2: The struc2mapGAN architecture. The bottom panel illustrates the U-Net++ architecture. $X^{i,j}$ refers to the convolution block at depth $i$ and position $j$ of the network.
  • Figure 3: Validation loss curves of the generator and discriminator in black and red, respectively, with snapshots of generated maps from the models trained at specific epochs.
  • Figure 4: Examples of struc2mapGAN (gray) and molmap (cyan) generated maps, and the raw experimental maps (orange). The PDB structures of $\alpha$-helices (pink) and $\beta$-sheets (blue) are superimposed on the maps. a. Human STEAP4 bound to NADP, FAD, heme and Fe(III)-NTA (EMDB ID: 0199; PDB ID: 6HCY; reported resolution: 3.1 Å). b. AAA+ ATPase, ClpL from Streptococcus pneumoniae: ATPrS-bound (EMDB ID: 0967; PDB ID: 6LT4; reported resolution: 4.5 Å). c. Follicle stimulating hormone receptor (EMDB ID: 35136; PDB ID: 8I2H; reported resolution: 6 Å). Visualization of cryo-EM density maps and PDB structures was produced by UCSF ChimeraX chimerax.
  • Figure 5: The scatter plots for comparison of ChimeraX correlation (left) and SSIM (right) for struc2mapGAN (blue dots), molmap at a resolution cutoff at 2 Å (orange dots), and molmap at the reported resolution cutoff (brown dots), across 130 test examples. The test examples were sorted by their reported resolutions from high to low. The shaded area around each colored regression line represents the confidence interval of the regression estimate.
  • ...and 3 more figures