Table of Contents
Fetching ...

GEM: 3D Gaussian Splatting for Efficient and Accurate Cryo-EM Reconstruction

Huaizhi Qu, Xiao Wang, Gengwei Zhang, Jie Peng, Tianlong Chen

TL;DR

GEM introduces 3D Gaussian Splatting (3DGS) to cryo-EM reconstruction, enabling explicit, sparse density representations that are efficient in real space. By modeling proteins as a set of 3D Gaussians and using a closed-form, CTF-aware projection with gradient updates localized to contributing Gaussians, GEM achieves faster training, lower memory, and higher local resolution than both Fourier-space and NeRF-based baselines. Extensive experiments on four datasets show GEM delivering state-of-the-art resolutions (often near the instrument’s limit) and robust directional consistency, with substantial efficiency advantages. This work demonstrates that explicit Gaussian-based density representations can unify speed, efficiency, and accuracy in cryo-EM reconstruction, offering practical benefits for large-scale structural biology.

Abstract

Cryo-electron microscopy (cryo-EM) has become a central tool for high-resolution structural biology, yet the massive scale of datasets (often exceeding 100k particle images) renders 3D reconstruction both computationally expensive and memory intensive. Traditional Fourier-space methods are efficient but lose fidelity due to repeated transforms, while recent real-space approaches based on neural radiance fields (NeRFs) improve accuracy but incur cubic memory and computation overhead. Therefore, we introduce GEM, a novel cryo-EM reconstruction framework built on 3D Gaussian Splatting (3DGS) that operates directly in real-space while maintaining high efficiency. Instead of modeling the entire density volume, GEM represents proteins with compact 3D Gaussians, each parameterized by only 11 values. To further improve the training efficiency, we designed a novel gradient computation to 3D Gaussians that contribute to each voxel. This design substantially reduced both memory footprint and training cost. On standard cryo-EM benchmarks, GEM achieves up to 48% faster training and 12% lower memory usage compared to state-of-the-art methods, while improving local resolution by as much as 38.8%. These results establish GEM as a practical and scalable paradigm for cryo-EM reconstruction, unifying speed, efficiency, and high-resolution accuracy. Our code is available at https://github.com/UNITES-Lab/GEM.

GEM: 3D Gaussian Splatting for Efficient and Accurate Cryo-EM Reconstruction

TL;DR

GEM introduces 3D Gaussian Splatting (3DGS) to cryo-EM reconstruction, enabling explicit, sparse density representations that are efficient in real space. By modeling proteins as a set of 3D Gaussians and using a closed-form, CTF-aware projection with gradient updates localized to contributing Gaussians, GEM achieves faster training, lower memory, and higher local resolution than both Fourier-space and NeRF-based baselines. Extensive experiments on four datasets show GEM delivering state-of-the-art resolutions (often near the instrument’s limit) and robust directional consistency, with substantial efficiency advantages. This work demonstrates that explicit Gaussian-based density representations can unify speed, efficiency, and accuracy in cryo-EM reconstruction, offering practical benefits for large-scale structural biology.

Abstract

Cryo-electron microscopy (cryo-EM) has become a central tool for high-resolution structural biology, yet the massive scale of datasets (often exceeding 100k particle images) renders 3D reconstruction both computationally expensive and memory intensive. Traditional Fourier-space methods are efficient but lose fidelity due to repeated transforms, while recent real-space approaches based on neural radiance fields (NeRFs) improve accuracy but incur cubic memory and computation overhead. Therefore, we introduce GEM, a novel cryo-EM reconstruction framework built on 3D Gaussian Splatting (3DGS) that operates directly in real-space while maintaining high efficiency. Instead of modeling the entire density volume, GEM represents proteins with compact 3D Gaussians, each parameterized by only 11 values. To further improve the training efficiency, we designed a novel gradient computation to 3D Gaussians that contribute to each voxel. This design substantially reduced both memory footprint and training cost. On standard cryo-EM benchmarks, GEM achieves up to 48% faster training and 12% lower memory usage compared to state-of-the-art methods, while improving local resolution by as much as 38.8%. These results establish GEM as a practical and scalable paradigm for cryo-EM reconstruction, unifying speed, efficiency, and high-resolution accuracy. Our code is available at https://github.com/UNITES-Lab/GEM.

Paper Structure

This paper contains 28 sections, 17 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: GEM achieves lower memory usage, faster speed, and higher reconstruction resolution compared to existing approaches.
  • Figure 2: Cryo-EM imaging process. Randomly oriented proteins are illuminated by the electron beam, leaving traces on the sensor as particle images $I_1$, $I_2$, and $I_3$. In this work we focus on homogeneous reconstruction, where all particles $V_1$, $V_2$, and $V_3$ share the same underlying 3D structure and differ only in orientation (i.e., multiple copies of the same object).
  • Figure 3: Initialization before training and density reconstruction after training.
  • Figure 4: The overview of GEM training. The training begins with randomly initialized Gaussians. The 3D Gaussians are projected following Equation \ref{['eq:render']}. Then the CTF is applied (Equation \ref{['eq:loss']}) to the projection and being compared with the noisy experimental image to calculate the loss (Equation \ref{['eq:loss']}).
  • Figure 5: GSFSC of our GEM and baselines. The horizontal axis denotes resolution in ångströms (Å), and the vertical axis denotes the FSC value. The two dashed horizontal lines are FSC thresholds of $0.5$ and $0.143$. The final resolution is defined as the resolution at which the GSFSC curve first drops below the $0.143$ threshold (smallest possible resolution if no intersection). Intersection points further to the right corresponds to better reconstruction quality.
  • ...and 2 more figures