Joint Denoising of Cryo-EM Projection Images using Polar Transformers
Joakim Andén, Justus Sagemüller
TL;DR
The paper tackles reconstructing a molecular structure from multiple noisy cryo-EM projections that are related by random in-plane rotations and suffer from very low SNR. It introduces the polar transformer, a symmetry-aware architecture that combines a polar representation, angular convolutions, and angular attention to jointly align, cluster, and denoise multiple projections while preserving rotational equivariance. On simulated data, it achieves up to a $2\times$ reduction in relative MSE at $SNR=0.02$ and improves downstream 3D reconstructions, demonstrating the potential of data-driven, geometry-aware reconstruction in cryo-EM. The approach provides a path toward end-to-end, symmetry-preserving reconstruction pipelines for cryo-EM and related tomographic modalities.
Abstract
Many imaging modalities involve reconstruction of unknown objects from collections of noisy projections related by random rotations. In one of these modalities, cryogenic electron microscopy (cryo-EM), the extremely low signal-to-noise ratio (SNR) makes integration of information from multiple images crucial. Existing approaches to cryo-EM processing, however, either rely on handcrafted priors or apply deep learning only on select portions of the pipeline, such as particle picking, micrograph denoising, or refinement. A fully end-to-end reconstruction approach requires a neural network architecture that integrates information from multiple images while respecting the rotational symmetry of the measurement process. In this work, we introduce the polar transformer, a new neural network architecture that combines polar representations and transformers along with a convolutional attention mechanism that preserves the rotational symmetry of the problem. We apply it to the particle-level denoising problem, where it is able to learn discriminative features in the images, enabling optimal clustering, alignment, and denoising. On simulated datasets, this achieves up to a $2\times$ reduction in mean squared error (MSE) at a signal-to-noise ratio (SNR) of $0.02$, suggesting new opportunities for data-driven approaches to reconstruction in cryo-EM and related tomographic modalities.
