Table of Contents
Fetching ...

Joint Denoising of Cryo-EM Projection Images using Polar Transformers

Joakim Andén, Justus Sagemüller

TL;DR

The paper tackles reconstructing a molecular structure from multiple noisy cryo-EM projections that are related by random in-plane rotations and suffer from very low SNR. It introduces the polar transformer, a symmetry-aware architecture that combines a polar representation, angular convolutions, and angular attention to jointly align, cluster, and denoise multiple projections while preserving rotational equivariance. On simulated data, it achieves up to a $2\times$ reduction in relative MSE at $SNR=0.02$ and improves downstream 3D reconstructions, demonstrating the potential of data-driven, geometry-aware reconstruction in cryo-EM. The approach provides a path toward end-to-end, symmetry-preserving reconstruction pipelines for cryo-EM and related tomographic modalities.

Abstract

Many imaging modalities involve reconstruction of unknown objects from collections of noisy projections related by random rotations. In one of these modalities, cryogenic electron microscopy (cryo-EM), the extremely low signal-to-noise ratio (SNR) makes integration of information from multiple images crucial. Existing approaches to cryo-EM processing, however, either rely on handcrafted priors or apply deep learning only on select portions of the pipeline, such as particle picking, micrograph denoising, or refinement. A fully end-to-end reconstruction approach requires a neural network architecture that integrates information from multiple images while respecting the rotational symmetry of the measurement process. In this work, we introduce the polar transformer, a new neural network architecture that combines polar representations and transformers along with a convolutional attention mechanism that preserves the rotational symmetry of the problem. We apply it to the particle-level denoising problem, where it is able to learn discriminative features in the images, enabling optimal clustering, alignment, and denoising. On simulated datasets, this achieves up to a $2\times$ reduction in mean squared error (MSE) at a signal-to-noise ratio (SNR) of $0.02$, suggesting new opportunities for data-driven approaches to reconstruction in cryo-EM and related tomographic modalities.

Joint Denoising of Cryo-EM Projection Images using Polar Transformers

TL;DR

The paper tackles reconstructing a molecular structure from multiple noisy cryo-EM projections that are related by random in-plane rotations and suffer from very low SNR. It introduces the polar transformer, a symmetry-aware architecture that combines a polar representation, angular convolutions, and angular attention to jointly align, cluster, and denoise multiple projections while preserving rotational equivariance. On simulated data, it achieves up to a reduction in relative MSE at and improves downstream 3D reconstructions, demonstrating the potential of data-driven, geometry-aware reconstruction in cryo-EM. The approach provides a path toward end-to-end, symmetry-preserving reconstruction pipelines for cryo-EM and related tomographic modalities.

Abstract

Many imaging modalities involve reconstruction of unknown objects from collections of noisy projections related by random rotations. In one of these modalities, cryogenic electron microscopy (cryo-EM), the extremely low signal-to-noise ratio (SNR) makes integration of information from multiple images crucial. Existing approaches to cryo-EM processing, however, either rely on handcrafted priors or apply deep learning only on select portions of the pipeline, such as particle picking, micrograph denoising, or refinement. A fully end-to-end reconstruction approach requires a neural network architecture that integrates information from multiple images while respecting the rotational symmetry of the measurement process. In this work, we introduce the polar transformer, a new neural network architecture that combines polar representations and transformers along with a convolutional attention mechanism that preserves the rotational symmetry of the problem. We apply it to the particle-level denoising problem, where it is able to learn discriminative features in the images, enabling optimal clustering, alignment, and denoising. On simulated datasets, this achieves up to a reduction in mean squared error (MSE) at a signal-to-noise ratio (SNR) of , suggesting new opportunities for data-driven approaches to reconstruction in cryo-EM and related tomographic modalities.

Paper Structure

This paper contains 21 sections, 1 theorem, 35 equations, 5 figures, 1 table.

Key Result

Proposition 1

Let Then where $\star$ denotes a discrete 2D convolution and

Figures (5)

  • Figure 1: (a) A simulated $64\times 64$ projection image of PDB ID 2pkq. (b) The projection image with a polar grid superimposed (downsampled by $4\times$ for visualization). (c) The polar representation (horizontal axis: angles; vertical axis: radii). (d) The reconstructed image with relative MSE of $3.5 \cdot 10^{-4}$.
  • Figure 2: Sample images with CTF, shifts, and Gaussian noise at $\text{SNR} = 0.02$ (top five rows), and for the Parakeet simulator (bottom row).
  • Figure 3: (left) 3D reconstructions of DMSO reductase from denoised images: ground truth (gray), polar transformer (red), and DnCNN (blue). (right) FSC curves of the reconstructions.
  • Figure 4: Outline of the polar CNN architecture. The Cartesian image is first transformed into the polar representation using the mapping $P$. It then passes through a series of 1D convolutional layers defined by the angular filters $h_1, h_2, \ldots, h_n$, each of which inclues a ReLU activation and a group normalization step (not shown above). Finally, the result is converted back to a Cartesian image using $P^{-1}$.
  • Figure 5: Outline of the polar transformer architecture. The Cartesian images are first preprocessed by a polar CNN (after being converted to the polar representation), then fed into the angular attention mechanism. This then combines the information from the various images by computing the key, and query vectors for each image (using another set of polar CNNs), calculating the attention coefficients, and using these together with the value vectors to compute the output. The output is then processed by another polar CNN and finally converted to a Cartesian image. Note that the angular attention block can be repeated, but we have found that one such block suffices for the purposes of denoising.

Theorems & Definitions (2)

  • Proposition 1
  • proof